RoutingLoad BalancingSSL TerminationProtocol TranslationTransformationURL Rewriting

Core Responsibilities

The capabilities that justify the gateway's existence — routing, load balancing, SSL termination, protocol translation, and request/response transformation.

40 min read9 sections

What is an API Gateway

An API Gateway is a single entry point that sits between external clients and your internal services. It accepts all incoming API requests, applies cross-cutting policies (auth, rate limiting, logging), and routes them to the appropriate backend service. Clients never talk directly to your microservices — they talk to the gateway.

🏨

The Hotel Front Desk

An API Gateway is like a hotel front desk. Guests (clients) don't wander the hallways looking for housekeeping, room service, or maintenance. They go to the front desk, which verifies their identity (auth), checks if they're allowed (authorization), and routes their request to the right department. The departments don't need to handle check-in or verify room keys — the front desk already did that.

The Three Things a Gateway Does Well

Core Gateway Functions

✅Traffic management — routing, load balancing, rate limiting, circuit breaking
✅Security boundary — authentication, authorization, TLS termination, IP filtering
✅API lifecycle — versioning, transformation, documentation, monitoring

What a Gateway is NOT

What Does Not Belong in a Gateway

❌Business logic — never put domain rules in the gateway
❌Data storage — the gateway should be stateless (except caching)
❌Service-to-service communication — that's a service mesh concern (east-west)
❌Heavy computation — transformation should be lightweight, not CPU-intensive

North-South vs East-West Traffic

The gateway handles north-south traffic — requests from external clients entering your system. East-west traffic (service-to-service communication within your cluster) is handled by a service mesh (Istio, Linkerd) or direct calls. Conflating the two leads to the gateway becoming a bottleneck for internal communication.

Concept	Gateway (North-South)	Service Mesh (East-West)
Traffic direction	External → Internal	Internal → Internal
Clients	Mobile apps, browsers, partners	Microservices talking to each other
Auth model	API keys, JWT, OAuth	mTLS, SPIFFE identities
Typical tool	Kong, AWS API Gateway	Istio, Linkerd, Consul Connect

Request Routing

Routing is the gateway's most fundamental job — matching an incoming request to the correct upstream service. Modern gateways support multiple routing dimensions that can be combined.

Routing Type	Match On	Example
Path-based	URL path prefix or exact match	/api/users/* → user-service
Host-based	Request Host header	api.example.com → api-service
Header-based	Custom header value	X-Version: v2 → service-v2
Method-based	HTTP method	GET /orders → read-replica, POST /orders → primary
Weighted	Percentage split	90% → stable, 10% → canary
Query param	URL query parameters	?region=eu → eu-cluster

kong-routes.yamlyaml

# Kong declarative routing configuration
_format_version: "3.0"

services:
  - name: user-service
    url: http://user-service:8080
    routes:
      - name: users-route
        paths:
          - /api/v1/users
        methods:
          - GET
          - POST
        strip_path: true

  - name: order-service
    url: http://order-service:8080
    routes:
      - name: orders-route
        paths:
          - /api/v1/orders
        headers:
          X-Region:
            - us-east
            - us-west

Dynamic Routing

Static routing maps paths to services at configuration time. Dynamic routing resolves the target at request time — based on JWT claims, database lookups, or service registry queries. This enables tenant-specific routing, feature flags, and A/B testing without redeploying the gateway.

Route Priority

When multiple routes could match, gateways use priority rules: exact path beats prefix, longer prefix beats shorter, specific headers beat wildcards. Understand your gateway's priority model — ambiguous routing is a common source of production incidents.

Load Balancing

Once the gateway knows which service to route to, it must choose which instance of that service receives the request. The load balancing algorithm determines fairness, latency, and resilience.

Algorithm	How It Works	Best For
Round Robin	Rotate through instances sequentially	Homogeneous instances, simple workloads
Weighted Round Robin	Rotate but send more to higher-weight instances	Mixed instance sizes (2x CPU = 2x weight)
Least Connections	Send to instance with fewest active requests	Variable request duration (long-running queries)
IP Hash	Hash client IP to pick consistent instance	Session affinity without cookies
Random	Pick a random instance	Large pools where simplicity wins
Least Latency	Send to instance with lowest response time	Latency-sensitive APIs

nginx-load-balancing.confnginx

upstream user_service {
    # Least connections — best for variable response times
    least_conn;

    server user-svc-1:8080 weight=3;
    server user-svc-2:8080 weight=2;
    server user-svc-3:8080 weight=1;

    # Health check — mark as down after 3 failures
    server user-svc-4:8080 max_fails=3 fail_timeout=30s;
}

server {
    location /api/users {
        proxy_pass http://user_service;
        proxy_next_upstream error timeout http_502 http_503;
        proxy_next_upstream_tries 2;
    }
}

Health-Check-Aware Balancing

A load balancer that doesn't health-check is a traffic distributor that sends requests to dead instances. Active health checks (periodic pings) detect failures before clients hit them. Passive health checks (monitoring response codes) react after failures. Use both — active for fast detection, passive for catching intermittent issues.

SSL/TLS Termination

TLS termination at the gateway means the gateway handles the expensive TLS handshake and decryption. Backend services receive plain HTTP, eliminating certificate management from every service and offloading CPU-intensive cryptographic operations.

Mode	Description	Use When
TLS Termination	Gateway decrypts, forwards plain HTTP to backends	Most common — internal network is trusted
TLS Passthrough	Gateway forwards encrypted traffic without decrypting	Gateway can't inspect traffic, end-to-end encryption required
Re-encryption	Gateway decrypts, inspects, re-encrypts to backend	Need inspection + backend encryption (compliance)
SNI Routing	Route based on TLS SNI header without decrypting	Multi-tenant with passthrough per domain

nginx-tls-termination.confnginx

server {
    listen 443 ssl http2;
    server_name api.example.com;

    # Certificate management
    ssl_certificate     /etc/ssl/certs/api.example.com.pem;
    ssl_certificate_key /etc/ssl/private/api.example.com.key;

    # Modern TLS configuration
    ssl_protocols TLSv1.2 TLSv1.3;
    ssl_ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256;
    ssl_prefer_server_ciphers off;

    # OCSP stapling for faster handshakes
    ssl_stapling on;
    ssl_stapling_verify on;

    # Forward to backend as plain HTTP
    location / {
        proxy_pass http://backend_service;
        proxy_set_header X-Forwarded-Proto $scheme;
        proxy_set_header X-Forwarded-For $remote_addr;
    }
}

Certificate Management at Scale

With dozens of domains, manual cert management is unsustainable. Use automated certificate management: Let's Encrypt with ACME protocol, or cloud-managed certificates (AWS ACM, GCP managed certs). The gateway should auto-renew certificates without downtime — Traefik and Caddy do this natively.

Protocol Translation

The gateway can translate between protocols — exposing a REST API to external clients while backends communicate via gRPC, or upgrading HTTP/1.1 connections to HTTP/2 for multiplexed backend communication.

Translation	Client Sees	Backend Uses	Why
HTTP/1.1 → HTTP/2	Standard HTTP/1.1	Multiplexed HTTP/2	Connection efficiency, header compression
REST → gRPC	JSON over HTTP	Protobuf over gRPC	Performance for internal services
WebSocket Upgrade	HTTP → WS handshake	WebSocket connection	Bidirectional real-time communication
HTTP → AMQP	Synchronous HTTP POST	Async message queue	Decouple client from async processing
gRPC-Web → gRPC	gRPC-Web (browser)	Native gRPC	Browser clients can't use native gRPC

envoy-grpc-transcoding.yamlyaml

# Envoy gRPC-JSON transcoding filter
http_filters:
  - name: envoy.filters.http.grpc_json_transcoder
    typed_config:
      "@type": type.googleapis.com/envoy.extensions.filters.http.grpc_json_transcoder.v3.GrpcJsonTranscoder
      proto_descriptor: "/etc/envoy/proto.pb"
      services:
        - user.UserService
      print_options:
        add_whitespace: true
        always_print_primitive_fields: true

# Client sends: POST /v1/users {"name": "Alice"}
# Gateway translates to: gRPC UserService.CreateUser(CreateUserRequest{name: "Alice"})
# Backend responds with protobuf, gateway translates back to JSON

🌐

The Universal Translator

Protocol translation makes the gateway like a UN interpreter. External clients speak REST (the common language), but internally your services speak gRPC (faster, typed). The gateway translates in both directions so neither side needs to learn the other's language. Clients get simplicity, services get performance.

Request/Response Transformation

Transformation modifies requests before they reach backends and responses before they reach clients. This decouples the external API contract from internal service interfaces.

Transformation	Direction	Example
Header injection	Request	Add X-Request-ID, X-User-ID from JWT
Header removal	Response	Strip internal headers (X-Powered-By, Server)
Body transformation	Both	Rename fields, flatten nested objects
URL rewriting	Request	/api/v1/users → /users (strip prefix)
Response filtering	Response	Remove internal fields from public API
Aggregation	Response	Merge responses from multiple services

kong-transformation-plugin.yamlyaml

plugins:
  - name: request-transformer
    config:
      add:
        headers:
          - "X-Request-ID:$(uuid)"
          - "X-Forwarded-Host:$(headers.host)"
      remove:
        headers:
          - Cookie
      rename:
        headers:
          - "Authorization:X-Original-Auth"

  - name: response-transformer
    config:
      remove:
        headers:
          - X-Powered-By
          - Server
          - X-Internal-Trace
      add:
        headers:
          - "X-Response-Time:$(latency)"
      remove:
        json:
          - internal_id
          - debug_info

Keep Transformations Lightweight

The gateway processes every request. Heavy transformations (large body rewrites, complex JSON manipulation) add latency to every call. If you need significant transformation, consider a dedicated BFF service behind the gateway. The gateway should handle header manipulation and simple field filtering — not complex business logic reshaping.

Gateway vs Related Concepts

The API Gateway overlaps with several other infrastructure components. Understanding the boundaries prevents architectural confusion.

Component	Primary Role	Overlap with Gateway	Key Difference
Reverse Proxy	Forward requests to backends	Routing, TLS, load balancing	No API-specific features (auth, rate limiting, versioning)
Load Balancer	Distribute traffic across instances	Traffic distribution	Layer 4 (TCP) vs Layer 7 (HTTP) — no request inspection
Service Mesh	Service-to-service communication	mTLS, observability, retries	East-west traffic, sidecar model, not client-facing
CDN	Cache and serve static content at edge	Caching, TLS termination	Optimized for static assets, not API logic
WAF	Block malicious requests	Security filtering	Focused on attack patterns (SQLi, XSS), not API management
Ingress Controller	Route external traffic into Kubernetes	Routing, TLS	Kubernetes-specific, often backed by a gateway (NGINX, Envoy)

They're Complementary, Not Competing

In production, you often use several together: CDN → WAF → API Gateway → Service Mesh. The CDN handles static content and edge caching. The WAF blocks attacks. The gateway handles API-specific concerns. The service mesh handles internal communication. Each layer does what it does best.

Interview Questions

Q:What's the difference between an API Gateway and a reverse proxy?

A: A reverse proxy forwards requests to backend servers and handles basic concerns like TLS and load balancing. An API Gateway does everything a reverse proxy does PLUS API-specific features: authentication, rate limiting, request/response transformation, API versioning, developer portal, and analytics. NGINX is a reverse proxy; Kong (built on NGINX) is an API Gateway. The gateway is a superset.

Q:Why terminate TLS at the gateway instead of at each service?

A: Three reasons: (1) Certificate management — manage certs in one place instead of N services. (2) Performance — TLS handshakes are CPU-intensive; offload to dedicated gateway hardware. (3) Inspection — the gateway needs to read request headers/body for routing, auth, and rate limiting. With TLS passthrough, the gateway can't inspect traffic. The trade-off: internal traffic is unencrypted unless you re-encrypt.

Q:How does an API Gateway handle the single point of failure problem?

A: Deploy multiple gateway instances behind a network load balancer (L4). The gateway itself should be stateless — all state (rate limit counters, sessions) lives in external stores (Redis). This allows horizontal scaling and instant failover. Use active-active deployment across availability zones. If one gateway instance dies, the NLB routes to healthy instances with zero downtime.

Q:When would you use TLS passthrough instead of termination?

A: Use passthrough when: (1) Compliance requires end-to-end encryption with no intermediary decryption. (2) The backend needs to verify the client certificate directly (mTLS where the backend is the trust anchor). (3) You don't need the gateway to inspect request content. The trade-off: the gateway can only route based on SNI (hostname) — it can't do path-based routing, header inspection, or body transformation.

Q:Explain how protocol translation at the gateway benefits a microservices architecture.

A: External clients use REST/JSON (simple, universal, browser-friendly). Internal services use gRPC (fast, typed, streaming). The gateway translates between them — clients get a simple API, services get performance. This also means you can change internal protocols without breaking external clients. The gateway absorbs the protocol mismatch, acting as an anti-corruption layer between external and internal contracts.

Common Mistakes

⚠️

Putting business logic in the gateway

Adding domain-specific validation, data enrichment, or workflow orchestration to gateway plugins.

✅The gateway handles cross-cutting concerns only: auth, rate limiting, routing, transformation. Business logic belongs in services. If you need request enrichment, use a BFF service behind the gateway.

⚠️

Single gateway instance with no redundancy

Running one gateway instance because 'it's just a proxy' — creating a single point of failure for all traffic.

✅Deploy at minimum 2 instances across availability zones behind a network load balancer. The gateway is the most critical piece of infrastructure — if it goes down, everything goes down.

⚠️

Using the gateway for east-west traffic

Routing all service-to-service calls through the API Gateway, creating a bottleneck and unnecessary hop.

✅The gateway handles north-south (external → internal) traffic only. For east-west (service-to-service), use direct calls or a service mesh. Internal services don't need the gateway's auth or rate limiting — they have their own trust model.

⚠️

Heavy body transformation in the gateway

Performing complex JSON restructuring, data aggregation from multiple sources, or XML-to-JSON conversion for every request in the gateway layer.

✅Keep gateway transformations lightweight: header manipulation, field filtering, simple renames. For complex transformations, use a dedicated BFF or transformation service. The gateway processes every request — heavy computation here adds latency to everything.