RoutingLoad BalancingSSL TerminationProtocol TranslationTransformationURL Rewriting

Core Responsibilities

The capabilities that justify the gateway's existence — routing, load balancing, SSL termination, protocol translation, and request/response transformation.

40 min read9 sections
01

What is an API Gateway

An API Gateway is a single entry point that sits between external clients and your internal services. It accepts all incoming API requests, applies cross-cutting policies (auth, rate limiting, logging), and routes them to the appropriate backend service. Clients never talk directly to your microservices — they talk to the gateway.

🏨

The Hotel Front Desk

An API Gateway is like a hotel front desk. Guests (clients) don't wander the hallways looking for housekeeping, room service, or maintenance. They go to the front desk, which verifies their identity (auth), checks if they're allowed (authorization), and routes their request to the right department. The departments don't need to handle check-in or verify room keys — the front desk already did that.

The Three Things a Gateway Does Well

Core Gateway Functions

  • Traffic management — routing, load balancing, rate limiting, circuit breaking
  • Security boundary — authentication, authorization, TLS termination, IP filtering
  • API lifecycle — versioning, transformation, documentation, monitoring

What a Gateway is NOT

What Does Not Belong in a Gateway

  • Business logic — never put domain rules in the gateway
  • Data storage — the gateway should be stateless (except caching)
  • Service-to-service communication — that's a service mesh concern (east-west)
  • Heavy computation — transformation should be lightweight, not CPU-intensive

North-South vs East-West Traffic

The gateway handles north-south traffic — requests from external clients entering your system. East-west traffic (service-to-service communication within your cluster) is handled by a service mesh (Istio, Linkerd) or direct calls. Conflating the two leads to the gateway becoming a bottleneck for internal communication.

ConceptGateway (North-South)Service Mesh (East-West)
Traffic directionExternal → InternalInternal → Internal
ClientsMobile apps, browsers, partnersMicroservices talking to each other
Auth modelAPI keys, JWT, OAuthmTLS, SPIFFE identities
Typical toolKong, AWS API GatewayIstio, Linkerd, Consul Connect
02

Request Routing

Routing is the gateway's most fundamental job — matching an incoming request to the correct upstream service. Modern gateways support multiple routing dimensions that can be combined.

Routing TypeMatch OnExample
Path-basedURL path prefix or exact match/api/users/* → user-service
Host-basedRequest Host headerapi.example.com → api-service
Header-basedCustom header valueX-Version: v2 → service-v2
Method-basedHTTP methodGET /orders → read-replica, POST /orders → primary
WeightedPercentage split90% → stable, 10% → canary
Query paramURL query parameters?region=eu → eu-cluster
kong-routes.yamlyaml
# Kong declarative routing configuration
_format_version: "3.0"

services:
  - name: user-service
    url: http://user-service:8080
    routes:
      - name: users-route
        paths:
          - /api/v1/users
        methods:
          - GET
          - POST
        strip_path: true

  - name: order-service
    url: http://order-service:8080
    routes:
      - name: orders-route
        paths:
          - /api/v1/orders
        headers:
          X-Region:
            - us-east
            - us-west

Dynamic Routing

Static routing maps paths to services at configuration time. Dynamic routing resolves the target at request time — based on JWT claims, database lookups, or service registry queries. This enables tenant-specific routing, feature flags, and A/B testing without redeploying the gateway.

Route Priority

When multiple routes could match, gateways use priority rules: exact path beats prefix, longer prefix beats shorter, specific headers beat wildcards. Understand your gateway's priority model — ambiguous routing is a common source of production incidents.

03

Load Balancing

Once the gateway knows which service to route to, it must choose which instance of that service receives the request. The load balancing algorithm determines fairness, latency, and resilience.

AlgorithmHow It WorksBest For
Round RobinRotate through instances sequentiallyHomogeneous instances, simple workloads
Weighted Round RobinRotate but send more to higher-weight instancesMixed instance sizes (2x CPU = 2x weight)
Least ConnectionsSend to instance with fewest active requestsVariable request duration (long-running queries)
IP HashHash client IP to pick consistent instanceSession affinity without cookies
RandomPick a random instanceLarge pools where simplicity wins
Least LatencySend to instance with lowest response timeLatency-sensitive APIs
nginx-load-balancing.confnginx
upstream user_service {
    # Least connectionsbest for variable response times
    least_conn;

    server user-svc-1:8080 weight=3;
    server user-svc-2:8080 weight=2;
    server user-svc-3:8080 weight=1;

    # Health checkmark as down after 3 failures
    server user-svc-4:8080 max_fails=3 fail_timeout=30s;
}

server {
    location /api/users {
        proxy_pass http://user_service;
        proxy_next_upstream error timeout http_502 http_503;
        proxy_next_upstream_tries 2;
    }
}

Health-Check-Aware Balancing

A load balancer that doesn't health-check is a traffic distributor that sends requests to dead instances. Active health checks (periodic pings) detect failures before clients hit them. Passive health checks (monitoring response codes) react after failures. Use both — active for fast detection, passive for catching intermittent issues.

04

SSL/TLS Termination

TLS termination at the gateway means the gateway handles the expensive TLS handshake and decryption. Backend services receive plain HTTP, eliminating certificate management from every service and offloading CPU-intensive cryptographic operations.

ModeDescriptionUse When
TLS TerminationGateway decrypts, forwards plain HTTP to backendsMost common — internal network is trusted
TLS PassthroughGateway forwards encrypted traffic without decryptingGateway can't inspect traffic, end-to-end encryption required
Re-encryptionGateway decrypts, inspects, re-encrypts to backendNeed inspection + backend encryption (compliance)
SNI RoutingRoute based on TLS SNI header without decryptingMulti-tenant with passthrough per domain
nginx-tls-termination.confnginx
server {
    listen 443 ssl http2;
    server_name api.example.com;

    # Certificate management
    ssl_certificate     /etc/ssl/certs/api.example.com.pem;
    ssl_certificate_key /etc/ssl/private/api.example.com.key;

    # Modern TLS configuration
    ssl_protocols TLSv1.2 TLSv1.3;
    ssl_ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256;
    ssl_prefer_server_ciphers off;

    # OCSP stapling for faster handshakes
    ssl_stapling on;
    ssl_stapling_verify on;

    # Forward to backend as plain HTTP
    location / {
        proxy_pass http://backend_service;
        proxy_set_header X-Forwarded-Proto $scheme;
        proxy_set_header X-Forwarded-For $remote_addr;
    }
}

Certificate Management at Scale

With dozens of domains, manual cert management is unsustainable. Use automated certificate management: Let's Encrypt with ACME protocol, or cloud-managed certificates (AWS ACM, GCP managed certs). The gateway should auto-renew certificates without downtime — Traefik and Caddy do this natively.

05

Protocol Translation

The gateway can translate between protocols — exposing a REST API to external clients while backends communicate via gRPC, or upgrading HTTP/1.1 connections to HTTP/2 for multiplexed backend communication.

TranslationClient SeesBackend UsesWhy
HTTP/1.1 → HTTP/2Standard HTTP/1.1Multiplexed HTTP/2Connection efficiency, header compression
REST → gRPCJSON over HTTPProtobuf over gRPCPerformance for internal services
WebSocket UpgradeHTTP → WS handshakeWebSocket connectionBidirectional real-time communication
HTTP → AMQPSynchronous HTTP POSTAsync message queueDecouple client from async processing
gRPC-Web → gRPCgRPC-Web (browser)Native gRPCBrowser clients can't use native gRPC
envoy-grpc-transcoding.yamlyaml
# Envoy gRPC-JSON transcoding filter
http_filters:
  - name: envoy.filters.http.grpc_json_transcoder
    typed_config:
      "@type": type.googleapis.com/envoy.extensions.filters.http.grpc_json_transcoder.v3.GrpcJsonTranscoder
      proto_descriptor: "/etc/envoy/proto.pb"
      services:
        - user.UserService
      print_options:
        add_whitespace: true
        always_print_primitive_fields: true

# Client sends: POST /v1/users {"name": "Alice"}
# Gateway translates to: gRPC UserService.CreateUser(CreateUserRequest{name: "Alice"})
# Backend responds with protobuf, gateway translates back to JSON
🌐

The Universal Translator

Protocol translation makes the gateway like a UN interpreter. External clients speak REST (the common language), but internally your services speak gRPC (faster, typed). The gateway translates in both directions so neither side needs to learn the other's language. Clients get simplicity, services get performance.

06

Request/Response Transformation

Transformation modifies requests before they reach backends and responses before they reach clients. This decouples the external API contract from internal service interfaces.

TransformationDirectionExample
Header injectionRequestAdd X-Request-ID, X-User-ID from JWT
Header removalResponseStrip internal headers (X-Powered-By, Server)
Body transformationBothRename fields, flatten nested objects
URL rewritingRequest/api/v1/users → /users (strip prefix)
Response filteringResponseRemove internal fields from public API
AggregationResponseMerge responses from multiple services
kong-transformation-plugin.yamlyaml
plugins:
  - name: request-transformer
    config:
      add:
        headers:
          - "X-Request-ID:$(uuid)"
          - "X-Forwarded-Host:$(headers.host)"
      remove:
        headers:
          - Cookie
      rename:
        headers:
          - "Authorization:X-Original-Auth"

  - name: response-transformer
    config:
      remove:
        headers:
          - X-Powered-By
          - Server
          - X-Internal-Trace
      add:
        headers:
          - "X-Response-Time:$(latency)"
      remove:
        json:
          - internal_id
          - debug_info

Keep Transformations Lightweight

The gateway processes every request. Heavy transformations (large body rewrites, complex JSON manipulation) add latency to every call. If you need significant transformation, consider a dedicated BFF service behind the gateway. The gateway should handle header manipulation and simple field filtering — not complex business logic reshaping.

07

Gateway vs Related Concepts

The API Gateway overlaps with several other infrastructure components. Understanding the boundaries prevents architectural confusion.

ComponentPrimary RoleOverlap with GatewayKey Difference
Reverse ProxyForward requests to backendsRouting, TLS, load balancingNo API-specific features (auth, rate limiting, versioning)
Load BalancerDistribute traffic across instancesTraffic distributionLayer 4 (TCP) vs Layer 7 (HTTP) — no request inspection
Service MeshService-to-service communicationmTLS, observability, retriesEast-west traffic, sidecar model, not client-facing
CDNCache and serve static content at edgeCaching, TLS terminationOptimized for static assets, not API logic
WAFBlock malicious requestsSecurity filteringFocused on attack patterns (SQLi, XSS), not API management
Ingress ControllerRoute external traffic into KubernetesRouting, TLSKubernetes-specific, often backed by a gateway (NGINX, Envoy)

They're Complementary, Not Competing

In production, you often use several together: CDN → WAF → API Gateway → Service Mesh. The CDN handles static content and edge caching. The WAF blocks attacks. The gateway handles API-specific concerns. The service mesh handles internal communication. Each layer does what it does best.

08

Interview Questions

Q:What's the difference between an API Gateway and a reverse proxy?

A: A reverse proxy forwards requests to backend servers and handles basic concerns like TLS and load balancing. An API Gateway does everything a reverse proxy does PLUS API-specific features: authentication, rate limiting, request/response transformation, API versioning, developer portal, and analytics. NGINX is a reverse proxy; Kong (built on NGINX) is an API Gateway. The gateway is a superset.

Q:Why terminate TLS at the gateway instead of at each service?

A: Three reasons: (1) Certificate management — manage certs in one place instead of N services. (2) Performance — TLS handshakes are CPU-intensive; offload to dedicated gateway hardware. (3) Inspection — the gateway needs to read request headers/body for routing, auth, and rate limiting. With TLS passthrough, the gateway can't inspect traffic. The trade-off: internal traffic is unencrypted unless you re-encrypt.

Q:How does an API Gateway handle the single point of failure problem?

A: Deploy multiple gateway instances behind a network load balancer (L4). The gateway itself should be stateless — all state (rate limit counters, sessions) lives in external stores (Redis). This allows horizontal scaling and instant failover. Use active-active deployment across availability zones. If one gateway instance dies, the NLB routes to healthy instances with zero downtime.

Q:When would you use TLS passthrough instead of termination?

A: Use passthrough when: (1) Compliance requires end-to-end encryption with no intermediary decryption. (2) The backend needs to verify the client certificate directly (mTLS where the backend is the trust anchor). (3) You don't need the gateway to inspect request content. The trade-off: the gateway can only route based on SNI (hostname) — it can't do path-based routing, header inspection, or body transformation.

Q:Explain how protocol translation at the gateway benefits a microservices architecture.

A: External clients use REST/JSON (simple, universal, browser-friendly). Internal services use gRPC (fast, typed, streaming). The gateway translates between them — clients get a simple API, services get performance. This also means you can change internal protocols without breaking external clients. The gateway absorbs the protocol mismatch, acting as an anti-corruption layer between external and internal contracts.

09

Common Mistakes

⚠️

Putting business logic in the gateway

Adding domain-specific validation, data enrichment, or workflow orchestration to gateway plugins.

The gateway handles cross-cutting concerns only: auth, rate limiting, routing, transformation. Business logic belongs in services. If you need request enrichment, use a BFF service behind the gateway.

⚠️

Single gateway instance with no redundancy

Running one gateway instance because 'it's just a proxy' — creating a single point of failure for all traffic.

Deploy at minimum 2 instances across availability zones behind a network load balancer. The gateway is the most critical piece of infrastructure — if it goes down, everything goes down.

⚠️

Using the gateway for east-west traffic

Routing all service-to-service calls through the API Gateway, creating a bottleneck and unnecessary hop.

The gateway handles north-south (external → internal) traffic only. For east-west (service-to-service), use direct calls or a service mesh. Internal services don't need the gateway's auth or rate limiting — they have their own trust model.

⚠️

Heavy body transformation in the gateway

Performing complex JSON restructuring, data aggregation from multiple sources, or XML-to-JSON conversion for every request in the gateway layer.

Keep gateway transformations lightweight: header manipulation, field filtering, simple renames. For complex transformations, use a dedicated BFF or transformation service. The gateway processes every request — heavy computation here adds latency to everything.