Security & WebSocket Support
The gateway as the security perimeter — DDoS mitigation, CORS, bot detection — plus handling persistent connections: WebSocket, SSE, gRPC, and GraphQL.
Table of Contents
DDoS Mitigation
The gateway is your first line of defense against DDoS attacks. While it can't stop volumetric attacks (that's your CDN/cloud provider's job), it can mitigate application-layer (L7) attacks that bypass network-level defenses.
| Attack Type | Description | Gateway Mitigation |
|---|---|---|
| Volumetric | Flood bandwidth (UDP flood, amplification) | Not gateway's job — use CDN/cloud DDoS protection |
| Connection flood | Exhaust connection slots (SYN flood) | Connection rate limiting, SYN cookies at OS level |
| Slowloris | Open connections, send data very slowly | Request timeout, minimum data rate enforcement |
| Application L7 | Valid-looking requests at high volume | Rate limiting, behavioral analysis, CAPTCHA |
| Payload attacks | Oversized bodies, deeply nested JSON | Body size limits, parsing depth limits |
# Connection rate limiting — max 10 new connections/sec per IP limit_conn_zone $binary_remote_addr zone=conn_limit:10m; limit_conn conn_limit 100; # Max 100 concurrent connections per IP # Request rate limiting — before auth (cheap rejection) limit_req_zone $binary_remote_addr zone=req_limit:10m rate=50r/s; server { # Body size limit — reject oversized payloads immediately client_max_body_size 10m; # Slowloris protection — minimum data rate client_body_timeout 10s; client_header_timeout 10s; send_timeout 10s; # Limit request header size large_client_header_buffers 4 8k; location /api/ { limit_req zone=req_limit burst=100 nodelay; limit_conn conn_limit 50; proxy_pass http://backend; } }
Defense in Depth
DDoS mitigation is layered: (1) Cloud provider absorbs volumetric attacks (AWS Shield, Cloudflare). (2) CDN/WAF filters known bad patterns. (3) Gateway rate-limits per IP before auth. (4) Gateway rate-limits per consumer after auth. No single layer handles everything — each catches what the previous layer missed.
IP Filtering & Bot Detection
IP filtering is the simplest security mechanism — allow or deny traffic based on source IP. Bot detection goes further, identifying automated traffic that mimics legitimate clients.
| Technique | How It Works | Effectiveness |
|---|---|---|
| IP Allowlist | Only permit traffic from known IPs | High for admin/internal APIs |
| IP Blocklist | Block known malicious IPs (threat feeds) | Low — attackers rotate IPs |
| Geo-blocking | Block traffic from specific countries | Moderate — reduces attack surface |
| User-Agent filtering | Block known bot user-agents | Low — trivially spoofed |
| Behavioral analysis | Detect patterns: request rate, path traversal, timing | High — hard to mimic human behavior |
| CAPTCHA challenge | Challenge suspicious traffic with proof-of-work | High — but hurts UX |
# IP restriction plugin — allowlist for admin API plugins: - name: ip-restriction route: admin-api config: allow: - 10.0.0.0/8 # Internal network - 203.0.113.0/24 # Office IP range deny: - 198.51.100.0/24 # Known bad actor range status: 403 message: "Access denied from your IP address" # Geo-blocking (requires GeoIP database) # Block traffic from countries where you have no customers # Reduces attack surface without affecting legitimate users
Bot Detection Signals
- ✅Request rate anomalies — 1000 req/min from a single source
- ✅Missing or unusual headers — no Accept, no Referer on browser requests
- ✅Sequential path scanning — /admin, /wp-admin, /.env, /config
- ✅Timing patterns — perfectly uniform request intervals (not human)
- ✅TLS fingerprinting — JA3 hash identifies client libraries vs browsers
CORS
CORS (Cross-Origin Resource Sharing) controls which web origins can call your API from a browser. The gateway is the ideal place to handle CORS centrally — instead of every service implementing it independently.
| Header | Purpose | Example |
|---|---|---|
| Access-Control-Allow-Origin | Which origins can access the API | https://app.example.com |
| Access-Control-Allow-Methods | Which HTTP methods are permitted | GET, POST, PUT, DELETE |
| Access-Control-Allow-Headers | Which request headers are permitted | Authorization, Content-Type |
| Access-Control-Max-Age | How long to cache preflight response | 86400 (24 hours) |
| Access-Control-Allow-Credentials | Whether cookies/auth headers are sent | true |
| Access-Control-Expose-Headers | Which response headers JS can read | X-RateLimit-Remaining |
plugins: - name: cors config: origins: - https://app.example.com - https://admin.example.com methods: - GET - POST - PUT - DELETE - OPTIONS headers: - Authorization - Content-Type - X-Request-ID exposed_headers: - X-RateLimit-Limit - X-RateLimit-Remaining - X-RateLimit-Reset credentials: true max_age: 86400 # Cache preflight for 24 hours preflight_continue: false # Gateway handles OPTIONS, don't forward
Preflight Requests (OPTIONS)
Browsers send a preflight OPTIONS request before any "non-simple" request (custom headers, PUT/DELETE methods, JSON content-type). The gateway should handle OPTIONS directly and return CORS headers without forwarding to the upstream service. Set a long max_age to reduce preflight frequency — browsers cache the response.
CORS Security Rules
- ❌Access-Control-Allow-Origin: * with credentials — browsers reject this combination
- ❌Reflecting the Origin header as Allow-Origin without validation — allows any site
- ❌Allowing all headers without restriction — expands attack surface
- ❌Not handling OPTIONS at the gateway — preflight hits backend unnecessarily
Request Validation & Sanitization
The gateway can validate requests against a schema before forwarding — rejecting malformed requests early and protecting backends from unexpected input.
| Validation Type | What It Checks | Rejects |
|---|---|---|
| Schema validation | Request body matches OpenAPI/JSON Schema | Missing required fields, wrong types |
| Header validation | Required headers present, format correct | Missing Content-Type, invalid Accept |
| Parameter validation | Path/query params match expected format | Non-numeric ID, invalid enum value |
| Size validation | Body size, header count, URL length | Oversized payloads, header bombs |
| Content-Type enforcement | Body matches declared Content-Type | JSON body with text/plain header |
# Gateway validates requests against OpenAPI spec plugins: - name: request-validator config: # Load OpenAPI spec for validation body_schema: | { "type": "object", "required": ["name", "email"], "properties": { "name": {"type": "string", "minLength": 1, "maxLength": 100}, "email": {"type": "string", "format": "email"}, "age": {"type": "integer", "minimum": 0, "maximum": 150} }, "additionalProperties": false } allowed_content_types: - application/json verbose_response: false # Don't expose schema details in errors parameter_schema: - name: id in: path required: true schema: type: string pattern: "^[a-f0-9-]{36}$" # UUID format
Validate at Gateway, Trust in Service
Gateway validation catches obviously malformed requests (wrong types, missing fields, oversized bodies). Service-level validation handles business rules (is this email already registered? is this product in stock?). The gateway rejects garbage early; the service validates business semantics.
WebSocket Proxying
WebSocket connections start as HTTP and upgrade to a persistent, bidirectional channel. The gateway must handle the upgrade handshake and then proxy frames in both directions without buffering.
| Aspect | HTTP | WebSocket |
|---|---|---|
| Connection | Short-lived, request-response | Long-lived, persistent |
| Direction | Client → Server (request), Server → Client (response) | Bidirectional (both can send anytime) |
| Gateway behavior | Buffer request, forward, buffer response, return | Proxy frames in both directions continuously |
| Load balancing | Per-request (any instance) | Sticky — entire session on one instance |
| Timeout | Request timeout (30-60s) | Idle timeout (minutes to hours) |
| Health check impact | None — each request is independent | Must drain connections on shutdown |
# WebSocket proxying in NGINX map $http_upgrade $connection_upgrade { default upgrade; '' close; } upstream ws_backend { # IP hash for sticky sessions — same client always hits same backend ip_hash; server ws-service-1:8080; server ws-service-2:8080; } server { location /ws/ { proxy_pass http://ws_backend; proxy_http_version 1.1; # WebSocket upgrade headers proxy_set_header Upgrade $http_upgrade; proxy_set_header Connection $connection_upgrade; # Longer timeouts for persistent connections proxy_read_timeout 3600s; # 1 hour idle timeout proxy_send_timeout 3600s; # Forward client info proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; } }
WebSocket and Scaling
WebSocket connections are stateful — you can't load-balance individual messages across instances. Use sticky sessions (IP hash or cookie-based) to pin a connection to one backend. For horizontal scaling, use a pub/sub layer (Redis Pub/Sub, NATS) so any backend instance can broadcast to connections on other instances.
Server-Sent Events & gRPC
Server-Sent Events (SSE)
SSE is a simpler alternative to WebSocket for server-to-client streaming. The client opens a standard HTTP connection, and the server sends events as they occur. The gateway must support long-lived HTTP responses without buffering.
# SSE proxying — disable buffering for streaming location /api/events { proxy_pass http://event_service; # Critical for SSE: disable response buffering proxy_buffering off; proxy_cache off; # Chunked transfer encoding proxy_set_header Connection ''; proxy_http_version 1.1; chunked_transfer_encoding on; # Long timeout for streaming connection proxy_read_timeout 86400s; # 24 hours # Headers proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; }
gRPC Proxying
gRPC uses HTTP/2 with Protocol Buffers. The gateway must support HTTP/2 upstream connections and handle gRPC-specific features: streaming, trailers, and binary framing.
# gRPC proxying requires HTTP/2 server { listen 443 ssl http2; location /grpc.UserService/ { grpc_pass grpcs://user-service:50051; # gRPC timeouts grpc_read_timeout 60s; grpc_send_timeout 60s; # Error handling — map gRPC errors to HTTP error_page 502 = /error502grpc; } # gRPC-Web translation (browser clients) location /grpc-web/ { grpc_pass grpc://user-service:50051; # Translate gRPC-Web (HTTP/1.1 + base64) to native gRPC (HTTP/2 + binary) # Requires envoy or grpc-web proxy } }
| Protocol | Direction | Connection | Gateway Requirement |
|---|---|---|---|
| SSE | Server → Client only | Long-lived HTTP | Disable buffering, long timeouts |
| WebSocket | Bidirectional | Upgraded HTTP → WS | Upgrade handling, sticky sessions |
| gRPC Unary | Request → Response | HTTP/2 stream | HTTP/2 support, binary framing |
| gRPC Streaming | Bidirectional streaming | HTTP/2 stream | HTTP/2, no buffering, trailers |
| gRPC-Web | Browser → gRPC service | HTTP/1.1 + base64 | Protocol translation layer |
GraphQL Support
GraphQL presents unique challenges for API gateways. All requests go to a single endpoint (POST /graphql), making path-based routing useless. The gateway must inspect the query body to apply security policies.
| Challenge | REST | GraphQL |
|---|---|---|
| Routing | Path-based (/users, /orders) | Single endpoint — must parse query |
| Rate limiting | Per endpoint | Per query complexity/cost |
| Caching | GET + URL = cache key | POST body varies — harder to cache |
| Authorization | Per endpoint + method | Per field/type in schema |
| DDoS | Body size limits sufficient | Deeply nested queries can be small but expensive |
{ "graphql_security": { "max_query_depth": 10, "max_query_complexity": 1000, "max_aliases": 5, "max_directives": 10, "introspection_enabled": false, "persisted_queries_only": true, "cost_analysis": { "default_field_cost": 1, "default_list_cost": 10, "custom_costs": { "Query.searchProducts": 50, "Query.analytics": 200, "Mutation.generateReport": 500 } }, "rate_limiting": { "max_cost_per_minute": 10000, "max_requests_per_minute": 100 } } }
GraphQL Gateway Security
- ✅Query depth limiting — prevent deeply nested queries that explode into millions of DB queries
- ✅Complexity scoring — assign cost to each field, reject queries exceeding budget
- ✅Persisted queries — only allow pre-registered query hashes in production
- ✅Disable introspection — don't expose your schema to attackers in production
- ✅Cost-based rate limiting — limit by query cost, not just request count
Persisted Queries for Security
In production, consider allowing only persisted (pre-registered) queries. Clients send a query hash instead of the full query text. The gateway looks up the hash and executes the known query. This prevents arbitrary query injection, makes caching trivial (hash = cache key), and eliminates depth/complexity attacks entirely.
Interview Questions
Q:How does an API Gateway handle WebSocket connections differently from HTTP?
A: HTTP: gateway buffers request, forwards to any backend instance, buffers response, returns. WebSocket: gateway handles the HTTP Upgrade handshake, then proxies frames bidirectionally without buffering. Key differences: (1) Sticky sessions required — the connection is stateful. (2) Long timeouts (hours, not seconds). (3) No request-level load balancing — the entire session is pinned. (4) Graceful shutdown must drain existing connections. (5) Health checks must account for connection count, not just request rate.
Q:Why is CORS handled at the gateway instead of in each service?
A: CORS is a cross-cutting concern — every service needs the same origin validation. Handling it at the gateway means: (1) One place to update allowed origins. (2) Preflight OPTIONS requests are handled without hitting backends. (3) Consistent headers across all endpoints. (4) No risk of one service having a misconfigured CORS policy that creates a security hole. The gateway is the natural enforcement point because it sees all cross-origin requests.
Q:How would you protect a GraphQL API from abuse at the gateway?
A: Multi-layered: (1) Query depth limiting (max 10 levels) — prevents exponential query expansion. (2) Complexity scoring — assign cost per field, reject queries exceeding budget. (3) Persisted queries in production — only allow pre-registered query hashes. (4) Cost-based rate limiting — limit by total query cost per minute, not just request count. (5) Disable introspection in production. (6) Timeout per query (5s). A simple request count limit is insufficient because one GraphQL query can be trivial or catastrophically expensive.
Q:What's the difference between a Slowloris attack and a regular DDoS, and how does the gateway mitigate it?
A: Regular DDoS floods with volume — many complete requests per second. Slowloris opens many connections and sends data extremely slowly (one byte per second), holding connections open indefinitely. This exhausts the gateway's connection pool without triggering rate limits (few 'requests' per second). Mitigation: (1) Minimum data rate enforcement — close connections sending below threshold. (2) Request header/body timeouts (10s). (3) Maximum concurrent connections per IP. (4) Connection idle timeout. The key insight: Slowloris attacks connection capacity, not request capacity.
Q:How do you handle CORS with credentials (cookies) securely?
A: When Access-Control-Allow-Credentials: true is set: (1) Access-Control-Allow-Origin CANNOT be * — must be the specific requesting origin. (2) The gateway must validate the Origin header against an allowlist before reflecting it. (3) Never reflect an arbitrary Origin header — this allows any site to make authenticated requests. (4) Set SameSite=Strict or Lax on cookies. (5) Limit exposed headers to what the frontend actually needs. The combination of credentials + wildcard origin is explicitly forbidden by browsers for security.
Common Mistakes
Setting Access-Control-Allow-Origin to wildcard with credentials
Configuring CORS with Allow-Origin: * and Allow-Credentials: true — browsers reject this, but the real danger is reflecting any Origin header without validation.
✅Maintain an explicit allowlist of permitted origins. Validate the incoming Origin header against the list. Only reflect origins that are in your allowlist. Never blindly reflect the Origin header — this effectively disables CORS protection.
Buffering WebSocket/SSE responses
The gateway buffers streaming responses, causing clients to receive data in large chunks instead of real-time events.
✅Disable proxy_buffering for WebSocket and SSE routes. Set proxy_buffering off and proxy_cache off. For SSE, ensure chunked_transfer_encoding is enabled. Test with a simple event stream to verify events arrive immediately, not batched.
No query depth limit on GraphQL
Allowing arbitrary query depth — an attacker sends a deeply nested query that causes exponential database queries and crashes the backend.
✅Set max query depth (10 is reasonable for most schemas). Implement complexity scoring that accounts for list fields (each list multiplies child cost). Use persisted queries in production to eliminate arbitrary query injection entirely.
Relying solely on User-Agent for bot detection
Blocking requests with bot-like User-Agent strings — trivially bypassed by setting a browser User-Agent.
✅Use behavioral signals: request rate patterns, TLS fingerprinting (JA3), header ordering anomalies, timing analysis, and challenge-response (CAPTCHA) for suspicious traffic. User-Agent is one weak signal among many — never the sole detection mechanism.