
Kong AI Gateway is a set of plugins for Kong Gateway that turn Kong into an LLM-aware API proxy. It adds multi-provider routing, semantic routing, prompt guardrails, token-based rate limiting, and MCP exposure to Kong's established gateway runtime built on OpenResty and Nginx.
The AI Proxy plugin routes requests to OpenAI, Anthropic, Azure, Cohere, Mistral, Gemini, Llama, Bedrock, and OpenAI-compatible upstreams with format translation between them. AI Prompt Guard and AI Prompt Decorator plugins enforce prompt-level policies — blocking prohibited content, injecting system prompts, or redacting inputs. AI Rate Limiting Advanced counts tokens rather than requests for more accurate budget enforcement. AI Semantic Routing dispatches traffic to different upstream models based on intent classification. A Kong MCP plugin (3.12+, late 2025) exposes Kong-managed services as MCP servers for agent clients.
Organizations that already run Kong for API traffic management and want to apply the same governance layer to LLM and agent traffic without introducing a parallel gateway. Kong's declarative config model fits GitOps-style operations.
Kong Gateway core is Apache 2.0. Several AI-focused plugins and supporting features are Kong Enterprise (commercial): AI Rate Limiting Advanced, AI Semantic Routing, AI Proxy Advanced, the OIDC authentication plugin, and advanced observability. The open-source AI Proxy and Prompt Guard plugins are usable on their own; the open-source JWT plugin handles token auth without automatic JWKS refresh.
Dockerized Kong on any container platform, including Docker Swarm and Kubernetes. Runs in database mode (Postgres or Cassandra) or DB-less mode (declarative YAML). Konga or the open-source admin UI handles configuration, though Enterprise-only plugins do not render in the open-source admin.