

Helicone is an open-source LLM observability platform from the Helicone team. It proxies requests to 20+ LLM providers — OpenAI, Anthropic, Gemini, Azure, Together, OpenRouter, and OpenAI-compatible endpoints — while capturing traces, cost data, prompts, and user analytics for the web UI and downstream systems.
Every request routed through Helicone is recorded with token counts, latency, cost, request and response bodies, and custom metadata. The web UI presents cost analytics broken down by user, model, feature, or any custom property, and lets operators drill into individual traces for debugging. Prompt management supports versioning, A/B testing, and evaluation workflows, and the scoring framework measures output quality over time against custom rubrics.
Custom properties let applications tag each request with user, session, workflow, or experiment labels that flow into all downstream filters, dashboards, and alerts.
Two integration paths are supported: changing the base URL of the OpenAI SDK (or any OpenAI-compatible SDK) to route through Helicone's proxy, or using Helicone's native client library for additional control. A gateway routing layer on top of the observability proxy adds retries, fallbacks, caching, and rate limiting.
Apache 2.0 core. Helicone Cloud is the hosted offering. Enterprise-tier features in the commercial product cover SAML SSO, advanced RBAC, and extended data governance; the core proxy, observability backend, dashboards, and generic OIDC are in the OSS repo.
Docker Compose reference stack with the gateway, ClickHouse for analytics, Postgres, MinIO for object storage, and the web UI. Helm charts are available for Kubernetes.
Open-source LLM observability proxy with traces, prompts, and cost analytics