Developerlangchainobservabilitydeveloper

5-Minute LangChain Observability: Cost, Latency, and Policy Without Rewriting a Line

Your LangChain agent is a black box in production. Here's how to get cost per run, full trace history, and policy enforcement by changing one environment variable — no SDK swap, no code rewrite.

Gil KalApril 28, 20265 min read

You built a LangChain agent. It works on your laptop. You pushed it to production. Now your manager is asking three questions you cannot answer: how much did it cost last week, why was it slow yesterday afternoon, and what exactly did it say to that customer on Monday at 3:47pm.

Every LangChain developer I talk to hits this wall within the first two weeks of shipping. The framework is brilliant at orchestration and hostile to observability. The callback system exists, but wiring it up to something that survives a restart, aggregates across agents, and doesn't lose data when your app crashes is a weekend project at best. Most teams skip it and hope.

The shortcut: treat Dobby as your LLM endpoint

Dobby's Agentic Gateway is OpenAI-compatible. That means LangChain — which already speaks the OpenAI API contract — can hit it by changing exactly one environment variable. No new SDK. No new import. No fork of LangChain's LLM classes.

Point LangChain's OpenAI client at the gateway, and every call your agent makes flows through Dobby on the way to the actual provider (OpenAI, Anthropic, Bedrock, Vertex, whichever you configured). The gateway records the prompt, the response, the cost, the latency, the model, and the agent identity. You see all of it in a dashboard named after your agent, not a random key fingerprint.

The actual 5 minutes

Sign up at dobby-ai.com and land on the onboarding page. Click "Try the Gateway" with "With agent" enabled. Give your agent a name (say, `support-summarizer`) and pick a default model.
The gateway mints a service key (`gk_svc_*`) scoped to that agent. Copy it — it's shown once.
In your LangChain app, set two environment variables: `OPENAI_API_BASE=https://api.dobby-ai.com/v1` and `OPENAI_API_KEY=gk_svc_...`. That's it. No code changes to your chains, tools, or callbacks.
Re-run your agent. Any `ChatOpenAI()` or `OpenAI()` instance in your code now routes through Dobby.
Open the Usage dashboard for `support-summarizer`. You now see every call, grouped by your agent name, with cost, latency, prompt preview, response, model, and hook decisions.

If your LangChain code uses Anthropic or another provider, the same pattern works — the gateway speaks OpenAI-compatible protocol across providers, so you keep a single base URL regardless of which backend actually serves the request. Dobby routes to the provider you configured for that agent.

What you get out of the box

The moment traffic flows, you have:

**Cost per run.** Input + output tokens × provider price, aggregated per agent. You can finally answer the "how much did this cost last week?" question.
**Latency breakdowns.** p50, p90, p99 by agent. Including hook overhead (usually <5ms).
**Full trace history.** Every prompt and response, searchable, 365-day retention by default. If someone asks what the agent said on Monday at 3:47pm, you can show them.
**Error visibility.** Timeouts, rate limits, and provider errors surfaced per agent, not buried in your application logs.
**Model routing.** Override the model per agent or per tenant without touching code — useful when you want to flip from gpt-4o to claude-sonnet-4.6 for one agent and measure the delta.

This is the baseline. You got it for free by changing one env var.

What you can layer on when you're ready

The thing that makes the 5-minute integration worthwhile is that it's the same surface you use later for governance. The observability is the side effect of the traffic flowing through a control point. Once you have that control point, adding policy is a config change, not a code change:

**Budget cap** per agent. "Stop `support-summarizer` from spending more than $200 this month." Toggle on in the dashboard; enforced on every request.
**Content Shield** (DLP + prompt-injection detection). 26 built-in patterns — credit cards, SSNs, API keys, jailbreak attempts — enforced before the request reaches the provider.
**Human-in-the-loop** approval for any tool call that matches a pattern. Your agent pauses, an approver on Slack gets notified, the agent continues once approved.
**Per-agent model restriction.** "This agent may only use gpt-4o-mini." Enforced server-side. Impossible to accidentally override in code.
**Audit trail.** Every decision — every hook, every policy match, every approval, every block — retained in an immutable log, exportable for compliance.

None of this requires you to change your LangChain code. You configured once when you registered the agent. The config follows the traffic.

How this differs from LangSmith and other tracer-first tools

LangSmith and similar LangChain-integrated tracers are valuable and solve a real problem — they give you deep insight into chain execution, tool selection, and intermediate state. They're a debugging tool and they're excellent at it.

Dobby is a different category. It's a control plane, not a tracer. The distinction matters when your second agent lands — or your fiftieth. A tracer tells you what happened inside one agent. A control plane tells you what's happening across all your agents, who's governing them, which ones are over budget, which ones just tried to send a customer credit card to the LLM. You can — and probably should — run both side-by-side: tracer for development, control plane for production.

The 5-minute setup above is specifically the control-plane entry point. It's OpenAI-compatible on purpose — we want LangChain, LlamaIndex, Vercel AI SDK, AutoGen, and raw OpenAI SDK users to all have the same two-env-var path to first value.

The one thing that trips people up

If your LangChain code hardcodes the provider base URL somewhere (inside a custom `ChatOpenAI` subclass, for instance), the env var won't override it. Search for `openai_api_base`, `base_url`, or `api_base` in your code and make sure they read from environment. This is a one-time fix — and the fact that LangChain supports env-var configuration at all is why this integration takes 5 minutes instead of 50.

What to do next

Try it on one agent. Not all of them. Pick the one you wish you had better visibility into — the one your manager keeps asking about — and point it at Dobby. Come back in 24 hours. The answers to the three questions will be in your dashboard.

— Gil

Ready to take control of your AI agents?

Start free with Dobby AI — connect, monitor, and govern agents from any framework.

Get Started Free