FinOpsfinopsgatewaycost-tracking

Per-Agent LLM Cost Tracking: How We Built a Gateway for 13 Providers (OpenAI, Claude, Gemini)

Most teams know their total LLM spend. Few know which agent costs what. How we built an agentic gateway that breaks down cost per agent, per model, per day — across 13 providers, with one line of code.

Gil KalApril 3, 20265 min read

The $2,400 Black Box

Last month we spent $2,400 on LLM API calls. OpenAI, Claude, Gemini — spread across 8 agents running in production. The invoice told us the total. It told us nothing about which agent was responsible. Was it the code review agent hitting GPT-4o on every PR? The research agent running Gemini queries in a loop? The security scanner using Claude for policy validation? We had no idea.

This is the reality for every team running multiple AI agents. You know your total spend. You have zero per-agent visibility. And without that visibility, you cannot optimize, budget, or even explain your costs to finance.

The One-Line Fix

The solution is embarrassingly simple. Instead of calling LLM providers directly, route through a gateway. The gateway logs every request with the agent identity attached. Now you know exactly who spent what.

With Dobby Gateway, the change is one line. If you are using the OpenAI SDK (which most agent frameworks use under the hood), you change your base_url to point at the gateway. Your API key, your model selection, your prompts — nothing else changes.

from openai import OpenAI

# Before — calls OpenAI directly
client = OpenAI(api_key="sk-...")

# After — routes through Dobby Gateway
client = OpenAI(
    api_key="gk_user_...",  # Your Dobby gateway key
    base_url="https://gateway.dobby-ai.com/v1",
)

Every request now flows through the gateway, which logs the agent identity (extracted from request headers or key metadata), the model used, tokens consumed, cost calculated, and latency measured. All of this lands in a real-time dashboard within seconds.

What You See After 5 Minutes

Once your agents are routing through the gateway, you get instant visibility into three dimensions that matter: cost per agent (which agent is spending the most), cost per model (is GPT-4o worth 10x the price of GPT-4o-mini for this use case), and cost per day (are there spending spikes on certain days).

The FinOps dashboard shows all three in real-time. No waiting for end-of-month invoices. No spreadsheet reconciliation. Just open the dashboard and see the numbers.

Beyond Visibility: Budget Enforcement

Seeing costs is step one. Controlling them is step two. Dobby lets you set per-agent monthly budgets. When an agent hits 80% of its budget, you get a Slack alert. At 100%, the gateway blocks further requests from that agent. No more runaway spending over weekends.

You can also set approval gates — the agent stops and waits for human approval before executing expensive operations. This is especially useful for agents that make decisions autonomously. The cost of a 30-second human review is nothing compared to a $500 mistake.

Works With Every Framework

Because Dobby Gateway is OpenAI SDK compatible, it works with any agent framework that uses the OpenAI client under the hood. That includes CrewAI, LangChain, AutoGen, and custom agents. You do not need to rewrite your agent code or adopt a new SDK. One line change per agent.

We currently support 13 LLM providers: OpenAI, Anthropic Claude, Google Gemini, Mistral, AWS Bedrock, Azure OpenAI, DeepSeek, Perplexity, Grok, Replicate (Llama), GitHub Copilot, Microsoft 365 Copilot, and more. The gateway handles authentication, routing, and cost calculation for all of them.

Try It Free

Dobby Gateway is free for up to 1,000 requests per month. No credit card required. Sign up, get your gateway key, change one line in your agent code, and see your per-agent costs in real-time within 5 minutes.

Ready to take control of your AI agents?

Start free with Dobby AI — connect, monitor, and govern agents from any framework.

Get Started Free