Dobby
Back to Academy
AdvancedAdvanced

FinOps for AI Agents: Per-Agent Cost Tracking & Optimization

Track, analyze, and optimize AI agent costs. Per-agent breakdown, provider comparison, forecasting, and budget automation.

12 min read Gil KalMar 19, 2026

What you will learn

  • Implement per-agent, per-provider, and per-user cost attribution
  • Use the FinOps dashboard to identify optimization opportunities
  • Build cost forecasts based on historical usage patterns
  • Reduce AI spend by 40-60% with model tiering and caching

AI FinOps: Beyond Cost Tracking

FinOps for AI agents goes beyond knowing how much you spend. It is about understanding where the money goes (per agent, per provider, per user), predicting future spend, and systematically reducing costs without sacrificing quality. The teams that master AI FinOps spend 40-60% less than those that do not.

Without Dobby

Monthly AI bill: $8,000. Nobody knows which agents cost the most. The team suspects the QA agent is expensive but cannot prove it. Cost optimization is guesswork.

With Dobby

Monthly AI bill: $3,200. Dashboard shows the QA agent costs $2,100/month on GPT-4o. Switched to Claude Sonnet for QA — same quality, 60% less cost. Forecast: $2,800 next month.

The FinOps Dashboard

The FinOps dashboard provides five views of your AI spend, each answering a different question:

  • Overview — Total spend, daily trend, 30-day forecast, budget tracking. How much are we spending and where is it going?
  • Cost by Agent — Per-agent breakdown with expandable provider drill-down. Which agents cost the most?
  • Cost by Provider — OpenAI vs Anthropic vs Google cost comparison. Which provider gives the best value?
  • Cost by User — Per-user cost attribution. Who is consuming the most resources?
  • Cost by Department — Team-level cost allocation for chargeback. Which department should be billed?

Cost Attribution Architecture

Every LLM call through the Gateway is tagged with: agent ID, user ID, tenant ID, organization ID, provider, model, and request metadata. This creates a multi-dimensional cost cube that you can slice any way you need.

sql
-- Example: Top 10 most expensive agents (last 30 days)
SELECT
  agent_id,
  agent_name,
  COUNT(*) as total_requests,
  SUM(total_tokens) as total_tokens,
  SUM(estimated_cost_usd) as total_cost,
  ROUND(SUM(estimated_cost_usd) / COUNT(*), 4) as avg_cost_per_request
FROM ds_platform.llm_gateway_requests
WHERE organization_id = @orgId
  AND created_at >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 30 DAY)
GROUP BY agent_id, agent_name
ORDER BY total_cost DESC
LIMIT 10

The FinOps dashboard runs 6 parallel BigQuery queries to build the complete cost picture. Data updates in real-time as Gateway requests flow through. No manual ETL, no separate analytics pipeline — it is built into the platform.

5 Optimization Strategies

1. Model Tiering

Match model capability to task complexity. Use GPT-4o-mini or Gemini Flash for simple tasks (classification, extraction) at 10-20x less cost than GPT-4o. Reserve expensive models for complex reasoning.

2. Semantic Caching

The Gateway semantic cache checks if a semantically similar question was already answered. Cache hits return in under 1ms at zero LLM cost. Typical savings: 20-40% for repetitive workloads.

3. Token Budget Automation

Set daily and monthly budgets per agent. Automatic alerts at 80% and 90%. Automatic blocking at 100%. This prevents runaway costs before they happen.

4. Provider Shopping

Use the Cost by Provider view to compare quality-per-dollar across providers. Often, switching one agent from Provider A to Provider B saves 30-50% with no quality loss.

5. Request Optimization

Reduce token consumption by: shortening system prompts, using structured outputs (JSON mode), batching similar requests, and pruning conversation history. Each optimization compounds.

Start with the Cost by Agent view. Find your top 3 most expensive agents. For each, check if a cheaper model would work. This single action typically saves 20-30% of total AI spend.

Related Features

Ready to try this yourself?

Start free — no credit card required.

Start Free