Dobby
Back to Blog
Governancegovernancesecuritycompliance

Why AI Agents Need a Governance Layer

Without centralized governance, AI agents create security risks, budget overruns, and compliance gaps. Learn how to build a governance layer that works.

Gil KalMarch 18, 20267 min read

Most teams start with one AI agent. Maybe a code assistant or a documentation bot. It works great. Then they add another. And another. Before long, they have 15 agents from 4 different frameworks -- CrewAI, LangChain, custom Python, and an OpenAI assistant -- each with its own API keys, cost profiles, and zero oversight. This is how every ungoverned AI deployment begins: quietly, incrementally, and with the best intentions.

The pattern is familiar because we have seen it before. Shadow IT started the same way -- employees signing up for SaaS tools without IT approval. But shadow AI is more dangerous. A SaaS tool might leak data if misconfigured. An autonomous AI agent can actively make decisions, modify systems, and spend real money at machine speed. Without governance, the question is not whether something will go wrong, but when.

The Three Risks of Ungoverned Agents

Without a governance layer, organizations face three escalating risks that compound as the number of agents grows. Each risk is manageable when you have two or three agents. At ten agents, they become painful. At fifty, they become existential threats to your business.

The first risk is uncontrolled costs. LLM API calls are deceptively expensive at scale. A single GPT-4 conversation costs pennies, but an agent running autonomously can burn through hundreds of dollars in a single day. One team we spoke with discovered a runaway agent that had consumed $4,200 in API credits over a weekend because no one had set a budget limit. Multiply that across 10 teams with 5 agents each, and monthly LLM spend can easily exceed $50,000 without anyone noticing until the invoice arrives.

The second risk is security exposure. When each agent manages its own API keys, those keys end up scattered across environment variables, config files, CI/CD pipelines, and developer laptops. There is no centralized key rotation, no audit trail of which agent used which key, and no way to revoke access instantly. A single leaked key can expose your entire LLM budget. Worse, agents that interact with production systems -- databases, APIs, deployment pipelines -- can become attack vectors if their credentials are compromised.

The third risk is compliance violations. Regulations like the EU AI Act, SOC 2, and GDPR now require audit trails for autonomous AI decisions. If an agent modifies customer data, approves a code deployment, or accesses sensitive information, you need an immutable record of what happened, when, and why. Without centralized logging, reconstructing an agent's decision chain after an incident is nearly impossible.

The cost of implementing governance after an incident is always ten times higher than implementing it before one. Every team that has lived through a runaway agent or a leaked API key will tell you the same thing.

What Good Governance Looks Like

A proper governance layer provides four capabilities that work together to keep agents productive and safe. These are not nice-to-haves. They are the minimum requirements for running AI agents in any environment where reliability, security, and cost predictability matter.

First, policy enforcement. This means defining rules that apply consistently across all agents regardless of framework: which models are approved for use, maximum tokens per request, content filtering rules, and data residency requirements. Policies should be hierarchical -- platform-level defaults, organization overrides, and tenant-specific rules -- so that security teams can set guardrails without micromanaging every workspace.

Second, a kill-switch. When something goes wrong, you need the ability to instantly stop all agent activity across your organization. Not in five minutes after someone finds the right config file. Instantly. A well-designed kill-switch has scopes -- stop all traffic, stop only LLM calls, or block only new API keys -- so you can respond proportionally to the severity of the situation.

Third, an immutable audit trail. Every agent action, every LLM request, every cost event should be logged in a tamper-proof record. This is not just for compliance. It is essential for debugging agent behavior, understanding cost spikes, and building trust with stakeholders. When the VP of Engineering asks what your agents did last Tuesday, you should have a definitive answer.

Fourth, human-in-the-loop approval gates. Agents should be able to work autonomously on low-risk tasks, but high-risk actions -- modifying production databases, deploying code, spending above a threshold -- should require human approval before proceeding. This is not about slowing agents down. It is about defining clear boundaries between autonomous operation and supervised execution.

A Governance Policy in Practice

What does governance look like in code? Here is an example of a policy configuration that enforces model restrictions, cost limits, and approval gates for a production workspace. In Dobby AI, policies like this are defined at the organization or tenant level and enforced automatically on every gateway request.

# Example: Dobby governance policy for a production workspace
policy:
  name: production-safety
  scope: tenant
  rules:
    allowed_models:
      - claude-sonnet-4
      - gpt-4o-mini
      - gemini-2.5-flash
    blocked_models:
      - "*-preview"     # No preview models in production
    cost_limits:
      per_request_max: 2.00      # USD
      daily_budget: 500.00       # USD per tenant
      monthly_budget: 10000.00   # USD per tenant
    approval_gates:
      - action: deploy_code
        requires: human_approval
        timeout: 24h
      - action: modify_database
        requires: human_approval
        notify: ["#dobby-approvals"]
    kill_switch:
      enabled: true
      scope: all_traffic

The Control Plane Pattern

Just as Kubernetes provides a control plane for containers and Datadog provides one for servers, AI agents need their own control plane. This is the layer that sits between your agents and the outside world, enforcing policies, tracking costs, and providing visibility -- regardless of which framework built the agent.

The key insight is that governance should be framework-agnostic. Whether an agent is built with CrewAI, LangChain, the OpenAI Assistants API, Google ADK, or a custom Python script, the governance rules are the same. The control plane handles the cross-cutting concerns -- authentication, authorization, metering, policy enforcement, and audit logging -- so that individual agents can focus on their tasks without reimplementing security and compliance from scratch. This separation of concerns is what makes the pattern scalable: adding a new agent to a governed fleet should take minutes, not days of integration work.

Dobby AI implements this pattern through the Agentic Gateway, a unified proxy that all LLM and MCP traffic flows through. Every request is authenticated, metered, and checked against the active policy before it reaches the provider. This means governance is not optional or opt-in. It is the default for every agent in the organization.

Start Before You Scale

The best time to implement governance is before you hit 10 agents. Retrofitting governance onto a fleet of autonomous agents is significantly harder than building it in from the start. Teams that wait often discover they have no reliable inventory of which agents exist, which keys they use, or what they cost.

Start with three actions today. First, inventory your agents: list every AI agent in your organization, the framework it uses, the API keys it holds, and what systems it can access. Most teams discover agents they did not know existed. Second, centralize your keys: route all LLM traffic through a single proxy so you have one place to monitor costs, enforce rate limits, and revoke access. Third, define your policies: which models are approved for production use, what is the monthly budget per team, what is the maximum cost per request, and who can approve agent actions that modify production systems. These three steps take less than a day and immediately reduce your risk surface by orders of magnitude.

The organizations that will thrive with AI agents are not the ones with the most agents. They are the ones that can trust their agents to operate safely, transparently, and within defined boundaries. Governance is what makes that trust possible. It transforms AI agents from unpredictable experiments into reliable production infrastructure. And the teams that build it in from the beginning will have an enormous advantage over those scrambling to retrofit it after their first major incident.

Ready to take control of your AI agents?

Start free with Dobby AI — connect, monitor, and govern agents from any framework.

Get Started Free