Governancecontrol-planegovernancecategory-framing

The 4 Control Modes of AI Governance

The AI gateway was the wrong unit. Governance needs a framework that covers the traffic you control, the traffic you observe, and the traffic you don't know exists. Here are the 4 Control Modes that define the category.

Gil KalMay 5, 20268 min read

The AI security industry has converged on the wrong unit of analysis. Almost every vendor in the Forrester AI Control Plane landscape — including some of the ones I respect most — positions around the same core metaphor: the gateway. A proxy in front of your LLM providers that logs every request, enforces policy, and returns a dashboard.

This metaphor has worked well for the early adopters. It's concrete, it maps onto existing security tooling, and it sells. It also describes roughly 30% of the AI traffic in a real enterprise. The other 70% — the part that's actually scaring CISOs in 2026 — doesn't flow through the gateway. Which means it doesn't appear in the dashboard. Which means it might as well not exist, from a governance standpoint.

If your governance metric is "requests per second through my gateway," your governance metric is a measurement of how much AI you happen to have sanctioned. Not how much AI you actually have.

The framework: 4 Control Modes

After running discovery conversations with roughly forty CISOs over the last quarter, I've landed on a framework that describes the actual state of AI in production. It divides every AI agent in your organization into exactly one of four modes based on how you can see and control it.

Mode 1 — Inline

Traffic flows through a proxy you control. Your governance layer sits on the request path. You can enforce policy (block, rewrite, approve), apply DLP, require human approval for sensitive operations, cap spend, and log everything.

This is the mode every AI Gateway vendor sells. It's the strongest mode — full enforcement, deterministic — and also the hardest to get adoption for, because it requires the engineering team to route their traffic through your proxy. Most enterprises have Mode 1 coverage for 10-30% of their AI agents.

Mode 2 — Hybrid (tool proxy)

The LLM traffic does NOT flow through you, but the tool calls do. This is the MCP proxy pattern: the agent talks directly to OpenAI or Anthropic, but when it wants to hit Salesforce, GitHub, or your internal database, it goes through an MCP gateway you operate.

Mode 2 became strategically important in Q1 2026, when Anthropic, Harmonic, Lakera, and Portkey all shipped MCP-aware products within weeks of each other. The industry recognized simultaneously that the tool layer is where agents cause damage — sending data out, writing to systems of record, calling external APIs. Governing the LLM without governing the tools is like installing airbags but leaving the brakes off.

Mode 2 is where agentic DLP actually lives. You can't stop a LangChain agent from hallucinating PII in its output — but you CAN stop it from retrieving customer rows it shouldn't see, and you CAN mask credit card fields in the response before the agent gets them.

Mode 3 — Surrounding (telemetry-only)

You don't control the path. The agent talks directly to its provider, its tools, its data. What you have is telemetry — OpenAI billing exports, SIEM logs showing traffic to api.anthropic.com, browser-extension events from your MDM, audit logs from the LLM-embedded SaaS vendors your employees use.

Mode 3 is where most mature enterprises end up for most of their AI. Security teams aren't going to rewrite the 40 product-team agents; they aren't going to force Salesforce to stop embedding OpenAI. What they CAN do is observe — and alert, and report, and eventually promote the high-risk cases up to Mode 2 or Mode 1.

Telemetry-mode governance is not weaker governance. It's governance applied to a larger surface. The question isn't "can I block this?" — it's "do I know this exists, do I know who owns it, is it within policy?" You can enforce a lot of policy without blocking a single request.

Mode 4 — Shadow

You have no control and no telemetry. The agent exists and you don't know it. Someone's Zapier workflow that calls ChatGPT. A browser extension your marketing team uses. A data science notebook that quietly hits Anthropic weekly. A vendor embedding an LLM in a product that used to be deterministic.

Mode 4 is the largest category. By definition, you don't know how big it is — you can only estimate. The best estimate I have from discovery conversations is that Mode 4 represents 40-60% of the AI traffic in a 500-person organization. That's roughly 10× the Mode 1 coverage for most companies.

Why the 4 modes aren't equal — and that's the point

A naive read of the framework says: "the goal is to move everything into Mode 1." That's wrong. The goal is to know what's in each mode and apply the strongest governance you can, given the mode's constraints.

An agent in Mode 1 gets deterministic enforcement: blocks, rewrites, approvals. An agent in Mode 3 gets detection: alerts, audit, quarterly review. An agent in Mode 4 gets discovery: find it, promote it up a mode, repeat. The right governance doesn't look the same in each mode — but the framework it's measured against does.

This is why we built the Governance Posture Score. Every agent, regardless of mode, scores 0-100 against the same 18 controls. An agent in Mode 1 can max out every control. An agent in Mode 4 can only score on Discovery — the control that asks "do we know this exists and who owns it." That's still a number. It's still a delta you can improve quarter-over-quarter. And it's still the metric your board can ask about.

What this means for vendor evaluation

If you're a CISO or head of security shopping for an AI governance platform this year, the framework gives you a cleaner evaluation criterion than "does it do prompt injection defense" or "does it do DLP." The criterion is: which of the 4 modes does this vendor actually cover?

**Gateway-only vendors** cover Mode 1. Some are excellent at it. They are useless for Mode 3 or Mode 4. Ask: "how do I govern an agent that doesn't route through you?" — if the answer is "don't allow agents that don't route through us," that vendor has Mode 1 only.
**MCP-proxy vendors** cover Modes 1 and 2. They saw the tool-governance wave before most others and shipped. Strong choice for teams that are agent-heavy and tool-heavy (typical of 2026 architecture).
**Browser + DLP vendors** cover Modes 3 and sometimes 4 (via browser extension). Weak on Mode 1. Best choice if your AI risk is employee-usage-driven, not agent-driven.
**Discovery vendors** cover Mode 4 exclusively. They're necessary — you can't govern what you can't see — but insufficient. A discovery-only tool finds the problem and hands you a spreadsheet. That's a step, not a solution.
**Control-plane vendors** (our category) cover all 4. Wider surface, more integration points, shallower at each one. This is the explicit tradeoff: we give up 10% of what a specialist can do in Mode 1 to get 90% coverage across all four modes.

There is no "best" — there is only "right for your shape." A pure-developer org with tight API control probably wants a specialist Mode 1 product. A large enterprise with regulatory exposure and employee-heavy AI usage needs the control-plane coverage. Both answers are defensible.

Promotion — how an agent moves up the modes

The most useful operational loop this framework unlocks is promotion. You start in Mode 4, promote to Mode 3, then Mode 2, then Mode 1. Each promotion unlocks stricter governance.

**Mode 4 → Mode 3:** discover the agent. Log-based scan (SIEM), employee survey, vendor audit. Now it exists in your inventory with an owner and a classification.
**Mode 3 → Mode 2:** wrap its tool calls in an MCP proxy you operate. Now you can enforce tool-layer policy — sensitivity masking, row scoping, RAG ACLs.
**Mode 2 → Mode 1:** route its LLM traffic through your gateway. Now you have full path control — budgets, model restrictions, content policy, approvals.
**Stopping points are fine.** Not every agent needs Mode 1. A summarization agent for public marketing copy is fine at Mode 3. A customer-support agent handling PII should be Mode 2 minimum. An internal finance agent writing to production data should be Mode 1.

The governance conversation gets cleaner once you have the framework. "Is this agent compliant?" becomes "What mode is it in, what score does it have in that mode, and what's our promotion plan?"

What this means for Dobby

We built Dobby as an AI Control Plane — explicit choice — because we think the single-mode vendors are incomplete for the 2026 enterprise. The same platform has to handle the engineer registering a new agent in the gateway (Mode 1), the MCP tool call from a LangChain agent running in customer VPC (Mode 2), the telemetry import from OpenAI billing for the data-science team (Mode 3), and the discovery scan that finds the 47 unmanaged ChatGPT users (Mode 4). One inventory. One governance score. Four modes of visibility.

That's the category we're defining. It's not "AI Gateway." It's "Governance Posture for AI Fleets" — the fleet being all four modes of AI in your org. Gateway is the vehicle for Mode 1. The framework is the product.

What to do with this framework

Regardless of which vendor you pick, the framework is operationally useful on its own. If you're a security leader right now, three actions:

Inventory your AI by mode. For each agent or AI-powered feature you know about, classify it into one of the four modes. The exercise usually produces the first "oh, we have way more Mode 4 than I thought" moment in week one.
Define a target distribution. In 12 months, what percentage of your AI should be in Mode 1 vs 2 vs 3 vs 4? This is a policy decision, not a technical one. Write it down; present it to your CIO.
Pick one agent per mode to instrument this quarter. One to promote from Mode 4 → Mode 3 (discover it). One from Mode 3 → Mode 2 (MCP proxy it). One from Mode 2 → Mode 1 (gateway route). Small, high-signal, reportable wins.

If you want to see how the framework translates into a product — specifically the Governance Posture Score that unifies measurement across modes — I'd be happy to walk you through it. 30 minutes, no demo unless you ask for one, just the framework applied to your actual fleet.

— Gil

Ready to take control of your AI agents?

Start free with Dobby AI — connect, monitor, and govern agents from any framework.

Get Started Free