Why I started building a decision layer before AI agents execute actions

supervenus928April 10, 2026 at 07:20 AM0 replies

Recently I’ve been thinking about a problem that only becomes obvious once you start building real agent workflows. When people talk about AI agents, most of the focus is on things like: prompting planning memory tool selection multi-agent coordination But once you actually run these systems in practice, a different kind of problem shows up. It’s not just about whether the model is “smart enough.” It’s something more practical: when an agent is about to execute an action, should the system actually let it do it? Where things start to break In most setups today, the flow looks roughly like this: define tools let the model decide when to call them maybe add some guardrails in prompts And once the agent decides to act, the system usually just executes. That’s the moment where things change. Because from that point on, it’s no longer about “answer quality,” but about real side effects: calling the wrong tool executing in the wrong context repeating the same action unintentionally acting with incomplete or ambiguous information This is not really a reasoning problem anymore. It’s an execution control problem. What I wanted to change Instead of trying to make the agent always “more correct,” I started exploring a different approach: What if there was a clear decision point right before execution? Something that sits between: what the agent wants to do and what the system actually allows it to do The idea The structure is intentionally simple: the agent proposes an action the action + context is sent to a decision layer the system returns one of: ALLOW DENY DEFER the original runtime still owns the final execution Why this feels different This is not another agent framework, and it’s not just prompt-level guardrails. The main goal is to introduce a clear authorization boundary at execution time. A few properties I care about: decisions are structured (not hidden in prompts) decisions are explainable (based on input context) decisions are replayable (same input → same output) execution control remains explicit It doesn’t make the agent perfect. But it creates a place where you can reason about whether an action should happen at all. What changed for me Once I started thinking in terms of execution boundaries, a lot of previously messy issues became easier to understand: which actions should never be allowed which situations require deferring instead of rejecting which risks come from the model vs. the system design which failures are not “bad reasoning” but “over-eager execution” Where this seems to matter This kind of layer becomes more relevant when agents start interacting with: internal tools external APIs write operations financial or transactional systems long, multi-step workflows In these cases, mistakes are not just “bad outputs” — they have real consequences. What I’m still figuring out I’m still refining how this should look as a real building block for developers: how much context is actually needed for good decisions how to keep it simple without losing usefulness where the boundary should live in different architectures A working belief The more I build in this space, the more I feel that: the hard part of agents is not just getting them to act but getting them to act only when they should And that might require a layer that is separate from both prompting and execution. If you’re working on agents, MCP servers, or tool-calling workflows, I’d be interested in how you think about this. Especially if you’ve run into situations where: the model was “reasonable,” but execution still felt unsafe or where adding more prompts didn’t really solve the issue I’m trying to understand whether this layer is actually necessary in real systems, or just something that feels useful in theory.