Almost nobody sets out to build an AI agent platform. They set out to ship one agent. But production has requirements, and each requirement has a best-in-class tool, so the stack assembles itself: a framework, a vector database, an observability tool, an eval tool, an experimentation tool, a secrets manager, and a layer of glue holding it all together.
Each piece is a reasonable choice on its own. The cost isn't in any single tool. It's in the seams between them, and that bill arrives later, as maintenance rather than a line item.
The stack you end up assembling
Map the requirements of a production agent to the tools teams reach for, and a familiar shape appears. Each of these is a category leader. None of them was designed to know about the others.
Retrieval, observability, evals, A/B testing, and connectors in one platform, so the seams aren't code you maintain.
Try Connic freeThe tax nobody prices in
Add up the subscriptions and the number looks manageable. The expensive part isn't on any invoice.
Integration is code you own forever
Wiring the framework to the eval tool, the eval results to the experimentation tool, and the traces to the observability backend is custom code. It has no product owner, it's the first thing to break on an upgrade, and it's the last thing anyone wants to maintain.
Version drift compounds
Six tools means six release cadences. A breaking change in one ripples through your glue. The team spends a steady fraction of every sprint keeping the seams intact instead of improving the agent.
The on-call surface is the whole stack
When an agent misbehaves at 2 a.m., the question is which layer failed, and the answer requires correlating across tools that don't share a request ID. There is no single trace from trigger to tool call to evaluation, so debugging is a join across dashboards.
It looks a lot like self-hosting
This is the same shape as running your own infrastructure, just one level up: the cost lives in people and time, not compute. We put hard numbers on the infrastructure version in the hidden costs of self-hosting AI agents and the managed vs. self-hosted TCO comparison; the assembled-stack tax stacks on top of it.
When building your own stack is the right call
Assembling is not a mistake by default. If one of these layers is your product (you sell observability, or retrieval quality is your moat), owning it end to end is the right investment, and a general platform would hold you back. The same is true if you have unusual requirements no integrated product serves, or an existing stack your team knows cold and won't outgrow. Build where the layer is your differentiator; buy where it's table stakes.
What an integrated platform changes
The alternative to assembling is a platform where these layers already share a data model. On Connic, retrieval, observability, evaluation, A/B testing, connectors, and secrets are part of one runtime: one trace runs from trigger to tool call to judge score, upgrades are the platform's problem, and there is no glue to own. You define the agent in YAML, write tools in Python, and push to Git. For the wider landscape of where agents run, see our guide to AI agent deployment platforms in 2026.