"We'll just host it ourselves" sounds cheaper than it ever turns out to be, because the agent code is rarely what makes production hard. Write the agent in LangChain, Google ADK, CrewAI, Mastra, or Connic's Composer SDK and the file is portable. What differs across platforms is everything that wraps it at runtime: observability with token attribution, LLM-as-judge evaluation, real-time guardrails, HITL approvals, A/B testing, per-tenant cost dashboards, EU AI Act audit logs, signed connectors with DLQs and retries. This post puts numbers on the cost of building that wrapper yourself versus buying it bundled at 50,000 agent runs per month — a realistic DACH mid-market scale.
The Agent Code Is the Same. The Wrapper Isn't.
A production AI agent isn't a script that calls an LLM. It's a script that calls an LLM, plus the observability stack that shows you what the script did, plus the judge that grades its outputs, plus the guardrail that blocks bad ones, plus the approvals queue when something consequential is about to happen, plus the A/B harness when you want to change a prompt, plus the cost dashboard that tells you if any of this is profitable, plus the audit trail when a regulator asks.
Frameworks (LangChain, ADK, CrewAI, Mastra) stop at the first "plus." They ship the script. Connic ships the wrapper as platform primitives, so the only code you write is the agent itself. The TCO model below holds the agent constant and prices what it costs to surround it. The narrative companion post on the hidden costs of self-hosting AI agents covers the categories; this one puts line-item numbers on them.
Scenario Assumptions
Every figure below derives from these assumptions, which model a DACH mid-market enterprise running a production agent for a customer-facing or internal-ops use case.
| Assumption | Value | Rationale |
|---|---|---|
| Runs per month | 50,000 | Scenario anchor. Adjust for your volume. |
| LLM calls per run | 3 | Mixed workload: query routing, tool-call decision, response generation. |
| Avg input tokens / call | 10,000 | System prompt + retrieved context + conversation history. Realistic for RAG and multi-turn agentic workloads. |
| Avg output tokens / call | 1,500 | Tool-call JSON, reasoning traces, or structured response. Higher than simple completion tasks. |
| Total input tokens / mo | 1.5B | 50,000 runs × 3 calls × 10,000 tokens. |
| Total output tokens / mo | 225M | 50,000 runs × 3 calls × 1,500 tokens. |
| Model | Claude Sonnet 4.6 | Current production-grade reasoning model. Input $3/1M, output $15/1M as of 2026-05-16. Source: anthropic.com/pricing, accessed 2026-05-16. LLM spend is identical in all three columns — model costs pass through to your provider account regardless of platform. |
| Avg run duration | 45 seconds | Typical for a 3-LLM-call agent with tool latency. |
| Compute-hours / mo | 625 hours | 50,000 runs × 45s = 2,250,000 sec = 625 hours. Avg concurrent ~0.86, peak ~5. |
| Self-hosted infra | 3-node m5.xlarge Kubernetes cluster, eu-central-1 | AWS EKS with m5.xlarge nodes (4 vCPU / 16 GB). Adequate headroom for peak concurrency of ~5 at this run volume. |
| Engineering salary (DACH) | €110,000/yr fully loaded | Senior software engineer, Germany. Fully loaded including employer social contributions. Source: Levels.fyi Germany senior software engineer data, accessed 2026-05-16. |
The Three Approaches
Estimated LLM API Spend at This Volume
At 1,500M input + 225M output tokens per month on Claude Sonnet 4.6 ($3/1M input, $15/1M output as of 2026-05-16; source: anthropic.com/pricing), LLM API spend lands at roughly €7,300/month (~€87,600/year): 1,500M × $3/1M = $4,500 + 225M × $15/1M = $3,375 = $7,875/mo ÷ 1.08 ≈ €7,300/mo. Identical in all three columns — a passthrough cost you pay your model provider regardless of platform.
What “Feature Parity” Means Here
The breakdown below bakes parity costs in from the start. Each column carries the engineering and tooling needed to hit the same feature surface — namely the capabilities Connic ships as platform primitives and every other approach makes you build or buy:
- Auto-scaling, no cluster tuning
- Eleven first-party connectors (cron, email, kafka, mcp, postgres CDC, s3, sqs, stripe, telegram, webhook, websocket) with signature verification, DLQs, retries — plus the Bridge connector for private networks without inbound ports
- Observability — traces, token tracking, cost-per-run, anomaly detection, custom dashboards
- A/B testing — traffic splitting with statistical significance
- Cost tracking — per-model, per-agent, per-tenant dashboards
- EU AI Act compliance — audit logs, risk classification, exportable trails
- Agent management — versioning, environments, deployment pipeline, rollback
- Built-in database for agent state with six pre-built CRUD tools
- Built-in RAG knowledge base, ingestion and retrieval
- Built-in LLM-as-judge evaluation framework
- HITL approvals with review queue and RBAC
- Integrated agent testing framework, regression harness, fixtures
- Live agent ops UI: observe and stop running agents in real time. This one capability is roughly six months of senior engineering on its own — real-time streaming, permission controls, step-through debugging.
Parity engineering is amortized over three years; parity tooling is monthly. The self-hosted column avoids double-counting Datadog and Sentry by adding only the net spend above what's already in bare observability. Inngest gets the full parity table because it ships none of these. Connic's parity adder is zero — everything is in the box.
Line-Item Cost Breakdown (Parity-Adjusted)
All figures in EUR. Sources linked inline. Parity engineering uses 3-year amortization.
Platform and Infrastructure (Monthly)
| Line item | Self-hosted | Connic | Inngest |
|---|---|---|---|
| EKS cluster (3 × m5.xlarge + control plane, eu-central-1)Includes EBS, NAT gateway, load balancer. Source: aws.amazon.com/eks/pricing, accessed 2026-05-16 | €500 | — | — |
| Data layer (RDS Postgres + S3 + vector store)AWS RDS db.t3.medium multi-AZ ~€80, S3 + vector store ~€50, eu-central-1 estimates | €130 | ~€1 | ~€50 |
| Networking (NAT + load balancer) | €60 | — | — |
| Platform subscription / base | — | €200 | ~€70 |
| Connic run fees (50,000 × €0.047/run)Uniform per-unit rate across all paid tiers. Source: connic.co/pricing | — | €2,350 | — |
| Connic compute (50,000 × 45s × €0.00042/sec)2,250,000 total seconds × €0.00042 = €945. Uniform per-unit rate across all paid tiers. | — | €945 | — |
| Inngest execution overage50k runs/month is well within Pro's 1M included step executions. Source: inngest.com/pricing, accessed 2026-05-16 | — | — | €0 |
| Platform / infra subtotal | €690 | €3,295 | €120 |
At 50k runs/month the run-fee math (€2,350) already exceeds the €200 Pro credit, so Connic billing tracks actual usage. The tier unlocks feature limits — 10 parallel runs per agent, 30-min timeout, 5 environments, 90-day retention, 40 active connectors, custom domains, priority support — not discounted rates.
Observability Tooling (Monthly)
| Line item | Self-hosted | Connic | Inngest |
|---|---|---|---|
| Infrastructure metrics + APM (Datadog)Source: datadoghq.com/pricing, accessed 2026-05-16 | €400 | — | — |
| Error tracking (Sentry)Required on self-hosted; not needed on Connic (built-in) or Inngest | €80 | — | — |
| Agent-specific traces, token tracking, dashboards | Build separately | Included | Basic run logs only |
| Observability subtotal | €480 | €0 | €0 |
Engineering Time and Operations (Monthly)
Most cost models undercount this line. Self-hosted engineering time doesn't sit still; it scales with system complexity, incident frequency, and compliance scope. The figures below assume a fully loaded senior engineer at €110,000/year (€9,167/month) — DACH senior comp including employer social contributions. Source: Levels.fyi Germany senior software engineer data, accessed 2026-05-16.
| Line item | Self-hosted | Connic | Inngest |
|---|---|---|---|
| Ongoing ops (20% of senior eng FTE)Patching, scaling, upgrades, security hardening. 0.20 × €110,000/yr ÷ 12 = €1,833/mo | €1,833 | — | — |
| Engineering subtotal | €1,833 | €0 | €0 |
Production-Wrapper Engineering (Amortized /3 yr)
Production agents need signed connectors, a live ops UI, HITL approvals, A/B testing, RAG, judges, compliance tooling, and agent management. Connic ships all of those. Everywhere else, you build or buy. The one-time build cost (€174,584 across ten capabilities) is amortized over three years. The net additional parity engineering on self-hosted is €58,196/yr after deducting Datadog and Sentry, which already sit in the bare-observability column.
| Wrapper item | One-time build | Amortized /yr (÷3) | Self-hosted | Connic | Inngest |
|---|---|---|---|---|---|
| Connector layer (cron / email / kafka / mcp / postgres CDC / s3 / sqs / stripe / telegram / webhook / websocket — signature verification, DLQs, retries) | €18,333 | €6,111 | €6,111 | included | €6,111 |
| Live agent ops UI (real-time view + stop button + step-through)~6 months of senior engineering | €55,000 | €18,333 | €18,333 | included | €18,333 |
| HITL approval workflow + review queue + RBAC | €27,500 | €9,167 | €9,167 | included | €9,167 |
| A/B testing infrastructure (traffic splitting + analytics) | €9,167 | €3,056 | €3,056 | included | €3,056 |
| LLM-as-judge evaluation framework | €9,167 | €3,056 | €3,056 | included | €3,056 |
| Agent testing framework (regression harness, fixtures) | €9,167 | €3,056 | €3,056 | included | €3,056 |
| RAG / knowledge base (ingestion + retrieval) | €9,167 | €3,056 | €3,056 | included | €3,056 |
| EU AI Act compliance tooling (audit logs, risk classification, exportable trails) + legal review | €14,167 | €4,722 | €4,722 | included | €4,722 |
| Agent management (versioning, environments, deploy pipeline) | €18,333 | €6,111 | €6,111 | included | €6,111 |
| Cost tracking dashboards (per-model, per-agent, per-tenant) | €4,583 | €1,528 | €1,528 | included | €1,528 |
| Total wrapper engineering (amortized /yr) | €174,584 one-time | €58,196 /yr | €58,196 | €0 (included) | €58,196 |
Production-Wrapper Tooling (Monthly)
| Tool | EUR/mo | Self-hosted (net add) | Connic | Inngest |
|---|---|---|---|---|
| LLM observability (Langfuse / Helicone) | €200 | €0 (Datadog overlap) | included | €200 |
| Vector DB (Pinecone starter) | €70 | €70 | included | €70 |
| A/B testing service (LaunchDarkly) | €100 | €100 | included | €100 |
| Maintenance of homegrown parts (0.15 FTE)0.15 × €110,000/yr ÷ 12 = €1,375/mo | €1,375 | €1,035Deducting ~€340 Sentry/Datadog overlap | included | €1,375 |
| Net wrapper tooling (monthly) | — | €1,205 | €0 (included) | €1,745 |
LLM API Spend (Passthrough — Equal in All Columns)
| Line item | Self-hosted | Connic | Inngest |
|---|---|---|---|
| LLM API (1,500M input + 225M output tokens/mo — Claude Sonnet 4.6)$3/1M input, $15/1M output as of 2026-05-16. Source: anthropic.com/pricing, accessed 2026-05-16. 1,500M × $3 = $4,500 + 225M × $15 = $3,375 = $7,875/mo ≈ €7,300/mo. | €7,300 | €7,300 | €7,300 |
Annual Totals — Parity-Adjusted (3-Year Amortization Basis)
The comparable number. Platform and wrapper costs sit above LLM passthrough so the platform savings show clearly. LLM spend is identical everywhere: a cost you pay your model provider, not a platform difference.
| Component | Self-hosted | Connic Pro | Inngest + wrapper |
|---|---|---|---|
| Platform / infra recurring (annual)Monthly platform × 12 | €36,036 | €39,540 | €840 |
| Engineering build, amortized /3 (annual)Initial agent runtime build + compliance review + incident buffer + training | €8,683 | €600 | €600 |
| Wrapper build, amortized /3 (annual)Connectors, live ops UI, HITL, A/B, judges, testing, RAG, compliance, agent mgmt, cost tracking | €58,196 | included | €58,196 |
| Wrapper tooling (annual)LLM observability, vector DB, A/B service, maintenance — net of overlap | €14,460 | included | €20,940 |
| Annual subtotal — platform + wrapper | ~€117,400 | ~€40,140 | ~€80,580 |
| LLM API passthrough (annual, identical across all platforms)Claude Sonnet 4.6 at $3/1M input, $15/1M output — passthrough cost, not a platform differentiator | €87,600 | €87,600 | €87,600 |
| Total annual (3-year amortized, incl. LLM) | ~€205,000 | ~€127,700 | ~€168,200 |
Platform + wrapper only (no LLM): Connic is ~66% cheaper than self-hosted and ~50% cheaper than Inngest with wrapper tooling, averaging ~60% cheaper. All-in with LLM passthrough: ~38% cheaper than self-hosted and ~24% cheaper than Inngest. The smaller all-in margin reflects the LLM cost, which is the same on every platform.
Sensitivity: How Volume Changes the Comparison
TCO shifts with volume. 50k runs/month is the base case. Here is what the extremes look like:
When Self-Hosting Actually Wins
Self-hosting wins in narrow circumstances, all of which require either that the wrapper is being built for other reasons or that some of its features can genuinely be foregone:
For every other profile — agent as a product feature, no dedicated platform team, EU compliance in scope — the parity-adjusted model favors a managed platform. The right question isn't “what does bare execution cost?” It's “what does it cost to run an agent with the wrapper my production use case actually requires?”
When Connic Isn't the Right Answer
Connic Pro at 50k runs/month is the scenario where the numbers work out cleanly. A few cases where they don't:
Connic Pricing at Production Scale
Connic prices on a subscription-as-credit model with uniform per-unit rates. The monthly subscription (€200 on Pro, €40 on Developer) is a usage credit consumed at €0.047/run and €0.00042/sec compute. You pay the higher of subscription or actual usage. Not both.
Per-unit rates are the same on every paid tier. The tier you pick unlocks feature limits:
| Feature | Developer (€40/mo credit) | Pro (€200/mo credit) |
|---|---|---|
| Parallel runs per agent | 3 | 10 |
| Run timeout | 10 minutes | 30 minutes |
| Active connectors | 10 | 40 |
| Environments | 3 | 5 |
| Data retention | 30 days | 90 days |
| Custom domains | No | Yes |
| Priority support | No | Yes |
| Per-unit run rate (all paid tiers) | €0.047 | €0.047 |
| Per-unit compute rate (all paid tiers) | €0.00042/sec | €0.00042/sec |
At 50k runs/month with a 45s average duration, Pro fits: the 10 parallel runs limit handles peak concurrency (~5 concurrent), and the 30-min timeout covers longer agentic workflows. Published rates apply up to roughly 100,000 runs/month. Above that, customers move to a custom Enterprise contract with negotiated per-unit rates. The published figures are the public ceiling at smaller scale, not what Enterprise customers pay. See the Connic pricing page for current rates, contact sales for an Enterprise quote above 100k, or read the self-hosting cost narrative behind these numbers.
Methodology and Limitations
All sources are public as of 2026-05-16: AWS EKS pricing, Datadog pricing, Inngest pricing, Connic pricing, Anthropic model pricing, Levels.fyi Germany senior engineer compensation.
Frequently Asked Questions
Why does the framework I write my agent in not change the TCO?
What does it actually cost to run an AI agent at 50,000 runs/month?
What counts as 'production infrastructure' for an AI agent?
Does the TCO model change if I'm not in the EU?
How does the cost change if I run higher or lower volumes?
The Bottom Line
The agent code is the cheap part. The wrapper (observability, judges, guardrails, HITL, A/B, cost tracking, compliance, signed connectors) is where the year-one budget goes. Self-hosting runs ~€117,400/yr at 50k runs because you build the wrapper. Inngest runs ~€80,600/yr because the platform stops at durable execution and you assemble the wrapper from third-party tools. Connic Pro runs ~€40,100/yr because the wrapper is the platform.
See the Composer SDK for what writing an agent looks like, or start a Connic project in minutes.