Managed vs Self-Hosted AI Agents: TCO at 50,000 Runs/Month

"We'll just host it ourselves" sounds cheaper than it ever turns out to be, because the agent code is rarely what makes production hard. Write the agent in LangChain, Google ADK, CrewAI, Mastra, or Connic's Composer SDK and the file is portable. What differs across platforms is everything that wraps it at runtime: observability with token attribution, LLM-as-judge evaluation, real-time guardrails, HITL approvals, A/B testing, per-tenant cost dashboards, EU AI Act audit logs, signed connectors with DLQs and retries. This post puts numbers on the cost of building that wrapper yourself versus buying it bundled at 50,000 agent runs per month — a realistic DACH mid-market scale.

The Agent Code Is the Same. The Wrapper Isn't.

A production AI agent isn't a script that calls an LLM. It's a script that calls an LLM, plus the observability stack that shows you what the script did, plus the judge that grades its outputs, plus the guardrail that blocks bad ones, plus the approvals queue when something consequential is about to happen, plus the A/B harness when you want to change a prompt, plus the cost dashboard that tells you if any of this is profitable, plus the audit trail when a regulator asks.

Frameworks (LangChain, ADK, CrewAI, Mastra) stop at the first "plus." They ship the script. Connic ships the wrapper as platform primitives, so the only code you write is the agent itself. The TCO model below holds the agent constant and prices what it costs to surround it. The narrative companion post on the hidden costs of self-hosting AI agents covers the categories; this one puts line-item numbers on them.

Skip building your own platform

Deployment, scaling, observability, and connectors come built in, so a small team ships without running infrastructure.

Get started free

Scenario Assumptions

Every figure below derives from these assumptions, which model a DACH mid-market enterprise running a production agent for a customer-facing or internal-ops use case.

Assumption	Value	Rationale
Runs per month	50,000	Scenario anchor. Adjust for your volume.
LLM calls per run	3	Mixed workload: query routing, tool-call decision, response generation.
Avg input tokens / call	10,000	System prompt + retrieved context + conversation history. Realistic for RAG and multi-turn agentic workloads.
Avg output tokens / call	1,500	Tool-call JSON, reasoning traces, or structured response. Higher than simple completion tasks.
Total input tokens / mo	1.5B	50,000 runs × 3 calls × 10,000 tokens.
Total output tokens / mo	225M	50,000 runs × 3 calls × 1,500 tokens.
Model	Claude Sonnet 4.6	Current production-grade reasoning model. Input $3/1M, output $15/1M as of 2026-05-16. Source: anthropic.com/pricing, accessed 2026-05-16. LLM spend is identical in all three columns — model costs pass through to your provider account regardless of platform.
Avg run duration	45 seconds	Typical for a 3-LLM-call agent with tool latency.
Compute-hours / mo	625 hours	50,000 runs × 45s = 2,250,000 sec = 625 hours. Avg concurrent ~0.86, peak ~5.
Self-hosted infra	3-node m5.xlarge Kubernetes cluster, eu-central-1	AWS EKS with m5.xlarge nodes (4 vCPU / 16 GB). Adequate headroom for peak concurrency of ~5 at this run volume.
Engineering salary (DACH)	€110,000/yr fully loaded	Senior software engineer, Germany. Fully loaded including employer social contributions. Source: Levels.fyi Germany senior software engineer data, accessed 2026-05-16.

The Three Approaches

Self-Hosted

Your agent (in any framework: LangChain, ADK, CrewAI, plain Python) on your own Kubernetes cluster (AWS EKS eu-central-1, 3 × m5.xlarge), your own LLM API keys, your own observability stack, your own connector layer, your own on-call. Maximum control, maximum ops burden, every line of the wrapper is yours to write and maintain.

Connic Pro

Agent-native managed runtime with EU-only hosting. Write the agent in YAML + Python (Composer SDK), Git-push to deploy. €200/mo subscription consumed as credit at uniform per-unit rates (€0.047/run + €0.00042/sec compute). Observability, judges, guardrails, HITL approvals, A/B testing, cost dashboards, RAG, pre-built connectors plus Bridge, and EU AI Act tooling come with the platform.

Inngest — Per-Execution

Inngest + AgentKit: durable workflow platform with an agent layer. Pay per step execution; Pro ($75/mo) includes 1M step executions. Retry and durability guarantees are strong. Observability, connectors, A/B, HITL, judges, guardrails, RAG, and compliance tooling are sourced separately; those are the parity costs in the line-item breakdown below.

LLM API costs are equal in all three columns

Every platform in this comparison supports BYOK: your model API keys, your provider bills. We include LLM spend in the totals so the overall picture is complete, but it is not a platform differentiator. Verify current token pricing with your model provider before using these estimates for budget planning.

Estimated LLM API Spend at This Volume

At 1,500M input + 225M output tokens per month on Claude Sonnet 4.6 ($3/1M input, $15/1M output as of 2026-05-16; source: anthropic.com/pricing), LLM API spend lands at roughly €7,300/month (~€87,600/year): 1,500M × $3/1M = $4,500 + 225M × $15/1M = $3,375 = $7,875/mo ÷ 1.08 ≈ €7,300/mo. Identical in all three columns — a passthrough cost you pay your model provider regardless of platform.

What “Feature Parity” Means Here

The breakdown below bakes parity costs in from the start. Each column carries the engineering and tooling needed to hit the same feature surface — namely the capabilities Connic ships as platform primitives and every other approach makes you build or buy:

Auto-scaling, no cluster tuning
First-party connectors (cron, email, kafka, mcp, postgres, s3, sqs, stripe, and more) with signature verification, DLQs, and retries, plus the Bridge connector for private networks without inbound ports
Observability — traces, token tracking, cost-per-run, anomaly detection, custom dashboards
A/B testing — traffic splitting with statistical significance
Cost tracking — per-model, per-agent, per-tenant dashboards
EU AI Act compliance — audit logs, risk classification, exportable trails
Agent management — versioning, environments, deployment pipeline, rollback
Built-in database for agent state with six pre-built CRUD tools
Built-in RAG knowledge base, ingestion and retrieval
Built-in LLM-as-judge evaluation framework
HITL approvals with review queue and RBAC
Integrated agent testing framework, regression harness, fixtures
Live agent ops UI: observe and stop running agents in real time. This one capability is roughly six months of senior engineering on its own — real-time streaming, permission controls, step-through debugging.

Parity engineering is amortized over three years; parity tooling is monthly. The self-hosted column avoids double-counting Datadog and Sentry by adding only the net spend above what's already in bare observability. Inngest gets the full parity table because it ships none of these. Connic's parity adder is zero — everything is in the box.

Line-Item Cost Breakdown (Parity-Adjusted)

All figures in EUR. Sources linked inline. Parity engineering uses 3-year amortization.

Platform and Infrastructure (Monthly)

Line item	Self-hosted	Connic	Inngest
EKS cluster (3 × m5.xlarge + control plane, eu-central-1)Includes EBS, NAT gateway, load balancer. Source: aws.amazon.com/eks/pricing, accessed 2026-05-16	€500	—	—
Data layer (RDS Postgres + S3 + vector store)AWS RDS db.t3.medium multi-AZ ~€80, S3 + vector store ~€50, eu-central-1 estimates	€130	~€1	~€50
Networking (NAT + load balancer)	€60	—	—
Platform subscription / base	—	€200	~€70
Connic run fees (50,000 × €0.047/run)Uniform per-unit rate across all paid tiers. Source: connic.co/pricing	—	€2,350	—
Connic compute (50,000 × 45s × €0.00042/sec)2,250,000 total seconds × €0.00042 = €945. Uniform per-unit rate across all paid tiers.	—	€945	—
Inngest execution overage50k runs/month is well within Pro's 1M included step executions. Source: inngest.com/pricing, accessed 2026-05-16	—	—	€0
Platform / infra subtotal	€690	€3,295	€120

At 50k runs/month the run-fee math (€2,350) already exceeds the €200 Pro credit, so Connic billing tracks actual usage. The tier unlocks feature limits — 10 parallel runs per agent, 30-min timeout, 5 environments, 90-day retention, 40 active connectors, custom domains, priority support — not discounted rates.

Observability Tooling (Monthly)

Line item	Self-hosted	Connic	Inngest
Infrastructure metrics + APM (Datadog)Source: datadoghq.com/pricing, accessed 2026-05-16	€400	—	—
Error tracking (Sentry)Required on self-hosted; not needed on Connic (built-in) or Inngest	€80	—	—
Agent-specific traces, token tracking, dashboards	Build separately	Included	Basic run logs only
Observability subtotal	€480	€0	€0

Engineering Time and Operations (Monthly)

Most cost models undercount this line. Self-hosted engineering time doesn't sit still; it scales with system complexity, incident frequency, and compliance scope. The figures below assume a fully loaded senior engineer at €110,000/year (€9,167/month) — DACH senior comp including employer social contributions. Source: Levels.fyi Germany senior software engineer data, accessed 2026-05-16.

Line item	Self-hosted	Connic	Inngest
Ongoing ops (20% of senior eng FTE)Patching, scaling, upgrades, security hardening. 0.20 × €110,000/yr ÷ 12 = €1,833/mo	€1,833	—	—
Engineering subtotal	€1,833	€0	€0

Production-Wrapper Engineering (Amortized /3 yr)

Production agents need signed connectors, a live ops UI, HITL approvals, A/B testing, RAG, judges, compliance tooling, and agent management. Connic ships all of those. Everywhere else, you build or buy. The one-time build cost (€174,584 across ten capabilities) is amortized over three years. The net additional parity engineering on self-hosted is €58,196/yr after deducting Datadog and Sentry, which already sit in the bare-observability column.

Wrapper item	One-time build	Amortized /yr (÷3)	Self-hosted	Connic	Inngest
Connector layer (cron / email / kafka / mcp / postgres CDC / s3 / sqs / stripe / telegram / webhook / websocket — signature verification, DLQs, retries)	€18,333	€6,111	€6,111	included	€6,111
Live agent ops UI (real-time view + stop button + step-through)~6 months of senior engineering	€55,000	€18,333	€18,333	included	€18,333
HITL approval workflow + review queue + RBAC	€27,500	€9,167	€9,167	included	€9,167
A/B testing infrastructure (traffic splitting + analytics)	€9,167	€3,056	€3,056	included	€3,056
LLM-as-judge evaluation framework	€9,167	€3,056	€3,056	included	€3,056
Agent testing framework (regression harness, fixtures)	€9,167	€3,056	€3,056	included	€3,056
RAG / knowledge base (ingestion + retrieval)	€9,167	€3,056	€3,056	included	€3,056
EU AI Act compliance tooling (audit logs, risk classification, exportable trails) + legal review	€14,167	€4,722	€4,722	included	€4,722
Agent management (versioning, environments, deploy pipeline)	€18,333	€6,111	€6,111	included	€6,111
Cost tracking dashboards (per-model, per-agent, per-tenant)	€4,583	€1,528	€1,528	included	€1,528
Total wrapper engineering (amortized /yr)	€174,584 one-time	€58,196 /yr	€58,196	€0 (included)	€58,196

Production-Wrapper Tooling (Monthly)

Tool	EUR/mo	Self-hosted (net add)	Connic	Inngest
LLM observability (Langfuse / Helicone)	€200	€0 (Datadog overlap)	included	€200
Vector DB (Pinecone starter)	€70	€70	included	€70
A/B testing service (LaunchDarkly)	€100	€100	included	€100
Maintenance of homegrown parts (0.15 FTE)0.15 × €110,000/yr ÷ 12 = €1,375/mo	€1,375	€1,035Deducting ~€340 Sentry/Datadog overlap	included	€1,375
Net wrapper tooling (monthly)	—	€1,205	€0 (included)	€1,745

LLM API Spend (Passthrough — Equal in All Columns)

Line item	Self-hosted	Connic	Inngest
LLM API (1,500M input + 225M output tokens/mo — Claude Sonnet 4.6)$3/1M input, $15/1M output as of 2026-05-16. Source: anthropic.com/pricing, accessed 2026-05-16. 1,500M × $3 = $4,500 + 225M × $15 = $3,375 = $7,875/mo ≈ €7,300/mo.	€7,300	€7,300	€7,300

Annual Totals — Parity-Adjusted (3-Year Amortization Basis)

The comparable number. Platform and wrapper costs sit above LLM passthrough so the platform savings show clearly. LLM spend is identical everywhere: a cost you pay your model provider, not a platform difference.

Component	Self-hosted	Connic Pro	Inngest + wrapper
Platform / infra recurring (annual)Monthly platform × 12	€36,036	€39,540	€840
Engineering build, amortized /3 (annual)Initial agent runtime build + compliance review + incident buffer + training	€8,683	€600	€600
Wrapper build, amortized /3 (annual)Connectors, live ops UI, HITL, A/B, judges, testing, RAG, compliance, agent mgmt, cost tracking	€58,196	included	€58,196
Wrapper tooling (annual)LLM observability, vector DB, A/B service, maintenance — net of overlap	€14,460	included	€20,940
Annual subtotal — platform + wrapper	~€117,400	~€40,140	~€80,580
LLM API passthrough (annual, identical across all platforms)Claude Sonnet 4.6 at $3/1M input, $15/1M output — passthrough cost, not a platform differentiator	€87,600	€87,600	€87,600
Total annual (3-year amortized, incl. LLM)	~€205,000	~€127,700	~€168,200

Platform + wrapper only (no LLM): Connic is ~66% cheaper than self-hosted and ~50% cheaper than Inngest with wrapper tooling, averaging ~60% cheaper. All-in with LLM passthrough: ~38% cheaper than self-hosted and ~24% cheaper than Inngest. The smaller all-in margin reflects the LLM cost, which is the same on every platform.

Sensitivity: How Volume Changes the Comparison

TCO shifts with volume. 50k runs/month is the base case. Here is what the extremes look like:

At 5,000 runs/month (low end)

The right comparison is the Connic Developer plan (€40/mo). Self-hosted fixed costs (~€3,300+/mo) don't shrink with volume; the wrapper still has to be built. At low volume Connic wins by the widest margin on platform cost.

At ~80,000 runs/month (upper published rate)

Connic platform billing scales to ~€5,272/mo (run + compute). Wrapper savings stay fixed regardless of volume, so Connic still leads parity-adjusted. Past ~100k runs/month, customers move to a negotiated Enterprise contract.

Engineering salary lever

The 20% ops allocation on self-hosted scales with salary. At €140k/yr fully loaded (senior Berlin with equity), the ops line moves from €1,833 to €2,333/mo. Every upward salary revision widens the managed advantage.

Run duration lever

Connic compute scales linearly with run time at €0.00042/sec. Run duration at 90s instead of 45s doubles the compute line from €945 to €1,890/mo at 50k runs. Self-hosted compute also grows (larger cluster or more contention). Model your p95 duration, not your average.

When Self-Hosting Actually Wins

Self-hosting wins in narrow circumstances, all of which require either that the wrapper is being built for other reasons or that some of its features can genuinely be foregone:

A platform team is already building the wrapper for other products

If a 3–5 engineer platform team treats agent infrastructure as a primary responsibility and is already building a live ops UI, connector layer, and compliance tooling for other systems, extending those to one more agent workload is cheaper than the full wrapper table. This rarely describes a team shipping its first production agent.

Regulators require on-premise execution

Some government and financial-services environments require workloads on hardware you own, regardless of data-hosting location. Uncommon, but real. For everyone else, EU-hosted platforms like Connic cover GDPR and EU AI Act without on-prem.

Surplus engineering capacity that would otherwise sit idle

If your team has capacity that can't be redirected to product work, the opportunity cost of building the wrapper is lower than this model shows. Note that this is a statement about your team, not about platform economics.

For every other profile — agent as a product feature, no dedicated platform team, EU compliance in scope — the parity-adjusted model favors a managed platform. The right question isn't “what does bare execution cost?” It's “what does it cost to run an agent with the wrapper my production use case actually requires?”

When Connic Isn't the Right Answer

Connic Pro at 50k runs/month is the scenario where the numbers work out cleanly. A few cases where they don't:

Tiny volume (under ~1k runs/month)

Below 1k runs/month and €20/month of usage, even the €40/mo Developer credit exceeds what the workload justifies. The platform comparison becomes moot. Between 1k and 5k runs/month, Developer plan feature limits (3 parallel runs per agent, 10-minute timeout, 10 active connectors) become the relevant comparison.

TypeScript-only stack with deep framework lock-in

Composer SDK is Python-first. If your agent is non-trivially coupled to a TypeScript-only framework (Mastra, Trigger.dev's TS path), the migration cost can outweigh the wrapper savings. Connic's value is highest when you pick a production stack before framework lock-in is deep.

Apache 2.0 self-hostability is a hard requirement

If procurement or compliance requires Apache 2.0-licensed, self-hostable software with no managed dependency, Connic doesn't qualify today. Self-host instead.

Strict on-premise requirements

If workloads have to run on hardware you own and control, no managed cloud (Connic included) meets the bar. Self-host instead.

Connic Pricing at Production Scale

Connic prices on a subscription-as-credit model with uniform per-unit rates. The monthly subscription (€200 on Pro, €40 on Developer) is a usage credit consumed at €0.047/run and €0.00042/sec compute. You pay the higher of subscription or actual usage. Not both.

Per-unit rates are the same on every paid tier. The tier you pick unlocks feature limits:

Feature	Developer (€40/mo credit)	Pro (€200/mo credit)
Parallel runs per agent	3	10
Run timeout	10 minutes	30 minutes
Active connectors	10	40
Environments	3	5
Data retention	30 days	90 days
Custom domains	No	Yes
Priority support	No	Yes
Per-unit run rate (all paid tiers)	€0.047	€0.047
Per-unit compute rate (all paid tiers)	€0.00042/sec	€0.00042/sec

At 50k runs/month with a 45s average duration, Pro fits: the 10 parallel runs limit handles peak concurrency (~5 concurrent), and the 30-min timeout covers longer agentic workflows. Published rates apply up to roughly 100,000 runs/month. Above that, customers move to a custom Enterprise contract with negotiated per-unit rates. The published figures are the public ceiling at smaller scale, not what Enterprise customers pay. See the Connic pricing page for current rates, contact sales for an Enterprise quote above 100k, or read the self-hosting cost narrative behind these numbers.

Methodology and Limitations

These are estimates, not quotes

Cloud pricing moves. Engineering salaries swing by seniority, location, and company. Incident rates depend on system complexity. Treat this as an order-of-magnitude model, not a budget line item, and check every figure against the current pricing page before you commit.

LLM pricing is volatile

Claude Sonnet 4.6 sat at $3/1M input, $15/1M output on 2026-05-16 (source: anthropic.com/pricing). Model prices change often; verify against your provider before you budget.

Engineering costs are opportunity costs

These figures are salary allocation, not new headcount. If the DevOps work would otherwise sit idle, the marginal cost runs lower. If it displaces product engineering, the opportunity cost runs higher.

Connic Enterprise has negotiated rates

Above ~100k runs/month, Connic moves to a custom Enterprise contract with per-unit rates below the published ceiling. This post models the published Developer and Pro rates: the public maximum, not what Enterprise customers actually pay.

Wrapper amortization horizon

One-time engineering is amortized over three years, which matches how often these systems get rewritten or replaced. Shorten the horizon to two years and the annual wrapper adder rises proportionally. Lengthen it and the adder shrinks.

All sources are public as of 2026-05-16: AWS EKS pricing, Datadog pricing, Inngest pricing, Connic pricing, Anthropic model pricing, Levels.fyi Germany senior engineer compensation.

Frequently Asked Questions

Why does the framework I write my agent in not change the TCO?

The agent file you write in LangChain, Google ADK, CrewAI, Mastra, or Connic's Composer SDK is a small fraction of a production system. The real cost is the wrapper: observability with token attribution, LLM-as-judge evaluation, real-time guardrails, HITL approvals, A/B testing, cost dashboards, audit trails, signed connectors. Frameworks stop at the agent. Platforms differ in how much of the wrapper they ship. Connic ships all of it. Self-hosting and durable-workflow platforms like Inngest leave you to build it.

What does it actually cost to run an AI agent at 50,000 runs/month?

Year-one platform plus wrapper (excluding LLM passthrough, which is identical everywhere): self-hosted ~EUR 117,400; Connic Pro ~EUR 40,140 with connectors, observability, A/B, judges, HITL, RAG, live ops UI, and EU AI Act tooling included; Inngest with parity tooling ~EUR 80,580. LLM API spend on Claude Sonnet 4.6 (1.5B input + 225M output tokens) runs ~EUR 87,600/yr in all three columns. Connic is ~60% cheaper than feature-parity alternatives on platform plus wrapper, ~38% cheaper all-in.

What counts as 'production infrastructure' for an AI agent?

Execution traces with per-agent and per-tenant token attribution, an LLM-as-judge harness, real-time guardrails (prompt injection, PII redaction, topic enforcement), HITL approvals with RBAC, A/B testing with statistical significance, per-tenant and per-agent cost dashboards, and an exportable audit trail keyed to EU AI Act deployer obligations. That's the minimum. Connic ships it on every paid plan as platform primitives, not third-party stitching.

Does the TCO model change if I'm not in the EU?

The platform and wrapper engineering math doesn't care about geography. What changes is the EU AI Act line: outside the EU, that specific obligation may not apply. Inside the EU it's non-negotiable, and the audit-trail, risk-classification, and human-oversight requirements add fixed cost on platforms that don't ship them. Connic is EU-hosted by default for residency. The model assumes DACH engineering salaries and AWS eu-central-1 for self-hosted infrastructure.

How does the cost change if I run higher or lower volumes?

Self-hosted fixed costs (cluster, ops FTE, wrapper engineering, monitoring) stay roughly flat from 5k to 80k runs/month: the cluster is sized for peak and the ops burden doesn't change. Connic scales linearly at uniform per-unit rates; above 100k runs/month, customers move to a negotiated Enterprise contract. Inngest scales with step executions and starts adding real cost past 1M/month. At low volume, the Connic Developer plan (€40/mo credit) is the right comparison and the wrapper advantage is largest; at high volume, the wrapper engineering amortizes over more runs and per-run economics take over.

The Bottom Line

The agent code is the cheap part. The wrapper (observability, judges, guardrails, HITL, A/B, cost tracking, compliance, signed connectors) is where the year-one budget goes. Self-hosting runs ~€117,400/yr at 50k runs because you build the wrapper. Inngest runs ~€80,600/yr because the platform stops at durable execution and you assemble the wrapper from third-party tools. Connic Pro runs ~€40,100/yr because the wrapper is the platform.

See the Composer SDK for what writing an agent looks like, survey the wider 2026 runtime landscape, or start a Connic project in minutes.