Skip to main content
Connic
Back to BlogChangelog

What We Shipped in May 2026

An agent testing framework, deploy gates with pull-request testing, deeper tracing for triggered and child runs, usage and budget dashboards, custom domains for connectors, and per-agent reasoning effort with cascading defaults.

June 1, 20267 min read

May was about confidence before you ship. A new testing framework lets you write and run tests for your agents, deploy gates hold a deployment until those tests pass, and pull-request testing catches regressions before they merge. We also went deeper on visibility, tracing triggered and child runs and reporting on token usage and budgets, shipped custom domains for connectors, and added per-agent reasoning effort with cascading project defaults.

The Agent Testing Framework

Agents are now testable like any other code. Define cases in a tests/ file, give each one an input and a set of assertions, and Connic runs them against your agents and records the results.

tests/order-manager.yaml
tests:
  - name: refunds_small_orders
    payload: '{"message": "Refund order A123, $40"}'
    expected_result: output.status == "refunded"
    expected_tool_calls:
      - process_refund

  - name: escalates_large_refunds
    payload: '{"message": "Refund order B456, $500"}'
    expected_result: output.status == "pending_approval"
    expected_child_agents:        # assert on a triggered child agent
      manager-approval:
        expected_triggered: 1
  • Run them anywhere: Kick off ad-hoc test runs from the dashboard, or test interactively while you build with connic dev
  • Assert on triggered agents: Assertions can target the child agents a run triggers, not just the top-level response
  • Coverage reports: See which agents and tools your tests actually exercise
  • Run a subset: Use connic test --filter to run only the cases that match

For the full walkthrough, read A Testing Framework for AI Agents.

Deploy Gates and Pull-Request Testing

Tests are only useful if something acts on them. With deploy gates, a deployment is held until its test suite passes, configured per environment so staging and production can be gated independently.

Connic also runs your tests on every pull request and reports the result back as a status check, with a link straight through to the deployment and its run traces.

Pull request checks
connic/pr-tests   All tests passed   Details

24 passed, 0 failed.
  • Per-environment control: Turn gating on or off per environment, and point its tests at a separate test environment when you want
  • PR status checks: A single connic/pr-tests check shows pass or fail on the pull request, linking through to the deployment behind it
  • Queued ahead of execution: Deploy-gate runs are queued before the deployment starts, so a bad build never reaches your environment

Tracing Triggered and Child Runs

When one agent triggers another, you can now follow the whole chain. Open a child run in a drawer stacked over its parent, step back up the chain, and jump between a trigger and the run it started, all without losing your place in the run history.

  • Stacked drawers: Child runs open over their parent so you can drill down and back up in one view
  • Linked navigation: Every triggered run records the run that started it, with inline links in both directions
  • Trigger source: A new trigger source on every run shows at a glance whether it came from a connector, another agent, a manual run, or a deploy gate

Usage and Budget Dashboards

The usage dashboard now reports token consumption and spend per agent alongside budget reporting, so you can see where cost is going and which agents are driving it.

Judges gained filter expressions for picking exactly which runs to evaluate:

Judge run selection
context.tier == 'enterprise' and output.priority >= 3
  • Token and budget metrics: Per-agent token usage and spend, with agents that have no usage excluded from budget reports
  • Input widgets: Add dropdowns and inputs to a dashboard to filter every widget on it at once
  • Companion metrics: Show a secondary comparison value next to a widget's primary number for quick context

Custom Domains for Connectors

Connector URLs no longer have to live on a Connic-hosted address. Point a domain you control at your connectors and Connic serves their webhook and websocket endpoints from it, so the URLs you hand out to partners match your own brand.

Connector URL
# Before — Connic-hosted
https://connect.connic.co/webhook/3b0a9d…

# After — your own domain
https://hooks.acme.com/webhook/3b0a9d…
  • Bring your own domain: Serve connector endpoints from a subdomain you own
  • Automatic verification: Add a domain and Connic verifies DNS and issues a certificate, with the status shown as it progresses
  • Per connector: Point individual connectors at the domain that fits

Reasoning Effort and Cascading Defaults

Two additions to agent configuration. Set reasoning_effort per agent to trade latency for depth, and drop a _defaults.yaml in your agents/ folder to set common configuration once.

agents/_defaults.yaml
# Inherited by every agent in the project
reasoning_effort: medium
timeout: 120

Every agent inherits the defaults and overrides only what it needs, so a research agent can ask for reasoning_effort: high without repeating the rest of the project's settings.

More Improvements

  • Reusable connections: Define a connection once for centralized credential management and reference it from any agent or connector instead of repeating secrets
  • Conditional upserts: The new db_upsert tool inserts or updates a document in one call, and knowledge-base queries now support metadata filtering
  • Authenticated connectors: A new require_auth option locks down inbound webhook and websocket connectors
  • MCP header context: Interpolate run context into MCP server headers for per-request auth and routing
  • Clearer timeouts: More detailed timeout and cancellation messages, so a stopped run tells you exactly why it stopped
  • Redirect following: The web_read_page tool can now follow redirects through to the final URL