What We Shipped in March 2026

March was all about confidence and control. A/B testing lets you compare agent variants side by side, guardrails enforce safety rules on every run, and API spec tools turn any OpenAPI definition into callable agent tools. We also shipped dashboard templates with percentile metrics, a migration CLI, and a long list of improvements.

A/B Testing

You can now test agent variants against each other in a live environment. Deploy a variant alongside a base agent, split traffic between them, and compare results side by side in the dashboard.

Variants follow a simple naming convention. Create an agent file named {base}-test-{name} and the SDK links it to the base agent automatically:

agents/

order-processor.yaml                      # base agent
order-processor-test-faster-model.yaml    # variant: "faster-model"
order-processor-test-new-prompt.yaml      # variant: "new-prompt"

From the dashboard, open the base agent and click Manage A/B Tests to configure a test:

✓Traffic split: Route a percentage of requests to the variant, the rest stays on control
✓Minimum sample size: Set a threshold before results are considered meaningful
✓Auto-rollback: Pause the test automatically if the variant failure rate exceeds a threshold within a rolling window
✓Sticky sessions: The same user or chat thread always sees the same variant

The comparison view shows runs, average cost, P50 and P95 duration, average judge score, and success rate for control and variant side by side. For a deeper walkthrough, read A/B Testing for AI Agents.

Agent Guardrails

Guardrails add a safety layer around your agents. Define input rules that run before the agent executes and output rules that check the response before it reaches the user.

agent.yaml

guardrails:
  input:
    - type: prompt_injection
      mode: block
    - type: pii
      mode: redact
      config:
        entities: [email, phone, ssn, credit_card]
    - type: topic_restriction
      mode: block
      config:
        allowed_topics: [product support, billing, account help]
        off_topic_message: "I can only help with support and billing."
  output:
    - type: system_prompt_leakage
      mode: block
    - type: pii_leakage
      mode: block
      config:
        entities: [ssn, api_key]

Each rule has a mode that controls what happens on violation:

•block: Stop processing and return a rejection message
•warn: Log the violation in the trace but continue normally
•redact: Replace detected content with placeholders like [EMAIL_REDACTED]

Built-in types include prompt_injection, pii, moderation, topic_restriction, regex, and custom. Custom guardrails point to your own Python functions for full flexibility. Every evaluation is recorded in the run trace so you can audit exactly what was checked and why. Read more in Agent Guardrails: Real-Time Safety and Secure AI Agents in Production.

API Spec Tools

Agents can now call any API defined in an OpenAPI v3.x specification. Upload a spec to your project and reference its operations as API spec tools using the api: prefix. Wildcard matching lets you expose entire specs or subsets in a single line.

agent.yaml

tools:
  - api:stripe.*                  # all operations from the Stripe spec
  - api:hubspot.get_contact       # single operation
  - api:internal_api.list*        # wildcard: list_users, list_orders, etc.
  - billing.lookup_invoice        # file-based tool (unchanged)

Tool names, descriptions, and parameters are derived directly from the OpenAPI schema. File-based tools also support wildcards now — use support_tools.search_* to match all functions starting with search_ in a module.

Dashboard Templates & Metrics

Setting up observability dashboards no longer means configuring widgets one by one. Choose a pre-built template when creating a new dashboard and get a complete layout instantly.

✓Overview: Total runs, success rate, costs, token usage, top agents, and model distribution
✓Agent: Scoped to a single agent with runs, errors, judge scores, and model breakdown
✓Cost: Total, input, output, thinking, and cached input costs with model-level breakdown
✓Token / LLM: Token consumption patterns by type and model

Alongside templates, all run metrics now include P50 and P95 percentiles for duration, cost, and tokens per run. Averages hide outliers — percentiles show you the real picture. Cost tracking is also more granular: every run now shows a computed cost based on actual token usage and model pricing, broken down by input, output, thinking, and cached tokens. For a full guide, read Agent Observability: Track Costs, Tokens & Runs.

Migrate Command

Moving to Connic from another framework is now a single command. The new connic migrate CLI command scans your project, detects agents, tools, and models, and scaffolds a complete Connic project structure.

Terminal

$ connic migrate ./my-langchain-project
Detected framework: LangChain
Found 3 agents, 7 tools, 2 models
Scaffolding project...

Created:
  agents/order-processor.yaml
  agents/support-agent.yaml
  agents/classifier.yaml
  tools/lookup_order.py
  tools/search_docs.py
  MIGRATION_REPORT.md

Supported frameworks:

•LangChain: Extracts agents from create_agent() and create_react_agent() calls
•Google ADK: Reads root_agent.yaml and Python agent classes including sequential, parallel, and loop agents

A generated MIGRATION_REPORT.md lists everything that was converted, any unresolved references, and items that need manual review. For a step-by-step walkthrough, read Migrate from LangChain to Production.

More Improvements

•Bulk run actions: Rerun or cancel multiple agent runs at once from the dashboard
•Knowledge Base namespaces: The new kb_list_namespaces tool lets agents explore the hierarchical structure of a knowledge base
•Scheduled triggers: Use trigger_agent_at to schedule another agent to run at a specific future time
•Web search filters: The web_search tool now supports country and include_news parameters for more targeted results
•Nested project structures: Agent YAML files can now be organized in subdirectories under agents/
•Telegram allowlist: Restrict which Telegram user IDs can interact with your agent, plus configurable session TTL
•Judge improvements: Filter evaluations by low scores and see cost estimates before running bulk evaluations
•Fullscreen dashboard: Expand any dashboard widget to fullscreen for a closer look at your data

What We Shipped in March 2026

A/B Testing

Agent Guardrails

API Spec Tools

Dashboard Templates & Metrics

Migrate Command

More Improvements

More from the Blog

AI Agent Evaluation: Automated Scoring with LLM Judges

Agent Guardrails: Real-Time Safety for Your AI Agents

What We Shipped in January 2026

AI Agents: From Prototype to Production

Hidden Costs of Self-Hosting AI Agents

Add AI Agents to SaaS Without an ML Team