Route requests to different models

Outcome

Define named execution routes so different agents or requests use different backends — for example, a fast/cheap model for quick lookups and a strong model for complex reasoning — without repeating provider:model strings everywhere.

Prerequisites

1) Create config

Generate a gateway token:

export CARAPACE_GATEWAY_TOKEN="$(openssl rand -hex 32)"

Windows (PowerShell) alternative:

$bytes = [byte[]]::new(32)
[System.Security.Cryptography.RandomNumberGenerator]::Fill($bytes)
$env:CARAPACE_GATEWAY_TOKEN = [System.BitConverter]::ToString($bytes).Replace('-', '').ToLower()
{
  "gateway": {
    "bind": "loopback",
    "port": 18789,
    "auth": {
      "mode": "token",
      "token": "${CARAPACE_GATEWAY_TOKEN}"
    }
  },
  // Define routes once, reference by name
  "routes": {
    "fast":   { "model": "gemini:gemini-2.5-flash" },
    "strong": { "model": "anthropic:claude-opus-4-6" }
  },
  // Default all agents to the fast route
  "agents": {
    "defaults": {
      "route": "fast"
    }
  },
  "anthropic": {
    "apiKey": "${ANTHROPIC_API_KEY}"
  },
  "google": {
    "apiKey": "${GOOGLE_API_KEY}"
  }
}

Key points:

2) Run commands

CARAPACE_CONFIG_PATH=./carapace.json5 cara

In another terminal:

cara status --port 18789
cara chat --port 18789

3) Verify

Variations

Single provider, multiple models:

"routes": {
  // Date-suffixed IDs pin a specific stable snapshot; non-dated aliases
  // (e.g. `claude-opus-4-6`) track the latest snapshot for that family.
  "fast":   { "model": "anthropic:claude-haiku-4-5-20251001" },
  "strong": { "model": "anthropic:claude-opus-4-6" }
}

Local + cloud hybrid:

"routes": {
  "local":  { "model": "ollama:llama3" },
  "cloud":  { "model": "anthropic:claude-sonnet-4-6" }
}

Next step

Common failures and fixes

Need a recipe for your use case?

Tell us what outcome you want and we can prioritize a walkthrough.

Request a cookbook recipe