llmadapter

Provider Matrix

This matrix is the v1 supported provider endpoint surface. It describes what llmadapter routes and smoke-tests today; it is not a promise that every upstream provider-specific field is implemented.

This is endpoint evidence, not workload approval. For agentic-coding suitability, see docs/USE_CASE_MATRIX.md.

For a machine-readable join of provider descriptors, endpoint evidence, warnings, and approved use-case rows, run:

go run ./cmd/llmadapter conformance --json

Legend:

Endpoints

Provider endpoint API kind Family Credentials Default smoke model
anthropic anthropic.messages anthropic.messages ANTHROPIC_API_KEY claude-haiku-4-5-20251001
claude anthropic.messages anthropic.messages ~/.claude/.credentials.json or CLAUDE_CONFIG_DIR claude-haiku-4-5-20251001
openai_chat openai.chat_completions openai.chat_completions OPENAI_API_KEY or OPENAI_KEY gpt-4.1-mini
openai_responses openai.responses openai.responses OPENAI_API_KEY or OPENAI_KEY gpt-4.1-mini
codex_responses codex.responses openai.responses CODEX_ACCESS_TOKEN, CODEX_CODE_OAUTH_TOKEN, or ~/.codex/auth.json provider default
bedrock_responses bedrock.responses openai.responses BEDROCK_API_KEY or AWS_BEARER_TOKEN_BEDROCK openai.gpt-oss-120b
bedrock_messages bedrock.anthropic_messages anthropic.messages BEDROCK_API_KEY or AWS_BEARER_TOKEN_BEDROCK anthropic.claude-opus-4-7
bedrock_converse bedrock.converse bedrock.converse AWS SDK default credential chain anthropic.claude-sonnet-4-6
openrouter_chat openrouter.chat_completions openai.chat_completions OPENROUTER_API_KEY or OPENROUTER_KEY openai/gpt-4.1-mini
openrouter_responses openrouter.responses openai.responses OPENROUTER_API_KEY or OPENROUTER_KEY openai/gpt-4.1-mini
openrouter_messages openrouter.anthropic_messages anthropic.messages OPENROUTER_API_KEY or OPENROUTER_KEY anthropic/claude-sonnet-4
minimax_chat minimax.chat_completions openai.chat_completions MINIMAX_API_KEY or MINIMAX_KEY MiniMax-M2.7
minimax_messages minimax.anthropic_messages anthropic.messages MINIMAX_API_KEY or MINIMAX_KEY MiniMax-M2.7

Continuation And Transport

Provider endpoint Consumer continuation Internal continuation Transport
anthropic replay replay http_sse
claude replay replay http_sse
openai_chat replay replay http_sse
openai_responses previous_response_id previous_response_id http_sse
codex_responses replay replay http_sse
bedrock_responses previous_response_id previous_response_id http_sse
bedrock_messages replay replay http_sse
bedrock_converse replay replay http_sse
openrouter_chat replay replay http_sse
openrouter_responses replay replay http_sse
openrouter_messages replay replay http_sse
minimax_chat replay replay http_sse
minimax_messages replay replay http_sse

Consumers should choose their public projection strategy from Consumer continuation, not from provider name, API family, Internal continuation, or Transport. For example, Codex Responses is OpenAI Responses-family but requires replay at the public boundary; WebSocket and internal previous_response_id reuse are provider optimizations, not caller contracts.

Feature Coverage

Provider endpoint Text Tools Tool continuation Parallel tools Reasoning Prompt cache accounting Structured output Vision Usage Pricing Gateway
anthropic live live live n/a live live n/a fixture live modeldb live
claude live live live n/a live live n/a fixture live modeldb live
openai_chat live live live live n/a n/a fixture fixture live modeldb live
openai_responses live live live live live live fixture fixture live modeldb live
codex_responses live live live live live live fixture fixture live modeldb live
bedrock_responses live live live n/a live n/a fixture n/a live modeldb live
bedrock_messages live live live n/a n/a n/a n/a fixture live modeldb live
bedrock_converse live live live n/a live live n/a n/a live modeldb live
openrouter_chat live live live live n/a n/a fixture fixture live modeldb live
openrouter_responses live live live live live live fixture fixture live modeldb live
openrouter_messages live live live n/a live live n/a fixture live modeldb live
minimax_chat live live live n/a n/a n/a fixture fixture live modeldb live
minimax_messages live live live n/a live live n/a fixture live modeldb live

Prompt cache accounting means the live smoke test checks provider-reported cache write/read token counters. mapped means llmadapter maps the cache controls onto the provider wire shape, but the v1 smoke matrix does not assert provider-reported cache accounting for that endpoint.

Codex Responses uses public replay semantics. One-shot requests use the Codex HTTP/SSE backend. Session-mode requests with an explicit session ID may prefer the Codex WebSocket transport, keep the WebSocket open for backend affinity, and use internal previous_response_id after lineage checks pass. Lineage requires exact canonical input-prefix matching, not just input length. HTTP/SSE fallback can happen before user-visible output starts; after output starts, a lost WebSocket fails the current turn and invalidates internal continuation state so the next request replays. previous_response_id is still not a public caller contract for codex_responses.

Codex prompt-cache accounting is verified in two ways: the shared prompt-cache smoke checks provider-reported cache counters, and TestSmokeCodexWebSocketPromptCache specifically requires WebSocket transport plus cache-read accounting for a repeated cached request.

OpenAI platform Responses has an official WebSocket mode. Direct openai_responses clients can opt into it with responses.WithWebSocketMode(...); this matrix still marks openai_responses as HTTP/SSE because provider descriptors, JSON config, auto mux, and the live workload matrix default to HTTP/SSE unless an explicit direct-client option is used. openai_chat remains HTTP/SSE. The OpenAI Realtime API is a separate WebSocket/WebRTC surface and is not represented in this matrix yet.

The default OpenAI Responses WebSocket transport enables compression and forces IPv4. OpenRouter Responses does not inherit that mode unless it opts in explicitly.

OpenAI Responses owns the base provider wire implementation for the Responses family. OpenRouter Responses wraps that base with OpenRouter-specific request extensions; Codex Responses wraps it with Codex auth/session/header behavior and Codex-specific unsupported-field handling.

Bedrock Responses uses Amazon Bedrock Mantle’s OpenAI-compatible /v1/responses endpoint. The default base URL is derived from AWS_REGION or AWS_DEFAULT_REGION, falling back to us-east-1; config base_url may also be set explicitly, with or without a trailing /v1. Authentication can use explicit Bedrock bearer keys or AWS SDK-loaded profile/region credentials to generate short-term Bedrock bearer tokens. Native Bedrock Converse and Bedrock Chat Completions are separate endpoint shapes and are not represented by bedrock_responses; Anthropic-compatible Bedrock access is represented by bedrock_messages.

Bedrock Mantle’s /v1/models list is broader than the /v1/responses surface. On 2026-05-03 in us-east-1, the listed models that live-probed successfully through /v1/responses were openai.gpt-oss-120b and openai.gpt-oss-20b; all other listed Mantle IDs returned Bedrock validation errors stating that the model does not support /v1/responses.

Bedrock Messages uses Mantle’s Anthropic-compatible route at /anthropic/v1/messages. On 2026-05-03 in us-east-1, anthropic.claude-opus-4-7 and anthropic.claude-haiku-4-5 live-probed successfully through this endpoint; anthropic.claude-opus-4-6 returned a provider not-found error for that Mantle model ID. A separate /anthropic/v1/models listing is not exposed, so the broad Mantle /v1/models endpoint remains the discovery source even though it mixes API surfaces. The shared smoke matrix uses Haiku 4.5 for tool-continuation because Opus 4.7 intentionally refuses the synthetic prompt-injection-shaped tool result used by the generic continuation smoke. Bedrock Messages accepts adaptive-thinking-shaped requests, but the live stream did not expose reasoning deltas or reasoning token accounting, so this endpoint does not advertise reasoning yet.

Bedrock Converse uses the native AWS SDK Bedrock Runtime ConverseStream API and AWS SDK credentials rather than Mantle bearer tokens. It targets replay-based continuation, text, tools, prompt cache points, and Claude extended-thinking request fields. Modeldb-backed fixed and dynamic routes prefer RuntimeAccess.ResolvedWireID for region-specific inference profiles and fall back to provider-local prefixing only when runtime metadata is unavailable. On 2026-05-03 in us-east-1, the shared text, tool-use, tool-result continuation, reasoning, OpenAI Chat gateway, and Anthropic Messages gateway smokes passed with the default Sonnet 4.6 model; the agentic-coding compatibility run also passed cache-accounting checks for Haiku 4.5, Sonnet 4.6, Opus 4.6, and Opus 4.7 through Bedrock Converse.

Live Smoke Commands

Full available matrix:

env GOCACHE=/tmp/go-cache TEST_INTEGRATION=1 go test ./tests/e2e -count=1 -v

Focused slices:

env GOCACHE=/tmp/go-cache TEST_INTEGRATION=1 go test ./tests/e2e -run TestSmokeTextStream -count=1 -v
env GOCACHE=/tmp/go-cache TEST_INTEGRATION=1 go test ./tests/e2e -run TestSmokeToolUse -count=1 -v
env GOCACHE=/tmp/go-cache TEST_INTEGRATION=1 go test ./tests/e2e -run TestSmokeToolResultContinuation -count=1 -v
env GOCACHE=/tmp/go-cache TEST_INTEGRATION=1 go test ./tests/e2e -run TestSmokeReasoningStream -count=1 -v
env GOCACHE=/tmp/go-cache TEST_INTEGRATION=1 go test ./tests/e2e -run TestSmokePromptCache -count=1 -v
env GOCACHE=/tmp/go-cache TEST_INTEGRATION=1 go test ./tests/e2e -run TestSmokeCodexWebSocketContinuation -count=1 -v
env GOCACHE=/tmp/go-cache TEST_INTEGRATION=1 go test ./tests/e2e -run TestSmokeCodexWebSocketPromptCache -count=1 -v
env GOCACHE=/tmp/go-cache TEST_INTEGRATION=1 go test ./tests/e2e -run TestSmokeOpenAIResponsesWebSocket -count=1 -v
env GOCACHE=/tmp/go-cache TEST_INTEGRATION=1 go test ./tests/e2e -run TestGatewaySmoke -count=1 -v
env GOCACHE=/tmp/go-cache TEST_INTEGRATION=1 go test ./tests/e2e -run TestAnthropicMessagesGatewaySmoke -count=1 -v
env GOCACHE=/tmp/go-cache TEST_INTEGRATION=1 go test ./tests/e2e -run TestResponsesGatewaySmoke -count=1 -v
env GOCACHE=/tmp/go-cache TEST_INTEGRATION=1 go test ./tests/e2e -run TestBedrockResponsesModelsList -count=1 -v
env GOCACHE=/tmp/go-cache TEST_INTEGRATION=1 BEDROCK_RESPONSES_PROBE_MODELS=openai.gpt-oss-120b,openai.gpt-oss-20b go test ./tests/e2e -run TestBedrockResponsesProbeModels -count=1 -v
env GOCACHE=/tmp/go-cache TEST_INTEGRATION=1 BEDROCK_MESSAGES_PROBE_MODELS=anthropic.claude-opus-4-7,anthropic.claude-haiku-4-5 go test ./tests/e2e -run TestBedrockMessagesProbeModels -count=1 -v

The e2e package skips cleanly when TEST_INTEGRATION is unset. Individual provider subtests skip when their credential env vars or local OAuth files are unavailable, or when a feature is not advertised by that endpoint in the smoke matrix.

Latest V1 Track Result

On 2026-05-02, the full live command above was run with local credentials available for all v1 provider endpoints. Text, tools, tool continuation, and gateway routing passed for Anthropic, Claude-compatible access, OpenAI Chat, OpenAI Responses, Codex Responses, OpenRouter Chat, OpenRouter Responses, OpenRouter Messages, MiniMax Chat, and MiniMax Messages.

On 2026-04-27, the focused Codex WebSocket continuation and WebSocket prompt-cache smokes passed with TEST_INTEGRATION=1, including runtime transport=websocket evidence and provider-reported cache-read token accounting on the repeated cached request.

On 2026-04-27, llmadapter conformance was tightened so approved agentic_coding rows must also carry live evidence for every required workload feature and explicit continuation/transport evidence. The bundled docs/compatibility/agentic_coding.json passes that stricter contract.