Skip to main content

Changelog

v3.8.0 — 2026-06-05

Added

  • Claude Opus 4.8 support — added vertex_ai.anthropic.claude-opus-4-8 (GENAI_SHARED_VERTEXAI_ANTHROPIC_CLAUDE_48_OPUS) and bedrock.anthropic.claude-opus-4-8 (GENAI_SHARED_BEDROCK_ANTHROPIC_CLAUDE_48_OPUS) to the model registry with 1M context support. Bare name claude-opus-4-8 defaults to Vertex AI, consistent with all other Anthropic models.

v3.7.2 — 2026-06-04

Changed

  • Claude code manual configuration — the Claude Code setup guide now marks manual configuration as not mandatory since the installation scripts automatically configure claude with all the needed variables

v3.7.1 — 2026-05-29

Added

  • One-shot Claude Code installation scripts — the Claude Code setup guide now links to downloadable install-claude.sh (macOS) and install-claude.ps1 (Windows) scripts that install Git, Claude Code, and any prerequisites in a single command. Replaces the previous manual winget / native-installer steps.

v3.7.0 — 2026-05-26

Added

  • suggested field on /models — each entry returned by GET /models now carries an optional suggested field whose value is one of OPUS, SONNET, HAIKU, CODEX, DEFAULT, or null. Populated manually from the new Firestore collection model_suggestions (document id = upstream model id, e.g. bedrock.anthropic.claude-opus-4-7). Refresh is piggybacked on the existing 1-hour models cache TTL; invalid or missing values are surfaced as null and /models continues to serve even if Firestore is unreachable.

v3.6.2 — 2026-05-22

Changed

  • Default GPT-5.5 routing to Azure — bare name gpt-5.5 now defaults to Azure (GENAI_SHARED_AZURE_OPENAI_GPT_55) instead of OpenAI. Users can still explicitly target OpenAI via openai.gpt-5.5.

v3.6.1 — 2026-05-21

Changed

  • Cline setup guide rewritten — authentication now uses the API Key field (apikey=<key>&tenantid=<id> format) instead of custom headers, avoiding a known Cline bug where custom headers are intermittently dropped.

v3.6.0 — 2026-05-20

Added

  • Internal bridge support — new USE_UPSTREAM_BRIDGE and UPSTREAM_BRIDGE_URL env vars allow routing all upstream requests through the internal Azure bridge (LiteLLM proxy) instead of the public GenAI Shared Service endpoint. When USE_UPSTREAM_BRIDGE=true, the gateway reads bridgeUrl from Secret Manager or uses UPSTREAM_BRIDGE_URL if set.

v3.5.1 — 2026-05-19

Changed

  • Default Opus 4.7 routing to Vertex AI — bare name claude-opus-4-7 now defaults to Vertex AI instead of Bedrock, consistent with all other Anthropic models.
  • [QUOTA] log downgraded to DEBUG — remaining [QUOTA] tagged message moved from INFO to DEBUG for cleaner production logs.

v3.4.12 — 2026-05-19

Added

  • Streaming diagnostics — added explicit ERROR logs for downstream cancellations, abnormal stream closure, upstream 5xx responses, and incomplete chunked upstream streams, including redacted response headers and request summaries.
  • Claude Opus 4.7 Vertex AI support — added the Vertex AI Opus 4.7 model entry and 1M-context handling.

Changed

  • Long-running stream handling — disabled per-read upstream timeouts for streaming requests while keeping normal request timeouts for non-streaming calls.
  • Default output token ceiling — increased default max token handling to support long coding-agent requests.

v3.5.0 — 2026-05-19

Added

  • Vertex AI Opus 4.7 support — added vertex_ai.anthropic.claude-opus-4-7 (GENAI_SHARED_VERTEXAI_ANTHROPIC_CLAUDE_47_OPUS) to the model registry with 1M context support. Both Bedrock and Vertex variants of Opus 4.7 are now available.
  • Auto-inject 1M context beta — the gateway now automatically injects anthropic-beta: context-1m-2025-08-07 for all models that support 1M context (Opus 4.7, Opus 4.6, Sonnet 4 — both Bedrock and Vertex). No opt-in from the client is required.
  • [1m] suffix stripping — model names with a [1m]/[1M] suffix (e.g., claude-opus-4-7[1m], bedrock.anthropic.claude-opus-4-7[1m], GENAI_SHARED_BEDROCK_ANTHROPIC_CLAUDE_47_OPUS[1m]) are now correctly resolved. The suffix is stripped before lookup so all three naming conventions work.

Changed

  • Log noise reduction[1M-CONTEXT], [TELEMETRY], and [CACHE TOKENS] tagged messages downgraded from INFO to DEBUG. Only [REQUEST COMPLETED] and [REQUEST FAILED] remain at INFO level for production observability.

v3.4.10 — 2026-05-08

Changed

  • Codex setup guide rewritten — simplified to a single-file configuration (config.toml only, auth.json no longer needed). Authentication now uses http_headers in the provider config. Updated recommended model to GPT-5.5.

v3.4.9 — 2026-05-07

Added

  • GPT-5.5 documentation — added GPT-5.5 to the supported models table with OpenAI and Azure provider rows.

v3.4.6 — 2026-04-23

Added

  • Claude Opus 4.7 documentation — added Opus 4.7 to the supported models table and updated the Claude Code setup guide to recommend it as the default pinned Opus model.
  • GPT-5.4 model family pricing update — corrected pricing and context windows for GPT-5.4, GPT-5.4 Mini, and GPT-5.4 Nano based on the latest PwC GenAI Shared Service EMEA model list.

v3.4.5 — 2026-04-21

Added

  • Claude Opus 4.7 support — added bedrock.anthropic.claude-opus-4-7 to the model registry with 1M context support. Bare name claude-opus-4-7 resolves to the Bedrock endpoint.

v3.4.4 — 2026-04-10

Added

  • GPT-5.4 model family support — added gpt-5.4, gpt-5.4-mini, gpt-5.4-nano, and gpt-5.4-pro to the model registry. These models can now be used with any endpoint including /responses (Codex).

v3.4.2 — 2026-03-21

Added

  • Observability logs for 1M context and telemetry enforcement — tagged [1M-CONTEXT] and [TELEMETRY] INFO logs at every decision point for easier debugging in GCP.

Fixed

  • Telemetry no longer blocks returning users — users coming back after idle periods (minutes, hours, or days) now get a grace period instead of being immediately blocked with a stale-heartbeat error.
  • Increased telemetry staleness window from 30 seconds to 5 minutes to reduce false positives.

v3.4.1 — 2026-03-18

Fixed

  • Re-added /gpt4o legacy route — backwards-compatible alias for /chat/completions for reverse proxies that still reference the old path.

v3.4.0 — 2026-03-17

Added

  • Telemetry grace period for new sessions — new API keys now get a 120-second grace period before telemetry enforcement kicks in. This allows Claude Code's OTEL exporter time to send its first heartbeat batch, eliminating false 403 errors on session startup. The grace period is configurable via TELEMETRY_GRACE_PERIOD_SECONDS.

v3.3.0 — 2026-03-16

Added

  • 1M context window support — Claude Opus 4.6 and Claude Sonnet 4 now support a 1M-token context window (5x the standard 200K limit). To enable it, remove CLAUDE_CODE_DISABLE_EXPERIMENTAL_BETAS from your settings.json and select "Opus (1M context)" in the model picker (/model). The gateway whitelists the context-1m-2025-08-07 beta header and forwards it to the upstream provider; all other beta headers are silently dropped.
  • Extended context pricing — requests exceeding 200K input tokens are automatically billed at 2x the standard input rate, consistent with Anthropic's pricing for 1M context. The cost multiplier applies to both regular and cache-aware cost calculations.

Changed

  • CLAUDE_CODE_DISABLE_EXPERIMENTAL_BETAS is now optional — this variable was previously required in the settings.json example. It is no longer needed for most users. The gateway selectively whitelists supported beta features instead of requiring clients to suppress all betas.

v3.2.4 — 2026-03-05

Fixed

  • Budget display showing 300€ instead of 200€ — the /quota/status endpoint was including quota rules from other organisations in a user's matching_entities when calculating the monthly limit. The previous fix (filter=False) was too broad: it stopped filtering out the global base rule (correct) but also stopped filtering out cross-org rules (incorrect). The aggregation now correctly includes only base rules (orgId=null), global rules (orgId="*"), and rules scoped to the user's current organisation.

v3.2.3 — 2026-03-05

Changed

  • Default Opus/Sonnet routing to Vertex AI — bare model names (e.g. claude-opus-4-6) now default to Vertex AI instead of Bedrock. Bedrock often lags behind in supporting new Anthropic features (e.g. deferred tool loading). Users can still explicitly request a Bedrock model via the full upstream ID.

v3.2.2 — 2026-03-04

Fixed

  • Budget display showing 100€ instead of 200€ — the /quota/status endpoint (used by the status line and Budget MCP server) was incorrectly filtering out the global base rule when calculating the limit, showing half the actual cap. Quota enforcement was already correct; this was a display-only bug.

v3.2.1 — 2026-03-04

Fixed

  • Model resolution fallback — the /chat/completions endpoint now accepts the model field in the request body as a fallback when the model-name header is not provided, aligning with OpenAI API conventions.
  • Default max_tokens — when clients omit max_tokens or send 0, the gateway now defaults to 8192 tokens, preventing upstream validation errors.

Important: Enforcement Starting Monday 9th March 2026

Action Required — Access Will Be Blocked from Monday 9 March 2026

Starting Monday 9 March 2026, the PROD environment enforces the following policies. If you do not act before this date, your access will be blocked.

What changes

ConditionResult
Request made with a personal @pwc.com API key403 Forbidden — blocked immediately
Claude Code without telemetry configured (missing OTEL_RESOURCE_ATTRIBUTES)403 Telemetry not configured — blocked immediately

What you need to do

  1. Get an API key from the Get Access page — personal @pwc.com keys are no longer accepted.
  2. Configure telemetry — add your PwC email to OTEL_RESOURCE_ATTRIBUTES in ~/.claude/settings.json:
"OTEL_RESOURCE_ATTRIBUTES": "user.email=name.surname@pwc.com"

See the Claude Code guide for the full telemetry setup.

Already configured telemetry but still getting errors? Restart Claude Code — it may take a few seconds for the new session to register. If errors persist after restarting, double-check that OTEL_RESOURCE_ATTRIBUTES is correctly set and that you are using an API key from the Get Access page.


v3.2.0 — 2026-03-02

Added

  • Personal API-key blocking — the gateway can now reject requests made with personal @pwc.com API keys. Use an API key from the Get Access page to avoid disruption.
  • Telemetry enforcement for Claude Code — Claude Code users can be required to have an active telemetry configuration before accessing the gateway, helping ensure usage visibility and compliance.

Both features are opt-in and disabled by default; existing users are not affected.

Improved

  • Three model naming formats accepted — the gateway now documents and cross-references all three accepted model ID formats: vendor name (e.g., claude-sonnet-4-6), PwC GenAI Shared Service convention (e.g., bedrock.anthropic.claude-opus-4-6), and Coding Agents Gateway convention (e.g., GENAI_SHARED_BEDROCK_ANTHROPIC_CLAUDE_46_OPUS). See the Available Models page for the full list.
  • Simplified telemetry configurationOTEL_RESOURCE_ATTRIBUTES now only requires user.email=name.surname@pwc.com. Team and department are resolved automatically; no manual team.id or department values needed.
  • Actionable telemetry error messages — when telemetry is missing or stale, the 403 response now tells you exactly what to set and that access will be restored within ~1 minute of restarting Claude Code.
  • Faster telemetry recovery — after configuring telemetry and restarting Claude Code, access is restored in approximately 1 minute (reduced from up to 5 minutes in previous versions).

v3.1.0 — 2026-02-27

Added

  • Interactive API reference — browse all endpoints, view request/response schemas, and try calls directly in your browser at /docs (Swagger UI) or /redoc (ReDoc).
  • The Swagger "Authorize" dialog accepts both auth formats (Bearer token or separate api-key + tenant-id headers) so you can test calls without leaving the browser.

v3.0.2 — 2026-02-26

Improved

  • Model name resolution is now more flexible: the gateway correctly handles model IDs regardless of which naming convention your client sends.

v3.0.1 — 2026-02-25

Fixed

  • Some model IDs that were previously unrecognised by the gateway are now resolved correctly.

v3.0.0 — 2026-02-25

Changed

  • Reduced latency and improved reliability — internal routing optimisations eliminate a class of startup failures that affected some users.
  • The /v1/models endpoint now returns an accurate, up-to-date list of available models.