Skip to main content

Changelog

v3.4.6 — 2026-04-23

Added

  • Claude Opus 4.7 documentation — added Opus 4.7 to the supported models table and updated the Claude Code setup guide to recommend it as the default pinned Opus model.
  • GPT-5.4 model family pricing update — corrected pricing and context windows for GPT-5.4, GPT-5.4 Mini, and GPT-5.4 Nano based on the latest PwC GenAI Shared Service EMEA model list.

v3.4.5 — 2026-04-21

Added

  • Claude Opus 4.7 support — added bedrock.anthropic.claude-opus-4-7 to the model registry with 1M context support. Bare name claude-opus-4-7 resolves to the Bedrock endpoint.

v3.4.4 — 2026-04-10

Added

  • GPT-5.4 model family support — added gpt-5.4, gpt-5.4-mini, gpt-5.4-nano, and gpt-5.4-pro to the model registry. These models can now be used with any endpoint including /responses (Codex).

v3.4.2 — 2026-03-21

Added

  • Observability logs for 1M context and telemetry enforcement — tagged [1M-CONTEXT] and [TELEMETRY] INFO logs at every decision point for easier debugging in GCP.

Fixed

  • Telemetry no longer blocks returning users — users coming back after idle periods (minutes, hours, or days) now get a grace period instead of being immediately blocked with a stale-heartbeat error.
  • Increased telemetry staleness window from 30 seconds to 5 minutes to reduce false positives.

v3.4.1 — 2026-03-18

Fixed

  • Re-added /gpt4o legacy route — backwards-compatible alias for /chat/completions for reverse proxies that still reference the old path.

v3.4.0 — 2026-03-17

Added

  • Telemetry grace period for new sessions — new API keys now get a 120-second grace period before telemetry enforcement kicks in. This allows Claude Code's OTEL exporter time to send its first heartbeat batch, eliminating false 403 errors on session startup. The grace period is configurable via TELEMETRY_GRACE_PERIOD_SECONDS.

v3.3.0 — 2026-03-16

Added

  • 1M context window support — Claude Opus 4.6 and Claude Sonnet 4 now support a 1M-token context window (5x the standard 200K limit). To enable it, remove CLAUDE_CODE_DISABLE_EXPERIMENTAL_BETAS from your settings.json and select "Opus (1M context)" in the model picker (/model). The gateway whitelists the context-1m-2025-08-07 beta header and forwards it to the upstream provider; all other beta headers are silently dropped.
  • Extended context pricing — requests exceeding 200K input tokens are automatically billed at 2x the standard input rate, consistent with Anthropic's pricing for 1M context. The cost multiplier applies to both regular and cache-aware cost calculations.

Changed

  • CLAUDE_CODE_DISABLE_EXPERIMENTAL_BETAS is now optional — this variable was previously required in the settings.json example. It is no longer needed for most users. The gateway selectively whitelists supported beta features instead of requiring clients to suppress all betas.

v3.2.4 — 2026-03-05

Fixed

  • Budget display showing 300€ instead of 200€ — the /quota/status endpoint was including quota rules from other organisations in a user's matching_entities when calculating the monthly limit. The previous fix (filter=False) was too broad: it stopped filtering out the global base rule (correct) but also stopped filtering out cross-org rules (incorrect). The aggregation now correctly includes only base rules (orgId=null), global rules (orgId="*"), and rules scoped to the user's current organisation.

v3.2.3 — 2026-03-05

Changed

  • Default Opus/Sonnet routing to Vertex AI — bare model names (e.g. claude-opus-4-6) now default to Vertex AI instead of Bedrock. Bedrock often lags behind in supporting new Anthropic features (e.g. deferred tool loading). Users can still explicitly request a Bedrock model via the full upstream ID.

v3.2.2 — 2026-03-04

Fixed

  • Budget display showing 100€ instead of 200€ — the /quota/status endpoint (used by the status line and Budget MCP server) was incorrectly filtering out the global base rule when calculating the limit, showing half the actual cap. Quota enforcement was already correct; this was a display-only bug.

v3.2.1 — 2026-03-04

Fixed

  • Model resolution fallback — the /chat/completions endpoint now accepts the model field in the request body as a fallback when the model-name header is not provided, aligning with OpenAI API conventions.
  • Default max_tokens — when clients omit max_tokens or send 0, the gateway now defaults to 8192 tokens, preventing upstream validation errors.

Important: Enforcement Starting Monday 9th March 2026

Action Required — Access Will Be Blocked from Monday 9 March 2026

Starting Monday 9 March 2026, the PROD environment enforces the following policies. If you do not act before this date, your access will be blocked.

What changes

ConditionResult
Request made with a personal @pwc.com API key403 Forbidden — blocked immediately
Claude Code without telemetry configured (missing OTEL_RESOURCE_ATTRIBUTES)403 Telemetry not configured — blocked immediately

What you need to do

  1. Get an API key from the Get Access page — personal @pwc.com keys are no longer accepted.
  2. Configure telemetry — add your PwC email to OTEL_RESOURCE_ATTRIBUTES in ~/.claude/settings.json:
"OTEL_RESOURCE_ATTRIBUTES": "user.email=name.surname@pwc.com"

See the Claude Code guide for the full telemetry setup.

Already configured telemetry but still getting errors? Restart Claude Code — it may take a few seconds for the new session to register. If errors persist after restarting, double-check that OTEL_RESOURCE_ATTRIBUTES is correctly set and that you are using an API key from the Get Access page.


v3.2.0 — 2026-03-02

Added

  • Personal API-key blocking — the gateway can now reject requests made with personal @pwc.com API keys. Use an API key from the Get Access page to avoid disruption.
  • Telemetry enforcement for Claude Code — Claude Code users can be required to have an active telemetry configuration before accessing the gateway, helping ensure usage visibility and compliance.

Both features are opt-in and disabled by default; existing users are not affected.

Improved

  • Three model naming formats accepted — the gateway now documents and cross-references all three accepted model ID formats: vendor name (e.g., claude-sonnet-4-6), PwC GenAI Shared Service convention (e.g., bedrock.anthropic.claude-opus-4-6), and Coding Agents Gateway convention (e.g., GENAI_SHARED_BEDROCK_ANTHROPIC_CLAUDE_46_OPUS). See the Available Models page for the full list.
  • Simplified telemetry configurationOTEL_RESOURCE_ATTRIBUTES now only requires user.email=name.surname@pwc.com. Team and department are resolved automatically; no manual team.id or department values needed.
  • Actionable telemetry error messages — when telemetry is missing or stale, the 403 response now tells you exactly what to set and that access will be restored within ~1 minute of restarting Claude Code.
  • Faster telemetry recovery — after configuring telemetry and restarting Claude Code, access is restored in approximately 1 minute (reduced from up to 5 minutes in previous versions).

v3.1.0 — 2026-02-27

Added

  • Interactive API reference — browse all endpoints, view request/response schemas, and try calls directly in your browser at /docs (Swagger UI) or /redoc (ReDoc).
  • The Swagger "Authorize" dialog accepts both auth formats (Bearer token or separate api-key + tenant-id headers) so you can test calls without leaving the browser.

v3.0.2 — 2026-02-26

Improved

  • Model name resolution is now more flexible: the gateway correctly handles model IDs regardless of which naming convention your client sends.

v3.0.1 — 2026-02-25

Fixed

  • Some model IDs that were previously unrecognised by the gateway are now resolved correctly.

v3.0.0 — 2026-02-25

Changed

  • Reduced latency and improved reliability — internal routing optimisations eliminate a class of startup failures that affected some users.
  • The /v1/models endpoint now returns an accurate, up-to-date list of available models.