Changelog
v3.4.6 — 2026-04-23
Added
- Claude Opus 4.7 documentation — added Opus 4.7 to the supported models table and updated the Claude Code setup guide to recommend it as the default pinned Opus model.
- GPT-5.4 model family pricing update — corrected pricing and context windows for GPT-5.4, GPT-5.4 Mini, and GPT-5.4 Nano based on the latest PwC GenAI Shared Service EMEA model list.
v3.4.5 — 2026-04-21
Added
- Claude Opus 4.7 support — added
bedrock.anthropic.claude-opus-4-7to the model registry with 1M context support. Bare nameclaude-opus-4-7resolves to the Bedrock endpoint.
v3.4.4 — 2026-04-10
Added
- GPT-5.4 model family support — added
gpt-5.4,gpt-5.4-mini,gpt-5.4-nano, andgpt-5.4-proto the model registry. These models can now be used with any endpoint including/responses(Codex).
v3.4.2 — 2026-03-21
Added
- Observability logs for 1M context and telemetry enforcement — tagged
[1M-CONTEXT]and[TELEMETRY]INFO logs at every decision point for easier debugging in GCP.
Fixed
- Telemetry no longer blocks returning users — users coming back after idle periods (minutes, hours, or days) now get a grace period instead of being immediately blocked with a stale-heartbeat error.
- Increased telemetry staleness window from 30 seconds to 5 minutes to reduce false positives.
v3.4.1 — 2026-03-18
Fixed
- Re-added
/gpt4olegacy route — backwards-compatible alias for/chat/completionsfor reverse proxies that still reference the old path.
v3.4.0 — 2026-03-17
Added
- Telemetry grace period for new sessions — new API keys now get a 120-second grace period before telemetry enforcement kicks in. This allows Claude Code's OTEL exporter time to send its first heartbeat batch, eliminating false 403 errors on session startup. The grace period is configurable via
TELEMETRY_GRACE_PERIOD_SECONDS.
v3.3.0 — 2026-03-16
Added
- 1M context window support — Claude Opus 4.6 and Claude Sonnet 4 now support a 1M-token context window (5x the standard 200K limit). To enable it, remove
CLAUDE_CODE_DISABLE_EXPERIMENTAL_BETASfrom yoursettings.jsonand select "Opus (1M context)" in the model picker (/model). The gateway whitelists thecontext-1m-2025-08-07beta header and forwards it to the upstream provider; all other beta headers are silently dropped. - Extended context pricing — requests exceeding 200K input tokens are automatically billed at 2x the standard input rate, consistent with Anthropic's pricing for 1M context. The cost multiplier applies to both regular and cache-aware cost calculations.
Changed
CLAUDE_CODE_DISABLE_EXPERIMENTAL_BETASis now optional — this variable was previously required in thesettings.jsonexample. It is no longer needed for most users. The gateway selectively whitelists supported beta features instead of requiring clients to suppress all betas.
v3.2.4 — 2026-03-05
Fixed
- Budget display showing 300€ instead of 200€ — the
/quota/statusendpoint was including quota rules from other organisations in a user'smatching_entitieswhen calculating the monthly limit. The previous fix (filter=False) was too broad: it stopped filtering out the global base rule (correct) but also stopped filtering out cross-org rules (incorrect). The aggregation now correctly includes only base rules (orgId=null), global rules (orgId="*"), and rules scoped to the user's current organisation.
v3.2.3 — 2026-03-05
Changed
- Default Opus/Sonnet routing to Vertex AI — bare model names (e.g.
claude-opus-4-6) now default to Vertex AI instead of Bedrock. Bedrock often lags behind in supporting new Anthropic features (e.g. deferred tool loading). Users can still explicitly request a Bedrock model via the full upstream ID.
v3.2.2 — 2026-03-04
Fixed
- Budget display showing 100€ instead of 200€ — the
/quota/statusendpoint (used by the status line and Budget MCP server) was incorrectly filtering out the global base rule when calculating the limit, showing half the actual cap. Quota enforcement was already correct; this was a display-only bug.
v3.2.1 — 2026-03-04
Fixed
- Model resolution fallback — the
/chat/completionsendpoint now accepts themodelfield in the request body as a fallback when themodel-nameheader is not provided, aligning with OpenAI API conventions. - Default max_tokens — when clients omit
max_tokensor send 0, the gateway now defaults to 8192 tokens, preventing upstream validation errors.
Important: Enforcement Starting Monday 9th March 2026
Action Required — Access Will Be Blocked from Monday 9 March 2026
Starting Monday 9 March 2026, the PROD environment enforces the following policies. If you do not act before this date, your access will be blocked.
What changes
| Condition | Result |
|---|---|
Request made with a personal @pwc.com API key | 403 Forbidden — blocked immediately |
Claude Code without telemetry configured (missing OTEL_RESOURCE_ATTRIBUTES) | 403 Telemetry not configured — blocked immediately |
What you need to do
- Get an API key from the Get Access page — personal
@pwc.comkeys are no longer accepted. - Configure telemetry — add your PwC email to
OTEL_RESOURCE_ATTRIBUTESin~/.claude/settings.json:
"OTEL_RESOURCE_ATTRIBUTES": "user.email=name.surname@pwc.com"
See the Claude Code guide for the full telemetry setup.
Already configured telemetry but still getting errors? Restart Claude Code — it may take a few seconds for the new session to register. If errors persist after restarting, double-check that
OTEL_RESOURCE_ATTRIBUTESis correctly set and that you are using an API key from the Get Access page.
v3.2.0 — 2026-03-02
Added
- Personal API-key blocking — the gateway can now reject requests made with personal
@pwc.comAPI keys. Use an API key from the Get Access page to avoid disruption. - Telemetry enforcement for Claude Code — Claude Code users can be required to have an active telemetry configuration before accessing the gateway, helping ensure usage visibility and compliance.
Both features are opt-in and disabled by default; existing users are not affected.
Improved
- Three model naming formats accepted — the gateway now documents and cross-references all three accepted model ID formats: vendor name (e.g.,
claude-sonnet-4-6), PwC GenAI Shared Service convention (e.g.,bedrock.anthropic.claude-opus-4-6), and Coding Agents Gateway convention (e.g.,GENAI_SHARED_BEDROCK_ANTHROPIC_CLAUDE_46_OPUS). See the Available Models page for the full list. - Simplified telemetry configuration —
OTEL_RESOURCE_ATTRIBUTESnow only requiresuser.email=name.surname@pwc.com. Team and department are resolved automatically; no manualteam.idordepartmentvalues needed. - Actionable telemetry error messages — when telemetry is missing or stale, the 403 response now tells you exactly what to set and that access will be restored within ~1 minute of restarting Claude Code.
- Faster telemetry recovery — after configuring telemetry and restarting Claude Code, access is restored in approximately 1 minute (reduced from up to 5 minutes in previous versions).
v3.1.0 — 2026-02-27
Added
- Interactive API reference — browse all endpoints, view request/response schemas, and try calls directly in your browser at
/docs(Swagger UI) or/redoc(ReDoc). - The Swagger "Authorize" dialog accepts both auth formats (Bearer token or separate
api-key+tenant-idheaders) so you can test calls without leaving the browser.
v3.0.2 — 2026-02-26
Improved
- Model name resolution is now more flexible: the gateway correctly handles model IDs regardless of which naming convention your client sends.
v3.0.1 — 2026-02-25
Fixed
- Some model IDs that were previously unrecognised by the gateway are now resolved correctly.
v3.0.0 — 2026-02-25
Changed
- Reduced latency and improved reliability — internal routing optimisations eliminate a class of startup failures that affected some users.
- The
/v1/modelsendpoint now returns an accurate, up-to-date list of available models.