Changelog
v3.8.0 — 2026-06-05
Added
- Claude Opus 4.8 support — added
vertex_ai.anthropic.claude-opus-4-8(GENAI_SHARED_VERTEXAI_ANTHROPIC_CLAUDE_48_OPUS) andbedrock.anthropic.claude-opus-4-8(GENAI_SHARED_BEDROCK_ANTHROPIC_CLAUDE_48_OPUS) to the model registry with 1M context support. Bare nameclaude-opus-4-8defaults to Vertex AI, consistent with all other Anthropic models.
v3.7.2 — 2026-06-04
Changed
- Claude code manual configuration — the Claude Code setup guide now marks manual configuration as not mandatory since the installation scripts automatically configure claude with all the needed variables
v3.7.1 — 2026-05-29
Added
- One-shot Claude Code installation scripts — the Claude Code setup guide now links to downloadable
install-claude.sh(macOS) andinstall-claude.ps1(Windows) scripts that install Git, Claude Code, and any prerequisites in a single command. Replaces the previous manualwinget/ native-installer steps.
v3.7.0 — 2026-05-26
Added
suggestedfield on/models— each entry returned byGET /modelsnow carries an optionalsuggestedfield whose value is one ofOPUS,SONNET,HAIKU,CODEX,DEFAULT, ornull. Populated manually from the new Firestore collectionmodel_suggestions(document id = upstream model id, e.g.bedrock.anthropic.claude-opus-4-7). Refresh is piggybacked on the existing 1-hour models cache TTL; invalid or missing values are surfaced asnulland/modelscontinues to serve even if Firestore is unreachable.
v3.6.2 — 2026-05-22
Changed
- Default GPT-5.5 routing to Azure — bare name
gpt-5.5now defaults to Azure (GENAI_SHARED_AZURE_OPENAI_GPT_55) instead of OpenAI. Users can still explicitly target OpenAI viaopenai.gpt-5.5.
v3.6.1 — 2026-05-21
Changed
- Cline setup guide rewritten — authentication now uses the API Key field (
apikey=<key>&tenantid=<id>format) instead of custom headers, avoiding a known Cline bug where custom headers are intermittently dropped.
v3.6.0 — 2026-05-20
Added
- Internal bridge support — new
USE_UPSTREAM_BRIDGEandUPSTREAM_BRIDGE_URLenv vars allow routing all upstream requests through the internal Azure bridge (LiteLLM proxy) instead of the public GenAI Shared Service endpoint. WhenUSE_UPSTREAM_BRIDGE=true, the gateway readsbridgeUrlfrom Secret Manager or usesUPSTREAM_BRIDGE_URLif set.
v3.5.1 — 2026-05-19
Changed
- Default Opus 4.7 routing to Vertex AI — bare name
claude-opus-4-7now defaults to Vertex AI instead of Bedrock, consistent with all other Anthropic models. [QUOTA]log downgraded to DEBUG — remaining[QUOTA]tagged message moved from INFO to DEBUG for cleaner production logs.
v3.4.12 — 2026-05-19
Added
- Streaming diagnostics — added explicit ERROR logs for downstream cancellations, abnormal stream closure, upstream 5xx responses, and incomplete chunked upstream streams, including redacted response headers and request summaries.
- Claude Opus 4.7 Vertex AI support — added the Vertex AI Opus 4.7 model entry and 1M-context handling.
Changed
- Long-running stream handling — disabled per-read upstream timeouts for streaming requests while keeping normal request timeouts for non-streaming calls.
- Default output token ceiling — increased default max token handling to support long coding-agent requests.
v3.5.0 — 2026-05-19
Added
- Vertex AI Opus 4.7 support — added
vertex_ai.anthropic.claude-opus-4-7(GENAI_SHARED_VERTEXAI_ANTHROPIC_CLAUDE_47_OPUS) to the model registry with 1M context support. Both Bedrock and Vertex variants of Opus 4.7 are now available. - Auto-inject 1M context beta — the gateway now automatically injects
anthropic-beta: context-1m-2025-08-07for all models that support 1M context (Opus 4.7, Opus 4.6, Sonnet 4 — both Bedrock and Vertex). No opt-in from the client is required. [1m]suffix stripping — model names with a[1m]/[1M]suffix (e.g.,claude-opus-4-7[1m],bedrock.anthropic.claude-opus-4-7[1m],GENAI_SHARED_BEDROCK_ANTHROPIC_CLAUDE_47_OPUS[1m]) are now correctly resolved. The suffix is stripped before lookup so all three naming conventions work.
Changed
- Log noise reduction —
[1M-CONTEXT],[TELEMETRY], and[CACHE TOKENS]tagged messages downgraded from INFO to DEBUG. Only[REQUEST COMPLETED]and[REQUEST FAILED]remain at INFO level for production observability.
v3.4.10 — 2026-05-08
Changed
- Codex setup guide rewritten — simplified to a single-file configuration (
config.tomlonly,auth.jsonno longer needed). Authentication now useshttp_headersin the provider config. Updated recommended model to GPT-5.5.
v3.4.9 — 2026-05-07
Added
- GPT-5.5 documentation — added GPT-5.5 to the supported models table with OpenAI and Azure provider rows.
v3.4.6 — 2026-04-23
Added
- Claude Opus 4.7 documentation — added Opus 4.7 to the supported models table and updated the Claude Code setup guide to recommend it as the default pinned Opus model.
- GPT-5.4 model family pricing update — corrected pricing and context windows for GPT-5.4, GPT-5.4 Mini, and GPT-5.4 Nano based on the latest PwC GenAI Shared Service EMEA model list.
v3.4.5 — 2026-04-21
Added
- Claude Opus 4.7 support — added
bedrock.anthropic.claude-opus-4-7to the model registry with 1M context support. Bare nameclaude-opus-4-7resolves to the Bedrock endpoint.
v3.4.4 — 2026-04-10
Added
- GPT-5.4 model family support — added
gpt-5.4,gpt-5.4-mini,gpt-5.4-nano, andgpt-5.4-proto the model registry. These models can now be used with any endpoint including/responses(Codex).
v3.4.2 — 2026-03-21
Added
- Observability logs for 1M context and telemetry enforcement — tagged
[1M-CONTEXT]and[TELEMETRY]INFO logs at every decision point for easier debugging in GCP.
Fixed
- Telemetry no longer blocks returning users — users coming back after idle periods (minutes, hours, or days) now get a grace period instead of being immediately blocked with a stale-heartbeat error.
- Increased telemetry staleness window from 30 seconds to 5 minutes to reduce false positives.
v3.4.1 — 2026-03-18
Fixed
- Re-added
/gpt4olegacy route — backwards-compatible alias for/chat/completionsfor reverse proxies that still reference the old path.
v3.4.0 — 2026-03-17
Added
- Telemetry grace period for new sessions — new API keys now get a 120-second grace period before telemetry enforcement kicks in. This allows Claude Code's OTEL exporter time to send its first heartbeat batch, eliminating false 403 errors on session startup. The grace period is configurable via
TELEMETRY_GRACE_PERIOD_SECONDS.
v3.3.0 — 2026-03-16
Added
- 1M context window support — Claude Opus 4.6 and Claude Sonnet 4 now support a 1M-token context window (5x the standard 200K limit). To enable it, remove
CLAUDE_CODE_DISABLE_EXPERIMENTAL_BETASfrom yoursettings.jsonand select "Opus (1M context)" in the model picker (/model). The gateway whitelists thecontext-1m-2025-08-07beta header and forwards it to the upstream provider; all other beta headers are silently dropped. - Extended context pricing — requests exceeding 200K input tokens are automatically billed at 2x the standard input rate, consistent with Anthropic's pricing for 1M context. The cost multiplier applies to both regular and cache-aware cost calculations.
Changed
CLAUDE_CODE_DISABLE_EXPERIMENTAL_BETASis now optional — this variable was previously required in thesettings.jsonexample. It is no longer needed for most users. The gateway selectively whitelists supported beta features instead of requiring clients to suppress all betas.
v3.2.4 — 2026-03-05
Fixed
- Budget display showing 300€ instead of 200€ — the
/quota/statusendpoint was including quota rules from other organisations in a user'smatching_entitieswhen calculating the monthly limit. The previous fix (filter=False) was too broad: it stopped filtering out the global base rule (correct) but also stopped filtering out cross-org rules (incorrect). The aggregation now correctly includes only base rules (orgId=null), global rules (orgId="*"), and rules scoped to the user's current organisation.
v3.2.3 — 2026-03-05
Changed
- Default Opus/Sonnet routing to Vertex AI — bare model names (e.g.
claude-opus-4-6) now default to Vertex AI instead of Bedrock. Bedrock often lags behind in supporting new Anthropic features (e.g. deferred tool loading). Users can still explicitly request a Bedrock model via the full upstream ID.
v3.2.2 — 2026-03-04
Fixed
- Budget display showing 100€ instead of 200€ — the
/quota/statusendpoint (used by the status line and Budget MCP server) was incorrectly filtering out the global base rule when calculating the limit, showing half the actual cap. Quota enforcement was already correct; this was a display-only bug.
v3.2.1 — 2026-03-04
Fixed
- Model resolution fallback — the
/chat/completionsendpoint now accepts themodelfield in the request body as a fallback when themodel-nameheader is not provided, aligning with OpenAI API conventions. - Default max_tokens — when clients omit
max_tokensor send 0, the gateway now defaults to 8192 tokens, preventing upstream validation errors.
Important: Enforcement Starting Monday 9th March 2026
Action Required — Access Will Be Blocked from Monday 9 March 2026
Starting Monday 9 March 2026, the PROD environment enforces the following policies. If you do not act before this date, your access will be blocked.
What changes
| Condition | Result |
|---|---|
Request made with a personal @pwc.com API key | 403 Forbidden — blocked immediately |
Claude Code without telemetry configured (missing OTEL_RESOURCE_ATTRIBUTES) | 403 Telemetry not configured — blocked immediately |
What you need to do
- Get an API key from the Get Access page — personal
@pwc.comkeys are no longer accepted. - Configure telemetry — add your PwC email to
OTEL_RESOURCE_ATTRIBUTESin~/.claude/settings.json:
"OTEL_RESOURCE_ATTRIBUTES": "user.email=name.surname@pwc.com"
See the Claude Code guide for the full telemetry setup.
Already configured telemetry but still getting errors? Restart Claude Code — it may take a few seconds for the new session to register. If errors persist after restarting, double-check that
OTEL_RESOURCE_ATTRIBUTESis correctly set and that you are using an API key from the Get Access page.
v3.2.0 — 2026-03-02
Added
- Personal API-key blocking — the gateway can now reject requests made with personal
@pwc.comAPI keys. Use an API key from the Get Access page to avoid disruption. - Telemetry enforcement for Claude Code — Claude Code users can be required to have an active telemetry configuration before accessing the gateway, helping ensure usage visibility and compliance.
Both features are opt-in and disabled by default; existing users are not affected.
Improved
- Three model naming formats accepted — the gateway now documents and cross-references all three accepted model ID formats: vendor name (e.g.,
claude-sonnet-4-6), PwC GenAI Shared Service convention (e.g.,bedrock.anthropic.claude-opus-4-6), and Coding Agents Gateway convention (e.g.,GENAI_SHARED_BEDROCK_ANTHROPIC_CLAUDE_46_OPUS). See the Available Models page for the full list. - Simplified telemetry configuration —
OTEL_RESOURCE_ATTRIBUTESnow only requiresuser.email=name.surname@pwc.com. Team and department are resolved automatically; no manualteam.idordepartmentvalues needed. - Actionable telemetry error messages — when telemetry is missing or stale, the 403 response now tells you exactly what to set and that access will be restored within ~1 minute of restarting Claude Code.
- Faster telemetry recovery — after configuring telemetry and restarting Claude Code, access is restored in approximately 1 minute (reduced from up to 5 minutes in previous versions).
v3.1.0 — 2026-02-27
Added
- Interactive API reference — browse all endpoints, view request/response schemas, and try calls directly in your browser at
/docs(Swagger UI) or/redoc(ReDoc). - The Swagger "Authorize" dialog accepts both auth formats (Bearer token or separate
api-key+tenant-idheaders) so you can test calls without leaving the browser.
v3.0.2 — 2026-02-26
Improved
- Model name resolution is now more flexible: the gateway correctly handles model IDs regardless of which naming convention your client sends.
v3.0.1 — 2026-02-25
Fixed
- Some model IDs that were previously unrecognised by the gateway are now resolved correctly.
v3.0.0 — 2026-02-25
Changed
- Reduced latency and improved reliability — internal routing optimisations eliminate a class of startup failures that affected some users.
- The
/v1/modelsendpoint now returns an accurate, up-to-date list of available models.