gateway.yaml. The file defines everything the gateway does: where it listens, how developers sign in, where inference goes, and which policies and telemetry apply. This page is the reference for every option in that file. To write your first one, start from the quickstart, which builds a minimal working config and runs it; once you have a config you’re happy with, the deployment guide covers containerizing and hosting it on Kubernetes, Cloud Run, or your own platform.
The gateway reads the file once, at startup, with claude gateway --config /path/to/gateway.yaml. Every option is validated against a schema at boot, so a malformed config fails at start with a field-level error rather than at first use.
The complete example at the end of this page exercises every section.
File structure
Five sections are required. Every other section is optional, and an omitted section takes its defaults. Unknown keys fail boot, so a typo surfaces as a named error rather than a silently ignored setting. Required sections:listen: bind address, public URL, TLS terminationoidc: your identity provider (IdP), including issuer, client, claim mapping, and who may sign insession: the bearer tokens the gateway mints, with secret and lifetimestore: PostgreSQL, for device grants and rate-limit countersupstreams: where inference goes, whether Anthropic, Bedrock, Agent Platform, or Foundry
admin: Admin API auth and retention for spend limitsenforcement: spend-limit fail-open or fail-closed behaviormodelsandauto_include_builtin_models: admin-curated model list and per-upstream IDsmanaged: managed settings policies by IdP grouptelemetry: OTLP forwarding to your observability stackaccess_control,limits,timeouts,rate_limits: IP allow/deny, request size caps, upstream time-to-first-byte, and per-IP sign-in limits
Secret expansion
Don’t write secrets such asclient_secret, jwt_secret, or postgres_url directly in gateway.yaml. Reference them with one of the forms below, and the gateway resolves the value at boot from an environment variable or a file:
| Form | Resolves to | Use for |
|---|---|---|
${VAR} | The environment variable VAR. Boot fails if undefined. | Container environment variables, AWS Secrets Manager via env injection |
${file:/path} | File contents, trimmed | Kubernetes Secret volume mounts, Vault Agent, SOPS |
Required sections
listen
The listen block controls where the gateway serves: the bind address and port, the externally visible origin, and optional TLS termination.
| Field | Required | Description |
|---|---|---|
host | No | Bind address. Default 0.0.0.0. |
port | No | Bind port. Default 8080. |
public_url | Behind a proxy | The externally visible https:// origin, used to build the IdP redirect_uri and discovery metadata. Required behind any TLS-terminating proxy such as an ALB, Ingress, or Cloud Run, because the gateway doesn’t trust X-Forwarded-* headers when constructing its own origin; they are client-spoofable. trusted_proxies below governs client-IP resolution only. Also required to enable telemetry, because the gateway builds the OTLP endpoint it pushes to clients from this URL. |
tls.cert / tls.key | No | PEM paths if the gateway terminates TLS itself |
trusted_proxies | No | CIDRs or IPs of load balancers in front of the gateway. When set, the gateway trusts X-Forwarded-For only from these peers and records the real client IP for per-IP rate limiting and audit. Equivalent to nginx set_real_ip_from. |
oidc
OpenID Connect (OIDC) is the SSO protocol the gateway uses with your identity provider; see Identity provider setup for what to register on the IdP side. The oidc block connects the gateway to your identity provider and decides who can sign in. It names the issuer and OAuth client, maps the claims that carry email and groups, and restricts sign-in by email domain or group.
| Field | Required | Description |
|---|---|---|
issuer | Yes | OIDC discovery base. Must serve discovery at /.well-known/openid-configuration. Use HTTPS in production; the gateway accepts an http:// issuer. A loopback issuer such as http://localhost:8081 is rejected by the SSRF guard unless CLAUDE_GATEWAY_ALLOW_LOOPBACK=1 is set in the gateway’s environment. |
client_id / client_secret | Yes | From your OAuth client registration |
allowed_email_domains | No | Reject id_tokens whose email claim isn’t in one of these domains, case-insensitive. Defense-in-depth against multi-tenant IdP misconfiguration. Independent of this setting, an id_token whose email_verified claim is explicitly false is always rejected. |
allowed_groups | No | Restrict sign-in to members of these IdP groups, matched against groups_claim. A user in an allowed email domain but in none of these groups is rejected. Requires the IdP to emit the groups claim. |
groups_claim | No | Which id_token claim carries group membership. Default groups. Microsoft Entra emits app roles under roles. Accepts a flat key or an RFC 6901 JSON Pointer such as /resource_access/gateway/roles for nested claims. |
google_groups | No | Look up the signed-in user’s groups through the Google Workspace Admin SDK Directory API, because Google’s id_token carries no groups claim. Set service_account_json_path to a service-account key file with domain-wide delegation on the https://www.googleapis.com/auth/admin.directory.group.readonly scope, and admin_email to a Workspace administrator the service account impersonates; the Directory API requires a real admin subject. Each user’s group email addresses become their groups claim, so allowed_groups and managed.policies.match.groups match on group emails. |
email_claim | No | Which id_token claim carries the user’s email. Default email. Some IdPs, such as ADFS and Entra B2C, emit upn or preferred_username instead. Accepts a flat key, a JSON Pointer, or a list of fallback keys where the first present key is used. |
scopes | No | Full override of the OIDC scopes the gateway requests. Default [openid, profile, email, offline_access]. Set when your IdP rejects scopes it doesn’t recognize, or requires a custom scope to emit groups or email. Must include openid. Dropping offline_access disables refresh tokens, so developers re-run the browser login every session.ttl_hours. See Identity provider setup for per-IdP scope recipes such as Google’s refresh-token flow. |
extra_auth_params | No | Extra query parameters appended to the IdP authorization request, verbatim. This is the override mechanism for IdP-specific behavior, such as access_type: offline for Google refresh tokens, domain_hint for some Entra tenants, or acr_values for step-up flows. Cannot override the gateway-managed protocol params: state, nonce, redirect_uri, PKCE, scope, response_type, response_mode, and client_id. |
userinfo_fallback | No | When the id_token omits email or groups, fetch them from /userinfo. Needed for Keycloak lightweight access tokens, the Okta org server, and ADFS minimal tokens. The id_token stays authoritative; userinfo only fills gaps. Default false. |
use_pkce | No | Send a PKCE (S256) challenge on the authorization request. Default true. Set false only if your IdP rejects PKCE for this confidential client. |
clock_skew_seconds | No | Tolerate clock drift when validating id_token time claims. Default 0, which is strict. Raise if you see “token expired / not yet valid” errors right after sign-in due to host/IdP clock skew. |
token_endpoint_auth_method | No | Override the token-endpoint auth method. Accepts client_secret_basic or client_secret_post. Auto-negotiated by default. |
id_token_signed_response_alg | No | Expected id_token signing algorithm. Default RS256. Set for IdPs that sign with ES256, PS256, or EdDSA. |
additional_authorized_parties | No | Extra azp values to accept beyond client_id, for Keycloak broker and token-exchange flows |
discovery_url | No | Fetch the discovery document from this URL instead of deriving it from issuer, for IdPs behind a proxy that rewrites the issuer host. The path must contain /.well-known/. |
form_action_origins | No | Additional origins for the /device page’s Content-Security-Policy: form-action directive. The gateway already allows 'self' and the discovered authorization_endpoint origin, but Chrome enforces form-action against the entire redirect chain. If your IdP redirects through a second host, such as Azure AD federated to ADFS, hub-spoke Okta, or a corporate SSO interceptor, list every origin the authorization request may redirect through. |
ca_cert_pem | No | PEM CA cert that replaces the system trust store for IdP requests only. Use for Keycloak or Dex behind corporate PKI. |
session
The session block shapes the bearer tokens the gateway mints after sign-in: the secret that signs them and how long they live.
| Field | Required | Description |
|---|---|---|
jwt_secret | Yes | At least 32 bytes of entropy, for example from openssl rand -base64 32. Signs the gateway’s HS256 bearer tokens. Accepts a single string or an array for rotation: index 0 signs and all entries verify. To rotate, prepend a new secret, wait ttl_hours, then drop the old one. |
ttl_hours | No | Gateway bearer token lifetime. Default 1. The CLI silently refreshes before expiry when the IdP issues refresh tokens. A shorter lifetime deprovisions faster; a longer one makes fewer IdP round-trips. If your IdP can’t issue refresh tokens because offline_access is unavailable, there is no silent refresh, so raise this to 8 or 12 to avoid sending developers back to the browser login every hour. |
store
The store block points the gateway at its PostgreSQL database, which holds device grants and rate-limit counters.
| Field | Required | Description |
|---|---|---|
postgres_url | Yes | postgres:// or postgresql:// URL. Required: the device-grant rendezvous, where the browser callback writes and the polling CLI reads, needs cross-replica state. The gateway runs its own schema migrations at boot, so the role needs CREATE TABLE on the target schema. If your security policy prohibits DDL from the application role, run the migrations with an admin role, initially and again whenever a new release ships migrations, and grant the app role SELECT, INSERT, UPDATE, DELETE on the gateway’s tables. See Upgrades and Postgres. |
username | No | Overrides the user in postgres_url |
password | No | Database credential. Set it here rather than in postgres_url so the credential stays out of the URL. Accepts any characters and takes precedence over URL credentials. |
max_connections | No | Postgres connection-pool size per replica. Default 5, which is conservative and friendly to shared databases. With spend limits enabled, the hot path does a few operations per inference request, so raise it for a dedicated database under load, and keep replicas × this below the database’s max_connections. |
postgres_url at a throwaway Postgres container, for example docker run --rm -p 5432:5432 -e POSTGRES_HOST_AUTH_METHOD=trust postgres.
upstreams
upstreams is an ordered list. The gateway forwards inference to the first upstream that resolves the requested model. On 5xx, 429, or timeout it fails over to the next; other 4xx doesn’t, because those errors are attributable to the request rather than the upstream. Multiple upstreams of the same provider must set a distinct name:.
Bedrock, Agent Platform, and Foundry clients are built once at startup, and their SDKs refresh credentials internally, so rotating cloud credentials doesn’t require a restart. Static Anthropic API keys and bearers are read at startup; see Anthropic API.
Anthropic API
The minimal Anthropic upstream is an API key from the Claude Console:api_key: sendsx-api-key. Rotate it in the Claude Console and update the env var.oauth_token: sendsAuthorization: Bearer. Use the bearer form when your org issues short-lived tokens instead of long-lived API keys. The bearer is read once at startup, so refresh by remounting the secret and restarting.
Amazon Bedrock
For the client-side Bedrock deployment that the gateway replaces or fronts, see Claude Code on Amazon Bedrock. The gateway-side upstream:auth block uses the AWS SDK’s default credential chain: env vars, ~/.aws/credentials, ECS task role, EC2 instance metadata, or IRSA on EKS. In production, give the gateway pod an IAM role instead of embedding static keys in a container image.
| Setup | How |
|---|---|
| IAM permissions | Grant the gateway’s principal bedrock:InvokeModel and bedrock:InvokeModelWithResponseStream on both the inference-profile ARNs and the underlying foundation-model ARNs. For the built-in catalog in US regions: arn:aws:bedrock:<region>:<account>:inference-profile/us.anthropic.* and arn:aws:bedrock:*::foundation-model/anthropic.*. |
| Model access | In the Bedrock console, per region, request and enable model access for the Claude models you want. Cross-region inference profiles (us.anthropic.*) require model access in each region the profile spans. |
| EKS (IRSA) | Create an IAM role with the policy above and a trust policy for your cluster’s OIDC provider scoped to the gateway’s service account. Annotate the service account with eks.amazonaws.com/role-arn: arn:aws:iam::<acct>:role/claude-gateway. auth: {} picks it up. |
| ECS / EC2 | Attach the IAM role to the task definition or instance profile. auth: {} picks it up. |
| Anywhere else | Pass credentials via the AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, and AWS_SESSION_TOKEN env vars, or set them explicitly in auth: with ${VAR} expansion |
| Region | region: is the API endpoint region. Cross-region inference profiles route across the geo (US, EU, APAC) regardless of which one you pick. For non-US regions or provisioned-throughput ARNs, add a models: block with the right per-upstream IDs. |
Google Cloud Agent Platform
For the equivalent client-side setup, see Claude Code on Google Cloud. The gateway-side upstream:auth block uses Application Default Credentials: GOOGLE_APPLICATION_CREDENTIALS, GCE metadata, or GKE Workload Identity. Service-account JSON key files are supported but discouraged; use Workload Identity or attach a service account to the GCE or Cloud Run instance.
Set region: global to use Agent Platform’s global endpoint instead of a regional one. Google then routes each request to an available region, so you don’t track per-region model availability. Setting a specific region pins every request to it.
| Setup | How |
|---|---|
| IAM permissions | Grant the gateway’s service account roles/aiplatform.user on the project, or a custom role with aiplatform.endpoints.predict. Enable the Agent Platform API (aiplatform.googleapis.com). |
| Model access | In Model Garden, enable the Claude models for your project. They publish to specific regions; check the model card for supported regions. |
| GKE (Workload Identity) | Bind a GCP service account to the gateway’s Kubernetes service account and annotate the KSA with iam.gke.io/gcp-service-account: claude-gateway@<proj>.iam.gserviceaccount.com. auth: {} picks it up. |
| Cloud Run / GCE | Set the service’s service account to one with roles/aiplatform.user. auth: {} picks it up. |
| Anywhere else | auth: { service_account_json: /secrets/sa.json }, the path to a JSON key file mounted as a secret. The field takes a file path, not the key contents, so no ${file:…} expansion is involved. |
Microsoft Foundry
For the client-side Foundry deployment, see Claude Code on Microsoft Foundry. The gateway-side upstream:use_azure_ad: true resolves through DefaultAzureCredential: Managed Identity on AKS, ACI, or App Service; the Azure CLI; or environment credentials. API keys work but are project-wide and don’t rotate automatically. Foundry’s endpoint is derived from resource:; set the optional base_url to override it for sovereign clouds such as Azure Government.
| Setup | How |
|---|---|
| RBAC | Grant the gateway’s identity Azure AI User or Cognitive Services User on the Foundry resource |
| Deployments | Foundry uses admin-chosen deployment names, not canonical model IDs. Add a models: block mapping each canonical ID to your deployment name. |
| AKS (workload identity) | Federate a User-Assigned Managed Identity with the cluster’s OIDC issuer and bind it to the gateway’s service account. use_azure_ad: true picks it up via WorkloadIdentityCredential. |
| ACI / App Service | Enable system-assigned or user-assigned managed identity on the resource. use_azure_ad: true picks it up. |
| Anywhere else | auth: { api_key: "${FOUNDRY_API_KEY}" }. Quote ${…} inside { }. |
Multiple upstreams
The same provider can appear more than once with a distinctname:. This covers different regions, different accounts via different credential chains, provisioned throughput versus on-demand, and cross-provider fallback.
The gateway tries upstreams in order. 5xx, 429, timeouts, and missing-endpoint (501) fail over; other 4xx doesn’t. 429 is per-upstream capacity, so provisioned-throughput (PT) exhaustion fails over to on-demand. An upstream that can’t resolve the requested model is skipped without a network round-trip.
This example routes a provisioned-throughput Bedrock allotment first, overflows to on-demand and a second account, and falls back to the Anthropic API last:
| Lever | How |
|---|---|
| Different regions | One Bedrock upstream per region, each with its own region:. With auto_include_builtin_models: true the cross-region inference profiles route automatically; for region-pinned deployments use a models: block. |
| Different accounts | One Bedrock upstream per account, each with its own credentials in auth:. The default chain (auth: {}) uses the pod’s identity; for a second account, set explicit credentials or a bearer token. |
| Provisioned throughput | Map the model to the provisioned-throughput ARN in models: for that upstream’s name. Other upstreams keep the on-demand ID, so PT capacity is exhausted before failing over. |
| VPC / FIPS endpoints | Set base_url: on the upstream to your VPC endpoint or FIPS endpoint URL |
| Model-scoped routing | Omit an upstream from a model’s upstream_model: map and that upstream is skipped for that model. For example, route Opus to provisioned throughput and Sonnet and Haiku to on-demand. |
Optional sections
admin
Optional. Enables /v1/organizations/spend_limits, which mirrors Anthropic’s public Admin API, and per-developer spend enforcement on /v1/messages. See Spend limits for how caps are set and enforced; this section covers the gateway.yaml keys that turn the feature on and tune it.
| Field | Required | Description |
|---|---|---|
write_keys | No | Array of {id, key}. An x-api-key matching one of these can list, set, and delete spend limits. Key values must be at least 32 characters; ids must be unique across read_keys and write_keys. |
read_keys | No | Array of {id, key}. Read-only: every GET endpoint, including listing caps, fetching one by ID, and reading /effective and /audit. |
admin_groups | No | IdP group names. A gateway JWT whose groups claim includes one of these has full admin access, read and write, and audits as oidc:<sub>. Use this for human admins; use API keys for machines. |
blocked_message | No | Appended verbatim to the 429 billing_error a blocked developer sees. Write the whole instruction, such as a URL or a Slack channel. Unset, the error is spend limit reached. |
audit_retention_days | No | Default 365. Older admin_audit rows are swept. |
spend_retention_months | No | Default 13. spend counter rows older than this are swept. The default keeps a full year plus the current partial month for year-over-year reporting. |
identity_retention_days | No | Default 90. Last-seen TTL for principal_emails rows, which hold each developer’s email, display name, and groups (PII). Deliberately shorter than spend retention so a deprovisioned identity ages out while its anonymous spend counters remain. |
group_limit_mode | No | min (default) or max. When a developer is in several groups with caps, min enforces the most restrictive and max the least. Used by both enforcement and /effective. |
enforcement
The enforcement block controls how spend-limit checks behave when the store is unavailable.
| Field | Required | Description |
|---|---|---|
fail_closed_on_error | No | Default false. Spend enforcement fails open on a Postgres outage, so inference stays up. Set true to fail closed: over-cap developers are blocked, but so is everyone else if the store is unreachable. Has no effect without an admin: block. |
models
The models block is an optional admin-curated model list, served at /v1/models and used to translate model IDs per upstream. It is required for non-US Bedrock regions, Bedrock provisioned-throughput ARNs, and Foundry deployment names.
managed
The managed block defines role-based access policies keyed on IdP groups or email domain. Policies are evaluated in order; the first match is selected, then merged onto the match: {} catch-all base described below. They are served per-user at GET /managed/settings with ETag/304 caching.
match: {} catch-all, conventionally listed last, is treated as a base layer. Every other policy inherits any key it doesn’t set from the catch-all, so per-role entries only need to list what differs from the org default. The merge rules depend on the key type:
- Allow-lists:
availableModelsandpermissions.allow. A specific policy’s list fully replaces the base’s. - Deny-lists and hook arrays:
permissions.deny,permissions.ask,disabledMcpjsonServers,deniedMcpServers,blockedMarketplaces, and everyhooksevent-type array. These take the union of base and policy, so an org-wide deny or audit hook can’t be accidentally dropped by a per-role override. - Record-typed keys:
env,modelOverrides, andskillOverrides. These shallow-merge, so a per-roleenvblock overrides keys it sets and inherits the rest from the base.
availableModels is also enforced server-side at /v1/messages, so a denied model returns 400 regardless of what the client sends.
| Matcher | Behavior |
|---|---|
match: {} | Matches every authenticated user. Start with one of these and add group-scoped policies above it later. |
match: { groups: [a, b] } | Matches if the JWT’s groups claim contains any of the listed groups. Case-sensitive: groups must match the IdP’s exact casing. |
match: { email_domain: example.com } | Matches the part after the last @ in the JWT’s email claim, case-insensitive. Accepts one domain per policy. |
match: { groups: [a], email_domain: example.com } | Both conditions must match |
match: {} catch-all last if you want a guaranteed default policy.
The gateway keeps no user directory of its own. It authorizes each request from the user’s IdP token, reading group membership from the token’s
groups claim and evaluating policies against it. There is no roster to enumerate and no accounts to pre-create, and therefore no SCIM endpoint, because there is nothing for SCIM to sync into.Run user and group lifecycle management at the source of truth, which is your IdP’s native SCIM provisioning or a dedicated identity-governance platform. Membership and deprovisioning governed there flow into the gateway automatically through the token. If you want SCIM provisioning of Claude accounts themselves, that is a Claude for Enterprise capability.Two propagation clocks apply:- Policy contents: editing a policy and redeploying reaches connected clients on their next managed-settings poll, within an hour
- Group membership: changing a user’s group membership changes which policy matches them. This takes effect on the next session re-mint, meaning the next silent refresh, bounded by
session.ttl_hours.
What goes in cli
Each cli value is a complete Claude Code managed-settings.json document, the same schema you would deploy via MDM or /etc/claude-code/managed-settings.json, expressed here as YAML. The CLI applies the delivered document at the managed tier, above user and project settings.
The gateway validates each document against the CLI’s settings schema at boot, so an unrecognized top-level key or a recognized key with a malformed value fails boot with an error naming every offending key. Deliberately open parts of the schema still accept arbitrary values, because newer clients may recognize entries the gateway’s schema doesn’t. These open keys are env, pluginConfigs, and keys nested under permissions.
Because validation uses the schema bundled with the gateway’s installed version, putting a top-level settings key introduced by a newer Claude Code release into managed config requires upgrading the gateway first. Smoke-test a new policy on one client before rolling it out.
The full key reference is in Claude Code settings. The keys most operators reach for first:
| Key | Enforced by | Effect |
|---|---|---|
availableModels | Gateway + CLI | Model allowlist. Also checked at /v1/messages, so a patched client can’t bypass it. |
permissions.allow / .deny | CLI | Tool and command rules. See Permissions. |
permissions.disableBypassPermissionsMode | CLI | Set to disable to block bypassPermissions, the mode that auto-approves every tool call, and the --dangerously-skip-permissions flag |
allowManagedPermissionRulesOnly | CLI | When true, user and project permission rules are ignored; only rules from this document apply |
env | CLI | Environment variables merged into the CLI process. Use for telemetry, auto-update, and model-name overrides. |
hooks | CLI | Org-wide hooks |
hooksenvvariables that aren’t on the CLI’s built-in safe list- shell-execution settings such as
apiKeyHelperandstatusLine - managed CLAUDE.md content
env variables apply without approval:
- On the safe list: auto-update and model-name vars
- Not on the safe list: proxy vars, base-URL vars, and
OTEL_EXPORTER_OTLP_ENDPOINT
OTEL_EXPORTER_OTLP_ENDPOINT, so setting telemetry.forward_to triggers the dialog on each interactive client. Non-interactive runs with the -p flag skip the dialog and apply settings without approval. The dialog protects the developer’s machine from a compromised or hostile gateway, not the organization from the developer, so the -p skip is intentional rather than a gap.
If a developer declines, Claude Code exits rather than applying the policy. Pushing a new hook or non-safe env var to a broad policy therefore means an approval prompt on every matching developer’s next startup.
The cli key was named settings in earlier releases. That spelling is still accepted as an alias, but new deployments should use cli.
Precedence with other managed sources
If a device also has a localmanaged-settings.json or MDM-delivered policy, the managed sources don’t merge. The highest-priority source provides all policy settings, ranked in this order with highest priority first:
- The policy helper
- Gateway-delivered settings
- MDM, via the HKLM registry on Windows or a plist on macOS
- The
managed-settings.jsonfile - The HKCU registry, on Windows only
managedSettings option. It is ignored by default and applies only when a managed source opts in with parentSettingsBehavior: "merge", filtered so it can tighten policy but not loosen it.
The exception is a small set of cross-source keys, honored when any admin source sets them; the user-writable HKCU tier is excluded:
sandbox.network.allowManagedDomainsOnlyandsandbox.filesystem.allowManagedReadPathsOnly: when locked, the corresponding allowlists are unioned across sourcesallowAllClaudeAiMcps: allow-only override for the claude.ai MCP server allowlistsandbox.bwrapPathandsandbox.socatPath: filesystem paths to the sandbox helper binaries
allowManagedPermissionRulesOnly and disableBypassPermissionsMode are not cross-source, so only the winning source’s value applies.
Gateway policies apply to every Claude Code invocation on the machine, including non-interactive claude -p runs and sessions spawned by the Agent SDK. If the gateway is unreachable at startup, signed-in sessions exit with an error rather than running without their policy.
telemetry
The CLI sends OpenTelemetry Protocol (OTLP) over HTTP metrics, logs, and, when enabled, traces to the gateway, which relays them verbatim to each configured destination. See Monitoring usage for the metrics and events the CLI emits.
The CLI stamps each export with the authenticated user’s identity, read from the gateway-issued JWT: the user.id, user.email, and user.groups attributes. Per-developer cost and usage attribution therefore works with no developer-side configuration.
telemetry.forward_to together with listen.public_url turns it on. The gateway pushes five env vars to every connected client through /managed/settings:
CLAUDE_CODE_ENABLE_TELEMETRY=1OTEL_METRICS_EXPORTER=otlpOTEL_LOGS_EXPORTER=otlpOTEL_TRACES_EXPORTER=otlpOTEL_EXPORTER_OTLP_ENDPOINT=<public_url>
OTEL_* variables a developer sets locally.
Traces additionally require CLAUDE_CODE_ENHANCED_TELEMETRY_BETA=1 on each client. The gateway doesn’t push that variable, so set it through a managed policy’s env block. It isn’t on the CLI’s safe list, so delivering it through a policy is covered by the same security approval dialog that the pushed OTLP endpoint already triggers.
Both protobuf and JSON OTLP encodings are relayed, and any OpenTelemetry-compatible backend works as a destination.
HTTP tuning
Four optional top-level blocks,access_control, limits, timeouts, and rate_limits, tune the HTTP surface. The defaults suit most deployments.
| Block | Key | Default | Description |
|---|---|---|---|
access_control | allow_cidrs / deny_cidrs | empty | Inbound IP allow/deny by client address, after trusted_proxies resolution. deny_cidrs is checked first; a client it matches is rejected even if allow_cidrs also matches. If allow_cidrs is non-empty the gateway is default-deny. /healthz and /readyz are exempt from allow_cidrs. |
limits | max_request_bytes | 32 MiB | Max inbound request body; oversize requests get 413 before the body is buffered. Raise for large file or image requests. |
limits | max_request_header_bytes | unset | When set, oversize headers return 431 |
limits | max_url_length | unset | When set, an over-long URL returns 414 |
timeouts | upstream_ttfb_ms | 120000 | Max wait for the upstream’s response headers (time to first byte). The response body then streams with no wall-clock cap. Applies to the direct Anthropic upstream path; Bedrock, Agent Platform, and Foundry are bounded by their provider SDK’s own timeout. |
rate_limits | device_authorization.max / .window_seconds | 30 / 600 | Per-IP rate limit on the unauthenticated device-authorization endpoint. Raise for a large org behind a shared egress IP or NAT. These limits apply only to the device-grant sign-in flow, not to /v1/messages inference. See User-code brute-force resistance. |
rate_limits | device_verify.max / .window_seconds | 10 / 600 | Per-IP rate limit on user_code submissions at /device |
Complete example
This full reference config exercises every core section; the HTTP tuning blocks keep their defaults. Copy it, delete what you don’t need, and fill in your values. The config in the Quickstart is a minimal version of this.gateway.yaml
Client-side managed settings
Everything above configures the gateway server. Pointing developer machines at it is configured separately, on each device, through Claude Code’s managed settings. The gateway can’t push these keys itself, because they’re what tell the client where the gateway is. For the CLI, set both keys in the per-OSmanaged-settings.json:
| Platform | Path |
|---|---|
| macOS | /Library/Application Support/ClaudeCode/managed-settings.json, or the com.anthropic.claudecode managed preferences domain |
| Linux and WSL | /etc/claude-code/managed-settings.json |
| Windows | C:\Program Files\ClaudeCode\managed-settings.json, or Group Policy via the HKLM registry |
forceLoginGatewayUrl, and the "gateway" value of forceLoginMethod, are honored only from the admin-controlled managed tier. A developer setting them in their own ~/.claude/settings.json has no effect.
Related
- Claude apps gateway overview: quickstart and developer connection
- Deployment guide: IdP setup, container image, Kubernetes and Cloud Run, and operations
- Spend limits: per-developer caps and the Admin API