Skip to main content
A Claude apps gateway deployment is configured by one YAML file, conventionally gateway.yaml. The file defines everything the gateway does: where it listens, how developers sign in, where inference goes, and which policies and telemetry apply. This page is the reference for every option in that file. To write your first one, start from the quickstart, which builds a minimal working config and runs it; once you have a config you’re happy with, the deployment guide covers containerizing and hosting it on Kubernetes, Cloud Run, or your own platform. The gateway reads the file once, at startup, with claude gateway --config /path/to/gateway.yaml. Every option is validated against a schema at boot, so a malformed config fails at start with a field-level error rather than at first use. The complete example at the end of this page exercises every section.

File structure

Five sections are required. Every other section is optional, and an omitted section takes its defaults. Unknown keys fail boot, so a typo surfaces as a named error rather than a silently ignored setting. Required sections:
  • listen: bind address, public URL, TLS termination
  • oidc: your identity provider (IdP), including issuer, client, claim mapping, and who may sign in
  • session: the bearer tokens the gateway mints, with secret and lifetime
  • store: PostgreSQL, for device grants and rate-limit counters
  • upstreams: where inference goes, whether Anthropic, Bedrock, Agent Platform, or Foundry
Optional sections:
  • admin: Admin API auth and retention for spend limits
  • enforcement: spend-limit fail-open or fail-closed behavior
  • models and auto_include_builtin_models: admin-curated model list and per-upstream IDs
  • managed: managed settings policies by IdP group
  • telemetry: OTLP forwarding to your observability stack
  • access_control, limits, timeouts, rate_limits: IP allow/deny, request size caps, upstream time-to-first-byte, and per-IP sign-in limits

Secret expansion

Don’t write secrets such as client_secret, jwt_secret, or postgres_url directly in gateway.yaml. Reference them with one of the forms below, and the gateway resolves the value at boot from an environment variable or a file:
FormResolves toUse for
${VAR}The environment variable VAR. Boot fails if undefined.Container environment variables, AWS Secrets Manager via env injection
${file:/path}File contents, trimmedKubernetes Secret volume mounts, Vault Agent, SOPS

Required sections

listen

The listen block controls where the gateway serves: the bind address and port, the externally visible origin, and optional TLS termination.
FieldRequiredDescription
hostNoBind address. Default 0.0.0.0.
portNoBind port. Default 8080.
public_urlBehind a proxyThe externally visible https:// origin, used to build the IdP redirect_uri and discovery metadata. Required behind any TLS-terminating proxy such as an ALB, Ingress, or Cloud Run, because the gateway doesn’t trust X-Forwarded-* headers when constructing its own origin; they are client-spoofable. trusted_proxies below governs client-IP resolution only. Also required to enable telemetry, because the gateway builds the OTLP endpoint it pushes to clients from this URL.
tls.cert / tls.keyNoPEM paths if the gateway terminates TLS itself
trusted_proxiesNoCIDRs or IPs of load balancers in front of the gateway. When set, the gateway trusts X-Forwarded-For only from these peers and records the real client IP for per-IP rate limiting and audit. Equivalent to nginx set_real_ip_from.

oidc

OpenID Connect (OIDC) is the SSO protocol the gateway uses with your identity provider; see Identity provider setup for what to register on the IdP side. The oidc block connects the gateway to your identity provider and decides who can sign in. It names the issuer and OAuth client, maps the claims that carry email and groups, and restricts sign-in by email domain or group.
FieldRequiredDescription
issuerYesOIDC discovery base. Must serve discovery at /.well-known/openid-configuration. Use HTTPS in production; the gateway accepts an http:// issuer. A loopback issuer such as http://localhost:8081 is rejected by the SSRF guard unless CLAUDE_GATEWAY_ALLOW_LOOPBACK=1 is set in the gateway’s environment.
client_id / client_secretYesFrom your OAuth client registration
allowed_email_domainsNoReject id_tokens whose email claim isn’t in one of these domains, case-insensitive. Defense-in-depth against multi-tenant IdP misconfiguration. Independent of this setting, an id_token whose email_verified claim is explicitly false is always rejected.
allowed_groupsNoRestrict sign-in to members of these IdP groups, matched against groups_claim. A user in an allowed email domain but in none of these groups is rejected. Requires the IdP to emit the groups claim.
groups_claimNoWhich id_token claim carries group membership. Default groups. Microsoft Entra emits app roles under roles. Accepts a flat key or an RFC 6901 JSON Pointer such as /resource_access/gateway/roles for nested claims.
google_groupsNoLook up the signed-in user’s groups through the Google Workspace Admin SDK Directory API, because Google’s id_token carries no groups claim. Set service_account_json_path to a service-account key file with domain-wide delegation on the https://www.googleapis.com/auth/admin.directory.group.readonly scope, and admin_email to a Workspace administrator the service account impersonates; the Directory API requires a real admin subject. Each user’s group email addresses become their groups claim, so allowed_groups and managed.policies.match.groups match on group emails.
email_claimNoWhich id_token claim carries the user’s email. Default email. Some IdPs, such as ADFS and Entra B2C, emit upn or preferred_username instead. Accepts a flat key, a JSON Pointer, or a list of fallback keys where the first present key is used.
scopesNoFull override of the OIDC scopes the gateway requests. Default [openid, profile, email, offline_access]. Set when your IdP rejects scopes it doesn’t recognize, or requires a custom scope to emit groups or email. Must include openid. Dropping offline_access disables refresh tokens, so developers re-run the browser login every session.ttl_hours. See Identity provider setup for per-IdP scope recipes such as Google’s refresh-token flow.
extra_auth_paramsNoExtra query parameters appended to the IdP authorization request, verbatim. This is the override mechanism for IdP-specific behavior, such as access_type: offline for Google refresh tokens, domain_hint for some Entra tenants, or acr_values for step-up flows. Cannot override the gateway-managed protocol params: state, nonce, redirect_uri, PKCE, scope, response_type, response_mode, and client_id.
userinfo_fallbackNoWhen the id_token omits email or groups, fetch them from /userinfo. Needed for Keycloak lightweight access tokens, the Okta org server, and ADFS minimal tokens. The id_token stays authoritative; userinfo only fills gaps. Default false.
use_pkceNoSend a PKCE (S256) challenge on the authorization request. Default true. Set false only if your IdP rejects PKCE for this confidential client.
clock_skew_secondsNoTolerate clock drift when validating id_token time claims. Default 0, which is strict. Raise if you see “token expired / not yet valid” errors right after sign-in due to host/IdP clock skew.
token_endpoint_auth_methodNoOverride the token-endpoint auth method. Accepts client_secret_basic or client_secret_post. Auto-negotiated by default.
id_token_signed_response_algNoExpected id_token signing algorithm. Default RS256. Set for IdPs that sign with ES256, PS256, or EdDSA.
additional_authorized_partiesNoExtra azp values to accept beyond client_id, for Keycloak broker and token-exchange flows
discovery_urlNoFetch the discovery document from this URL instead of deriving it from issuer, for IdPs behind a proxy that rewrites the issuer host. The path must contain /.well-known/.
form_action_originsNoAdditional origins for the /device page’s Content-Security-Policy: form-action directive. The gateway already allows 'self' and the discovered authorization_endpoint origin, but Chrome enforces form-action against the entire redirect chain. If your IdP redirects through a second host, such as Azure AD federated to ADFS, hub-spoke Okta, or a corporate SSO interceptor, list every origin the authorization request may redirect through.
ca_cert_pemNoPEM CA cert that replaces the system trust store for IdP requests only. Use for Keycloak or Dex behind corporate PKI.

session

The session block shapes the bearer tokens the gateway mints after sign-in: the secret that signs them and how long they live.
FieldRequiredDescription
jwt_secretYesAt least 32 bytes of entropy, for example from openssl rand -base64 32. Signs the gateway’s HS256 bearer tokens. Accepts a single string or an array for rotation: index 0 signs and all entries verify. To rotate, prepend a new secret, wait ttl_hours, then drop the old one.
ttl_hoursNoGateway bearer token lifetime. Default 1. The CLI silently refreshes before expiry when the IdP issues refresh tokens. A shorter lifetime deprovisions faster; a longer one makes fewer IdP round-trips. If your IdP can’t issue refresh tokens because offline_access is unavailable, there is no silent refresh, so raise this to 8 or 12 to avoid sending developers back to the browser login every hour.

store

The store block points the gateway at its PostgreSQL database, which holds device grants and rate-limit counters.
FieldRequiredDescription
postgres_urlYespostgres:// or postgresql:// URL. Required: the device-grant rendezvous, where the browser callback writes and the polling CLI reads, needs cross-replica state. The gateway runs its own schema migrations at boot, so the role needs CREATE TABLE on the target schema. If your security policy prohibits DDL from the application role, run the migrations with an admin role, initially and again whenever a new release ships migrations, and grant the app role SELECT, INSERT, UPDATE, DELETE on the gateway’s tables. See Upgrades and Postgres.
usernameNoOverrides the user in postgres_url
passwordNoDatabase credential. Set it here rather than in postgres_url so the credential stays out of the URL. Accepts any characters and takes precedence over URL credentials.
max_connectionsNoPostgres connection-pool size per replica. Default 5, which is conservative and friendly to shared databases. With spend limits enabled, the hot path does a few operations per inference request, so raise it for a dedicated database under load, and keep replicas × this below the database’s max_connections.
For local development, point postgres_url at a throwaway Postgres container, for example docker run --rm -p 5432:5432 -e POSTGRES_HOST_AUTH_METHOD=trust postgres.

upstreams

upstreams is an ordered list. The gateway forwards inference to the first upstream that resolves the requested model. On 5xx, 429, or timeout it fails over to the next; other 4xx doesn’t, because those errors are attributable to the request rather than the upstream. Multiple upstreams of the same provider must set a distinct name:. Bedrock, Agent Platform, and Foundry clients are built once at startup, and their SDKs refresh credentials internally, so rotating cloud credentials doesn’t require a restart. Static Anthropic API keys and bearers are read at startup; see Anthropic API.

Anthropic API

The minimal Anthropic upstream is an API key from the Claude Console:
upstreams:
  - provider: anthropic
    auth:
      api_key: ${ANTHROPIC_API_KEY}
    # OR an OAuth bearer (e.g. a Workload-Identity-Federation-exchanged token):
    #   oauth_token: ${file:/var/run/secrets/anthropic-oauth-token}
    # base_url: https://api.anthropic.com   # default; override for a forward proxy
The two credential forms differ in the header they send:
  • api_key: sends x-api-key. Rotate it in the Claude Console and update the env var.
  • oauth_token: sends Authorization: Bearer. Use the bearer form when your org issues short-lived tokens instead of long-lived API keys. The bearer is read once at startup, so refresh by remounting the secret and restarting.
Instead of a static key or bearer, you can use Workload Identity Federation. Create a federation rule by following the Workload Identity Federation guide, then mount your workload’s OIDC JWT as a file, such as a Kubernetes projected service-account token or a CI platform’s id-token. The gateway exchanges the JWT for a short-lived bearer and refreshes it automatically. The token file is re-read on every exchange, so rotated projected tokens are picked up without a restart.
upstreams:
  - provider: anthropic
    auth:
      federation_rule_id: ${ANTHROPIC_FEDERATION_RULE_ID}
      organization_id: ${ANTHROPIC_ORGANIZATION_ID}
      identity_token_file: /var/run/secrets/anthropic/id-token
      # workspace_id: wrkspc_...       # required if the rule covers >1 workspace
      # service_account_id: svac_...   # optional expected-target check

Amazon Bedrock

For the client-side Bedrock deployment that the gateway replaces or fronts, see Claude Code on Amazon Bedrock. The gateway-side upstream:
upstreams:
  - provider: bedrock
    region: us-east-1
    auth: {}                           # preferred: AWS default credential chain
    # OR explicit credentials:
    # auth:
    #   aws_access_key_id: ${AWS_AKID}
    #   aws_secret_access_key: ${AWS_SK}
    #   aws_session_token: ${AWS_ST}
    # OR a Bedrock API bearer token:
    # auth:
    #   aws_bearer_token: ${AWS_BEARER_TOKEN}
    # Override the bedrock-runtime endpoint for FIPS or VPC-endpoint deployments:
    # base_url: https://bedrock-runtime-fips.us-east-1.amazonaws.com
An empty auth block uses the AWS SDK’s default credential chain: env vars, ~/.aws/credentials, ECS task role, EC2 instance metadata, or IRSA on EKS. In production, give the gateway pod an IAM role instead of embedding static keys in a container image.
SetupHow
IAM permissionsGrant the gateway’s principal bedrock:InvokeModel and bedrock:InvokeModelWithResponseStream on both the inference-profile ARNs and the underlying foundation-model ARNs. For the built-in catalog in US regions: arn:aws:bedrock:<region>:<account>:inference-profile/us.anthropic.* and arn:aws:bedrock:*::foundation-model/anthropic.*.
Model accessIn the Bedrock console, per region, request and enable model access for the Claude models you want. Cross-region inference profiles (us.anthropic.*) require model access in each region the profile spans.
EKS (IRSA)Create an IAM role with the policy above and a trust policy for your cluster’s OIDC provider scoped to the gateway’s service account. Annotate the service account with eks.amazonaws.com/role-arn: arn:aws:iam::<acct>:role/claude-gateway. auth: {} picks it up.
ECS / EC2Attach the IAM role to the task definition or instance profile. auth: {} picks it up.
Anywhere elsePass credentials via the AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, and AWS_SESSION_TOKEN env vars, or set them explicitly in auth: with ${VAR} expansion
Regionregion: is the API endpoint region. Cross-region inference profiles route across the geo (US, EU, APAC) regardless of which one you pick. For non-US regions or provisioned-throughput ARNs, add a models: block with the right per-upstream IDs.

Google Cloud Agent Platform

For the equivalent client-side setup, see Claude Code on Google Cloud. The gateway-side upstream:
upstreams:
  - provider: vertex
    region: us-east5
    project_id: example-prod
    auth: {}                           # preferred: Application Default Credentials
    # OR a service account key file:
    # auth: { service_account_json: /secrets/sa.json }
    # Override the aiplatform endpoint for Private Service Connect:
    # base_url: https://us-east5-aiplatform.p.googleapis.com
An empty auth block uses Application Default Credentials: GOOGLE_APPLICATION_CREDENTIALS, GCE metadata, or GKE Workload Identity. Service-account JSON key files are supported but discouraged; use Workload Identity or attach a service account to the GCE or Cloud Run instance. Set region: global to use Agent Platform’s global endpoint instead of a regional one. Google then routes each request to an available region, so you don’t track per-region model availability. Setting a specific region pins every request to it.
SetupHow
IAM permissionsGrant the gateway’s service account roles/aiplatform.user on the project, or a custom role with aiplatform.endpoints.predict. Enable the Agent Platform API (aiplatform.googleapis.com).
Model accessIn Model Garden, enable the Claude models for your project. They publish to specific regions; check the model card for supported regions.
GKE (Workload Identity)Bind a GCP service account to the gateway’s Kubernetes service account and annotate the KSA with iam.gke.io/gcp-service-account: claude-gateway@<proj>.iam.gserviceaccount.com. auth: {} picks it up.
Cloud Run / GCESet the service’s service account to one with roles/aiplatform.user. auth: {} picks it up.
Anywhere elseauth: { service_account_json: /secrets/sa.json }, the path to a JSON key file mounted as a secret. The field takes a file path, not the key contents, so no ${file:…} expansion is involved.

Microsoft Foundry

For the client-side Foundry deployment, see Claude Code on Microsoft Foundry. The gateway-side upstream:
upstreams:
  - provider: foundry
    resource: example-foundry              # https://example-foundry.services.ai.azure.com
    auth: { use_azure_ad: true }        # preferred: DefaultAzureCredential / Managed Identity
    # OR an API key:
    # auth:
    #   api_key: ${FOUNDRY_API_KEY}
use_azure_ad: true resolves through DefaultAzureCredential: Managed Identity on AKS, ACI, or App Service; the Azure CLI; or environment credentials. API keys work but are project-wide and don’t rotate automatically. Foundry’s endpoint is derived from resource:; set the optional base_url to override it for sovereign clouds such as Azure Government.
SetupHow
RBACGrant the gateway’s identity Azure AI User or Cognitive Services User on the Foundry resource
DeploymentsFoundry uses admin-chosen deployment names, not canonical model IDs. Add a models: block mapping each canonical ID to your deployment name.
AKS (workload identity)Federate a User-Assigned Managed Identity with the cluster’s OIDC issuer and bind it to the gateway’s service account. use_azure_ad: true picks it up via WorkloadIdentityCredential.
ACI / App ServiceEnable system-assigned or user-assigned managed identity on the resource. use_azure_ad: true picks it up.
Anywhere elseauth: { api_key: "${FOUNDRY_API_KEY}" }. Quote ${…} inside { }.

Multiple upstreams

The same provider can appear more than once with a distinct name:. This covers different regions, different accounts via different credential chains, provisioned throughput versus on-demand, and cross-provider fallback. The gateway tries upstreams in order. 5xx, 429, timeouts, and missing-endpoint (501) fail over; other 4xx doesn’t. 429 is per-upstream capacity, so provisioned-throughput (PT) exhaustion fails over to on-demand. An upstream that can’t resolve the requested model is skipped without a network round-trip. This example routes a provisioned-throughput Bedrock allotment first, overflows to on-demand and a second account, and falls back to the Anthropic API last:
upstreams:
  # Primary: provisioned throughput in your home region.
  - name: bedrock-pt
    provider: bedrock
    region: us-east-1
    auth: {}
  # Overflow: on-demand cross-region.
  - name: bedrock-od
    provider: bedrock
    region: us-west-2
    auth: {}
  # Different account: a separate Bedrock allotment via assumed-role creds.
  - name: bedrock-acct2
    provider: bedrock
    region: us-east-1
    auth:
      aws_access_key_id: ${ACCT2_AKID}
      aws_secret_access_key: ${ACCT2_SK}
  # Last resort: direct Anthropic API.
  - name: anthropic-fallback
    provider: anthropic
    auth:
      api_key: ${ANTHROPIC_API_KEY}

# Per-upstream model IDs are keyed on the upstream's `name:`; an upstream
# without a `name:` defaults to its provider string (e.g. `bedrock`). Any
# upstream not listed for a model is skipped, which is how you route a model
# to provisioned throughput while everything else stays on-demand.
models:
  - id: claude-opus-4-8
    label: Claude Opus 4.8
    upstream_model:
      bedrock-pt: arn:aws:bedrock:us-east-1:111111111111:provisioned-model/abcdef
      bedrock-od: us.anthropic.claude-opus-4-8
      bedrock-acct2: us.anthropic.claude-opus-4-8
      anthropic-fallback: claude-opus-4-8
LeverHow
Different regionsOne Bedrock upstream per region, each with its own region:. With auto_include_builtin_models: true the cross-region inference profiles route automatically; for region-pinned deployments use a models: block.
Different accountsOne Bedrock upstream per account, each with its own credentials in auth:. The default chain (auth: {}) uses the pod’s identity; for a second account, set explicit credentials or a bearer token.
Provisioned throughputMap the model to the provisioned-throughput ARN in models: for that upstream’s name. Other upstreams keep the on-demand ID, so PT capacity is exhausted before failing over.
VPC / FIPS endpointsSet base_url: on the upstream to your VPC endpoint or FIPS endpoint URL
Model-scoped routingOmit an upstream from a model’s upstream_model: map and that upstream is skipped for that model. For example, route Opus to provisioned throughput and Sonnet and Haiku to on-demand.
Failing over between cloud providers, or to the direct Anthropic API, changes which agreement, geography, and other terms govern the request. The CLI applies the same feature gating to gateways regardless of which upstream serves a given request, so failover doesn’t send a body field an upstream would reject.

Optional sections

admin

Optional. Enables /v1/organizations/spend_limits, which mirrors Anthropic’s public Admin API, and per-developer spend enforcement on /v1/messages. See Spend limits for how caps are set and enforced; this section covers the gateway.yaml keys that turn the feature on and tune it.
admin:
  # Named static API keys for the admin endpoints, sent as x-api-key.
  # The id appears in the audit log as admin-key:<id> so each key is
  # attributable. Array for rotation: add the new key, roll clients,
  # remove the old.
  write_keys:
    - { id: terraform, key: "${GATEWAY_ADMIN_WRITE_KEY_TF}" }
    - { id: ci,        key: "${GATEWAY_ADMIN_WRITE_KEY_CI}" }
  read_keys:
    - { id: reporting, key: "${GATEWAY_ADMIN_READ_KEY}" }
  # IdP groups granted full admin via the normal gateway JWT (no API key).
  admin_groups: [platform-finops]
  blocked_message: request an increase at https://go.example.com/claude-limits
FieldRequiredDescription
write_keysNoArray of {id, key}. An x-api-key matching one of these can list, set, and delete spend limits. Key values must be at least 32 characters; ids must be unique across read_keys and write_keys.
read_keysNoArray of {id, key}. Read-only: every GET endpoint, including listing caps, fetching one by ID, and reading /effective and /audit.
admin_groupsNoIdP group names. A gateway JWT whose groups claim includes one of these has full admin access, read and write, and audits as oidc:<sub>. Use this for human admins; use API keys for machines.
blocked_messageNoAppended verbatim to the 429 billing_error a blocked developer sees. Write the whole instruction, such as a URL or a Slack channel. Unset, the error is spend limit reached.
audit_retention_daysNoDefault 365. Older admin_audit rows are swept.
spend_retention_monthsNoDefault 13. spend counter rows older than this are swept. The default keeps a full year plus the current partial month for year-over-year reporting.
identity_retention_daysNoDefault 90. Last-seen TTL for principal_emails rows, which hold each developer’s email, display name, and groups (PII). Deliberately shorter than spend retention so a deprovisioned identity ages out while its anonymous spend counters remain.
group_limit_modeNomin (default) or max. When a developer is in several groups with caps, min enforces the most restrictive and max the least. Used by both enforcement and /effective.

enforcement

The enforcement block controls how spend-limit checks behave when the store is unavailable.
FieldRequiredDescription
fail_closed_on_errorNoDefault false. Spend enforcement fails open on a Postgres outage, so inference stays up. Set true to fail closed: over-cap developers are blocked, but so is everyone else if the store is unreachable. Has no effect without an admin: block.

models

The models block is an optional admin-curated model list, served at /v1/models and used to translate model IDs per upstream. It is required for non-US Bedrock regions, Bedrock provisioned-throughput ARNs, and Foundry deployment names.
auto_include_builtin_models: true   # false: expose only the list below
models:
  - id: claude-opus-4-8
    label: Claude Opus 4.8
    # description: optional text shown in clients that surface it
    upstream_model:
      anthropic: claude-opus-4-8
      bedrock: us.anthropic.claude-opus-4-8   # or an inference-profile ARN
      foundry: your-opus-deployment-name

managed

The managed block defines role-based access policies keyed on IdP groups or email domain. Policies are evaluated in order; the first match is selected, then merged onto the match: {} catch-all base described below. They are served per-user at GET /managed/settings with ETag/304 caching.
managed:
  policies:
    # Specific groups first.
    - match: { groups: [eng-contractors] }
      cli:
        availableModels: [claude-sonnet-4-6]
        permissions: { deny: ["WebFetch", "WebSearch"] }
    # Default catch-all last: matches everyone who authenticated.
    - match: {}
      cli:
        availableModels: [claude-opus-4-8, claude-sonnet-4-6, claude-haiku-4-5]
A match: {} catch-all, conventionally listed last, is treated as a base layer. Every other policy inherits any key it doesn’t set from the catch-all, so per-role entries only need to list what differs from the org default. The merge rules depend on the key type:
  • Allow-lists: availableModels and permissions.allow. A specific policy’s list fully replaces the base’s.
  • Deny-lists and hook arrays: permissions.deny, permissions.ask, disabledMcpjsonServers, deniedMcpServers, blockedMarketplaces, and every hooks event-type array. These take the union of base and policy, so an org-wide deny or audit hook can’t be accidentally dropped by a per-role override.
  • Record-typed keys: env, modelOverrides, and skillOverrides. These shallow-merge, so a per-role env block overrides keys it sets and inherits the rest from the base.
availableModels is also enforced server-side at /v1/messages, so a denied model returns 400 regardless of what the client sends.
MatcherBehavior
match: {}Matches every authenticated user. Start with one of these and add group-scoped policies above it later.
match: { groups: [a, b] }Matches if the JWT’s groups claim contains any of the listed groups. Case-sensitive: groups must match the IdP’s exact casing.
match: { email_domain: example.com }Matches the part after the last @ in the JWT’s email claim, case-insensitive. Accepts one domain per policy.
match: { groups: [a], email_domain: example.com }Both conditions must match
An authenticated user who matches no policy gets the gateway’s defaults, which means every model in the catalog and no managed settings. Add a match: {} catch-all last if you want a guaranteed default policy.
The gateway keeps no user directory of its own. It authorizes each request from the user’s IdP token, reading group membership from the token’s groups claim and evaluating policies against it. There is no roster to enumerate and no accounts to pre-create, and therefore no SCIM endpoint, because there is nothing for SCIM to sync into.Run user and group lifecycle management at the source of truth, which is your IdP’s native SCIM provisioning or a dedicated identity-governance platform. Membership and deprovisioning governed there flow into the gateway automatically through the token. If you want SCIM provisioning of Claude accounts themselves, that is a Claude for Enterprise capability.Two propagation clocks apply:
  • Policy contents: editing a policy and redeploying reaches connected clients on their next managed-settings poll, within an hour
  • Group membership: changing a user’s group membership changes which policy matches them. This takes effect on the next session re-mint, meaning the next silent refresh, bounded by session.ttl_hours.

What goes in cli

Each cli value is a complete Claude Code managed-settings.json document, the same schema you would deploy via MDM or /etc/claude-code/managed-settings.json, expressed here as YAML. The CLI applies the delivered document at the managed tier, above user and project settings. The gateway validates each document against the CLI’s settings schema at boot, so an unrecognized top-level key or a recognized key with a malformed value fails boot with an error naming every offending key. Deliberately open parts of the schema still accept arbitrary values, because newer clients may recognize entries the gateway’s schema doesn’t. These open keys are env, pluginConfigs, and keys nested under permissions. Because validation uses the schema bundled with the gateway’s installed version, putting a top-level settings key introduced by a newer Claude Code release into managed config requires upgrading the gateway first. Smoke-test a new policy on one client before rolling it out. The full key reference is in Claude Code settings. The keys most operators reach for first:
managed:
  policies:
    - match: {}
      cli:
        # Model access (also enforced server-side at /v1/messages)
        availableModels: [claude-opus-4-8, claude-sonnet-4-6, claude-haiku-4-5]

        # Permission policy
        permissions:
          deny:
            - "WebFetch"
            - "Read(./.env)"
            - "Read(./secrets/**)"
          disableBypassPermissionsMode: disable   # blocks --dangerously-skip-permissions
        allowManagedPermissionRulesOnly: true     # ignore user/project permission rules

        # Environment pushed into the CLI process. DISABLE_UPDATES blocks
        # background and manual updates; DISABLE_AUTOUPDATER stops only
        # background updates.
        env:
          DISABLE_UPDATES: "1"                    # pin versions via your own distribution

        # Org-wide hooks. Hook commands run on developer machines, not the
        # gateway, so the path must exist on every client OS in the policy.
        hooks:
          PostToolUse:
            - matcher: "Edit|Write"
              hooks:
                - { type: command, command: /usr/local/bin/audit-edit.sh }
KeyEnforced byEffect
availableModelsGateway + CLIModel allowlist. Also checked at /v1/messages, so a patched client can’t bypass it.
permissions.allow / .denyCLITool and command rules. See Permissions.
permissions.disableBypassPermissionsModeCLISet to disable to block bypassPermissions, the mode that auto-approves every tool call, and the --dangerously-skip-permissions flag
allowManagedPermissionRulesOnlyCLIWhen true, user and project permission rules are ignored; only rules from this document apply
envCLIEnvironment variables merged into the CLI process. Use for telemetry, auto-update, and model-name overrides.
hooksCLIOrg-wide hooks
Because these settings arrive over the network, the CLI shows each developer a one-time security approval dialog before applying anything that can run a shell command or alter where traffic goes. The dialog covers:
  • hooks
  • env variables that aren’t on the CLI’s built-in safe list
  • shell-execution settings such as apiKeyHelper and statusLine
  • managed CLAUDE.md content
The safe list determines which env variables apply without approval:
  • On the safe list: auto-update and model-name vars
  • Not on the safe list: proxy vars, base-URL vars, and OTEL_EXPORTER_OTLP_ENDPOINT
The gateway’s telemetry configuration pushes OTEL_EXPORTER_OTLP_ENDPOINT, so setting telemetry.forward_to triggers the dialog on each interactive client. Non-interactive runs with the -p flag skip the dialog and apply settings without approval. The dialog protects the developer’s machine from a compromised or hostile gateway, not the organization from the developer, so the -p skip is intentional rather than a gap. If a developer declines, Claude Code exits rather than applying the policy. Pushing a new hook or non-safe env var to a broad policy therefore means an approval prompt on every matching developer’s next startup. The cli key was named settings in earlier releases. That spelling is still accepted as an alias, but new deployments should use cli.

Precedence with other managed sources

If a device also has a local managed-settings.json or MDM-delivered policy, the managed sources don’t merge. The highest-priority source provides all policy settings, ranked in this order with highest priority first:
  1. The policy helper
  2. Gateway-delivered settings
  3. MDM, via the HKLM registry on Windows or a plist on macOS
  4. The managed-settings.json file
  5. The HKCU registry, on Windows only
Embedding hosts can supply policy through the SDK managedSettings option. It is ignored by default and applies only when a managed source opts in with parentSettingsBehavior: "merge", filtered so it can tighten policy but not loosen it. The exception is a small set of cross-source keys, honored when any admin source sets them; the user-writable HKCU tier is excluded:
  • sandbox.network.allowManagedDomainsOnly and sandbox.filesystem.allowManagedReadPathsOnly: when locked, the corresponding allowlists are unioned across sources
  • allowAllClaudeAiMcps: allow-only override for the claude.ai MCP server allowlist
  • sandbox.bwrapPath and sandbox.socatPath: filesystem paths to the sandbox helper binaries
allowManagedPermissionRulesOnly and disableBypassPermissionsMode are not cross-source, so only the winning source’s value applies. Gateway policies apply to every Claude Code invocation on the machine, including non-interactive claude -p runs and sessions spawned by the Agent SDK. If the gateway is unreachable at startup, signed-in sessions exit with an error rather than running without their policy.
mcpServers inside a policy’s cli block is rejected at gateway boot. Per-group MCP distribution is not available; deploy MCP servers via the file-based managed-mcp.json on each device or let developers add them locally.

telemetry

The CLI sends OpenTelemetry Protocol (OTLP) over HTTP metrics, logs, and, when enabled, traces to the gateway, which relays them verbatim to each configured destination. See Monitoring usage for the metrics and events the CLI emits. The CLI stamps each export with the authenticated user’s identity, read from the gateway-issued JWT: the user.id, user.email, and user.groups attributes. Per-developer cost and usage attribution therefore works with no developer-side configuration.
telemetry:
  forward_to:
    - url: https://otel-collector.internal.example.com
      headers:
        Authorization: ${OTLP_TOKEN}
      # Per-signal opt-in. Default: metrics only.
      metrics: true
      logs: false
      traces: false
    - url: https://api.datadoghq.com/api/v2/otlp
      headers:
        DD-API-KEY: ${DD_API_KEY}
Each destination opts into metrics, logs, and traces independently, and the default is metrics only. The signals differ in sensitivity:
  • Metrics: aggregate counters such as token counts, request counts, and latency
  • Logs and traces: can carry full bash commands, tool inputs, and file paths, covering anything Claude Code does on a developer’s machine
Enable logs and traces only on destinations with the access controls and retention policy that data warrants.
Telemetry is off in the CLI by default. Configuring telemetry.forward_to together with listen.public_url turns it on. The gateway pushes five env vars to every connected client through /managed/settings:
  • CLAUDE_CODE_ENABLE_TELEMETRY=1
  • OTEL_METRICS_EXPORTER=otlp
  • OTEL_LOGS_EXPORTER=otlp
  • OTEL_TRACES_EXPORTER=otlp
  • OTEL_EXPORTER_OTLP_ENDPOINT=<public_url>
The pushed endpoint is built from the public URL, so metrics and logs need no OTEL configuration from developers or policies. The pushed configuration is applied at the managed tier, overriding OTEL_* variables a developer sets locally. Traces additionally require CLAUDE_CODE_ENHANCED_TELEMETRY_BETA=1 on each client. The gateway doesn’t push that variable, so set it through a managed policy’s env block. It isn’t on the CLI’s safe list, so delivering it through a policy is covered by the same security approval dialog that the pushed OTLP endpoint already triggers. Both protobuf and JSON OTLP encodings are relayed, and any OpenTelemetry-compatible backend works as a destination.

HTTP tuning

Four optional top-level blocks, access_control, limits, timeouts, and rate_limits, tune the HTTP surface. The defaults suit most deployments.
BlockKeyDefaultDescription
access_controlallow_cidrs / deny_cidrsemptyInbound IP allow/deny by client address, after trusted_proxies resolution. deny_cidrs is checked first; a client it matches is rejected even if allow_cidrs also matches. If allow_cidrs is non-empty the gateway is default-deny. /healthz and /readyz are exempt from allow_cidrs.
limitsmax_request_bytes32 MiBMax inbound request body; oversize requests get 413 before the body is buffered. Raise for large file or image requests.
limitsmax_request_header_bytesunsetWhen set, oversize headers return 431
limitsmax_url_lengthunsetWhen set, an over-long URL returns 414
timeoutsupstream_ttfb_ms120000Max wait for the upstream’s response headers (time to first byte). The response body then streams with no wall-clock cap. Applies to the direct Anthropic upstream path; Bedrock, Agent Platform, and Foundry are bounded by their provider SDK’s own timeout.
rate_limitsdevice_authorization.max / .window_seconds30 / 600Per-IP rate limit on the unauthenticated device-authorization endpoint. Raise for a large org behind a shared egress IP or NAT. These limits apply only to the device-grant sign-in flow, not to /v1/messages inference. See User-code brute-force resistance.
rate_limitsdevice_verify.max / .window_seconds10 / 600Per-IP rate limit on user_code submissions at /device

Complete example

This full reference config exercises every core section; the HTTP tuning blocks keep their defaults. Copy it, delete what you don’t need, and fill in your values. The config in the Quickstart is a minimal version of this.
gateway.yaml
# Run with:
#   claude gateway --config gateway.yaml
#
# Operational log verbosity is controlled by the CLAUDE_GATEWAY_LOG_LEVEL
# environment variable (info | warn | error; default info). It does not
# affect audit events, which are always emitted.

listen:
  host: 0.0.0.0
  port: 8080
  public_url: https://claude-gateway.internal.example.com
  # Omit the tls block when running behind a TLS-terminating ingress.
  # tls:
  #   cert: /certs/gateway.crt
  #   key: /certs/gateway.key
  # trusted_proxies:
  #   - 10.0.0.0/8

oidc:
  issuer: https://example.okta.com
  client_id: 0oa1example2
  client_secret: ${OIDC_CLIENT_SECRET}
  allowed_email_domains:
    - example.com
  # Required when the issuer is the Okta org server, whose id_tokens
  # can omit email and groups; the gateway fills them from /userinfo.
  userinfo_fallback: true
  # allowed_groups: [claude-code-users]
  # Okta emits groups only when the `groups` scope is requested and the
  # app's groups claim filter allows them. The contractors policy below
  # matches on groups, so the scope is requested here.
  scopes: [openid, profile, email, offline_access, groups]
  # extra_auth_params: { access_type: offline, prompt: consent }  # Google
  # groups_claim: groups          # Entra app roles: use `roles`
  # email_claim: email

session:
  jwt_secret: ${GATEWAY_JWT_SECRET}   # openssl rand -base64 32
  # ttl_hours: 1

store:
  postgres_url: ${GATEWAY_POSTGRES_URL}
  # max_connections: 5

# Enables /v1/organizations/spend_limits (mirrors the Anthropic Admin API)
# and per-developer spend enforcement on /v1/messages. Omit to disable.
# Caps themselves are set via the admin API, not here.
# admin:
#   write_keys:
#     - { id: terraform, key: "${GATEWAY_ADMIN_WRITE_KEY_TF}" }
#   read_keys:
#     - { id: reporting, key: "${GATEWAY_ADMIN_READ_KEY}" }
#   admin_groups: [platform-finops]
#   blocked_message: request an increase at https://go.example.com/claude-limits
#   # audit_retention_days: 365
#   # spend_retention_months: 13
#   # identity_retention_days: 90
#   # group_limit_mode: min

# enforcement:
#   fail_closed_on_error: false

upstreams:
  - provider: anthropic
    auth:
      api_key: ${ANTHROPIC_API_KEY}

  # - provider: bedrock
  #   region: us-east-1
  #   auth: {}

  # - provider: vertex
  #   region: us-east5
  #   project_id: example-prod
  #   auth: {}

  # - provider: foundry
  #   resource: example-foundry
  #   auth: { use_azure_ad: true }

auto_include_builtin_models: true
models:
  - id: claude-opus-4-8
    label: Claude Opus 4.8
    upstream_model:
      anthropic: claude-opus-4-8
      # bedrock: us.anthropic.claude-opus-4-8
      # vertex: claude-opus-4-8
      # foundry: <your-opus-deployment-name>
  - id: claude-sonnet-4-6
    label: Claude Sonnet 4.6
    upstream_model:
      anthropic: claude-sonnet-4-6
  - id: claude-haiku-4-5
    label: Claude Haiku 4.5
    upstream_model:
      anthropic: claude-haiku-4-5

managed:
  policies:
    - match: { groups: [contractors] }
      cli:
        availableModels: [claude-haiku-4-5]
        # Constrain the Default picker option to availableModels instead of
        # the tier default, so contractors don't get a 400 on the default.
        enforceAvailableModels: true
        # allow auto-approves these tools; it does not block the rest.
        # Add deny rules to restrict tools.
        permissions: { allow: [Read, Grep] }
    - match: {}
      cli:
        availableModels: [claude-opus-4-8, claude-sonnet-4-6, claude-haiku-4-5]
        permissions:
          allow: [Read, Grep, Bash, Edit]
          deny: ["WebFetch"]
        env: { HTTP_PROXY: http://proxy.example.com:8080 }

telemetry:
  forward_to:
    - url: https://otel.internal.example.com:4318
      headers:
        Authorization: Bearer ${OTEL_TOKEN}

Client-side managed settings

Everything above configures the gateway server. Pointing developer machines at it is configured separately, on each device, through Claude Code’s managed settings. The gateway can’t push these keys itself, because they’re what tell the client where the gateway is. For the CLI, set both keys in the per-OS managed-settings.json:
{
  "forceLoginMethod": "gateway",
  "forceLoginGatewayUrl": "https://claude-gateway.internal.example.com"
}
Deploy that file to each device, typically via your MDM platform. The file path differs by platform:
PlatformPath
macOS/Library/Application Support/ClaudeCode/managed-settings.json, or the com.anthropic.claudecode managed preferences domain
Linux and WSL/etc/claude-code/managed-settings.json
WindowsC:\Program Files\ClaudeCode\managed-settings.json, or Group Policy via the HKLM registry
forceLoginGatewayUrl, and the "gateway" value of forceLoginMethod, are honored only from the admin-controlled managed tier. A developer setting them in their own ~/.claude/settings.json has no effect.