prep release: v2.1.0 #7109

abernix · 2025-03-25T12:28:47Z

Note

When approved, this PR will merge into the 2.1.0 branch which will — upon being approved itself — merge into dev.

Things to review in this PR:

Changelog correctness (There is a preview below, but it is not necessarily the most up to date. See the Files Changed for the true reality.)

Version bumps

That it targets the right release branch (2.1.0 in this case!).

🚀 Features

Add metric to measure cardinality overflow frequency (PR #6998)

Adds a new counter metric, apollo.router.telemetry.metrics.cardinality_overflow, that is incremented when the cardinality overflow log from opentelemetry-rust occurs. This log means that a metric in a batch has reached a cardinality of > 2000 and that any excess attributes will be ignored.

By @rregitsky in #6998

Introduce PQ manifest `hot_reload` option for local manifests (PR #6987)

This change introduces a persisted_queries.hot_reload configuration option to allow the router to hot reload local PQ manifest changes.

If you configure local_manifests, you can set hot_reload to true to automatically reload manifest files whenever they change. This lets you update local manifest files without restarting the router.

persisted_queries:
  enabled: true
  local_manifests:
    - ./path/to/persisted-query-manifest.json
  hot_reload: true

Note: This change explicitly does not piggyback on the existing --hot-reload flag.

By @trevor-scheer in #6987

Add metrics for value completion errors (PR #6905)

When the router encounters a value completion error, it is not included in the GraphQL errors array, making it harder to observe. To surface this issue in a more obvious way, router now counts value completion error metrics via the metric instruments apollo.router.graphql.error and apollo.router.operations.error, distinguishable via the code attribute with value RESPONSE_VALIDATION_FAILED.

By @timbotnik in #6905

Changes to experimental error metrics (PR #6966)

In 2.0.0, an experimental metric telemetry.apollo.errors.experimental_otlp_error_metrics was introduced to track errors with additional attributes. A few related changes are included here:

Sending these metrics now also respects the subgraph's send flag e.g. telemetry.apollo.errors.subgraph.[all|(subgraph name)].send.
A new configuration option telemetry.apollo.errors.subgraph.[all|(subgraph name)].redaction_policy has been added. This flag only applies when redact is set to true. When set to ErrorRedactionPolicy.Strict, error redaction will behave as it has in the past. Setting this to ErrorRedactionPolicy.Extended will allow the extensions.code value from subgraph errors to pass through redaction and be sent to Studio.
A warning about incompatibility of error telemetry with connectors will be suppressed when this feature is enabled, since it does support connectors when using the new mode.

By @timbotnik in #6966

Add router config validate subcommand (PR #7016)

Adds new router config validate subcommand to allow validation of a router config file without fully starting up the Router.

./router config validate <path-to-config-file.yaml>

By @andrewmcgivery in #7016

Support traffic shaping for connectors (PR #6737)

Traffic shaping is now supported for connectors. To target a specific source, use the subgraph_name.source_name under the new connector.sources property of traffic_shaping. Settings under connector.all will apply to all connectors. deduplicate_query is not supported at this time.

Example config:

traffic_shaping:
  connector:
    all:
      timeout: 5s
    sources:
      connector-graph.random_person_api:
        global_rate_limit:
          capacity: 20
          interval: 1s
        experimental_http2: http2only
        timeout: 1s

By @andrewmcgivery in #6737

Add `apollo.router.pipelines` metrics (PR #6967)

When the router reloads, either via schema change or config change, a new request pipeline is created.
Existing request pipelines are closed once their requests finish. However, this may not happen if there are ongoing long requests that do not finish, such as Subscriptions.

To enable debugging when request pipelines are being kept around, a new gauge metric has been added:

apollo.router.pipelines - The number of request pipelines active in the router
- schema.id - The Apollo Studio schema hash associated with the pipeline.
- launch.id - The Apollo Studio launch id associated with the pipeline (optional).
- config.hash - The hash of the configuration

By @BrynCooke in #6967

Update JWT handling (PR #6930)

This PR updates JWT-handling in the AuthenticationPlugin;

Users may now set a new config option config.authentication.router.jwt.on_error.
- When set to the default Error, JWT-related errors will be returned to users (the current behavior).
- When set to Continue, JWT errors will instead be ignored, and JWT claims will not be set in the request context.
When JWTs are processed, whether processing succeeds or fails, the request context will contain a new variable apollo::authentication::jwt_status which notes the result of processing.

By @Velfi in #6930

Add support to get/set URI scheme in Rhai (Issue #6897)

This adds support to read and write the scheme from the request.uri.scheme/request.subgraph.uri.scheme functions in Rhai,
enabling the ability to switch between http and https for subgraph fetches. For example:

fn subgraph_service(service, subgraph){
    service.map_request(|request|{
        log_info(`${request.subgraph.uri.scheme}`);
        if request.subgraph.uri.scheme == {} {
            log_info("Scheme is not explicitly set");
        }
        request.subgraph.uri.scheme = "https"
        request.subgraph.uri.host = "api.apollographql.com";
        request.subgraph.uri.path = "/api/graphql";
        request.subgraph.uri.port = 1234;
        log_info(``);
    });
}

By @starJammer in #6906

Add `apollo.router.open_connections` metric (PR #7023)

To help users to diagnose when connections are keeping pipelines hanging around, the following metric has been added:

apollo.router.open_connections - The number of request pipelines active in the router
- schema.id - The Apollo Studio schema hash associated with the pipeline.
- launch.id - The Apollo Studio launch id associated with the pipeline (optional).
- config.hash - The hash of the configuration.
- server.address - The address that the router is listening on.
- server.port - The port that the router is listening on if not a unix socket.
- http.connection.state - Either active or terminating.

You can use this metric to monitor when connections are open via long running requests or keepalive messages.

By @bryncooke in #7023

Add `batching.maximum_size` configuration option to limit maximum client batch size (PR #7005)

Add an optional maximum_size parameter to the batching configuration.

When specified, the router will reject requests which contain more than maximum_size queries in the client batch.
When unspecified, the router performs no size checking (the current behavior).

If the number of queries provided exceeds the maximum batch size, the entire batch fails with error code 422 (
Unprocessable Content). For example:

{
  "errors": [
    {
      "message": "Invalid GraphQL request",
      "extensions": {
        "details": "Batch limits exceeded: you provided a batch with 3 entries, but the configured maximum router batch size is 2",
        "code": "BATCH_LIMIT_EXCEEDED"
      }
    }
  ]
}

By @carodewig in #7005

Support TLS configuration for connectors (PR #6995)

Connectors now supports TLS configuration for using custom certificate authorities and utilizing client certificate authentication.

tls:
  connector:
    sources:
      connector-graph.random_person_api:
        certificate_authorities: 
        client_authentication:
          certificate_chain: 
          key:

By @andrewmcgivery in #6995

Enable remote proxy downloads of the Router

This enables users without direct download access to specify a remote proxy mirror location for the GitHub download of
the Apollo Router releases.

By @LongLiveCHIEF in #6667

Add span events to error spans for connectors and demand control plugin (PR #6727)

New span events have been added to trace spans which include errors. These span events include the GraphQL error code that relates to the error. So far, this only includes errors generated by connectors and the demand control plugin.

By @bonnici in #6727

🐛 Fixes

Export gauge instruments (Issue #6859)

Previously in router 2.x, when using the router's OTel meter_provider() to report metrics from Rust plugins, gauge instruments such as those created using .u64_gauge() weren't exported. The router now exports these instruments.

By @yanns in #6865

Use `batch_processor` config for Apollo metrics `PeriodicReader` (PR #7024)

The Apollo OTLP batch_processor configurations telemetry.apollo.batch_processor.scheduled_delay and telemetry.apollo.batch_processor.max_export_timeout now also control the Apollo OTLP PeriodicReader export interval and timeout, respectively. This update brings parity between Apollo OTLP metrics and non-Apollo OTLP exporter metrics.

By @rregitsky in #7024

Reduce Brotli encoding compression level (Issue #6857)

The Brotli encoding compression level has been changed from 11 to 4 to improve performance and mimic other compression algorithms' fast setting. This value is also a much more reasonable value for dynamic workloads.

By @carodewig in #7007

CPU count inference improvements for `cgroup` environments (PR #6787)

This fixes an issue where the fleet_detector plugin would not correctly infer the CPU limits for a system which used cgroup or cgroup2.

By @nmoutschen in #6787

Separate entity keys and representation variables in entity cache key (Issue #6673)

This fix separates the entity keys and representation variable values in the cache key, to avoid issues with @requires for example.

Important

If you have enabled Distributed query plan caching, this release contains changes which necessarily alter the hashing algorithm used for the cache keys. On account of this, you should anticipate additional cache regeneration cost when updating between these versions while the new hashing algorithm comes into service.

By @bnjjj in #6888

Replace Rhai-specific hot-reload functionality with general hot-reload (PR #6950)

In Router 2.0 the rhai hot-reload capability was not working. This was because of architectural improvements to the router which meant that the entire service stack was no longer re-created for each request.

The fix adds the rhai source files into the primary list of elements, configuration, schema, etc..., watched by the router and removes the old Rhai-specific file watching logic.

If --hot-reload is enabled, the router will reload on changes to Rhai source code just like it would for changes to configuration, for example.

By @garypen in #6950

📃 Configuration

Make experimental OTLP error metrics feature flag non-experimental (PR #7033)

Because the OTLP error metrics feature is being promoted to preview from experimental, this change updates its feature flag name from experimental_otlp_error_metrics to preview_extended_error_metrics.

By @merylc in #7033

Tip

All notable changes to Router v2.x after its initial release will be documented in this file. To see previous history, see the changelog prior to v2.0.0.

svc-apollo-docs · 2025-03-25T12:29:32Z

⚠️ Docs preview not attached to branch

The preview was not built because the PR's base branch 2.1.0 is not in the list of sources.

An Apollo team member can comment one of the following commands to dictate which branch to attach the preview to:

!docs set-base-branch 1.x
!docs set-base-branch dev

Build ID: 219c602f47a6957ce69f30d4

CHANGELOG.md

carodewig

Various nit-picky suggestions to improve the consistency and clarity of the changelog. Please don't hesitate to reject any/all - I may have inadvertently just imposed my own style preferences rather than making meaningful contributions 😅

NB: I do think it might be helpful to reorganize some of the changes - perhaps picking the top few features and then bucketing the rest by rough categories would improve the flow?

CHANGELOG.md

Co-authored-by: Caroline Rodewig <[email protected]>

CHANGELOG.md

Co-authored-by: Caroline Rodewig <[email protected]> Co-authored-by: Edward Huang <[email protected]>

Co-authored-by: Caroline Rodewig <[email protected]>

CHANGELOG.md

Fixes the mistakes I made while landing "Ordering" in the prep PR for 2.1.0: #7109. Ref: a9b6b34

abernix added 2 commits March 25, 2025 14:23

Add Mise configuration for v2.x

1ce45ea

prep release: v2.1.0

e9f2ce0

abernix requested review from a team as code owners March 25, 2025 12:28

abernix commented Mar 25, 2025

View reviewed changes

CHANGELOG.md Outdated Show resolved Hide resolved

lennyburdette reviewed Mar 25, 2025

View reviewed changes

CHANGELOG.md Outdated Show resolved Hide resolved

Remove WIP from CHANGELOG.md

90fb1d4

abernix requested a review from a team March 25, 2025 18:14

carodewig reviewed Mar 25, 2025

View reviewed changes

Velfi and others added 2 commits March 25, 2025 15:56

Update CHANGELOG.md

9d5dd83

Co-authored-by: Caroline Rodewig <[email protected]>

Update CHANGELOG.md

9ec3615

Co-authored-by: Caroline Rodewig <[email protected]>

shorgi reviewed Mar 25, 2025

View reviewed changes

CHANGELOG.md Outdated Show resolved Hide resolved

CHANGELOG.md Outdated Show resolved Hide resolved

CHANGELOG.md Outdated Show resolved Hide resolved

CHANGELOG.md Outdated Show resolved Hide resolved

CHANGELOG.md Outdated Show resolved Hide resolved

abernix commented Mar 26, 2025

View reviewed changes

CHANGELOG.md Outdated Show resolved Hide resolved

abernix and others added 4 commits March 26, 2025 11:31

Apply suggestions from code review

53bf9e9

Co-authored-by: Caroline Rodewig <[email protected]> Co-authored-by: Edward Huang <[email protected]>

Apply suggestions from code review

a0eebb5

Co-authored-by: Caroline Rodewig <[email protected]>

Update CHANGELOG.md

21e2884

Co-authored-by: Caroline Rodewig <[email protected]>

Merge branch '2.1.0' into prep-2.1.0

fd358a7

BrynCooke previously approved these changes Mar 26, 2025

View reviewed changes

lrlna previously approved these changes Mar 26, 2025

View reviewed changes

garypen previously approved these changes Mar 26, 2025

View reviewed changes

abernix commented Mar 26, 2025

View reviewed changes

CHANGELOG.md Outdated Show resolved Hide resolved

Ordering

a9b6b34

abernix dismissed stale reviews from garypen, lrlna, and BrynCooke via a9b6b34 March 26, 2025 11:10

abernix merged commit cfd1cce into 2.1.0 Mar 26, 2025
9 of 10 checks passed

abernix deleted the prep-2.1.0 branch March 26, 2025 11:16

abernix added a commit that referenced this pull request Mar 26, 2025

Do better at changelog ordering

3cb18f2

Fixes the mistakes I made while landing "Ordering" in the prep PR for 2.1.0: #7109. Ref: a9b6b34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

prep release: v2.1.0 #7109

prep release: v2.1.0 #7109

abernix commented Mar 25, 2025 •

edited

Loading

svc-apollo-docs commented Mar 25, 2025 •

edited

Loading

carodewig left a comment

prep release: v2.1.0 #7109

prep release: v2.1.0 #7109

Conversation

abernix commented Mar 25, 2025 • edited Loading

🚀 Features

Add metric to measure cardinality overflow frequency (PR #6998)

Introduce PQ manifest hot_reload option for local manifests (PR #6987)

Add metrics for value completion errors (PR #6905)

Changes to experimental error metrics (PR #6966)

Add router config validate subcommand (PR #7016)

Support traffic shaping for connectors (PR #6737)

Add apollo.router.pipelines metrics (PR #6967)

Update JWT handling (PR #6930)

Add support to get/set URI scheme in Rhai (Issue #6897)

Add apollo.router.open_connections metric (PR #7023)

Add batching.maximum_size configuration option to limit maximum client batch size (PR #7005)

Support TLS configuration for connectors (PR #6995)

Enable remote proxy downloads of the Router

Add span events to error spans for connectors and demand control plugin (PR #6727)

🐛 Fixes

Export gauge instruments (Issue #6859)

Use batch_processor config for Apollo metrics PeriodicReader (PR #7024)

Reduce Brotli encoding compression level (Issue #6857)

CPU count inference improvements for cgroup environments (PR #6787)

Separate entity keys and representation variables in entity cache key (Issue #6673)

Replace Rhai-specific hot-reload functionality with general hot-reload (PR #6950)

📃 Configuration

Make experimental OTLP error metrics feature flag non-experimental (PR #7033)

svc-apollo-docs commented Mar 25, 2025 • edited Loading

⚠️ Docs preview not attached to branch

carodewig left a comment

Choose a reason for hiding this comment

abernix commented Mar 25, 2025 •

edited

Loading

Introduce PQ manifest `hot_reload` option for local manifests (PR #6987)

Add `apollo.router.pipelines` metrics (PR #6967)

Add `apollo.router.open_connections` metric (PR #7023)

Add `batching.maximum_size` configuration option to limit maximum client batch size (PR #7005)

Use `batch_processor` config for Apollo metrics `PeriodicReader` (PR #7024)

CPU count inference improvements for `cgroup` environments (PR #6787)

svc-apollo-docs commented Mar 25, 2025 •

edited

Loading