-
Notifications
You must be signed in to change notification settings - Fork 283
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
prep release: v2.1.0 #7109
prep release: v2.1.0 #7109
Conversation
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Various nit-picky suggestions to improve the consistency and clarity of the changelog. Please don't hesitate to reject any/all - I may have inadvertently just imposed my own style preferences rather than making meaningful contributions 😅
NB: I do think it might be helpful to reorganize some of the changes - perhaps picking the top few features and then bucketing the rest by rough categories would improve the flow?
Co-authored-by: Caroline Rodewig <[email protected]>
Co-authored-by: Caroline Rodewig <[email protected]>
Co-authored-by: Caroline Rodewig <[email protected]> Co-authored-by: Edward Huang <[email protected]>
Co-authored-by: Caroline Rodewig <[email protected]>
Co-authored-by: Caroline Rodewig <[email protected]>
🚀 Features
Add metric to measure cardinality overflow frequency (PR #6998)
Adds a new counter metric,
apollo.router.telemetry.metrics.cardinality_overflow
, that is incremented when the cardinality overflow log from opentelemetry-rust occurs. This log means that a metric in a batch has reached a cardinality of > 2000 and that any excess attributes will be ignored.By @rregitsky in #6998
Introduce PQ manifest
hot_reload
option for local manifests (PR #6987)This change introduces a
persisted_queries.hot_reload
configuration option to allow the router to hot reload local PQ manifest changes.If you configure
local_manifests
, you can sethot_reload
totrue
to automatically reload manifest files whenever they change. This lets you update local manifest files without restarting the router.Note: This change explicitly does not piggyback on the existing
--hot-reload
flag.By @trevor-scheer in #6987
Add metrics for value completion errors (PR #6905)
When the router encounters a value completion error, it is not included in the GraphQL errors array, making it harder to observe. To surface this issue in a more obvious way, router now counts value completion error metrics via the metric instruments
apollo.router.graphql.error
andapollo.router.operations.error
, distinguishable via thecode
attribute with valueRESPONSE_VALIDATION_FAILED
.By @timbotnik in #6905
Changes to experimental error metrics (PR #6966)
In 2.0.0, an experimental metric
telemetry.apollo.errors.experimental_otlp_error_metrics
was introduced to track errors with additional attributes. A few related changes are included here:send
flag e.g.telemetry.apollo.errors.subgraph.[all|(subgraph name)].send
.telemetry.apollo.errors.subgraph.[all|(subgraph name)].redaction_policy
has been added. This flag only applies whenredact
is set totrue
. When set toErrorRedactionPolicy.Strict
, error redaction will behave as it has in the past. Setting this toErrorRedactionPolicy.Extended
will allow theextensions.code
value from subgraph errors to pass through redaction and be sent to Studio.By @timbotnik in #6966
Add router config validate subcommand (PR #7016)
Adds new
router config validate
subcommand to allow validation of a router config file without fully starting up the Router.By @andrewmcgivery in #7016
Support traffic shaping for connectors (PR #6737)
Traffic shaping is now supported for connectors. To target a specific source, use the
subgraph_name.source_name
under the newconnector.sources
property oftraffic_shaping
. Settings underconnector.all
will apply to all connectors.deduplicate_query
is not supported at this time.Example config:
By @andrewmcgivery in #6737
Add
apollo.router.pipelines
metrics (PR #6967)When the router reloads, either via schema change or config change, a new request pipeline is created.
Existing request pipelines are closed once their requests finish. However, this may not happen if there are ongoing long requests that do not finish, such as Subscriptions.
To enable debugging when request pipelines are being kept around, a new gauge metric has been added:
apollo.router.pipelines
- The number of request pipelines active in the routerschema.id
- The Apollo Studio schema hash associated with the pipeline.launch.id
- The Apollo Studio launch id associated with the pipeline (optional).config.hash
- The hash of the configurationBy @BrynCooke in #6967
Update JWT handling (PR #6930)
This PR updates JWT-handling in the
AuthenticationPlugin
;config.authentication.router.jwt.on_error
.Error
, JWT-related errors will be returned to users (the current behavior).Continue
, JWT errors will instead be ignored, and JWT claims will not be set in the request context.apollo::authentication::jwt_status
which notes the result of processing.By @Velfi in #6930
Add support to get/set URI scheme in Rhai (Issue #6897)
This adds support to read and write the scheme from the
request.uri.scheme
/request.subgraph.uri.scheme
functions in Rhai,enabling the ability to switch between
http
andhttps
for subgraph fetches. For example:By @starJammer in #6906
Add
apollo.router.open_connections
metric (PR #7023)To help users to diagnose when connections are keeping pipelines hanging around, the following metric has been added:
apollo.router.open_connections
- The number of request pipelines active in the routerschema.id
- The Apollo Studio schema hash associated with the pipeline.launch.id
- The Apollo Studio launch id associated with the pipeline (optional).config.hash
- The hash of the configuration.server.address
- The address that the router is listening on.server.port
- The port that the router is listening on if not a unix socket.http.connection.state
- Eitheractive
orterminating
.You can use this metric to monitor when connections are open via long running requests or keepalive messages.
By @bryncooke in #7023
Add
batching.maximum_size
configuration option to limit maximum client batch size (PR #7005)Add an optional
maximum_size
parameter to the batching configuration.maximum_size
queries in the client batch.If the number of queries provided exceeds the maximum batch size, the entire batch fails with error code 422 (
Unprocessable Content
). For example:By @carodewig in #7005
Support TLS configuration for connectors (PR #6995)
Connectors now supports TLS configuration for using custom certificate authorities and utilizing client certificate authentication.
By @andrewmcgivery in #6995
Enable remote proxy downloads of the Router
This enables users without direct download access to specify a remote proxy mirror location for the GitHub download of
the Apollo Router releases.
By @LongLiveCHIEF in #6667
Add span events to error spans for connectors and demand control plugin (PR #6727)
New span events have been added to trace spans which include errors. These span events include the GraphQL error code that relates to the error. So far, this only includes errors generated by connectors and the demand control plugin.
By @bonnici in #6727
🐛 Fixes
Export gauge instruments (Issue #6859)
Previously in router 2.x, when using the router's OTel
meter_provider()
to report metrics from Rust plugins, gauge instruments such as those created using.u64_gauge()
weren't exported. The router now exports these instruments.By @yanns in #6865
Use
batch_processor
config for Apollo metricsPeriodicReader
(PR #7024)The Apollo OTLP
batch_processor
configurationstelemetry.apollo.batch_processor.scheduled_delay
andtelemetry.apollo.batch_processor.max_export_timeout
now also control the Apollo OTLPPeriodicReader
export interval and timeout, respectively. This update brings parity between Apollo OTLP metrics and non-Apollo OTLP exporter metrics.By @rregitsky in #7024
Reduce Brotli encoding compression level (Issue #6857)
The Brotli encoding compression level has been changed from
11
to4
to improve performance and mimic other compression algorithms'fast
setting. This value is also a much more reasonable value for dynamic workloads.By @carodewig in #7007
CPU count inference improvements for
cgroup
environments (PR #6787)This fixes an issue where the
fleet_detector
plugin would not correctly infer the CPU limits for a system which usedcgroup
orcgroup2
.By @nmoutschen in #6787
Separate entity keys and representation variables in entity cache key (Issue #6673)
This fix separates the entity keys and representation variable values in the cache key, to avoid issues with
@requires
for example.Important
If you have enabled Distributed query plan caching, this release contains changes which necessarily alter the hashing algorithm used for the cache keys. On account of this, you should anticipate additional cache regeneration cost when updating between these versions while the new hashing algorithm comes into service.
By @bnjjj in #6888
Replace Rhai-specific hot-reload functionality with general hot-reload (PR #6950)
In Router 2.0 the rhai hot-reload capability was not working. This was because of architectural improvements to the router which meant that the entire service stack was no longer re-created for each request.
The fix adds the rhai source files into the primary list of elements, configuration, schema, etc..., watched by the router and removes the old Rhai-specific file watching logic.
If --hot-reload is enabled, the router will reload on changes to Rhai source code just like it would for changes to configuration, for example.
By @garypen in #6950
📃 Configuration
Make experimental OTLP error metrics feature flag non-experimental (PR #7033)
Because the OTLP error metrics feature is being promoted to
preview
fromexperimental
, this change updates its feature flag name fromexperimental_otlp_error_metrics
topreview_extended_error_metrics
.By @merylc in #7033
Tip
All notable changes to Router v2.x after its initial release will be documented in this file. To see previous history, see the changelog prior to v2.0.0.