Skip to content

chore(config): Convert top-level transforms enum to typetag #16572

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Feb 27, 2023

Conversation

bruceg
Copy link
Member

@bruceg bruceg commented Feb 23, 2023

This replaces the top-level enum Transforms into a boxed transform type. Serialization and deserialization are handled by typetag, and the configurable_component macro is enhanced to build out the necessary table entries to generate the schema bits dynamically from all of the components that are compiled into the current configuration.

Note that this same approach is now possible for both the sources and sinks as well. This focuses on the transforms as the smallest step to prove it out for the other component types. Making it work for either sources or sinks should be able to be scoped down to just the actual sources and sinks and not need to touch support code like this does.

This replaces the top-level `enum Transforms` into a boxed transform type.
Serialization and deserialization are handled by `typetag`, and the
`configurable_component` macro is enhanced to build out the necessary table
entries to generate the schema bits dynamically from all of the components that
are compiled into the current configuration.
@bruceg bruceg added type: tech debt A code change that does not add user value. domain: config Anything related to configuring Vector domain: transforms Anything related to Vector's transform components labels Feb 23, 2023
@bruceg bruceg requested review from tobz and a team February 23, 2023 20:36
@netlify
Copy link

netlify bot commented Feb 23, 2023

Deploy Preview for vector-project ready!

Name Link
🔨 Latest commit f682889
🔍 Latest deploy log https://app.netlify.com/sites/vector-project/deploys/63f938cf4a1a0c000756d49e
😎 Deploy Preview https://deploy-preview-16572--vector-project.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site settings.

@netlify
Copy link

netlify bot commented Feb 23, 2023

Deploy Preview for vrl-playground ready!

Name Link
🔨 Latest commit f682889
🔍 Latest deploy log https://app.netlify.com/sites/vrl-playground/deploys/63f938cf528a090008ab886d
😎 Deploy Preview https://deploy-preview-16572--vrl-playground.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site settings.

@github-actions github-actions bot added the domain: topology Anything related to Vector's topology code label Feb 23, 2023
@bruceg bruceg added the ci-condition: integration tests enable Run integration tests on this PR label Feb 23, 2023
@github-actions
Copy link

Regression Detector Results

Run ID: a18da62b-ba7c-4389-8d1a-a605112daf5a
Baseline: ab45939
Comparison: 6725661
Total vector CPUs: 7

Explanation

A regression test is an integrated performance test for vector in a repeatable rig, with varying configuration for vector. What follows is a statistical summary of a brief vector run for each configuration across SHAs given above. The goal of these tests are to determine quickly if vector performance is changed and to what degree by a pull request.

The table below, if present, lists those experiments that have experienced a statistically significant change in mean optimization goal performance between baseline and comparison SHAs with 90.00% confidence OR have been detected as newly erratic. Negative values mean that baseline is faster, positive comparison. Results that do not exhibit more than a ±5.00% change in their mean optimization goal are discarded. An experiment is erratic if its coefficient of variation is greater than 0.1. The abbreviated table will be omitted if no interesting change is observed.

Changes in experiment optimization goals with confidence ≥ 90.00% and |Δ mean %| ≥ 5.00%:

experiment goal Δ mean Δ mean % confidence
file_to_blackhole egress throughput -502.0KiB/CPU-s -7.66 99.77%
Fine details of change detection per experiment.
experiment goal Δ mean Δ mean % confidence baseline mean baseline stdev baseline stderr baseline outlier % baseline CoV comparison mean comparison stdev comparison stderr comparison outlier % comparison CoV erratic declared erratic
http_text_to_http_json ingress throughput 624.49KiB/CPU-s 2.48 100.00% 24.58MiB/CPU-s 555.13KiB/CPU-s 6.83KiB/CPU-s 0.0 0.022055 25.19MiB/CPU-s 664.41KiB/CPU-s 8.18KiB/CPU-s 0.0 0.025757 False False
syslog_regex_logs2metric_ddmetrics ingress throughput 38.84KiB/CPU-s 1.05 100.00% 3.62MiB/CPU-s 373.39KiB/CPU-s 4.59KiB/CPU-s 0.0 0.100808 3.65MiB/CPU-s 387.24KiB/CPU-s 4.76KiB/CPU-s 0.0 0.103462 True False
syslog_log2metric_splunk_hec_metrics ingress throughput 60.8KiB/CPU-s 0.64 100.00% 9.22MiB/CPU-s 238.14KiB/CPU-s 2.93KiB/CPU-s 0.0 0.025229 9.28MiB/CPU-s 212.29KiB/CPU-s 2.61KiB/CPU-s 0.0 0.022347 False False
splunk_hec_route_s3 ingress throughput 59.03KiB/CPU-s 0.50 100.00% 11.58MiB/CPU-s 561.9KiB/CPU-s 6.91KiB/CPU-s 0.0 0.047381 11.64MiB/CPU-s 543.31KiB/CPU-s 6.69KiB/CPU-s 0.0 0.045586 False False
enterprise_http_to_http ingress throughput 5.48KiB/CPU-s 0.04 87.10% 13.62MiB/CPU-s 249.79KiB/CPU-s 3.07KiB/CPU-s 0.0 0.017911 13.62MiB/CPU-s 153.79KiB/CPU-s 1.89KiB/CPU-s 0.0 0.011023 False False
datadog_agent_remap_blackhole_acks ingress throughput 8.64KiB/CPU-s 0.03 34.81% 31.38MiB/CPU-s 1.18MiB/CPU-s 14.83KiB/CPU-s 0.0 0.037503 31.38MiB/CPU-s 984.66KiB/CPU-s 12.12KiB/CPU-s 0.0 0.030637 False False
http_to_http_noack ingress throughput 3.55KiB/CPU-s 0.03 42.14% 13.61MiB/CPU-s 383.47KiB/CPU-s 4.72KiB/CPU-s 0.0 0.027521 13.61MiB/CPU-s 349.98KiB/CPU-s 4.31KiB/CPU-s 0.0 0.025111 False False
splunk_hec_to_splunk_hec_logs_acks ingress throughput 183.44B/CPU-s 0.00 2.31% 13.61MiB/CPU-s 351.31KiB/CPU-s 4.32KiB/CPU-s 0.0 0.025199 13.61MiB/CPU-s 358.71KiB/CPU-s 4.41KiB/CPU-s 0.0 0.025729 False False
fluent_elasticsearch ingress throughput -154.02B/CPU-s -0.00 22.85% 45.41MiB/CPU-s 29.88KiB/CPU-s 372.39B/CPU-s 0.0 0.000642 45.41MiB/CPU-s 30.3KiB/CPU-s 377.57B/CPU-s 0.0 0.000651 False False
splunk_hec_indexer_ack_blackhole ingress throughput -2.26KiB/CPU-s -0.02 38.13% 13.62MiB/CPU-s 255.82KiB/CPU-s 3.15KiB/CPU-s 0.0 0.018346 13.61MiB/CPU-s 266.03KiB/CPU-s 3.27KiB/CPU-s 0.0 0.019082 False False
splunk_hec_to_splunk_hec_logs_noack ingress throughput -3.05KiB/CPU-s -0.02 48.69% 13.62MiB/CPU-s 256.45KiB/CPU-s 3.15KiB/CPU-s 0.0 0.018391 13.61MiB/CPU-s 278.46KiB/CPU-s 3.42KiB/CPU-s 0.0 0.019974 False False
socket_to_socket_blackhole ingress throughput -20.67KiB/CPU-s -0.15 97.30% 13.32MiB/CPU-s 617.06KiB/CPU-s 7.59KiB/CPU-s 0.0 0.045231 13.3MiB/CPU-s 443.27KiB/CPU-s 5.45KiB/CPU-s 0.0 0.032542 False False
datadog_agent_remap_datadog_logs_acks ingress throughput -54.94KiB/CPU-s -0.16 99.64% 32.91MiB/CPU-s 1.18MiB/CPU-s 14.8KiB/CPU-s 0.0 0.035711 32.86MiB/CPU-s 951.56KiB/CPU-s 11.71KiB/CPU-s 0.0 0.02828 False False
datadog_agent_remap_datadog_logs ingress throughput -169.96KiB/CPU-s -0.50 100.00% 33.13MiB/CPU-s 1.01MiB/CPU-s 12.72KiB/CPU-s 0.0 0.030459 32.96MiB/CPU-s 1.03MiB/CPU-s 12.97KiB/CPU-s 0.0 0.031215 False False
syslog_loki ingress throughput -52.8KiB/CPU-s -0.60 100.00% 8.58MiB/CPU-s 244.13KiB/CPU-s 3.0KiB/CPU-s 0.0 0.02778 8.53MiB/CPU-s 302.69KiB/CPU-s 3.72KiB/CPU-s 0.0 0.034652 False False
http_to_http_json ingress throughput -84.83KiB/CPU-s -0.61 100.00% 13.62MiB/CPU-s 215.33KiB/CPU-s 2.65KiB/CPU-s 0.0 0.015437 13.54MiB/CPU-s 373.44KiB/CPU-s 4.59KiB/CPU-s 0.0 0.026935 False False
datadog_agent_remap_blackhole ingress throughput -233.53KiB/CPU-s -0.75 100.00% 30.5MiB/CPU-s 1.48MiB/CPU-s 18.68KiB/CPU-s 0.0 0.048605 30.28MiB/CPU-s 1.22MiB/CPU-s 15.43KiB/CPU-s 0.0 0.040458 False False
http_to_http_acks ingress throughput -70.43KiB/CPU-s -1.29 84.85% 5.32MiB/CPU-s 2.75MiB/CPU-s 34.65KiB/CPU-s 0.0 0.517039 5.25MiB/CPU-s 2.76MiB/CPU-s 34.8KiB/CPU-s 0.0 0.526166 True False
otlp_grpc_to_blackhole ingress throughput -14.09KiB/CPU-s -1.33 100.00% 1.04MiB/CPU-s 48.64KiB/CPU-s 612.82B/CPU-s 0.0 0.045878 1.02MiB/CPU-s 45.58KiB/CPU-s 574.38B/CPU-s 0.0 0.043573 False False
otlp_http_to_blackhole ingress throughput -21.08KiB/CPU-s -1.34 100.00% 1.53MiB/CPU-s 113.28KiB/CPU-s 1.39KiB/CPU-s 0.0 0.07207 1.51MiB/CPU-s 118.88KiB/CPU-s 1.46KiB/CPU-s 0.0 0.076656 False False
syslog_log2metric_humio_metrics ingress throughput -101.58KiB/CPU-s -1.64 100.00% 6.06MiB/CPU-s 252.27KiB/CPU-s 3.11KiB/CPU-s 0.0 0.04063 5.96MiB/CPU-s 343.51KiB/CPU-s 4.23KiB/CPU-s 0.0 0.056244 False False
syslog_humio_logs ingress throughput -165.79KiB/CPU-s -1.79 100.00% 9.02MiB/CPU-s 211.46KiB/CPU-s 2.6KiB/CPU-s 0.0 0.022885 8.86MiB/CPU-s 297.08KiB/CPU-s 3.65KiB/CPU-s 0.0 0.032739 False False
syslog_splunk_hec_logs ingress throughput -166.49KiB/CPU-s -1.83 100.00% 8.86MiB/CPU-s 252.04KiB/CPU-s 3.1KiB/CPU-s 0.0 0.027764 8.7MiB/CPU-s 207.77KiB/CPU-s 2.56KiB/CPU-s 0.0 0.023316 False False
file_to_blackhole egress throughput -502.0KiB/CPU-s -7.66 99.77% 6.4MiB/CPU-s 4.2MiB/CPU-s 122.09KiB/CPU-s 2.896219 0.656833 5.91MiB/CPU-s 4.17MiB/CPU-s 110.39KiB/CPU-s 0.0 0.704916 True False

Copy link
Contributor

@tobz tobz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally looks good to me, with a few nits/questions.

&self,
gen: &RefCell<SchemaGenerator>,
) -> Result<SchemaObject, GenerateError> {
let tag_schema = schema::generate_internal_tagged_variant_schema("type".to_string(), {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine for now, but one thing that makes me uneasy here is that we're hardcoding the code generation, essentially, to match what we have... but it could easily diverge from future refactoring aimed at making the output better for enums overall.

That, and we're hardcoding the tag field.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is all to say that it's fine for now, but it sure would be nice if we could figure out some way to drive more of this hand-written schema generation from the typetag-related data, if that makes sense.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm open to that, but this seemed the best first path to solving it.

@bruceg bruceg enabled auto-merge (squash) February 24, 2023 18:03
@github-actions
Copy link

Regression Detector Results

Run ID: 34d5f844-8677-4b40-820c-e5fba4fd4984
Baseline: 63e5068
Comparison: f682889
Total vector CPUs: 7

Explanation

A regression test is an integrated performance test for vector in a repeatable rig, with varying configuration for vector. What follows is a statistical summary of a brief vector run for each configuration across SHAs given above. The goal of these tests are to determine quickly if vector performance is changed and to what degree by a pull request.

The table below, if present, lists those experiments that have experienced a statistically significant change in mean optimization goal performance between baseline and comparison SHAs with 90.00% confidence OR have been detected as newly erratic. Negative values mean that baseline is faster, positive comparison. Results that do not exhibit more than a ±5.00% change in their mean optimization goal are discarded. An experiment is erratic if its coefficient of variation is greater than 0.1. The abbreviated table will be omitted if no interesting change is observed.

Changes in experiment optimization goals with confidence ≥ 90.00% and |Δ mean %| ≥ 5.00%:

experiment goal Δ mean Δ mean % confidence
file_to_blackhole egress throughput -1.37MiB/CPU-s -19.36 100.00%
Fine details of change detection per experiment.
experiment goal Δ mean Δ mean % confidence baseline mean baseline stdev baseline stderr baseline outlier % baseline CoV comparison mean comparison stdev comparison stderr comparison outlier % comparison CoV erratic declared erratic
syslog_log2metric_humio_metrics ingress throughput 192.03KiB/CPU-s 3.16 100.00% 5.94MiB/CPU-s 254.51KiB/CPU-s 3.13KiB/CPU-s 0.0 0.041848 6.13MiB/CPU-s 207.12KiB/CPU-s 2.55KiB/CPU-s 0.0 0.033013 False False
otlp_http_to_blackhole ingress throughput 40.81KiB/CPU-s 2.66 100.00% 1.5MiB/CPU-s 125.67KiB/CPU-s 1.55KiB/CPU-s 0.0 0.08179 1.54MiB/CPU-s 111.21KiB/CPU-s 1.37KiB/CPU-s 0.0 0.070508 False False
otlp_grpc_to_blackhole ingress throughput 24.58KiB/CPU-s 2.36 100.00% 1.02MiB/CPU-s 54.18KiB/CPU-s 682.56B/CPU-s 0.0 0.052036 1.04MiB/CPU-s 41.96KiB/CPU-s 528.79B/CPU-s 0.0 0.039369 False False
socket_to_socket_blackhole ingress throughput 156.52KiB/CPU-s 1.16 100.00% 13.23MiB/CPU-s 537.15KiB/CPU-s 6.61KiB/CPU-s 0.0 0.039636 13.39MiB/CPU-s 282.96KiB/CPU-s 3.48KiB/CPU-s 0.0 0.020641 False False
syslog_regex_logs2metric_ddmetrics ingress throughput 22.14KiB/CPU-s 0.63 99.99% 3.44MiB/CPU-s 364.3KiB/CPU-s 4.48KiB/CPU-s 0.0 0.103341 3.46MiB/CPU-s 301.09KiB/CPU-s 3.71KiB/CPU-s 0.0 0.084878 True False
syslog_log2metric_splunk_hec_metrics ingress throughput 55.85KiB/CPU-s 0.61 100.00% 9.01MiB/CPU-s 379.02KiB/CPU-s 4.66KiB/CPU-s 0.0 0.041098 9.06MiB/CPU-s 377.76KiB/CPU-s 4.65KiB/CPU-s 0.0 0.040715 False False
http_to_http_acks ingress throughput 25.69KiB/CPU-s 0.48 40.16% 5.25MiB/CPU-s 2.74MiB/CPU-s 34.59KiB/CPU-s 0.0 0.522985 5.27MiB/CPU-s 2.73MiB/CPU-s 34.4KiB/CPU-s 0.0 0.517696 True False
syslog_splunk_hec_logs ingress throughput 28.62KiB/CPU-s 0.32 100.00% 8.69MiB/CPU-s 295.8KiB/CPU-s 3.64KiB/CPU-s 0.0 0.033258 8.71MiB/CPU-s 225.39KiB/CPU-s 2.77KiB/CPU-s 0.0 0.02526 False False
datadog_agent_remap_datadog_logs_acks ingress throughput 47.6KiB/CPU-s 0.14 97.81% 32.55MiB/CPU-s 1.15MiB/CPU-s 14.54KiB/CPU-s 0.0 0.035467 32.59MiB/CPU-s 1.18MiB/CPU-s 14.83KiB/CPU-s 0.0 0.036104 False False
enterprise_http_to_http ingress throughput 5.29KiB/CPU-s 0.04 79.92% 13.62MiB/CPU-s 277.62KiB/CPU-s 3.42KiB/CPU-s 0.0 0.019909 13.62MiB/CPU-s 189.88KiB/CPU-s 2.34KiB/CPU-s 0.0 0.013612 False False
http_to_http_noack ingress throughput 5.32KiB/CPU-s 0.04 60.43% 13.61MiB/CPU-s 383.3KiB/CPU-s 4.71KiB/CPU-s 0.0 0.027509 13.61MiB/CPU-s 335.83KiB/CPU-s 4.13KiB/CPU-s 0.0 0.024093 False False
splunk_hec_to_splunk_hec_logs_acks ingress throughput -221.86B/CPU-s -0.00 2.60% 13.61MiB/CPU-s 389.99KiB/CPU-s 4.8KiB/CPU-s 0.0 0.027973 13.61MiB/CPU-s 375.28KiB/CPU-s 4.62KiB/CPU-s 0.0 0.026918 False False
fluent_elasticsearch ingress throughput -6.36KiB/CPU-s -0.01 80.93% 45.41MiB/CPU-s 30.1KiB/CPU-s 375.05B/CPU-s 0.0 0.000647 45.41MiB/CPU-s 397.9KiB/CPU-s 4.84KiB/CPU-s 0.0 0.008557 False False
splunk_hec_to_splunk_hec_logs_noack ingress throughput -2.98KiB/CPU-s -0.02 50.51% 13.62MiB/CPU-s 240.97KiB/CPU-s 2.96KiB/CPU-s 0.0 0.017278 13.62MiB/CPU-s 260.95KiB/CPU-s 3.21KiB/CPU-s 0.0 0.018715 False False
splunk_hec_indexer_ack_blackhole ingress throughput -3.04KiB/CPU-s -0.02 51.23% 13.62MiB/CPU-s 243.68KiB/CPU-s 3.0KiB/CPU-s 0.0 0.017473 13.62MiB/CPU-s 258.95KiB/CPU-s 3.19KiB/CPU-s 0.0 0.018572 False False
http_to_http_json ingress throughput -28.83KiB/CPU-s -0.21 100.00% 13.62MiB/CPU-s 220.1KiB/CPU-s 2.71KiB/CPU-s 0.0 0.01578 13.59MiB/CPU-s 257.55KiB/CPU-s 3.17KiB/CPU-s 0.0 0.018503 False False
datadog_agent_remap_datadog_logs ingress throughput -226.38KiB/CPU-s -0.66 100.00% 33.25MiB/CPU-s 985.34KiB/CPU-s 12.12KiB/CPU-s 0.0 0.028939 33.03MiB/CPU-s 955.42KiB/CPU-s 11.76KiB/CPU-s 0.0 0.028249 False False
datadog_agent_remap_blackhole_acks ingress throughput -429.94KiB/CPU-s -1.34 100.00% 31.23MiB/CPU-s 1.17MiB/CPU-s 14.7KiB/CPU-s 0.0 0.037366 30.81MiB/CPU-s 1.03MiB/CPU-s 13.0KiB/CPU-s 0.0 0.033486 False False
splunk_hec_route_s3 ingress throughput -211.76KiB/CPU-s -1.77 100.00% 11.67MiB/CPU-s 529.24KiB/CPU-s 6.51KiB/CPU-s 0.0 0.044268 11.47MiB/CPU-s 670.89KiB/CPU-s 8.25KiB/CPU-s 0.0 0.057129 False False
datadog_agent_remap_blackhole ingress throughput -594.0KiB/CPU-s -1.86 100.00% 31.18MiB/CPU-s 1.18MiB/CPU-s 14.82KiB/CPU-s 0.0 0.037748 30.6MiB/CPU-s 1.11MiB/CPU-s 14.0KiB/CPU-s 0.0 0.036302 False False
syslog_loki ingress throughput -193.93KiB/CPU-s -2.20 100.00% 8.62MiB/CPU-s 206.6KiB/CPU-s 2.54KiB/CPU-s 0.0 0.023411 8.43MiB/CPU-s 265.0KiB/CPU-s 3.26KiB/CPU-s 0.0 0.030704 False False
http_text_to_http_json ingress throughput -726.42KiB/CPU-s -2.86 100.00% 24.81MiB/CPU-s 575.54KiB/CPU-s 7.08KiB/CPU-s 0.0 0.02265 24.1MiB/CPU-s 1007.0KiB/CPU-s 12.39KiB/CPU-s 0.0 0.040797 False False
syslog_humio_logs ingress throughput -283.5KiB/CPU-s -3.10 100.00% 8.93MiB/CPU-s 175.6KiB/CPU-s 2.16KiB/CPU-s 0.0 0.019193 8.66MiB/CPU-s 354.6KiB/CPU-s 4.36KiB/CPU-s 0.0 0.039997 False False
file_to_blackhole egress throughput -1.37MiB/CPU-s -19.36 100.00% 7.09MiB/CPU-s 3.95MiB/CPU-s 128.97KiB/CPU-s 0.0 0.556987 5.72MiB/CPU-s 4.27MiB/CPU-s 112.31KiB/CPU-s 0.0 0.746478 True False

@bruceg bruceg merged commit b809df1 into master Feb 27, 2023
@bruceg bruceg deleted the bruceg/dynamic-transforms branch February 27, 2023 14:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci-condition: integration tests enable Run integration tests on this PR domain: config Anything related to configuring Vector domain: topology Anything related to Vector's topology code domain: transforms Anything related to Vector's transform components type: tech debt A code change that does not add user value.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants