Skip to content

[Question] Increased metric pull rates from v49 to v60 #1622

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
boyansiromahov-WM opened this issue Jan 14, 2025 · 6 comments
Open

[Question] Increased metric pull rates from v49 to v60 #1622

boyansiromahov-WM opened this issue Jan 14, 2025 · 6 comments

Comments

@boyansiromahov-WM
Copy link

Hi,
We recently updated out yace version from 49.0 to v60 and noticed that our cost has increased about 50% using the same configs.
Im wondering if there has been any changes that caused yace to pull more metrics. AWS confirmed that we are requesting more GMD calls and overall more metrics.

In our configs we run:

  • 60s scrape interval
  • 60s period, 60s length, 120-300 delay.

We can see that if we revert our version back to 49 the costs go back down. Any help on this topic would be greatly appreciated.

@kgeckhart
Copy link
Contributor

There have been a rather substantial amount of changes in between those releases, v0.49.0...v0.60.0.

There was a substantial refactor in the way resources are associated to metrics. It's very possible the old version was filtering out a lot of resources that didn't match. If you could try some versions in-between it would help narrow things down a bit. Perhaps v0.58.0 which is before some refactoring to how queries are batched? It also includes a new yace_cloudwatch_getmetricdata_metrics_requested_total metric which attempts to give more insight on total metrics requested.

@fabiiw05
Copy link
Contributor

fabiiw05 commented Feb 7, 2025

Hi,

We were using v0.57.1, but after switching to Grafana Alloy’s prometheus.exporter.cloudwatch, we observed the same cost increase.

Currently, we are using Grafana Alloy v1.5.0, which includes yace v0.61.0.

v0.57.1...v0.60.0

@kgeckhart
Copy link
Contributor

kgeckhart commented Feb 7, 2025

Can you provide some more info on the configuration you're using? I ran v0.57.1 and v0.61.0 with

apiVersion: v1alpha1
discovery:
  jobs:
    - type: AWS/EC2
      regions: [us-east-2]
      includeContextOnInfoMetrics: true
      metrics:
        - name: CPUUtilization
          statistics:
            - Average

and both produce the same number of metrics requested + calls

cat v0.61.0.txt v0.57.1.txt | grep yace_cloudwatch_getmetricdata
# HELP yace_cloudwatch_getmetricdata_metrics_requested_total Number of metrics requested from the CloudWatch GetMetricData API which is how AWS bills
# TYPE yace_cloudwatch_getmetricdata_metrics_requested_total counter
yace_cloudwatch_getmetricdata_metrics_requested_total 362
# HELP yace_cloudwatch_getmetricdata_requests_total DEPRECATED: replaced by yace_cloudwatch_requests_total with api_name label
# TYPE yace_cloudwatch_getmetricdata_requests_total counter
yace_cloudwatch_getmetricdata_requests_total 1
# HELP yace_cloudwatch_getmetricdata_metrics_total Help is not implemented yet.
# TYPE yace_cloudwatch_getmetricdata_metrics_total counter
yace_cloudwatch_getmetricdata_metrics_total 362
# HELP yace_cloudwatch_getmetricdata_requests_total Help is not implemented yet.
# TYPE yace_cloudwatch_getmetricdata_requests_total counter
yace_cloudwatch_getmetricdata_requests_total 1

@boyansiromahov-WM
Copy link
Author

boyansiromahov-WM commented Feb 11, 2025

Hi the config we are currently using is this:

  - type: AWS/EC2
            addCloudwatchTimestamp: true
            regions:
              - us-east-1
            roles:
              - roleArn: 
            period: 60
            length: 60
            delay: 180
            metrics:
              - name: CPUUtilization
                statistics:
                  - Average
                  - Maximum

We've noticed the increased across all metrics. All of our configs are following a similar pattern to the one above. We also tried v58 but that kept the costs the same as v60. I will try v57.1 this week to see if that drops the cost back down.

@kgeckhart
Copy link
Contributor

🤔 v58 was before any changes occurred which were intended to change how the requests were batched. There's this PR #1325 but it only moved existing logic.

@boyansiromahov-WM
Copy link
Author

I've had 57.1 running for a few days now and there hasnt been any change in the cost when compared to v58. I'll keep trying versions this week to see if i can find out when the metrics go up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants