Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicated sample points returned by query-frontend #3920

Closed
1 of 2 tasks
alvinlin123 opened this issue Mar 8, 2021 · 12 comments
Closed
1 of 2 tasks

Duplicated sample points returned by query-frontend #3920

alvinlin123 opened this issue Mar 8, 2021 · 12 comments

Comments

@alvinlin123
Copy link
Contributor

alvinlin123 commented Mar 8, 2021

Describe the bug

I have continuous test running that queries Cortex. After upgrading from commit af9e20c to dd6dbf9 we observed that around 16:00:00 pacific time our test would fail because duplicate sample points were returned:

curl --location -g --request POST 'http://localhost:8080/prometheus/api/v1/query_range?query=my_metrics_name&start=1614556800&end=1614557160&step=60s'
{
  "status": "success",
  "data": {
    "resultType": "matrix",
    "result": [
      {
        "metric": {
          "__name__": "my_metrics_name"
        },
        "values": [
          [
            1614556800,
            "96"
          ],
          [
            1614556860,
            "141"
          ],
          [
            1614556800,
            "96"
          ],
          [
            1614556860,
            "141"
          ],
          [
            1614556920,
            "141"
          ],
          [
            1614556980,
            "141"
          ],
          [
            1614557040,
            "141"
          ],
          [
            1614557100,
            "141"
          ],
          [
            1614557160,
            "185"
          ]
        ]
      }
    ]
  }
} 

Notice how sample point with timestamp 1614556800 and 1614556860 are duplicated.

Our continuous is quite simple, it runs periodically to push some data and query some data from Cortex. Our test is not using HA mode.

To Reproduce
I am not sure how to reproduce it, but one pattern is that it seems to happen everyday around 16:00:00 pacific time (00:00:00 GMT). The query I had issues with are queries that query past data points like:

  • query data from 24 hours ago to 24 hours + 6 minutes ago (this query will land in ingester).
  • query data from 48 hours ago to 48 hours + 6 minuges ago (this query will not land in ingester).

Also, if I keep making the same query multiple times, the issues goes away after a few tries for that specific query; problem comes back next day at 16:00:00 pacific time.

Expected behavior
I don't expected sample points to be duplicated because:

  1. Cortex rejects duplicated sample points for write path when not using HA mode.
  2. Query path should deduplicate result from different sources.

Environment:

  • Infrastructure: k8s
  • Deployment tool: helm
  • I am using query front-end

Storage Engine

  • Blocks
  • Chunks
@bboreham
Copy link
Contributor

bboreham commented Mar 8, 2021

Could you clarify whether you are using the query-frontend component, and if so, whether the symptom persists if you call the querier directly?

@jeromeinsf
Copy link
Contributor

jeromeinsf commented Mar 8, 2021

Yes query-frontend is being used.

@alvinlin123
Copy link
Contributor Author

Could you clarify whether you are using the query-frontend component, and if so, whether the symptom persists if you call the querier directly?

I tried query all the queries directly after I see the bug through query-frontend, but was not able to reproduce the bug.

@alvinlin123
Copy link
Contributor Author

alvinlin123 commented Mar 10, 2021

One more data point. I was able to reproduce the issue by querying agains query-frontend only as soon as I see duplicate sample points querying query-frontend, I made the exact same query against all the queriers, but querier never return duplicated samples. So basically here is what I did:

  1. port-forward to query-frontend-x
  2. make query_range against the query-frontend, got duplicate samples
  3. make the same query_range call against ALL The querier, didn't get duplicate samples
  4. make the same query_range call agains the same query-frontend, got duplicate samples again.

Below are the curl command I ran, port 7777 is to querier where as port 9999 is to a single query-frontend pod

Click to see the long outputs
▶ curl --location -g --request POST "http://localhost:9999/prometheus/api/v1/query_range?query=my_metrics&start=1615161360&end=1615161720&step=60s" -H "X-Scope-OrgId: tenant_id"  -d '{}' -v  | jq

{
  "status": "success",
  "data": {
    "resultType": "matrix",
    "result": [
      {
        "metric": {
          "__name__": "my_metrics"
        },
        "values": [
          [
            1615161360,
            "242"
          ],
          [
            1615161420,
            "29"
          ],
          [
            1615161480,
            "29"
          ],
          [
            1615161540,
            "29"
          ],
          [
            1615161600,
            "29"
          ],
          [
            1615161660,
            "29"
          ],
          [
            1615161720,
            "73"
          ],
          [
            1615161660,
            "29"
          ],
          [
            1615161720,
            "73"
          ]
        ]
      }
    ]
  }
}

▶ curl --location -g --request POST "http://localhost:7777/prometheus/api/v1/query_range?query=my_metrics&start=1615161360&end=1615161720&step=60s" -H "X-Scope-OrgId: tenant_id"  -d '{}' -v  | jq

{
  "status": "success",
  "data": {
    "resultType": "matrix",
    "result": [
      {
        "metric": {
          "__name__": "my_metrics"
        },
        "values": [
          [
            1615161360,
            "242"
          ],
          [
            1615161420,
            "29"
          ],
          [
            1615161480,
            "29"
          ],
          [
            1615161540,
            "29"
          ],
          [
            1615161600,
            "29"
          ],
          [
            1615161660,
            "29"
          ],
          [
            1615161720,
            "73"
          ]
        ]
      }
    ]
  }
}

▶ kill -3 $(pgrep -f '^kubectl.* port-forward .* 7777')                                                                 ➜  AWSPrometheusOpsTools git:(mainline)  [2021-03-09 17:25:31] AWS:
▶ curl --location -g --request POST "http://localhost:7777/prometheus/api/v1/query_range?query=my_metrics&start=1615161360&end=1615161720&step=60s" -H "X-Scope-OrgId: tenant_id"  -d '{}' -v  | jq

{
  "status": "success",
  "data": {
    "resultType": "matrix",
    "result": [
      {
        "metric": {
          "__name__": "my_metrics"
        },
        "values": [
          [
            1615161360,
            "242"
          ],
          [
            1615161420,
            "29"
          ],
          [
            1615161480,
            "29"
          ],
          [
            1615161540,
            "29"
          ],
          [
            1615161600,
            "29"
          ],
          [
            1615161660,
            "29"
          ],
          [
            1615161720,
            "73"
          ]
        ]
      }
    ]
  }
}

▶ kill -3 $(pgrep -f '^kubectl.* port-forward .* 7777')                                                                 ➜  AWSPrometheusOpsTools git:(mainline)  [2021-03-09 17:25:40] 

▶ curl --location -g --request POST "http://localhost:7777/prometheus/api/v1/query_range?query=my_metrics&start=1615161360&end=1615161720&step=60s" -H "X-Scope-OrgId: tenant_id"  -d '{}' -v  | jq

{
  "status": "success",
  "data": {
    "resultType": "matrix",
    "result": [
      {
        "metric": {
          "__name__": "my_metrics"
        },
        "values": [
          [
            1615161360,
            "242"
          ],
          [
            1615161420,
            "29"
          ],
          [
            1615161480,
            "29"
          ],
          [
            1615161540,
            "29"
          ],
          [
            1615161600,
            "29"
          ],
          [
            1615161660,
            "29"
          ],
          [
            1615161720,
            "73"
          ]
        ]
      }
    ]
  }
}

▶ curl --location -g --request POST "http://localhost:9999/prometheus/api/v1/query_range?query=my_metrics&start=1615161360&end=1615161720&step=60s" -H "X-Scope-OrgId: tenant_id"  -d '{}' -v  | jq

{
  "status": "success",
  "data": {
    "resultType": "matrix",
    "result": [
      {
        "metric": {
          "__name__": "my_metrics"
        },
        "values": [
          [
            1615161360,
            "242"
          ],
          [
            1615161420,
            "29"
          ],
          [
            1615161480,
            "29"
          ],
          [
            1615161540,
            "29"
          ],
          [
            1615161600,
            "29"
          ],
          [
            1615161660,
            "29"
          ],
          [
            1615161720,
            "73"
          ],
          [
            1615161660,
            "29"
          ],
          [
            1615161720,
            "73"
          ]
        ]
      }
    ]
  }
}

@gouthamve gouthamve changed the title Duplicated sample points returned by querier Duplicated sample points returned by query-frontend Mar 11, 2021
@jeromeinsf
Copy link
Contributor

From the Cortex community call, wondering if this would be in or around https://github.com/cortexproject/cortex/blob/master/pkg/querier/queryrange/split_by_interval.go#L43

@alvinlin123
Copy link
Contributor Author

ok, one more datapoint I just discovered, if I set cache_results to true I get the duplicate samples, if I set cache_results to false, I can't re-produce the duplicate sample issue hmmm.

@alvinlin123
Copy link
Contributor Author

alvinlin123 commented Mar 17, 2021

I just noticed something weird. With old commit af9e20c the size of cache entry doesn't change. But after I upgrade to commit dd6dbf9, I started to see cache item size start to grow everytime I make a same query:

key=cc4fd9127b36005e exp=-1 la=1615952893 cas=1181 fetch=no cls=21 size=7806
key=cc4fdc127b360577 exp=-1 la=1615952893 cas=1182 fetch=no cls=22 size=10308
END
lru_crawler metadump all
key=cc4fd9127b36005e exp=-1 la=1615952896 cas=1183 fetch=no cls=21 size=7860
key=cc4fdc127b360577 exp=-1 la=1615952896 cas=1184 fetch=no cls=22 size=10380
END
lru_crawler metadump all
key=cc4fd9127b36005e exp=-1 la=1615952896 cas=1183 fetch=no cls=21 size=7860
key=cc4fdc127b360577 exp=-1 la=1615952896 cas=1184 fetch=no cls=22 size=10380
END
lru_crawler metadump all
key=cc4fd9127b36005e exp=-1 la=1615952902 cas=1188 fetch=no cls=21 size=7968
key=cc4fdc127b360577 exp=-1 la=1615952902 cas=1187 fetch=no cls=22 size=10524
END
lru_crawler metadump all
key=cc4fd9127b36005e exp=-1 la=1615952902 cas=1188 fetch=no cls=21 size=7968
key=cc4fdc127b360577 exp=-1 la=1615952902 cas=1187 fetch=no cls=22 size=10524
END
lru_crawler metadump all
key=cc4fd9127b36005e exp=-1 la=1615952908 cas=1191 fetch=no cls=21 size=8076
key=cc4fdc127b360577 exp=-1 la=1615952908 cas=1192 fetch=no cls=22 size=10668
END
lru_crawler metadump all
key=cc4fd9127b36005e exp=-1 la=1615952908 cas=1191 fetch=no cls=21 size=8076
key=cc4fdc127b360577 exp=-1 la=1615952908 cas=1192 fetch=no cls=22 size=10668
END
lru_crawler metadump all
key=cc4fd9127b36005e exp=-1 la=1615952911 cas=1193 fetch=no cls=21 size=8130
key=cc4fdc127b360577 exp=-1 la=1615952911 cas=1194 fetch=no cls=22 size=10740
END

note that when cas changes,size increases. I don't think this is normal, is it? I am suspecting this commit 50ab740 introduces increase cache entry size behaviour, but I need to take a closer look.

The range query I am making has start=1614556800 end=1614556860, which crosses a day's boundary in GMT.

With the above in mind, I was somehow able to reproduce the duplicate sample issue if I down grade from newer commit to older commit; and I guess that is because the cached entry contains lots of duplicate samples?

@pracucci
Copy link
Contributor

I am suspecting this commit 50ab740 introduces increase cache entry size behaviour, but I need to take a closer look.

We noticed another issue with the behaviour introduced by 50ab740 and Goutham fixed it in #3818. Could you test if #3818 solves your issue too? I just would like to exclude the issue has already be (unintentionally) fixed in master.

@alvinlin123
Copy link
Contributor Author

I actually tested with #3818 in. Specifically, I was testing with e02797a and see the cache entry size increase.

@alvinlin123
Copy link
Contributor Author

alvinlin123 commented Mar 17, 2021

Ok, so after looking closer, indeed 50ab740 caused the issue where cache entry size kept on increasing. Consider the following:

We are making a range query where

  • start: 1615161360000
  • end: 1615161720000
  • step: 60000 (1 minute)

Because the above timestamp crosses day's boundary, so the query splitter will split it to two queries:

  • Start: 1615161360000, End: 1615161540000 (3 minutes interval)
  • Start: 1615161600000, End: 1615161720000 (2 minutes interval)

Note that all the queries has interval less than 5 minutes. So these 2 queries will get pass further down to the result_cache middleware. Let's assume we have made the same query before, so there are result cache entries for the 2 queries.

Both query will go through the handleHit method with existing cache data, then go through the partition method .

Because the queries are less than 5 minutes this line of code will not use the cache data, and will return an Request object for each query.

handleHit then go ahead and execute the Request returned by partition and append it to the existing cached data. But now the problem comes; existing cached data already have the exactly same data, by appending the result of the Request we created a duplicated Extent in the cache entry. The more we execute the same query, the more duplicate Extent gets added to the cache entry; hence, the ever growing cache entry size.

Before 50ab740, I think it was expected that partition method returns requests to get data that is not in the cache. Commit 50ab740 breaks that expectation because after the change, partition would return requests to get data that is in cache.

I am working on a PR to fix this.

@bboreham
Copy link
Contributor

What is left to fix here after #3968?

@alvinlin123
Copy link
Contributor Author

alvinlin123 commented Aug 20, 2021

I am not seeing anymore duplicate samples pointing now, so I will go ahead and resolve this issue now, but I am still now sure how it was fixed :) Something must have changed with de-duplication logic together with #3968 that fixes this duplicate sample issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants