-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Allow skip cache factor to be updated dynamically #14412
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java
Outdated
Show resolved
Hide resolved
lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java
Outdated
Show resolved
Hide resolved
lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left some comments, can you add a CHANGES entry too so that I can merge?
lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java
Outdated
Show resolved
Hide resolved
@jpountz Added a CHANGES entry. |
I am wondering if we should also make |
What is your use-case for tuning it? Related to my comment on the issue linked to this PR, I worry a bit about optimizing too much for query caching vs. making the query cache less relevant. To expand a bit more on this, for "good" queries, like term queries or FieldExistQuery that produce good iterators with a low up-front cost, memory would be better spent on the filesystem cache than on the query cache. Boolean queries recently had big improvements that makes the query cache less relevant. (annotations HS, HW and HX at https://benchmarks.mikemccandless.com/CountAndHighHigh.html and https://benchmarks.mikemccandless.com/CountOrHighHigh.html). So that makes the query cache mostly relevant for queries that have a high up-front cost, like point queries and multi-term queries. Rather than investing in the query cache for these queries, I would rather invest in making them behave better, e.g. recent work on vectorizing this up-front cost for |
Thank you for your reply. I understand your perspective. You aim to optimize execution efficiency rather than focus on cache optimization. In certain rare scenarios, the querycache memory becomes full. To rectify the querycache situation, we are currently compelled to restart the JVM, which differs from your line of thought. I'm wondering if it would be feasible to expose all querycache cache APIs and eliminate any cache strategies within Lucene itself. If Lucene already has a querycache, then upper - level services like Elasticsearch or OpenSearch could refrain from implementing their own querycache. Please ignore this suggestion if it doesn't seem viable. |
Sorry I'm not sure I get your suggestion. |
@jpountz Thanks for expanding on the reasoning above w.r.t. query cache usage. I was working on refactoring query cache so that it is not a bottleneck itself when utilized well. Maybe we can an additional caching policy wrt to determine high upfront cost or query latency, so that we only cache queries which will give us the benefit. Let me know if its still worth a shot, meanwhile we can work on optimizing the queries itself. I can open up a PR soon for that. |
Description
Related issue - #14183
This change allows skip cache factor to be updated dynamically within LRU query cache. This can be done via getter/setter