-
-
Notifications
You must be signed in to change notification settings - Fork 96
Add 3rd minimal whoosh index for performance #1877
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
A guess is you thought about removing content from the latest_revs indexes. Is it easier to create a 3rd index or did you do some benchmarking? |
Some time ago I tested removing the content, this will impact searching and search results. IMO the different NGRAMs have the biggest impact on the index size. This solution with a third index should not change the search functionality. |
With the latest commit the recommended changes have been applied. Running
|
@ThomasWaldmann, @RogerHaase, may I ask you to review this PR. IMO this change is urgently needed for large wikis (e.g. python.org) to get reasonable response times. Thank you. |
Sorry for the delay. Busy with other things, no time for moin. For my wiki on windows with 900+ items, response times dropped from about 5 seconds to about 2 seconds for the +index view. Very nice fix. |
Related to #1725.
After updating to this version the indexes need to be dropped and rebuild:
This will take some time for large wikis.
The index-build subcommand will create a new third index called LATEST_META. This index will be much smaller than LATEST_REVS and will only contain the usual metadata fields, but no content or content ngrams.
Many queries to check the existence of an item or to check ACL rights only require a few metadata fields. Queries against this small index are very fast and improve response times for large wikis.
The new index is called LATEST_META and the parameter to use the new index in various methods or functions is called “short”.
If you revert to a version prior to this change, you should delete and recreate the indexes. Advanced users can remove the new index files (latest_meta) in the wiki/index directory instead.