-
Notifications
You must be signed in to change notification settings - Fork 2.8k
Clustering performance improvements #5319
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clustering performance improvements #5319
Conversation
Thanks so much for your contribution @KristinnVikar , we appreciate it! We also have a Discord server, which you’re more than welcome to join. It's a great place to connect with fellow contributors and stay updated with the latest developments! |
Using a HashMap based clustering reduces clustering time even further! However I'm getting a slight regression with the change, where I get "Templates clustered: 1544" but without I get "Templates clustered: 1550", some 6 templates that no longer want to cluster together. I did some further measurements, originally clustering took 14000ms (single threaded bottleneck), now it takes ~18ms, about 777x faster |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm !
nice work @KristinnVikar , this reduced clustering time by lot , by adding time.Now() to benchmark that particular function on my setup i found
Before
[FTL] Clustering took 3.30753675s: total: 6675
After
[FTL] Clustering took 8.236709ms: total: 6679
that's almost 402x faster !!
Note
we can see there are more clusters compared to dev/latest but that's ok and better and i just added more clustering conditions to ssl to avoid issues
Proposed changes
Reduce clustering from
O(n^2)
where n is the number of total templates loaded, toO(n)
.On my machine, this reduces total nuclei startup time from 19s -> 8.5s (2.23x faster!) (Edit: re-did timing tests)
Checklist