-
Notifications
You must be signed in to change notification settings - Fork 6
Description
Most of the time both during searches and graph creation
is spent within distance
computation and insert!
& pop_nearest!
of the NeighborSet
s.
To reach the speed of the C++ implementation ( https://github.com/nmslib/hnsw )
both parts need to be sped up.
AFAICT the algorithm does the same as the C++ implementation so
my version shouldn't be doing more unnecessary iterations which
would have been the simplest explanation.
One way to speed up the process should be to use PriorityQueue
s
instead of my NeighborSet
s as the C++ uses these. All attempts with PriorityQueue
have so far not been faster though. ( I used the PriorityQueue
from DataStructures.jl
)
The second point is the distance computation. I have no idea if it is possible or desirable to optimize this further. ATM we use the Distances.jl
library.
The relevant code in C++ is here: https://github.com/nmslib/hnsw/blob/master/hnswlib/space_l2.h
EDIT: I investigated the @simd
optimizations:
in hnsw the runtime is >2× longer for dim % 16 != 0 && dim % 4 != 0
as they implement simd
optimization for those two cases only.
In Julia I do not see this behaviour at all. Is the @simd
macro so powerful that it that it works with other dimensions as well? (by adding 0
s at the end?)
EDIT: PriorityQueue
from DataStructures.jl
is very old code in Julian terms (according to the git blame). Maybe it could be optimized now?