Skip to content

Performance Bottle Necks #10

@JonasIsensee

Description

@JonasIsensee

Most of the time both during searches and graph creation
is spent within distance computation and insert! & pop_nearest!
of the NeighborSets.

To reach the speed of the C++ implementation ( https://github.com/nmslib/hnsw )
both parts need to be sped up.
AFAICT the algorithm does the same as the C++ implementation so
my version shouldn't be doing more unnecessary iterations which
would have been the simplest explanation.

One way to speed up the process should be to use PriorityQueues
instead of my NeighborSets as the C++ uses these. All attempts with PriorityQueue have so far not been faster though. ( I used the PriorityQueue from DataStructures.jl )

The second point is the distance computation. I have no idea if it is possible or desirable to optimize this further. ATM we use the Distances.jl library.
The relevant code in C++ is here: https://github.com/nmslib/hnsw/blob/master/hnswlib/space_l2.h

EDIT: I investigated the @simd optimizations:
in hnsw the runtime is >2× longer for dim % 16 != 0 && dim % 4 != 0 as they implement simd
optimization for those two cases only.
In Julia I do not see this behaviour at all. Is the @simd macro so powerful that it that it works with other dimensions as well? (by adding 0s at the end?)

EDIT: PriorityQueue from DataStructures.jl is very old code in Julian terms (according to the git blame). Maybe it could be optimized now?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions