Fast nearest-neighbor queries in Julia using NearestNeighbors.jl
12-01, 09:55–10:25 (Europe/Amsterdam), Ernst-Curie

NearestNeighbors.jl is a Julia package that offers fast nearest-neighbor queries in Julia by using space-partitioning data structures such as kd-trees and ball trees. This talk will give a short background on these tree data structures as well as show some specific implementation details that have been used to further improve performance.


NearestNeighbors.jl is a Julia package that offers fast nearest-neighbor queries in Julia by using space-partitioning data structures such as kd-trees and ball trees. These types of data structures are useful when you want to find for example the k nearest neighbors to some point (knn) or find all points within a certain distance from some other point.

Even though the data structure itself can improve performance by algorithmic improvements, e.g. by making a code go from a time complexity of O(n^2) to O(nlog(n)) in the input size n, careful considerations of the implementation can still have a large impact on the end performance.

Among others, these are some of the optimizations that NearestNeighbors.jl uses:

  • Spatial locality: data that is likely to be accessed together is also stored in memory together, improving cache usage.
  • Avoiding unnecessary computations: if all you need to know is that one distance is larger than the other you can sometimes avoid computing the actual distance, for example leaving out a square root computation in the Euclidean distance.
  • Specializing on input dimensions: By compiling specialized functions for points of certain dimensions the compiler can take advantage of that and generate optimized code.

This talk will give a short background on these tree data structures as well as show some specific implementation details that have been used to further improve performance.

Software engineer at JuliaHub
Long time contributor to Julia and the Julia package ecosystem.