Skip to content

dillondaudert/NearestNeighborDescent.jl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NearestNeighborDescent.jl

Documentation Build Status

A Julia implementation of Nearest Neighbor Descent.

Dong, Wei et al. Efficient K-Nearest Neighbor Graph Construction for Generic Similarity Measures. WWW (2011).

Overview

Nearest Neighbor Descent (NNDescent) is an approximate K-nearest neighbor graph construction algorithm that has several useful properties:

  • general: works with arbitrary dissimilarity functions
  • scalable: has an empirical complexity of O(n^1.14) pairwise comparisons for a dataset of size n
  • space efficient: the only data structure required is an approximate KNN graph which is operated on in-place and is also the final output
  • accurate: converges to above 90% recall while only comparing each data point to a small percentage of the whole dataset on average

NNDescent is based on the heuristic argument that a neighbor of a neighbor is also likely to be a neighbor. That is, given a list of approximate nearest neighbors to a point, we can improve that list by exploring the neighbors of each point in the list. The algorithm is in essence the repeated application of this principle.

Installation

]add NearestNeighborDescent

Basic Usage

Approximate kNN graph construction on a dataset:

using NearestNeighborDescent
using Distances
data = [rand(20) for _ in 1:1000]
n_neighbors = 10
metric = Euclidean()
graph = nndescent(data, n_neighbors, metric)

The approximate KNNs of the original dataset can be retrieved from the resulting graph with

# return the approximate knns as KxN matrices of indexes and distances, where
# indices[j, i] and distances[j, i] are the index of and distance to node i's jth
# nearest neighbor, respectively.
indices, distances = knn_matrices(graph)

To find the approximate neighbors for new points with respect to an already constructed graph:

queries = [rand(20) for _ in 1:20]
n_neighbors = 5
indices, distances = search(graph, queries, n_neighbors)

About

Efficient approximate k-nearest neighbors graph construction and search in Julia

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •  

Languages

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy