Skip to contents

For a set of points (subsetPoints), find the nearest neighbor in another set (subsetNeighbors). Distances are given in a p' x n' matrix (distances), with p' >= p (p the number of points) and n' >= n (n the number of neighbors). If points and neighbors come from the same set, put inner to TRUE for assuring the nearest neighbor of a point will not be itself.

Usage

nearest_neighbor(
  distances,
  inner = FALSE,
  contiguity = NULL,
  subsetPoints = seq_len(nrow(distances)),
  subsetNeighbors = seq_len(ncol(distances))
)

Arguments

distances

distances between points and neighbors. A row is for a point and a column a neighbor. If a vector of length k > 0 is given, it's consider like a matrix of distances with p' = k points and n' = 1 neighbor.

inner

Flag indicating if points and neighbors come from the same set and distance is the set matrix distance. Diagonal of distance will then not be considered, assuring that the nearest neighbor of a point is not itself. FALSE by default (different sets).

contiguity

In the case there are connectivity constraint between points. If an element of subsetPoints and one of subsetNeighbors are not contiguous then the second element cannot be a neighbor of the first one. NULL by default (no connectivity constraint). If precised, must be a logical matrix with the same dimensions as distances.

subsetPoints

A vector of strictly positive integers giving the position of the points the nearest neighbor must be found. Default is all points (p = p'). If distances has row names can be those instead (as characters). If no points are given, return an empty vector. Values can be duplicated but it will increase complexity.

subsetNeighbors

A vector of strictly positive integers giving the position of the elements of the set that can be consider has a neighbor for each point of subsetPoints. Default is all neighbors (n = n'). If distances has column names can be those instead (as characters). If no neighbors are given an error is generated. Values are expected to be unique.

Value

a vector of length p with for each point the position of its nearest neighbor (NaN if it doesn't exist). A point do not have a nearest neighbor if each distance with its neighbors are missing. If distances has column names position of neighbors will be replaced by their names. The returned vector is named with the IDs of the points, or their name if distances has row names.