Английская Википедия:Farthest-first traversal

Материал из Онлайн справочника
Перейти к навигацииПерейти к поиску

Шаблон:Good article Шаблон:Short description

Файл:Farthest-first traversal.svg
The first five points in a farthest-first traversal of a planar point set. The first point is chosen arbitrarily and each successive point is as far as possible from all previously chosen points.

In computational geometry, the farthest-first traversal of a compact metric space is a sequence of points in the space, where the first point is selected arbitrarily and each successive point is as far as possible from the set of previously-selected points. The same concept can also be applied to a finite set of geometric points, by restricting the selected points to belong to the set or equivalently by considering the finite metric space generated by these points.Шаблон:R For a finite metric space or finite set of geometric points, the resulting sequence forms a permutation of the points, also known as the greedy permutation.Шаблон:R

Every prefix of a farthest-first traversal provides a set of points that is widely spaced and close to all remaining points. More precisely, no other set of equally many points can be spaced more than twice as widely, and no other set of equally many points can be less than half as far to its farthest remaining point. In part because of these properties, farthest-point traversals have many applications, including the approximation of the traveling salesman problem and the [[metric k-center|metric Шаблон:Mvar-center]] problem. They may be constructed in polynomial time, or (for low-dimensional Euclidean spaces) approximated in near-linear time.

Definition and properties

A farthest-first traversal is a sequence of points in a compact metric space, with each point appearing at most once. If the space is finite, each point appears exactly once, and the traversal is a permutation of all of the points in the space. The first point of the sequence may be any point in the space. Each point Шаблон:Mvar after the first must have the maximum possible distance to the set of points earlier than Шаблон:Mvar in the sequence, where the distance from a point to a set is defined as the minimum of the pairwise distances to points in the set. A given space may have many different farthest-first traversals, depending both on the choice of the first point in the sequence (which may be any point in the space) and on ties for the maximum distance among later choices.Шаблон:R

Farthest-point traversals may be characterized by the following properties. Fix a number Шаблон:Mvar, and consider the prefix formed by the first Шаблон:Mvar points of the farthest-first traversal of any metric space. Let Шаблон:Mvar be the distance between the final point of the prefix and the other points in the prefix. Then this subset has the following two properties:

  • All pairs of the selected points are at distance at least Шаблон:Mvar from each other, and
  • All points of the metric space are at distance at most Шаблон:Mvar from the subset.

Conversely any sequence having these properties, for all choices of Шаблон:Mvar, must be a farthest-first traversal. These are the two defining properties of a Delone set, so each prefix of the farthest-first traversal forms a Delone set.Шаблон:R

Applications

Шаблон:Harvtxt used the farthest-first traversal to define the farthest-insertion heuristic for the travelling salesman problem. This heuristic finds approximate solutions to the travelling salesman problem by building up a tour on a subset of points, adding one point at a time to the tour in the ordering given by a farthest-first traversal. To add each point to the tour, one edge of the previous tour is broken and replaced by a pair of edges through the added point, in the cheapest possible way. Although Rosenkrantz et al. prove only a logarithmic approximation ratio for this method, they show that in practice it often works better than other insertion methods with better provable approximation ratios.Шаблон:R

Later, the same sequence of points was popularized by Шаблон:Harvtxt, who used it as part of greedy approximation algorithms for two problems in clustering, in which the goal is to partition a set of points into Шаблон:Mvar clusters. One of the two problems that Gonzalez solve in this way seeks to minimize the maximum diameter of a cluster, while the other, known as the [[metric k-center|metric Шаблон:Mvar-center]] problem, seeks to minimize the maximum radius, the distance from a chosen central point of a cluster to the farthest point from it in the same cluster. For instance, the Шаблон:Mvar-center problem can be used to model the placement of fire stations within a city, in order to ensure that every address within the city can be reached quickly by a fire truck. For both clustering problems, Gonzalez chooses a set of Шаблон:Mvar cluster centers by selecting the first Шаблон:Mvar points of a farthest-first traversal, and then creates clusters by assigning each input point to the nearest cluster center. If Шаблон:Mvar is the distance from the set of Шаблон:Mvar selected centers to the next point at position Шаблон:Math in the traversal, then with this clustering every point is within distance Шаблон:Mvar of its center and every cluster has diameter at most Шаблон:Math. However, the subset of Шаблон:Mvar centers together with the next point are all at distance at least Шаблон:Mvar from each other, and any Шаблон:Mvar-clustering would put some two of these points into a single cluster, with one of them at distance at least Шаблон:Math from its center and with diameter at least Шаблон:Mvar. Thus, Gonzalez's heuristic gives an approximation ratio of 2 for both clustering problems.Шаблон:R

Gonzalez's heuristic was independently rediscovered for the metric Шаблон:Mvar-center problem by Шаблон:Harvtxt, who applied it more generally to weighted Шаблон:Mvar-center problems.Шаблон:R Another paper on the Шаблон:Mvar-center problem from the same time, Шаблон:Harvtxt, achieves the same approximation ratio of 2,Шаблон:R but its techniques are different.Шаблон:R Nevertheless, Gonzalez's heuristic, and the name "farthest-first traversal", are often incorrectly attributed to Hochbaum and Shmoys.Шаблон:R For both the min-max diameter clustering problem and the metric Шаблон:Mvar-center problem, these approximations are optimal: the existence of a polynomial-time heuristic with any constant approximation ratio less than 2 would imply that P = NP.Шаблон:R

As well as for clustering, the farthest-first traversal can also be used in another type of facility location problem, the max-min facility dispersion problem, in which the goal is to choose the locations of Шаблон:Mvar different facilities so that they are as far apart from each other as possible. More precisely, the goal in this problem is to choose Шаблон:Mvar points from a given metric space or a given set of candidate points, in such a way as to maximize the minimum pairwise distance between the selected points. Again, this can be approximated by choosing the first Шаблон:Mvar points of a farthest-first traversal. If Шаблон:Mvar denotes the distance of the Шаблон:Mvarth point from all previous points, then every point of the metric space or the candidate set is within distance Шаблон:Mvar of the first Шаблон:Math points. By the pigeonhole principle, some two points of the optimal solution (whatever it is) must both be within distance Шаблон:Mvar of the same point among these first Шаблон:Math chosen points, and (by the triangle inequality) within distance Шаблон:Math of each other. Therefore, the heuristic solution given by the farthest-first traversal is within a factor of two of optimal.Шаблон:R

Other applications of the farthest-first traversal include color quantization (clustering the colors in an image to a smaller set of representative colors),Шаблон:R progressive scanning of images (choosing an order to display the pixels of an image so that prefixes of the ordering produce good lower-resolution versions of the whole image rather than filling in the image from top to bottom),Шаблон:R point selection in the probabilistic roadmap method for motion planning,Шаблон:R simplification of point clouds,Шаблон:R generating masks for halftone images,Шаблон:R hierarchical clustering,Шаблон:R finding the similarities between polygon meshes of similar surfaces,Шаблон:R choosing diverse and high-value observation targets for underwater robot exploration,Шаблон:R fault detection in sensor networks,Шаблон:R modeling phylogenetic diversity,Шаблон:R matching vehicles in a heterogenous fleet to customer delivery requests,Шаблон:R uniform distribution of geodetic observatories on the Earth's surfaceШаблон:R or of other types of sensor network,Шаблон:R generation of virtual point lights in the instant radiosity computer graphics rendering method,Шаблон:R and geometric range searching data structures.Шаблон:R

Algorithms

Greedy exact algorithm

The farthest-first traversal of a finite point set may be computed by a greedy algorithm that maintains the distance of each point from the previously selected points, performing the following steps:Шаблон:R

  • Initialize the sequence of selected points to the empty sequence, and the distances of each point to the selected points to infinity.
  • While not all points have been selected, repeat the following steps:

For a set of Шаблон:Mvar points, this algorithm takes Шаблон:Math steps and Шаблон:Math distance computations.Шаблон:R

Approximations

A faster approximation algorithm, given by Шаблон:Harvtxt, applies to any subset of points in a metric space with bounded doubling dimension, a class of spaces that include the Euclidean spaces of bounded dimension. Their algorithm finds a sequence of points in which each successive point has distance within a Шаблон:Math factor of the farthest distance from the previously-selected point, where Шаблон:Math can be chosen to be any positive number. It takes time <math>O(n\log n)</math>.Шаблон:R

The results for bounded doubling dimension do not apply to high-dimensional Euclidean spaces, because the constant factor in the big O notation for these algorithms depends on the dimension. Instead, a different approximation method based on the Johnson–Lindenstrauss lemma and locality-sensitive hashing has running time <math>O(\varepsilon^{-2} n^{1+1/(1+\varepsilon)^2+o(1)}).</math> For metrics defined by shortest paths on weighted undirected graphs, a randomized incremental construction based on Dijkstra's algorithm achieves time <math>O(\varepsilon^{-1} m\log n\log\tfrac{n}{\varepsilon})</math>, where Шаблон:Mvar and Шаблон:Mvar are the numbers of vertices and edges of the input graph, respectively.Шаблон:R

Incremental Voronoi insertion

For selecting points from a continuous space such as the Euclidean plane, rather than from a finite set of candidate points, these methods will not work directly, because there would be an infinite number of distances to maintain. Instead, each new point should be selected as the center of the largest empty circle defined by the previously-selected point set.Шаблон:R This center will always lie on a vertex of the Voronoi diagram of the already selected points, or at a point where an edge of the Voronoi diagram crosses the domain boundary. In this formulation the method for constructing farthest-first traversals has also been called incremental Voronoi insertion.Шаблон:R It is similar to Delaunay refinement for finite element mesh generation, but differs in the choice of which Voronoi vertex to insert at each step.Шаблон:R

See also

  • Lloyd's algorithm, a different method for generating evenly spaced points in geometric spaces

References

Шаблон:Reflist