Connected Components#

Connected components algorithm receives an undirected graph \(G\) as an input and searches for connected components in \(G\). For each vertex in \(G\), the algorithm returns the label of the component this vertex belongs to. The result of the algorithm is a set of labels for all vertices in \(G\).


Computational methods

Programming Interface






Mathematical formulation#


Given an undirected graph \(G\), the problem is to find connected components in \(G\), determine their quantity, and label vertices so that vertices from the same component have the same label.


Сomponents are labeled from \(0\) to \(k-1\), where \(k\) is the number of components. For the example above, the labels for vertices are [0, 1, 1, 1, 2, 0, 1, 3, 4, 4, 4, 4, 4].

This notation means that:

  • vertices with ids 0 and 5 belong to the connected component with id 0

  • vertices with ids 1, 2, 3, and 6 belong to the connected component with id 1

  • vertex with id 4 belongs to the connected component with id 2

  • vertex with id 7 belongs to the connected component with id 3

  • vertices with ids 8, 9, 10, 11, and 12 belong to the connected component with id 4

Computation method: afforest#

The method defines Afforest algorithm and solves the problem of сonnected components identification in an undirected graph.

This algorithm expands the Shiloach-Vishkin connected components algorithm and uses component approximation to decrease redundant edge processing. The method consists of the following steps:

  1. Process a fixed number of edges for each vertex (Vertex Neighbor Sampling optimization).

  2. Identify the largest intermediate component using probabilistic method.

  3. Process the rest of the neighborhoods only for the vertices that do not belong to the largest component (Large Component Skipping optimization).

For more details, see [Sutton2018].

Programming Interface#

Refer to API Reference: Connected Components.