K-Means initialization¶
The K-Means initialization algorithm receives \(n\) feature vectors as input and chooses \(k\) initial centroids. After initialization, K-Means algorithm uses the initialization result to partition input data into \(k\) clusters.
Operation |
Computational methods |
Programming Interface |
|||||
Mathematical formulation¶
Refer to Developer Guide: K-Means Initialization.
Programming Interface¶
All types and functions in this section are declared in the
oneapi::dal::kmeans_init
namespace and be available via inclusion of the
oneapi/dal/algo/kmeans_init.hpp
header file.
Descriptor¶
-
template<typename Float = float, typename Method = method::by_default, typename Task = task::by_default>
class descriptor¶ - Template Parameters
Float – The floating-point type that the algorithm uses for intermediate computations. Can be float or double.
Method – Tag-type that specifies an implementation of K-Means Initialization algorithm.
Task – Tag-type that specifies the type of the problem to solve. Can be task::init.
Constructors
-
descriptor(std::int64_t cluster_count = 2)¶
Creates a new instance of the class with the given
cluster_count
.
Properties
-
auto &local_trials_count¶
Number of attempts to find the best sample in terms of potential value If the value is equal to -1, the number of trials is 2 + int(log(cluster_count)). Default value: -1.
- Getter & Setter
template <typename M = Method, typename None = detail::v1::enable_if_plus_plus_dense<M>> auto & get_local_trials_count() const
template <typename M = Method, typename None = detail::v1::enable_if_plus_plus_dense<M>> auto & set_local_trials_count(std::int64_t value=-1)
- Invariants
- local_trials > 0 or :expr`local_trials = -1`
-
std::int64_t cluster_count¶
The number of clusters k. Default value: 2.
- Getter & Setter
std::int64_t get_cluster_count() const
auto & set_cluster_count(std::int64_t value)
- Invariants
- cluster_count > 0
-
auto &seed¶
- Getter & Setter
template <typename M = Method, typename None = detail::v1::enable_if_not_default_dense<M>> auto & get_seed() const
template <typename M = Method, typename None = detail::v1::enable_if_not_default_dense<M>> auto & set_seed(std::int64_t value)
Computing compute(...)¶
Input¶
-
template<typename Task = task::by_default>
class compute_input¶ - Template Parameters
Task – Tag-type that specifies type of the problem to solve. Can be task::init.
Constructors
Properties
Result¶
-
template<typename Task = task::by_default>
class compute_result¶ - Template Parameters
Task – Tag-type that specifies type of the problem to solve. Can be oneapi::dal::kmeans::task::clustering.
Constructors
-
compute_result()¶
Creates a new instance of the class with the default property values.
Properties
Operation¶
-
template<typename Descriptor>
kmeans_init::compute_result compute(const Descriptor &desc, const kmeans_init::compute_input &input)¶ - Parameters
desc – K-Means algorithm descriptor kmeans_init::descriptor
input – Input data for the computing operation
- Preconditions
- Postconditions
Usage Example¶
Computing¶
table run_compute(const table& data) {
const auto kmeans_desc = kmeans_init::descriptor<float,
kmeans_init::method::dense>{}
.set_cluster_count(10)
const auto result = compute(kmeans_desc, data);
print_table("centroids", result.get_centroids());
return result.get_centroids();
}
Examples¶
Batch Processing:
Batch Processing: