k-Nearest Neighbors Classification (k-NN)

\(k\)-NN classification algorithm infers the class for the new feature vector by computing majority vote of the \(k\) nearest observations from the training set.

Operation

Computational methods

Programming Interface

Training

Brute-force

k-d tree

train(…)

train_input

train_result

Inference

Brute-force

k-d tree

infer(…)

infer_input

infer_result

Mathematical formulation

Refer to Developer Guide: k-Nearest Neighbors Classification.

Programming Interface

All types and functions in this section are declared in the oneapi::dal::knn namespace and be available via inclusion of the oneapi/dal/algo/knn.hpp header file.

Descriptor

template<typename Float = float, typename Method = method::by_default, typename Task = task::by_default>
class descriptor
Template Parameters
  • Float – The floating-point type that the algorithm uses for intermediate computations. Can be float or double.

  • Method – Tag-type that specifies an implementation of algorithm. Can be method::brute_force or method::kd_tree.

  • Task – Tag-type that specifies type of the problem to solve. Can be task::classification.

Constructors

descriptor(std::int64_t class_count, std::int64_t neighbor_count)

Creates a new instance of the class with the given class_count and neighbor_count property values.

Properties

std::int64_t neighbor_count

The number of neighbors k.

Getter & Setter
std::int64_t get_neighbor_count() const
auto & set_neighbor_count(std::int64_t value)
Invariants
std::int64_t class_count

The number of classes c.

Getter & Setter
std::int64_t get_class_count() const
auto & set_class_count(std::int64_t value)
Invariants

Method tags

struct brute_force

Tag-type that denotes brute-force computational method.

struct kd_tree

Tag-type that denotes k-d tree computational method.

using by_default = brute_force

Alias tag-type for brute-force computational method.

Task tags

struct classification

Tag-type that parameterizes entities used for solving classification problem.

using by_default = classification

Alias tag-type for classification task.

Model

template<typename Task = task::by_default>
class model
Template Parameters

Task – Tag-type that specifies type of the problem to solve. Can be task::classification.

Constructors

model()

Creates a new instance of the class with the default property values.

Training train(...)

Input

template<typename Task = task::by_default>
class train_input
Template Parameters

Task – Tag-type that specifies type of the problem to solve. Can be task::classification.

Constructors

train_input(const table &data, const table &labels)

Creates a new instance of the class with the given data and labels property values.

Properties

const table &data

The training set X. Default value: table{}.

Getter & Setter
const table & get_data() const
auto & set_data(const table &data)
const table &labels

Vector of labels y for the training set X. Default value: table{}.

Getter & Setter
const table & get_labels() const
auto & set_labels(const table &labels)

Result

template<typename Task = task::by_default>
class train_result
Template Parameters

Task – Tag-type that specifies type of the problem to solve. Can be task::classification.

Constructors

train_result()

Creates a new instance of the class with the default property values.

Properties

const model<Task> &model

The trained k-NN model. Default value: model<Task>{}.

Getter & Setter
const model< Task > & get_model() const
auto & set_model(const model< Task > &value)

Operation

template<typename Descriptor>
knn::train_result train(const Descriptor &desc, const knn::train_input &input)
Parameters
  • desc – k-NN algorithm descriptor knn::descriptor

  • input – Input data for the training operation

Preconditions
input.data.has_data == true
input.labels.has_data == true
input.data.row_count == input.labels.row_count
input.labels.column_count == 1
input.labels[i] >= 0
input.labels[i] < desc.class_count

Inference infer(...)

Input

template<typename Task = task::by_default>
class infer_input
Template Parameters

Task – Tag-type that specifies type of the problem to solve. Can be task::classification.

Constructors

infer_input(const table &data, const model<Task> &model)

Creates a new instance of the class with the given model and data property values.

Properties

const table &data

The dataset for inference \(X'\). Default value: table{}.

Getter & Setter
const table & get_data() const
auto & set_data(const table &data)
const model<Task> &model

The trained k-NN model. Default value: model<Task>{}.

Getter & Setter
const model< Task > & get_model() const
auto & set_model(const model< Task > &m)

Result

template<typename Task = task::by_default>
class infer_result
Template Parameters

Task – Tag-type that specifies type of the problem to solve. Can be task::classification.

Constructors

infer_result()

Creates a new instance of the class with the default property values.

Properties

const table &labels

The predicted labels. Default value: table{}.

Getter & Setter
const table & get_labels() const
auto & set_labels(const table &value)

Operation

template<typename Descriptor>
knn::infer_result infer(const Descriptor &desc, const knn::infer_input &input)
Parameters
  • desc – k-NN algorithm descriptor knn::descriptor

  • input – Input data for the inference operation

Preconditions
input.data.has_data == true
Postconditions
result.labels.row_count == input.data.row_count
result.labels.column_count == 1
result.labels[i] >= 0
result.labels[i] < desc.class_count

Usage example

Training

knn::model<> run_training(const table& data,
                        const table& labels) {
   const std::int64_t class_count = 10;
   const std::int64_t neighbor_count = 5;
   const auto knn_desc = knn::descriptor<float>{class_count, neighbor_count};

   const auto result = train(knn_desc, data, labels);

   return result.get_model();
}

Inference

table run_inference(const knn::model<>& model,
                  const table& new_data) {
   const std::int64_t class_count = 10;
   const std::int64_t neighbor_count = 5;
   const auto knn_desc = knn::descriptor<float>{class_count, neighbor_count};

   const auto result = infer(knn_desc, model, new_data);

   print_table("labels", result.get_labels());
}

Examples

Batch Processing: