Title: | Adaptive k-Nearest Neighbor Classifier Based on Local Curvature Estimation |
Version: | 0.1.0 |
Description: | Implements the kK-NN algorithm, an adaptive k-nearest neighbor classifier that adjusts the neighborhood size based on local data curvature. The method estimates local Gaussian curvature by approximating the shape operator of the data manifold. This approach aims to improve classification performance, particularly in datasets with limited samples. |
License: | MIT + file LICENSE |
URL: | https://github.com/Gabrielforest/LCCkNN |
Encoding: | UTF-8 |
Imports: | FNN, caret, MLmetrics, stats, class |
RoxygenNote: | 7.3.2 |
NeedsCompilation: | no |
Packaged: | 2025-08-21 20:42:36 UTC; gabrielf |
Author: | Gabriel Pereira [aut, cre] |
Maintainer: | Gabriel Pereira <gabrielfreitaspereira10@gmail.com> |
Repository: | CRAN |
Date/Publication: | 2025-08-27 12:10:02 UTC |
Computes balanced accuracy.
Description
This function requires the 'caret' package.
Usage
balanced_accuracy_score(true_labels, predicted_labels)
Arguments
true_labels |
The true class labels. |
predicted_labels |
The predicted class labels. |
Value
The balanced accuracy score.
Computes the curvatures of all samples in the training set.
Description
Computes the curvatures of all samples in the training set.
Usage
curvature_estimation(data, k)
Arguments
data |
A numeric matrix or data frame of the training data. |
k |
The number of neighbors for the initial k-NN graph. |
Value
A numeric vector of curvatures for each sample.
Computes the F1-score.
Description
Computes the F1-score.
Usage
f1_score(true_labels, predicted_labels, average = "weighted")
Arguments
true_labels |
The true class labels. |
predicted_labels |
The predicted class labels. |
average |
The type of averaging ('weighted'). |
Value
The F1-score.
Adaptive k-Nearest Neighbor Classifier
Description
Implements the adaptive k-nearest neighbor (kK-NN) algorithm, which adjusts the neighborhood size for each sample based on a local curvature estimate. This method aims to improve classification performance, particularly in datasets with limited training samples.
Usage
kKNN(train, test, train_target, k, func = "log", quantize_method = "paper")
Arguments
train |
A numeric matrix or data frame of the training data. |
test |
A numeric matrix or data frame of the test data. |
train_target |
A numeric or factor vector of class labels for the training data. |
k |
The number of neighbors for the initial k-NN graph. |
func |
The transformation function for curvatures ('log', 'cubic_root', or 'sigmoid'). |
quantize_method |
The quantization method to use: 'paper' (10 levels, default) or 'log2n' (k levels, where k = log2(n)). |
Value
A numeric or factor vector of predicted class labels for the test data.
References
Levada, A.L.M., Nielsen, F., Haddad, M.F.C. (2024). ADAPTIVE k-NEAREST NEIGHBOR CLASSIFIER BASED ON THE LOCAL ESTIMATION OF THE SHAPE OPERATOR. arXiv:2409.05084.
Examples
# Load necessary libraries
library(caret)
# Load and prepare data (e.g., the Iris dataset)
data_iris <- iris
data <- as.matrix(data_iris[, 1:4])
target <- as.integer(data_iris$Species)
# Standardize the data
data <- scale(data)
# Split data into training and testing sets
set.seed(42)
train_index <- caret::createDataPartition(target, p = 0.5, list = FALSE)
train_data <- data[train_index, ]
test_data <- data[-train_index, ]
train_labels <- target[train_index]
# Determine initial k value as log2(n)
initial_k <- round(log2(nrow(train_data)))
if (initial_k %% 2 == 0) {
initial_k <- initial_k + 1
}
# Run the kK-NN classifier using the default quantization method ('paper')
predictions_paper <- LCCkNN::kKNN(
train = train_data,
test = test_data,
train_target = train_labels,
k = initial_k
)
# Run the kK-NN classifier using the 'log2n' quantization method
predictions_log2n <- LCCkNN::kKNN(
train = train_data,
test = test_data,
train_target = train_labels,
k = initial_k,
quantize_method = 'log2n'
)
# Evaluate the results (e.g., calculate balanced accuracy)
test_labels <- target[-train_index]
bal_acc_paper <- LCCkNN::balanced_accuracy_score(test_labels, predictions_paper)
bal_acc_log2n <- LCCkNN::balanced_accuracy_score(test_labels, predictions_log2n)
cat("Balanced Accuracy (paper Method):", bal_acc_paper, "\n")
cat("Balanced Accuracy (log2n Method):", bal_acc_log2n, "\n")
Computes the curvature of a single test sample's neighborhood.
Description
Computes the curvature of a single test sample's neighborhood.
Usage
point_curvature_estimation(data)
Arguments
data |
A numeric matrix or data frame representing the neighborhood (test point + its neighbors). |
Value
A single numeric value for the curvature.
Quantizes real values to integer levels.
Description
This function quantizes real values in the interval [a, b]
to integer levels from 0 to k-1.
Usage
quantize(arr, a, b, k = 10)
Arguments
arr |
A numeric vector in the interval |
a |
The lower bound of the interval. |
b |
The upper bound of the interval. |
k |
The number of quantization levels (default is 10). |
Value
A vector of quantized integers in 0, ..., k - 1
.
A helper sigmoid function.
Description
A helper sigmoid function.
Usage
sigmoid(x, a = 1)
Arguments
x |
A numeric value or vector. |
a |
A numeric scaling factor (default is 1). |
Value
The sigmoid of x.
Standard k-NN classifier.
Description
Standard k-NN classifier.
Usage
testa_KNN(train, test, train_target, nn)
Arguments
train |
A numeric matrix or data frame of the training data. |
test |
A numeric matrix or data frame of the test data. |
train_target |
A numeric or factor vector of class labels for the training data. |
nn |
The number of neighbors. |
Value
A numeric or factor vector of predicted class labels.