Abstract
In unsupervised learning, the traditional feature selection methods are not always efficient and their feature selection performance can be severely affected in the presence of outliers and noise. To address this issue, we propose a novel robust unsupervised feature selection method, called Unsupervised Feature Selection with Robust Data Reconstruction (UFS-RDR), that minimizes the graph regularized weighted data reconstruction error function. For the detection of outliers, the well-known Mahalanobis distance is used and further determine the Huber-type weight function using these Mahalanobis distances. This weight function downweights the clustering observations that have large distance. Our experimental results on both synthetic and real-world datasets indicate that the proposed UFS-RDR approach has good feature selection performance and also outperforms the competitive non-robust unsupervised feature selection methods in the presence of contamination in the unlabeled data.
Original language | English |
---|---|
Article number | 117008 |
Journal | Expert Systems with Applications |
Volume | 201 |
DOIs | |
Publication status | Published - Sept 1 2022 |
Externally published | Yes |
Keywords
- Data reconstruction
- Mahalanobis distance
- Outliers
- Robustness
- Unsupervised feature selection
ASJC Scopus subject areas
- General Engineering
- Computer Science Applications
- Artificial Intelligence