Evaluation of Modified Categorical Data Fuzzy Clustering Algorithm on the Wisconsin Breast Cancer Dataset

Research output: Contribution to journalArticlepeer-review

6 Citations (Scopus)

Abstract

The early diagnosis of breast cancer is an important step in a fight against the disease. Machine learning techniques have shown promise in improving our understanding of the disease. As medical datasets consist of data points which cannot be precisely assigned to a class, fuzzy methods have been useful for studying of these datasets. Sometimes breast cancer datasets are described by categorical features. Many fuzzy clustering algorithms have been developed for categorical datasets. However, in most of these methods Hamming distance is used to define the distance between the two categorical feature values. In this paper, we use a probabilistic distance measure for the distance computation among a pair of categorical feature values. Experiments demonstrate that the distance measure performs better than Hamming distance for Wisconsin breast cancer data.

Original languageEnglish
Article number4273813
JournalScientifica
Volume2016
DOIs
Publication statusPublished - 2016
Externally publishedYes

ASJC Scopus subject areas

  • Environmental Science(all)
  • Agricultural and Biological Sciences(all)

Fingerprint

Dive into the research topics of 'Evaluation of Modified Categorical Data Fuzzy Clustering Algorithm on the Wisconsin Breast Cancer Dataset'. Together they form a unique fingerprint.

Cite this