Clustering mixed datasets by using similarity features

Amir Ahmad, Santosh Kumar Ray, Ch Aswani Kumar

Research output: Chapter in Book/Report/Conference proceedingChapter

1 Citation (Scopus)


Clustering datasets consisting of numeric and nominal features is a challenging task as there are different similarity measures for numeric and nominal features. In the present paper, we propose a method to transform a mixed dataset to a numeric dataset. This method uses a similarity measure for mixed datasets and a randomly selected set of the data objects form the given mixed dataset and generate numeric similarity features. A clustering algorithm for pure numeric datasets is then applied on the newly generated numeric dataset to produce clusters. A comparative study with the other clustering algorithms demonstrated the superior performance of the proposed clustering approach.

Original languageEnglish
Title of host publicationLecture Notes on Data Engineering and Communications Technologies
PublisherSpringer Science and Business Media Deutschland GmbH
Number of pages8
Publication statusPublished - 2020

Publication series

NameLecture Notes on Data Engineering and Communications Technologies
ISSN (Print)2367-4512
ISSN (Electronic)2367-4520

ASJC Scopus subject areas

  • Information Systems
  • Media Technology
  • Computer Science Applications
  • Computer Networks and Communications
  • Electrical and Electronic Engineering


Dive into the research topics of 'Clustering mixed datasets by using similarity features'. Together they form a unique fingerprint.

Cite this