Clustering mixed datasets by using similarity features

Amir Ahmad, Santosh Kumar Ray, Ch Aswani Kumar

Research output: Chapter in Book/Report/Conference proceedingChapter

Abstract

Clustering datasets consisting of numeric and nominal features is a challenging task as there are different similarity measures for numeric and nominal features. In the present paper, we propose a method to transform a mixed dataset to a numeric dataset. This method uses a similarity measure for mixed datasets and a randomly selected set of the data objects form the given mixed dataset and generate numeric similarity features. A clustering algorithm for pure numeric datasets is then applied on the newly generated numeric dataset to produce clusters. A comparative study with the other clustering algorithms demonstrated the superior performance of the proposed clustering approach.

Original languageEnglish
Title of host publicationLecture Notes on Data Engineering and Communications Technologies
PublisherSpringer Science and Business Media Deutschland GmbH
Pages478-485
Number of pages8
DOIs
Publication statusPublished - 2020

Publication series

NameLecture Notes on Data Engineering and Communications Technologies
Volume39
ISSN (Print)2367-4512
ISSN (Electronic)2367-4520

ASJC Scopus subject areas

  • Information Systems
  • Media Technology
  • Computer Science Applications
  • Computer Networks and Communications
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Clustering mixed datasets by using similarity features'. Together they form a unique fingerprint.

Cite this