TY - GEN
T1 - K-means clustering with infinite feature selection for classification tasks in gene expression data
AU - Remli, Muhammad Akmal
AU - Daud, Kauthar Mohd
AU - Nies, Hui Wen
AU - Mohamad, Mohd Saberi
AU - Deris, Safaai
AU - Omatu, Sigeru
AU - Kasim, Shahreen
AU - Sulong, Ghazali
N1 - Publisher Copyright:
© Springer International Publishing AG 2017.
PY - 2017
Y1 - 2017
N2 - In the bioinformatics and clinical research areas, microarray technology has been widely used to distinguish a cancer dataset between normal and tumour samples. However, the high dimensionality of gene expression data affects the classification accuracy of an experiment. Thus, feature selection is needed to select informative genes and remove non-informative genes. Some of the feature selection methods, yet, ignore the interaction between genes. Therefore, the similar genes are clustered together and dissimilar genes are clustered in other groups. Hence, to provide a higher classification accuracy, this research proposed k-means clustering and infinite feature selection for identifying informative genes in the selected subset. This research has been applied to colorectal cancer and small round blue cell tumors datasets. Eventually, this research successfully obtained higher classification accuracy than the previous work.
AB - In the bioinformatics and clinical research areas, microarray technology has been widely used to distinguish a cancer dataset between normal and tumour samples. However, the high dimensionality of gene expression data affects the classification accuracy of an experiment. Thus, feature selection is needed to select informative genes and remove non-informative genes. Some of the feature selection methods, yet, ignore the interaction between genes. Therefore, the similar genes are clustered together and dissimilar genes are clustered in other groups. Hence, to provide a higher classification accuracy, this research proposed k-means clustering and infinite feature selection for identifying informative genes in the selected subset. This research has been applied to colorectal cancer and small round blue cell tumors datasets. Eventually, this research successfully obtained higher classification accuracy than the previous work.
KW - Artificial intelligence
KW - Cancer classification
KW - Gene expression data
KW - Infinite feature selection
KW - Informative genes
KW - K-means clustering
KW - Small round blue cell tumors
UR - http://www.scopus.com/inward/record.url?scp=85025136538&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85025136538&partnerID=8YFLogxK
U2 - 10.1007/978-3-319-60816-7_7
DO - 10.1007/978-3-319-60816-7_7
M3 - Conference contribution
AN - SCOPUS:85025136538
SN - 9783319608150
T3 - Advances in Intelligent Systems and Computing
SP - 50
EP - 57
BT - 11th International Conference on Practical Applications of Computational Biology and Bioinformatics, 2017
A2 - Rocha, Miguel
A2 - De Paz, Juan F.
A2 - Pinto, Tiago
A2 - Fdez-Riverola, Florentino
A2 - Mohamad, Mohd Saberi
PB - Springer Verlag
T2 - 11th International Conference on Practical Applications of Computational Biology and Bioinformatics, PACBB 2017
Y2 - 21 June 2017 through 23 June 2017
ER -