Diagnosis of cancer is one of the most emerging clinical applications in microarray gene expression data. However, cancer classification on microarray gene expression data still remains a difficult problem. The main reason for this is the significantly large number of genes present relatively compared to the number of available training samples. In this paper, we propose a hybrid feature selection approach that combines the correlation coefficient with particle swarm optimization. The process of feature selection and classification is performed on three multi-class datasets namely Lymphoma, MLL and SRBCT. After the process of feature selection is performed, the selected genes are subjected to Extreme Learning Machines Classifier. Experimental results show that the proposed hybrid approach reduces the number of effective levels of gene expression and obtains higher classification accuracy and uses fewer features compared to the same experiment performed using the traditional tree-based classifiers like J48, random forest, random trees, decision stump and genetic algorithm as well.
A. Chinnaswamy and Srinivasan, R., “Hybrid feature selection using correlation coefficient and particle swarm optimization on microarray gene expression data”, Proceedings of the 6th International Conference in Bioinspired Computing and Applications, Advances in Intelligent Systems and Computing. Springer, ToC H Institute of Science and Technology, Kochi, India, 2015.