Supervised Classification of Cancers Based on Copy Number Variation

Faculty Engineering Year: 2018
Type of Publication: ZU Hosted Pages:
Authors:
Journal: Proceedings of the International Conference on Advanced Intelligent Systems and Informatics 2018 Proceedings of the International Conference on Advanced Intelligent Systems and Informatics 2018 Volume:
Keywords : Supervised Classification , Cancers Based , Copy Number    
Abstract:
Genomic variation in DNA can cause many types of human cancer so the machine learning has important role in genomic medicine it can help to classify, predict and analysis of DNA sequence. Which is the most important biological characteristic? DNA copy number variations (CNVs) used to understand the difference between different human cancers and predict cancer causing from genetic sequence. But it’s not easy due to the high dimensionality of the CNV features. This paper presents approach to computationally classify a set of human cancer types. We use machine learning to train and test various models on set of human cancer using the CNV level values of 23,082 genes (features) for 2916 instances to construct the classifier. Then the genes are selected according to their importance by the filter feature selection method. We compare the performance of seven classifiers Support vector Machine, Random Forest, j48, Neural Network, Logistic Regression, Bagging and Dagging with other benchmark using 10-fold cross validation. The best performance achieved accuracy value 0.859 and ROC value 0.965 which are promising results. The classification models developed in this research could provide a reasonable prediction of the cancer patients’ stage based on their CNV level values. The proposed model confirmed that genes from chromosome 3 have in developing human cancers. It also predicted new genes not studied so far as important ones for the prediction of human cancers.
   
     
 
       
Tweet