Research Article | Open Access | Download PDF
Volume 34 | Number 1 | Year 2016 | Article Id. IJCTT-V34P125 | DOI : https://doi.org/10.14445/22312803/IJCTT-V34P125
Harnessing Power of Decision Tree Approach for HPF Prediction using SIPINA and See5
Sunny Sharma, Amritpal Singh, Dr. Rajinder Singh
Citation :
Sunny Sharma, Amritpal Singh, Dr. Rajinder Singh, "Harnessing Power of Decision Tree Approach for HPF Prediction using SIPINA and See5," International Journal of Computer Trends and Technology (IJCTT), vol. 34, no. 1, pp. 139-143, 2016. Crossref, https://doi.org/10.14445/22312803/ IJCTT-V34P125
Abstract
Drug discovery process, Disease detection and Prediction of molecular class are the area of great significance for carrying out research. In past few decades some precise approaches were used to enhance the accuracy of Human protein Function (HPF) prediction. This research study is primarily concentrated on such approach of HPF prediction with sequence derived features (SDF) using decision trees and there variants implemented using C5 and C4.5 algorithms like See5 and SIPINA. More sequence derived features were identified and incorporated. The training data was improved with these incorporated features. The Sequence data was evolved from HPRD (Human protein reference database) in terms of number of sequences and the features used to extract the relation towards a specific class which enhancing power of training data. Multiple techniques were examined for accuracy in prediction and a widespread comparison was done amongst them incorporating with previous research results, and prescribed the overall accuracy of See5 with 64% and SIPINA with 88%.
Keywords
HPF, C5, C4.5, See5, Decision Tree, SDF, SIPINA.
References
[1] B. Bergeron, ?Bioinformatics Computing, pp 257-270, 2002.
[2] D. Arditi and T. Pulket, ?Predicting the outcome of construction litigation using boosted decision trees, Journal of Computing in Civil Engineering, vol. 19, no. 4, pp 387– 393, 2005.
[3] H. Wei-Feng, G. Na, Y. Yan, L. Ji-Yang, Y. Ji-Hong, ?Decision Trees Com-bined with Feature Selection for the Rational Synthesis of Aluminophos-phate AlPO4-5, National Natural Science Foundation of China, vol 27, no.9, pp 2111-2117, 2011.
[4] I. Friedberg, ?Automated Protein Function Prediction- the Genomic Chal-lenge, Briefings in Bioinformatics, vol 7, no.3, pp 225-242.
[5] J. Han and M. Kamber, ?Data Mining Concepts and Techniques, MorganKaufmann Publishers, USA pp 279-322, 2003.
[6] L.J. Jensen, R. Gupta, N. Blom, D. Devos, J. Tamames C. Kesmir, H. Nielsen, H.H. Stærfeldt, K. Rapacki, C. Workman C.A.F. Andersen, S. Knudsen, A. Krogh, A.Valencia and S. Brunak , ?Prediction of Human Protein Function from Post- Translational Modifications and Localization Features, Journal of Molecular Biology, vol. 319, issue 5,pp 1257- 1265, 2002.
[7] M. Singh, G. Singh, ?Cluster Analysis Technique based on Bipartite Graph for Human Protein Class Prediction, International Journal of Computer Applications (0975 – 8887), vol. 20, no.3, pp. 22-27, 2011.
[8] M. Singh, P. K. Wadhwa and P. S. Sandhu , ? Human Protein Function Prediction using Decision Tree Induction ?, IJCSNS International Journal of Computer Science and Network Security, vol. 7, no.4, pp. 92-98, 2007.
[9] www.hprd.org.
[10] http://rulequest.com/see5-info.html.
[11] http://eric.univ-lyon2.fr/~ricco/sipina.html