Harnessing Power of Decision Tree Approach for HPF Prediction using SIPINA and See5

International Journal of Computer Trends and Technology (IJCTT)          
© 2016 by IJCTT Journal
Volume-34 Number-3
Year of Publication : 2016
Authors : Sunny Sharma, Amritpal Singh, Dr. Rajinder Singh


Sunny Sharma, Amritpal Singh, Dr. Rajinder Singh "Harnessing Power of Decision Tree Approach for HPF Prediction using SIPINA and See5". International Journal of Computer Trends and Technology (IJCTT) V34(3):139-143, April 2016. ISSN:2231-2803. www.ijcttjournal.org. Published by Seventh Sense Research Group.

Abstract -
Drug discovery process, Disease detection and Prediction of molecular class are the area of great significance for carrying out research. In past few decades some precise approaches were used to enhance the accuracy of Human protein Function (HPF) prediction. This research study is primarily concentrated on such approach of HPF prediction with sequence derived features (SDF) using decision trees and there variants implemented using C5 and C4.5 algorithms like See5 and SIPINA. More sequence derived features were identified and incorporated. The training data was improved with these incorporated features. The Sequence data was evolved from HPRD (Human protein reference database) in terms of number of sequences and the features used to extract the relation towards a specific class which enhancing power of training data. Multiple techniques were examined for accuracy in prediction and a widespread comparison was done amongst them incorporating with previous research results, and prescribed the overall accuracy of See5 with 64% and SIPINA with 88%.

[1] B. Bergeron, ―Bioinformatics Computing, pp 257-270, 2002.
[2] D. Arditi and T. Pulket, ―Predicting the outcome of construction litigation using boosted decision trees, Journal of Computing in Civil Engineering, vol. 19, no. 4, pp 387– 393, 2005.
[3] H. Wei-Feng, G. Na, Y. Yan, L. Ji-Yang, Y. Ji-Hong, ―Decision Trees Com-bined with Feature Selection for the Rational Synthesis of Aluminophos-phate AlPO4-5, National Natural Science Foundation of China, vol 27, no.9, pp 2111-2117, 2011.
[4] I. Friedberg, ―Automated Protein Function Prediction- the Genomic Chal-lenge, Briefings in Bioinformatics, vol 7, no.3, pp 225-242.
[5] J. Han and M. Kamber, ―Data Mining Concepts and Techniques, MorganKaufmann Publishers, USA pp 279-322, 2003.
[6] L.J. Jensen, R. Gupta, N. Blom, D. Devos, J. Tamames C. Kesmir, H. Nielsen, H.H. Stærfeldt, K. Rapacki, C. Workman C.A.F. Andersen, S. Knudsen, A. Krogh, A.Valencia and S. Brunak , ―Prediction of Human Protein Function from Post- Translational Modifications and Localization Features, Journal of Molecular Biology, vol. 319, issue 5,pp 1257- 1265, 2002.
[7] M. Singh, G. Singh, ―Cluster Analysis Technique based on Bipartite Graph for Human Protein Class Prediction, International Journal of Computer Applications (0975 – 8887), vol. 20, no.3, pp. 22-27, 2011.
[8] M. Singh, P. K. Wadhwa and P. S. Sandhu , ― Human Protein Function Prediction using Decision Tree Induction ―, IJCSNS International Journal of Computer Science and Network Security, vol. 7, no.4, pp. 92-98, 2007.
[9] www.hprd.org.
[10] http://rulequest.com/see5-info.html.
[11] http://eric.univ-lyon2.fr/~ricco/sipina.html

HPF, C5, C4.5, See5, Decision Tree, SDF, SIPINA.