Estimation of Distance complexity in amino acids between Normal and Cancer Liver Cells using Data Mining Techniques

International Journal of Computer Trends and Technology (IJCTT)          
© 2015 by IJCTT Journal
Volume-27 Number-1
Year of Publication : 2015
Authors : M. Mayilvaganan, R.Rajamani
DOI :  10.14445/22312803/IJCTT-V27P103


M. Mayilvaganan, R.Rajamani "Estimation of Distance complexity in amino acids between Normal and Cancer Liver Cells using Data Mining Techniques". International Journal of Computer Trends and Technology (IJCTT) V27(1):10-13, September 2015. ISSN:2231-2803. Published by Seventh Sense Research Group.

Abstract -
The Data mining is the process of finding correlations or patterns among dozens of fields in large relational databases. Clustering algorithm used to find groups of objects such that the objects in a group will be similar (or related) to one another and different from (or unrelated to) the objects in other groups . This paper comprises of two database such as normal liver cells and cancer affected cells. After analyzing the cancer cells, there is a need to determine the distance between normal and cancer affected cells. Each amino acid can have character variables and also assigned numeric number and its corresponding pair combination of sequence are represented in a graph. The proposed HMM system is validated with two different nucleotide values for analyse the performance and get the simulated output using viterbi and forward algorithms implemented in Mat Lab Tool. The extracted rules and analyzed results are graphically demonstrated. The performance is analyzed based on the different no of instances and confidence in DNA sequence data set.

[1 ] M.Anandavalli , M.K.Ghose , K.Gouthaman ,Association Rule Mining in Genomics,International journal of computer Theory and engineering, Vol.2,No.2 April,2010.
[2] Bayardo, Roberto J., Jr.; Agrawal, Rakesh; Gunopulos, Dimitrios (2000). Constraint-based rule mining in large, dense databases Data Mining and Knowledge Discovery (2): 217–240.
[3] Donald, ?Introduction to Data Mining for Medical Informatics, Clin Lab Med, pp. 9-35, 2008.
[4] JunoWatada,KeisukeAoki, Masahiro Kawano, Muhammad SuzuriHitam, Dual Scaling Approach to Data M Journal of Advanced Computational Intelligence Intelligent Informatics , Vol. 10, No. 4, pp. 441-447, 2006.
[5] Jiawei Han and MichelineKamber,?Data Mining Concepts and Techniques. San Francisco, CA: Elsevier Inc, 2006.
[6] Irene M. Mullins et al., ?Data mining and clinical data repositories: Insights from a667,000 patient data set, Computers in Biology and Medicine, vol. 36, pp. 1351-1377, 2006.
[7] Liao.S & M. Embrechts I. -N. Lee, ?Data mining techniques applied to medical information, Med. Inform , pp. 81-102, 2000.
[8] Piatetsky-Shapiro, G.& myth P. &Uthurusamy, R. Fayyad, "From Data Mining toKnowledge Discovery: An Overview," in Advances in Knowledge Discovery and DataMining, 1996.
[9] Webb, Geoffrey I. ?Efficient Search for Association Rules, Proceedings of the Sixth ACM SIGKDD International Conference Knowledge Discoveryand Data Mining (KDD- 2000), Boston, MA, New York.
[10] R. Zhang, Y, Katta, ?Medical Data Mining,Data Mining and Knowledge Discovery, pp. 305-308, 2002.

Hidden Markov Model; Viterbi algorithms; Forward algorithms; Pub Chem of liver and Cancer DNA dataset;