FB2 Compression & Selective Encryption on DNA Sequence

  IJCTT-book-cover
 
International Journal of Computer Trends and Technology (IJCTT)          
 
© 2016 by IJCTT Journal
Volume-36 Number-4
Year of Publication : 2016
Authors : Syed Mahamud Hossein, Pradeep Kumar Das Mohapatra, Debashis De
  10.14445/22312803/IJCTT-V36P136

MLA

Syed Mahamud Hossein, Pradeep Kumar Das Mohapatra, Debashis De "FB2 Compression & Selective Encryption on DNA Sequence". International Journal of Computer Trends and Technology (IJCTT) V36(4):204-212 June 2016. ISSN:2231-2803. www.ijcttjournal.org. Published by Seventh Sense Research Group.

Abstract -
A lossless compression algorithm, for genetic sequences, based on exact Four Bases Block(FB2) conversion, each bases are define by a single decimal/binary value, convert each block of base into equivalent number of decimal value, the range of decimal value is vary from 0 to 255 and each decimal equivalent value are encoded by corresponding ASCII code. Each base define as a=0, t=1, g=2 & c=3 in case of decimal coded and a=00, t=01, g=10 & c=11 in case of binary coded, this values are user specified for better security. It can provide the data security, the exact coded value are necessary for decoding time by using only exact coded value provided by encoded time, this proposed method protect the data by using ASCII code, this types of data security in tier one. In tier two, use selective encryption techniques for better securities. Speed of compression and security levels are two important measurements for evaluating any encryption-compression system. This method are use only particular available ASCII code for compression purpose and pattern are use for selective encryption purpose. The running time of this algorithm is very few second, varies linearly with the size of the source file to be compressed and the complexity is O(n). The algorithm can approach a compression rate of 2.00002 bits/base.

References
[1] M. Li and P. Vitányi, An Introduction to Kolmogorov Complexity and Its Applications, 2nd ed. New York: Springer- Verlag, 1997.
[2] R. Curnow and T. Kirkwood, “Statistical analysis of deoxyribonucleic acid sequence data-a review,” J. Royal Statistical Soc., vol. 152, pp. 199-220, 1989.
[3] S. Grumbach and F. Tahi, “A new challenge for compression algorithms: Genetic sequences,” J. Inform. Process. Manage., vol. 30, no. 6, pp. 875-866, 1994.
[4] É. Rivals, O. Delgrange, J.P. Delahaye, M.Dauchet, M.O. Delorme et al., “Detection of significant patterns by compression algorithms: the case of Approximate Tandem Repeats inDNAsequences,” CABIOS, vol. 13, no. 2, pp. 131-136,1997.
[5] K. Lanctot, M. Li, and E.H. Yang, “Estimating DNA sequence entropy,”in Proc. SODA 2000, to be published.
[6] D. Loewenstern and P. Yianilos, “Significantly lower entropy estimates for natural DNA sequences,” J. Comput. Biol., to be published (Preliminary version appeared in a DIMACS workshop, 1996.)
[7] T.Matsumoto,K.Sadakame and H.Imani, ”Biological sequence compression algorithm”, Genome Informatics 11:43-52 (2000).
[8] X. Chen, M. Li, B. Ma, and J. Tromp, “Dnacompress:fast and effective dna sequence compression,” Bioinformatics, vol. 18,2002.
[9] E. Balagurusamy, Introduction to Computing. McGraw- Hill,1998.
[10] Bell, T.C., Cleary, J.G., and Witten, I.H., Text Compression, Prentice Hall, 1990.
[11]ASCII code[Online] ,Available: http://www.asciitable.com [12] H. Cheng and X. Li, “Partial Encryption of Compressed Images and Video,” IEEE Transactions on Signal Processing, 48(8), 2000, pp. 2439-2451.
[13] C. E. Shannon, “Communication theory of secrecy systems,” Bell Systems Technical Journal, v. 28, October 1949, pp. 656-715.
[14]Dhajvir Singh Rai et al., Survey of Compression of DNA Sequence, International Journal of Computer Applications, 2013, pp- 52-58
[15]Ashutosh Gupta et al., Searching a pattern in compressed DNA Sequences, Int. J. Bioinformatics Research and Applications, 2011, pp 115-129
[16] Jie Liu et al., A Fixed-Length Coding Algorithm for DNA Sequence Compression(Draft,using Bioinformatics LATEX template), Bioinformatics,2005,pp 1–3
[17] S. Grumbach and F. Tahi, “A new challenge for compression algorithms: Genetic sequences,” J. Inform. Process. Manage., vol. 30, no. 6, pp. 875-866, 1994.
[18] Xin Chen, San Kwong and Mine Li, “A Compression Algorithm for DNA Sequences Using Approximate Matching for Better Compression Ratio to Reveal the True Characteristics of DNA”, IEEE Engineering in Medicine and Biology, 2001, pp 61-66.

Keywords
Biology, genetics, Data compaction, compression, FB2 and Security.