Enhancing Dictionary Based Preprocessing for Better Text Compression

  IJCTT-book-cover
 
International Journal of Computer Trends and Technology (IJCTT)          
 
© 2014 by IJCTT Journal
Volume-9 Number-1                          
Year of Publication : 2014
Authors : R. R. Baruah , V.Deka , M. P. Bhuyan
DOI :  10.14445/22312803/IJCTT-V9P102

MLA

R. R. Baruah , V.Deka , M. P. Bhuyan. "Enhancing Dictionary Based Preprocessing for Better Text Compression ". International Journal of Computer Trends and Technology (IJCTT) V9(1):4-9, March 2014. ISSN:2231-2803. www.ijcttjournal.org. Published by Seventh Sense Research Group.

Abstract -
With the rapid growing of data and number of applications, there is a crucial need of dictionary based reversible transformation techniques to increase the efficiency of the compression algorithms and hence contribute towards the enhancement in compression ratio. Performance analysis of compression methods in combination with the various transformation techniques is obtained for different text files of varying sizes. The popular block sorting lossless Burrows Wheeler Compression Algorithm (BWCA) is implemented along with one proposed method. For efficient compression a dictionary based transformation algorithm is also developed. It is observed that much increase in terms of compression ratio is attained when a source file is preprocessed with dictionary and then applied to BWCA and the proposed method.

References
[1] M. Burrows, and D.J. Wheeler, “A Block-sorting Lossless Data Compression Algorithm,” Digital Systems Research Center Research Report 124, 1994.
[2] Deorowicz, S. Improvements to Burrows-Wheeler Compression Algorithm. Software – Practice and Experience, 30(13), 1465–1483, 2000.
[3] Awan, F, Zhang, N, Motgi, N, Iqbal, R, Mukherjee, A. LIPT: A reversible lossless text transform to improve compression performance. In Proceedings of the IEEE Data Compression Conference 2001, Snowbird, Utah, J. Storer and M. Cohn, Eds. 481, 2001.
[4] Jürgen Abel, Ingenieurbüro Dr. Abel GmbH,Lechstrasse, “Incremental Frequency Count – A post BWT-stage for the Burrows- Wheeler Compression Algorithm “, Software: Practice and Experience Volume 37, Issue 3, pages 247–265, March 2007
[5] Kruse H, Mukherjee A. “Preprocessing Text to Improve Compression Ratios”. In Storer JA, , Proceedings of the 1998 IEEE Data Compression Conference, Los Alamitos, California,1998
[6] Arnavut, Z, Magliveras, S. Block Sorting and Compression. Proceedings of the IEEE Data Compression Conference 1997, Snowbird, Utah, J. Storer and M. Cohn, Eds. 181–190, 1997.
[7] B. Balkenhol, S. Kurtz , and Y. M.Shtarkov. ”Modifications of the Burrows Wheeler Data Compression Algorithm” Proceedings of Data Compression Conference, Snowbird Utah, pp. 188-197, 1999.
[8] Rexline S.J , Robert L “Dictionary Based Preprocessing Methods in Text Compression - A Survey” International Journal of Wisdom Based Computing, Vol. 1 (2), August 2011 pp. 13-18.
[9]M.P. Bhuyan,V. Deka,S. Bordoloi “Burrows Wheeler based data compression and secure transmission” Proceedings of 1st National Conference on Research & Higher Education in Information Technology (RHEIT – 2013),4th – 5th February,2013.
[10] P. Jeyanthi, V. Anuratha “Analysis of lossless reversible transformation” Journal of Global Research in Computer Science Volume 3, No. 8, August 2012
[11] Radu R¸ADESCU “Transform Methods Used in Lossless Compression of Text Files” Romanian journal of science and technology.Volume 12, Number 1, 2009, 101-115
[12] V.K. Govindan, B.S. Shajee mohan, “IDBE – An Intelligent Dictionary Based Encoding Algorithm for Text Data Compression for High Speed Data Transmission Over Internet”.
[13] P. Skibi ski, Sz. Grabowski, S. Deorowicz, “Revisiting dictionary based compression,” Software–Practice and Experience, 2005; vol. 35, no. 15, pp. 1455–1476, 2005.

Keywords
BWCA, Dictionary, Preprocessing Techniques, Lossless, Reversible