Naïve Bayes Classifier with Various Smoothing Techniques for Text Documents

International Journal of Computer Trends and Technology (IJCTT)          
© - April Issue 2013 by IJCTT Journal
Volume-4 Issue-4                           
Year of Publication : 2013
Authors :Shruti Aggarwal, Devinder Kaur


Shruti Aggarwal, Devinder Kaur "Naïve Bayes Classifier with Various Smoothing Techniques for Text Documents "International Journal of Computer Trends and Technology (IJCTT),V4(4):873-876 April Issue 2013 .ISSN Published by Seventh Sense Research Group.

Abstract: -Due to huge amount of increase in text data, its classification has become an important issue, now days. There are many good classification techniques discussed in this paper. Each classification method has its own assumptions, advantages and limitations. One of the most widely used classifier is Naïve Bayes which performs well with different data sets. Various Smoothing techniques are applied on Naïve Bayes. The idea behind them is to improve the classification accuracy and performance.



[1] B S Harish, D S Guru and S Manjunath,“ Representation and Classification of Text Documents: A Brief Review”, IJCA Special Issue on Recent Trends in Image Processing and Pattern Recognition, 2010.
[2] Y. H. LI and A. K. JAIN, “Classification of Text Documents”, The Computer Journal,1998.
[3] Kevin P. Murphy, “ Naïve Bayes classifier”, Department of Computer Science, University of British Columbia,2006.
[4] Hetal Doshi and Maruti Zalte, “Performance of Naïve Bayes Classifier-Multinomial model on different categories of documents” National Conference on Emerging Trends in Computer Science and Infortiom Technology,IJCA,2011.
[5] Ajay S. Patil and B.V. Pawar, “Automated Classification of Naïve Bayesian Algorithm” ,Proceedings of International Multi-Conference of Engineers and Computer Scientists, March 14- 16,2012.
[6] C. Zhai and J. Lafferty, “A Study of Smoothing Methods for language Models Applied to Information Retrieval” TOIS, 22:179 – 214, 2004.
[7] Jing Bai and Jian-Yun Nie. “Using Language Models for Text Classification”, InAIRS, 2004.
[8] Quan Yuan , Gao Cong and Nadia M. Thalmann,”Enhancing Naïve Bayes with Various Smoothing Method for Short text Classification”, Proceedings of 21st International Conference on World Wide Web, pages 645-646,2012.
[9] Colas, Fabrice, and Pavel Brazdil. "Comparison of SVM and some older classification algorithms in text classification tasks." In Artificial Intelligence in Theory and Practice, pp. 169-178. Springer US, 2006.

Keywords —Text classification, Naïve Bayes, Jelinek-Mercer, Smoothing, Dirichlet, Two-Stage, Absolute Discounting