Naïve Bayes Classifier with Various Smoothing Techniques for Text Documents

Shruti Aggarwal; Devinder Kaur

doi:10.14445/22312803/IJCTT-V4I4P185

Research Article | Open Access | Download PDF

Volume 4 | Issue 4 | Year 2013 | Article Id. IJCTT-V4I4P185 | DOI : https://doi.org/10.14445/22312803/IJCTT-V4I4P185

Naïve Bayes Classifier with Various Smoothing Techniques for Text Documents

Shruti Aggarwal, Devinder Kaur

Citation :

Shruti Aggarwal, Devinder Kaur, "Naïve Bayes Classifier with Various Smoothing Techniques for Text Documents," International Journal of Computer Trends and Technology (IJCTT), vol. 4, no. 4, pp. 873-876, 2013. Crossref, https://doi.org/10.14445/22312803/IJCTT-V4I4P185

Abstract

Due to huge amount of increase in text data, its classification has become an important issue, now days. There are many good classification techniques discussed in this paper. Each classification method has its own assumptions, advantages and limitations. One of the most widely used classifier is Naïve Bayes which performs well with different data sets. Various Smoothing techniques are applied on Naïve Bayes. The idea behind them is to improve the classification accuracy and performance.

Keywords

Text classification, Naïve Bayes, Jelinek-Mercer, Smoothing, Dirichlet, Two-Stage, Absolute Discounting

References

[1] B S Harish, D S Guru and S Manjunath,“ Representation and Classification of Text Documents: A Brief Review”, IJCA Special Issue on Recent Trends in Image Processing and Pattern Recognition, 2010.
[2] Y. H. LI and A. K. JAIN, “Classification of Text Documents”, The Computer Journal,1998.
[3] Kevin P. Murphy, “ Naïve Bayes classifier”, Department of Computer Science, University of British Columbia,2006.
[4] Hetal Doshi and Maruti Zalte, “Performance of Naïve Bayes Classifier-Multinomial model on different categories of documents” National Conference on Emerging Trends in Computer Science and Infortiom Technology,IJCA,2011.
[5] Ajay S. Patil and B.V. Pawar, “Automated Classification of Naïve Bayesian Algorithm” ,Proceedings of International Multi-Conference of Engineers and Computer Scientists, March 1416,2012.
[6] C. Zhai and J. Lafferty, “A Study of Smoothing Methods for language Models Applied to Information Retrieval” TOIS, 22:179 – 214, 2004.
[7] Jing Bai and Jian-Yun Nie. “Using Language Models for Text Classification”, InAIRS, 2004.
[8] Quan Yuan , Gao Cong and Nadia M. Thalmann,”Enhancing Naïve Bayes with Various Smoothing Method for Short text Classification”, Proceedings of 21st International Conference on World Wide Web, pages 645-646,2012.
[9] Colas, Fabrice, and Pavel Brazdil. "Comparison of SVM and some older classification algorithms in text classification tasks." In Artificial Intelligence in Theory and Practice, pp. 169-178. Springer US, 2006.