Detection of Bold Italic and Underline Fonts for Hindi OCR

  IJCOT-book-cover
 
International Journal of Computer Trends and Technology (IJCTT)          
 
© - August Issue 2013 by IJCTT Journal
Volume-4 Issue-8                           
Year of Publication : 2013
Authors :Nidhi Sharma, Mohit Khandelwal

MLA

Nidhi Sharma, Mohit Khandelwal "Detection of Bold Italic and Underline Fonts for Hindi OCR "International Journal of Computer Trends and Technology (IJCTT),V4(8):2423-2428 August Issue 2013 .ISSN 2231-2803.www.ijcttjournal.org. Published by Seventh Sense Research Group.

Abstract:- This paper presents a technique for improving the recognition accuracy of Hindi OCR System by developing concept for detection of Bold, Italic and underline words. Optical Character Recognition is a process by which characters in text of printed document or scanned page are recognized and converted to ASCII character that a computer can read and edit.Detection of font style in Hindi script document can improve the performance of Hindi OCR system.

 

References-
[1] Font identification - In context of an Indic script: Chanda, S. ; Dept. of Comput. Sci. & Media Technol., Gjovik Univ. Coll., Gjovik, Norway ; Pal, U. ; Franke, K., IEEE Pattern Recognition (ICPR), 2012 21st International Conference on 11-15 Nov. 2012 Pp 1655 - 1658
[2] S L. Zhang, Y. Lu, and C. L. Tan. Italic font recognition using stroke pattern analysis on wavelet decomposed word images. In ICPR ’04: Proceedings of the Pattern Recognition, 17th International Conference on (ICPR’04) Volume 4, pages 835–838, Washington, DC, USA, 2004. IEEE Computer Society.
[3] B. B. Chaudhuri and U. Garain, “Detection of Italic, Bold and All-Capital Words in Document Images”, Proc. 14th Int. Conf. on Pattern Recognition (ICPR), Vol. 1, pp. 610-612, 1998.
[4] Zhen-Long BAI and Qiang HUO, “Underline Detection and Removal in a Document Image Using Multiple Strategies,” Proceedings of the 17th International Conference on Pattern Recognition (2004).
[5] Kuo-Chin Fan and Chien-Hsiang Huang, “Italic Detection and Rectification,” Journal of Information Science and Engineering 23, 403-419 (2007).
[6] L. Zhang, Y. Lu, and C. L. Tan, “Italic font recognition using stroke pattern analysis on wavelet decomposed word images,” in Proceedings of the 17th International Conference on Pattern Recognition, Vol. 4, 2004. pp. 835-838.
[7] B. B. Chaudhuri and U. Pal. An OCR system to read two Indian language scripts: Bangla and devanagari (hindi). In Proc of ICDAR, pages 1011–1015, 1997.
[8] J. Padhye, V. Firoiu, and D. Towsley, “A stochastic model of TCP Reno congestion avoidance and control,” Univ. of Massachusetts, Amherst, MA, CMPSCI Tech. Rep. 99-02, 1999.
[9] Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) Specification, IEEE Std. 802.11, 1997.

Keywords : — Hindi-Text image, Gray-scale image, Binary image, Edge detection, Image reconstruction, Optical Character recognition .