Hinglish Profanity Filter and Hate Speech Detection
|© 2023 by IJCTT Journal|
|Year of Publication : 2023|
|Authors : Nirali Arora, Aartem Singh, Laik Shaikh, Mawrah Khan, Yash Devadiga|
|DOI : 10.14445/22312803/IJCTT-V71I2P101|
How to Cite?
Nirali Arora, Aartem Singh, Laik Shaikh, Mawrah Khan, Yash Devadiga, "Hinglish Profanity Filter and Hate Speech Detection," International Journal of Computer Trends and Technology, vol. 71, no. 2, pp. 1-7, 2023. Crossref, https://doi.org/10.14445/22312803/IJCTT-V71I2P101
Freedom of speech is highly valued on the Internet, yet it is frequently also abused there. Events such as social media applications have become necessary instead of luxury. Many children and young teenagers at a tender age are introduced to this content and are prone to verbal abuse or exposed to illegitimate content or deadlines. There are no constraints or regulations to prevent the flow of hatred and violent content; this nature of the Internet inevitably gives rise to soul stigmas such as cyberbullying and cybercrime, which can impact the minds of children and young teenagers in society. The use of a profanity filter censors out all the above content. The hate filter recognizes hate speech and blocks any hateful material, making the application suitable for kids. The paper proposes a hate speech detector along with a profanity filter algorithm. One of the simulation findings demonstrates that when considering profanity as noise input in the sentiment classification for review data, accuracy decreased by roughly 2%
Censorship, Corpus, Filtering, Profanity filtering, Tokenization.
 Elisabeth Métais et al., “Natural Language Processing and Information Systems,” 26th International Conference on Applications of Natural Language to Information Systems, vol. 12801, 2021.
 “Profanity Filters: Everything You Need to Know + Our Top 5 Picks,” 2021.[Online]. Available: https://vpnoverview.com/internet-safety/kids-online/profanity-filters/
 A. D. Moore, “Python GUI Programming with Tkinter,” 2021.
 Sanjana Kumar, Srikrishna Veturi, and Varun Sreedhar, “Profanity Filter and Safe Chat Application using Deep Learning,” International Research Journal of Engineering and Technology, vol. 08 no. 07, 2021.
 MoungHo Yi et al., “Method of Profanity Detection Using Word Embedding and LSTM,” Mobile Information Systems, vol. 2021, pp. 1-9, 2021. Crossref, https://doi.org/10.10.1155/2021/6654029
 Nur Chamidah, and Reiza Sahawaly, “Comparison Support Vector Machine and Naive Bayes Methods for Classifying Cyberbullying in Twitter,” Jurnal Ilmiah Teknik Elektro Komputer dan Informatika, vol. 7, no. 2, pp. 338, Crossref, https://doi.org/10.10.26555/jiteki.v7i2.21175
 Sean MacAvaney et al., “Hate Speech Detection: Challenges and Solutions,” Plos One, 2019. Crossref, https://doi.org/10.10.1371/0221152
 F Razali1 et al., “Implementation of Anti-Profanity Words in Mobile Application Platform,” International Colloquium on Computational & Experimental Mechanics, vol. 1062, Crossref, https://doi.org/10.1088/1757-899X/1062/1/012026
 Raktim Chatterjee, Sukanya Bhattacharya, and Soumyajeet Kabi, “Profanity Detection in Social Media Text using a Hybrid Approach of NLP and Machine Learning”, International Journal of Advance Research, Ideas and Innovations in Technology, vol. 7, no. 1, 2021.
 Cheong-Ghil Kim, Young-Jun Hwang, and Chayapol Kamyod, “A Study of Profanity Effect in Sentiment Analysis on Natural Language Processing Using ANN”, Journal of Web Engineering, vol. 21, no. 3, 2022. Crossref, https://doi.org/10.13052/jwe1540- 9589.2139
 Taijin Yoon, Sun-Young Park, and Hwan-Gue Cho, “A Smart Filtering System for Newly Coined Profanities by Using Approximate String Alignment”, 10th IEEE International Conference on Computer and Information Technology, pp. 643-650, 2010. Crossref, https://doi.org/10.1109/CIT.2010.129
 Abdulrehman A. Mohamed, George O.Okeyo, and Michael W. Kimwele, “Literature Survey: Data-driven Approach for Selection of an Ensemble Model of Profane Words Detection in Social Media”, International Journal of Scientific & Engineering Research, vol. 9 no. 10, 2018.
 Zeerak Waseem, and Dirk Hovy, “Hateful Symbols or Hateful People? Predictive Features for Hate Speech Detection on Twitter,” In Proceedings of the NAACL Student Research Workshop, Association for Computational Linguistics, pp. 88–93, 2016. Crossref, https://doi.org/10.18653/v1/N16-2013
 Sourya Dipta Das, Soumil Mandal, and Dipankar Das, “Language Identification of Bengali-English Code-Mixed Data Using Character & Phonetic Based LSTM Models,” In Proceedings of the 11th Forum for Information Retrieval Evaluation, pp. 60–64, 2019. Crossref, https://doi.org/10.1145/3368567.3368578
 Shervin Malmasi, and Marcos Zampieri, “Challenges in Discriminating Profanity from Hate Speech,” Journal of Experimental & Theoretical Artificial Intelligence, vol. 30, no. 2, pp. 187–202, 2018. Crossref, https://doi.org/10.1080/0952813X.2017.1409284
 Prashanth Kannadaguli, and Vidya Bhat, "Phoneme Modeling for Speech Recognition in Kannada using Multivariate Bayesian Classifier," SSRG International Journal of Electronics and Communication Engineering, vol. 1, no. 9, pp. 1-4, 2014. Crossref, https://doi.org/10.14445/23488549/IJECE-V1I9P101
 Sara Sood, Judd Antin, and Elizabeth F. Churchill, "Profanity use in Online Communities," Conference on Human Factors in Computing Systems - Proceedings, pp. 1481-1490, 2012. Crossref, https://doi.org/10.1145/2207676.2208610
 Geetika Gautam, and Divakar Yadav, “Sentiment Analysis of Twitter Data Using Machine Learning Approaches and Semantic Analysis,” Seventh International Conference on Contemporary Computing, pp. 437- 442, 2014. Crossref, https://doi.org/10.1109/IC3.2014.6897213
 Hate Speech - ABA Legal Fact Check - American Bar Association, [Online]. Available: https://abalegalfactcheck.com/articles/hate-speech.html.
 What are Profanity Filters? How to Implement Them? [Online]. Available: https://caseguard.com/articles/what-are-profanity-filters/
 NoSwearing.com. Noswearing.com - List of Swear Words, Bad Words, & Curse Words. 2019. [Online]. Available: https://www.noswearing.com/dictionary
 Ekaterina Chernyak, “Comparison of String Similarity Measures for Obscenity Filtering”, aclanthology, vol. 04 no.06, 4 April 2017.
 Tobias Renwick, and Denilson Barbosa, “Detection and Identification of Obfuscated Obscene Language with Character Level Transformers,” The 34th Canadian Conference on Artificial Intelligence, pp. 1–8, 2021. [Online]. Available: https://caiac.pubpub.org/pub/5uqi2h7k/
 Pushkar Mishra, “Author Profiling for Abuse Detection,” 27th international conference on computational linguistics,” pp. 1088–1098, 2018. [Online]. Available: https://aclanthology.org/C18-1093
 Yi Chang et al., “Abusive Language Detection in Online User Content,” 25th international conference on world wide web, pp. 145–153, 2016. Crossref, https://doi.org/10.1145/2872427.2883062
 Sood S O, Antin J and Churchill E 2012 Conference on Human Factors in Computing Systems ACM 978-1-4503-1015
 Abdulrehman A Mohamed, Dr George O Okeyo and Dr Michael W Kimwele 2018 International Journal of Scientific & Engineering Research 9 (10) 2229-5518
 A. Abitha, and K Lincy, "A Faster RCNN Based Image Text Detection and Text to Speech Conversion," SSRG International Journal of Electronics and Communication Engineering, vol. 5, no. 5, pp. 11-14, 2018. Crossref, https://doi.org/10.14445/23488549/IJECEV5I5P103
 Kate Knibbs, “Curses! People swear a lot on Twitter, and here are the most popular words,” 2014. [Online]. Available: http://www.digitaltrends.com/socialmedia/popular-curse-words-twitter/
 C. J. Hutto, and Eric Gilbert, “VADER : A Parsimonious Rule-Based Model for Sentiment Analysis Of Social Media Text,” The Eighth International AAAI Conference on Weblogs and Social Media, vol. 8, no. 1, pp. 216–225, 2014. Crossref, https://doi.org/10.1609/icwsm.v8i1.14550
 N. D. Gitari, Z. Zuping, H. Damien, & J. Long.
 Hugo Rosa et al., “A ‘Deeper’ look at Detecting Cyberbullying in Social Networks,” International Joint Conference on Neural Networks, pp. 1–8, 2018. Crossref, https://doi.org/10.1109/IJCNN.2018.8489211T
 Tin Van Huynh et al., “Hate Speech Detection on Vietnamese Social Media Text using the Bi-GRU-LSTMCNN Model,” Computation and Language, 2019. Crossref, https://doi.org/10.48550/arXiv.1911.03644
 Bjorn Gambäck, and Utpal Kumar Sikdar, “Using Convolutional Neural Networks to Classify Hate-Speech,” The First Workshop on Abusive Language Online, Association for Computational Linguistics, pp. 85–90, 2017. Crossref, https://doi.org/10.18653/v1/W17- 3013
 Tomas Mikolov et al., “Efficient Estimation of Word Representations in Vector Space,” Computation and Language, 2013.[Online]. Available: http://arxiv.org/abs/1301.3781.
 Tom Young et al., “Recent Trends in Deep Learning Based Natural Language Processing, Computation and Language , 2017. [Online]. Available: http://arxiv.org/abs/1708.02709.
 Ayush Jain et al., "Detection of Sarcasm through Tone Analysis on video and Audio files: A Comparative Study On Ai Models Performance," SSRG International Journal of Computer Science and Engineering, vol. 8, no. 12, pp. 1-5, 2021. Crossref, https://doi.org/10.14445/23488387/IJCSE-V8I12P101
 Jeffrey Pennington, Richard Socher, and Christopher D. Manning “Global Vectors for Word Representation,” Conference on Empirical Methods in Natural Language Processing, pp. 1532-1543, 2014. Crossref, https://doi.org/10.3115/v1/D14-116
 Piotr Bojanowski et al., “Enriching Word Vectors with Subword Information,” 2017. [Online]. Available: http://arxiv.org/abs/1607.04606.
 Armand Joulin et al., “Bag of Tricks for Efficient Text Classification,” 2016. [Online]. Available: http://arxiv.org/abs/1607.01759.
 Armand Joulin et al., “Compressing Text Classification Models,” 2016. [Online]. Available: http://arxiv.org/abs/1612.03651.
 Tomas Mikolov et al., “Advances in Pre-Training Distributed Word Representations,” 2017. [Online]. Available: http://arxiv.org/abs/1712.09405.
 ZENG Runhua, and ZHANG Shuqun, "Improving Speech Emotion Recognition Method of Convolutional Neural Network,” International Journal of Recent Engineering Science, vol. 5, no. 3, pp. 1-7, 2018. Crossref, https://doi.org/10.14445/23497157/IJRES-V5I3P101
 Mike King, “Types of Profanity Filters for Online Safety,” 2013. [Online]. Available: https://cleanspeak.com/blog/2013/03/28/types-of-profanity-filters-for-online-safety
 Ng Wai Foong, “Profanity Filtering in Speech,” 2022. [Online]. Available: https://levelup.gitconnected.com/profanity-filtering-in-speech-41ae4fd6cccf
 Wikidocs, “Introduction to natural language processing using deep learning.”, 2020. [Online]. Available: https://wikidocs.net/33520