Detection of Cyberbullying in Twitter Data Using Machine Learning Techniques

Volume-67 Issue-10
Year of Publication : 2019
Authors : Shahina K M
DOI :  10.14445/22312803/IJCTT-V67I10P110


Analyzing comments in online interactions poses an important role in todays technological world. Although the social media plays a significant role in communication, it spreads cyberbullying among the young generation. Usage of aggressive and distorting words in social media is turn into a trend in nowadays. This will constitute a culture with dishonor and adverse communication in cyber world. so, intelligence systems based on different algorithms are emerged to classify this social media contents. This paper focused on analyzing and experimenting feature extraction and detection of cyber bullying in twitter messages with the help of Natural Language Processing tools and different Machine learning algorithms. Four feature extraction methods including Bag of words, TFIDF, doctovec and wordtovec are applied on the data set to create the feature set and then different classification methods are performed on these features. The classification methods include Logistic Regression, Support Vector Machine, Random Forest, and XGBoost. The result shows that XGBoost model on word2vec features has outperformed all the other methods. Machine learning algorithms for classification is implemented here using anaconda python distribution.

Bag of words, TFIDF ,doctovec, wordtovec, Random Forest, Logistic Regression