Vector Space Models to Classify Arabic Text

Authors : Jafar Ababneh, Omar Almomani, Wael Hadi, Nidhal Kamel Taha El-Omari, Ali Al-Ibrahim
      Text classification is one of the most important tasks in data mining. This paper investigates different variations of vector space models (VSMs) using KNN algorithm. The bases of our comparison are the most popular text evaluation measures. The Experimental results against the Saudi data sets reveal that Cosine outperformed Dice and Jaccard coefficients.

Keywords-Arabic data sets, Data mining, Text categorization, Term weighting, VSM.