Information Retrieval using Jaccard Similarity Coefficient

International Journal of Computer Trends and Technology (IJCTT)          
© 2016 by IJCTT Journal
Volume-36 Number-3
Year of Publication : 2016
Authors : Manoj Chahal
DOI :  10.14445/22312803/IJCTT-V36P124


Manoj Chahal "Information Retrieval using Jaccard Similarity Coefficient". International Journal of Computer Trends and Technology (IJCTT) V36(3):140-143, June 2016. ISSN:2231-2803. Published by Seventh Sense Research Group.

Abstract -
Similarity measure define similarity between two or more documents. The retrieved documents are ranked based on the similarity of content of document to the user query. Jaccard similarity coefficient measure the degree of similarity between the retrieved documents. In this paper we retrieved information with the help of Jaccard similarity coefficient and analysis that information. All this is performed with the help of Genetic Algorithm. Due to exploring and exploiting nature of Genetic Algorithm it gives optimal result of our search. Genetic algorithm use Jaccard similarity coefficient to calculate similarity between documents. Value of jaccard similarity function lies between 0 &1 .it show the probability of similarity between the documents.

[1] E man Al Mashagba , Feras Al Mashagba and Mohammad Othman Nassar, “Query optimization using genetic algorithm in the vector space model”, International Journal of Computer Science, ISSN 0814-1694, vol. 8, no. 3, pp. 450-457, Sept. 2011.
[2] Mohammad Othman Nassar, Feras Al Mashagba and Eman Al Mashagba, “Improving the user query for the boolean model using genetic algorithm”, International Journal of Computer Science, vol. 8, no. 1, pp. 66-70, Sept. 2011.
[3] P.Pradeep Kumar, Naini.Shekhar Reddy, R.Sai Krishna et al., “Measuring of semantic similarity between words using web search engine approach”, International Journal of Engineering Research and Application, vol. 2, no. 1, pp. 401- 404, Feb. 2012.
[4] Poltak Sihombing, Abdullah Embong, Putra Sumari, “Comparison of document similarity in information retrieval system by different formulation”, Proceedings of 2nd IMT-GT Regional Conference on Mathematics Statics and Application, Malaysia, Jun. 2006.
[5] Gokul Patil, Amit Patil, “Web information extraction and classification using vector space model algorithm”, International Journal of Emerging Technology and Advanced Engineering, ISSN 2250-2459, vol. 1, no. 2, pp. 70-73, Dec. 2011.
[6] J.Allaan, Jay Aslam et al. “Challenges in Information Retrieval and Language Modeling “ , Report of a Workshop held at the Center for Intelligent Information Retrieval, University of Massachusetts Amherst, September 2002.
[7] Seung-Seok Choi, Sung-Hyuk Cha, Charles C. Tappert,” A Survey of Binary Similarity and Distance Measures”,. Department of computer science, Pace University
[8] Pragati Bhatnagar and N.K. Pareek ,” A Combined Matching Function based Evolutionary Approach for development of Adaptive Information Retrieval System “, International Journal of Emerging Technology and Advanced Engineering, June 2012
[9] Vaibhav Chaudhary, Dr. Pushpa Rani Suri ,” Genetic Algorithm v/s Share Genetic Algorithm with Roulette Wheel Selection method for Registration of Multimodal Images”, International Journal of Engineering Research and Application, August 2012.
[10] Simon, P., and Sathya, S.S., “Genetic algorithm for information retrieval”, International Conference on Intelligent Agent & Multi-Agent Systems (IAMA), ISBN: 978-1-4244- 4710-7, pp. 1 – 6, 2009.

Genetic Algorithm, Information Retrieval, Vector Space Model, Database, Jaccard Similarity Measure.