Document Clustering in Web Search Engine

  IJCOT-book-cover
 
International Journal of Computer Trends and Technology (IJCTT)          
 
© - Issue 2012 by IJCTT Journal
Volume-3 Issue-2                           
Year of Publication : 2012
Authors :A.S.N.Chakravarthy, Deepthi.S, K.Satyatej, Sk.Nizmi, S.Sindhura.

MLA

A.S.N.Chakravarthy, Deepthi.S, K.Satyatej, Sk.Nizmi, S.Sindhura."Document Clustering in Web Search Engine"International Journal of Computer Trends and Technology (IJCTT),V3(2):286-289 Issue 2012 .ISSN 2231-2803.www.ijcttjournal.org. Published by Seventh Sense Research Group.

Abstract: -As the number of web pages grows, it becomes more difficult to find the relavant documents from the information retrieval engines, so by using clustering concept we can find the grouped relavant documents. The main purpose of clustering techniques is to partitionate a set of entities into different groups, called clusters. These groups may be consistent in terms of similarity of its members. As the name suggests, the representative-based clustering techniques uses some form of representation for each cluster. Thus, every group has a member that represents it. The main use is to reduce the cost of the algorithm, the use of representatives makes the process easier to understand.

References-

[1] Chan, L.M.: Cataloging and Classification : an Introduction. McGraw-Hill, New York, 1994
[2] R. Kannan, S. Vempala, and Adrian Vetta, “On Clusterings: Good, Bad, and Spectral”, Proc. of the 41st Foundations of Computer Science, Redondo Beach, 2000.5
[3] S. Kantabutra, Efficient Representation of Cluster Structure in Large Data Sets, Ph.D. Thesis, Tufts University, Medford, MA, September2001
[4] Aristides Likas, Nikos Vlassis and Jacob J. Verbeek: The global k-means clustering algorithm. In Pattern Recognition Vol 36, No 2, 2003.
[5] J. Matoušek. On the approximate geometric k-clustering. Discrete and Computational Geometry. 24:61-84, 2000
[6] Dan Pelleg and Andrew Moore: Cached sufficient statistics for efficientmachine learning with large datasets. In Journal of Artificial Intelligence Research, 8:67-91, 1998.
[7]A Document Clustering Algorithm for Web Search Engine Retrieval System ,2010 Hongwei Yang School of Software, Yunnan University, Kunming 650021, China; Education Science Research Academy of Yunnan, Kunming 650223, China.

Keywords —Document clustering, k-means.