Introduction to KEA-Means Algorithm for Web Document Clustering.

  IJCOT-book-cover
 
International Journal of Computer Trends and Technology (IJCTT)          
 
© - Issue 2012 by IJCTT Journal
Volume-3 Issue-4                           
Year of Publication : 2012
Authors :Swapnali Ware, N.A.Dhawas.

MLA

Swapnali Ware, N.A.Dhawas."Introduction to KEA-Means Algorithm for Web Document Clustering "International Journal of Computer Trends and Technology (IJCTT),V3(4):495-498 Issue 2012 .ISSN 2231-2803.www.ijcttjournal.org. Published by Seventh Sense Research Group.

Abstract: -In most traditional techniques of document clustering, the number of total clusters is not known in advance and the cluster that contains the target information or précised information associated with the cluster cannot be determined. This problem solved by K-means algorithm. By providing the value of no. of cluster k. However, if the value of k is modified, the precision of each result is also changes. To solve this problem, this paper introduces a new clustering algorithm known as KEA-Means algorithm which will combines the kea i.e. key phrase extraction algorithm which returns several key phrases from the source documents by using some machine learning language by creating model which will contains some rule for generating the no. of clusters of the web documents from the dataset .this algorithm will automatically generates the number of clusters at the run time here. User need not to specify the no. of clusters. This Kea-means clustering algorithm provides the value of k and will be beneficial to extract test documents from massive quantities of resources.

References-

[1] Alexander S., Joydeep G. and Raymond M 2000.Impact of similarity measures on web page clustering. University of Texas at Austin, TX, 78712-1084, USA.
[2] M.Steinbach, G.Karypis, V.Kumar 2000.A comparison of document clustering techniques.proc.KDD Workshop on Text Mining, 1-20.
[3] Teknomo, Kardi. K-Means Clustering Tutorials. http:\people.revoledu.comkardi utorialkMean
[4] P.Turney 1999.Coherent keyphrase extraction via web mining”, Technical Report ERB-1057, Institute for Information Technology, National Research Council of Canada.
[5] P.Turney 2003.”Learning to extract keyphrases from text”, proc.18th International Joint Conference on Artificial Intelligence (IJCAI), 434-439, 2003.
[6] Ian H. Witten, Gordon W. Paynter, Eibe Frank, Carl Gutwin and Craig G. Nevill-Manning 1999.KEA: Practical Automatic Keyphrase Extraction. Dept. of computer science university of Waaikato.

Keywords- K-means clustering, Kea key phrase extraction algorithm, KEA-Means algorithm, F-measure