Efficient Data Clustering with Link Approach

International Journal of Computer Trends and Technology (IJCTT)          
© - October Issue 2013 by IJCTT Journal
Volume-4 Issue-10                           
Year of Publication : 2013
Authors :Y. Sireesha , CH. Srinivas , K.C. Ravi Kumar


Y. Sireesha , CH. Srinivas , K.C. Ravi Kumar"Efficient Data Clustering with Link Approach"International Journal of Computer Trends and Technology (IJCTT),V4(10):3648-3655 October Issue 2013 .ISSN 2231-2803.www.ijcttjournal.org. Published by Seventh Sense Research Group.

Abstract:-  Data clustering faces lots of studies and researches and at last the results being competitive to conventional algorithms, even though using these techniques finally we are getting an incomplete information. The existed partitioned-information matrix contains particular cluster-data point relations only, with lot entries which are not recognized. The paper explores researches that preferres this crisis decomposes the efficiency of the clustering result, and it contains a new link-based approach, which increases the conventional matrix by revealing the entries which are not recognized based upon the common things which are present both clusters and in ensemble. Often, a perfect link-based algorithm is invented and used for the underlying common assessment. After all those, to gain the maximum clustering outputs, a graph partitioning technique is used for a weighted bipartite graph that is formulated from the refined matrix. Results on various real data sets suggest that the proposed link-based method mostly performs both conventional clustering algorithms for categorical data and also most common cluster ensemble techniques.


References -

[1] D.S. Hochbaum and D.B. Shmoys, “A Best Possible Heuristic for the K-Center Problem,” Math. of Operational Research, vol. 10, no. 2, pp. 180-184, 1985.
[2] L. Kaufman and P.J. Rousseeuw, Finding Groups in Data: An Introduction to Cluster Analysis. Wiley Publishers, 1990.
[3] A.K. Jain and R.C. Dubes, Algorithms for Clustering. Prentice-Hall, 1998.
[4] P. Zhang, X. Wang, and P.X. Song, “Clustering Categorical Data Based on Distance Vectors,” The J. Am. Statistical Assoc., vol. 101, no. 473, pp. 355-367, 2006.
[5] J. Grambeier and A. Rudolph, “Techniques of Cluster Algorithms in Data Mining,” Data Mining and Knowledge Discovery, vol. 6, pp. 303-360, 2002.
[6] K.C. Gowda and E. Diday, “Symbolic Clustering Using a New Dissimilarity Measure,” Pattern Recognition, vol. 24, no. 6, pp. 567- 578, 1991.
[7] J.C. Gower, “A General Coefficient of Similarity and Some of Its Properties,” Biometrics, vol. 27, pp. 857-871, 1971.
[8] Z. Huang, “Extensions to the K-Means Algorithm for Clustering Large Data Sets with Categorical Values,” Data Mining and Knowledge Discovery, vol. 2, pp. 283-304, 1998.
[9] Z. He, X. Xu, and S. Deng, “Squeezer: An Efficient Algorithm for Clustering Categorical Data,” J. Computer Science and Technology, vol. 17, no. 5, pp. 611-624, 2002.
[10] P. Andritsos and V. Tzerpos, “Information-Theoretic Software Clustering,” IEEE Trans. Software Eng., vol. 31, no. 2, pp. 150-165, Feb. 2005.
[11] D. Cristofor and D. Simovici, “Finding Median Partitions Using Information-Theoretical-Based Genetic Algorithms,” J. Universal Computer Science, vol. 8, no. 2, pp. 153-172, 2002.
[12] D.H. Fisher, “Knowledge Acquisition via Incremental Conceptual Clustering,” Machine Learning, vol. 2, pp. 139-172, 1987.

Keywords :— cloud computing, cloud GIS, Amazon EC2, Google maps API.