Efficient Data Clustering with Link Approach

Y. Sireesha; CH. Srinivas; K.C. Ravi Kumar

doi:https://doi.org/10.14445/22312803/IJCTT-V4I10P150

Research Article | Open Access | Download PDF

Volume 4 | Issue 10 | Year 2013 | Article Id. IJCTT-V4I10P150 | DOI : https://doi.org/10.14445/22312803/IJCTT-V4I10P150

Efficient Data Clustering with Link Approach

Y. Sireesha , CH. Srinivas , K.C. Ravi Kumar

Citation :

Y. Sireesha , CH. Srinivas , K.C. Ravi Kumar, "Efficient Data Clustering with Link Approach," International Journal of Computer Trends and Technology (IJCTT), vol. 4, no. 10, pp. 3648-3655, 2013. Crossref, https://doi.org/10.14445/22312803/IJCTT-V4I10P150

Abstract

Data clustering faces lots of studies and researches and at last the results being competitive to conventional algorithms, even though using these techniques finally we are getting an incomplete information. The existed partitioned-information matrix contains particular cluster-data point relations only, with lot entries which are not recognized. The paper explores researches that preferres this crisis decomposes the efficiency of the clustering result, and it contains a new link-based approach, which increases the conventional matrix by revealing the entries which are not recognized based upon the common things which are present both clusters and in ensemble. Often, a perfect link-based algorithm is invented and used for the underlying common assessment. After all those, to gain the maximum clustering outputs, a graph partitioning technique is used for a weighted bipartite graph that is formulated from the refined matrix. Results on various real data sets suggest that the proposed link-based method mostly performs both conventional clustering algorithms for categorical data and also most common cluster ensemble techniques.

Keywords

cloud computing, cloud GIS, Amazon EC2, Google maps API.

References

[1] D.S. Hochbaum and D.B. Shmoys, “A Best Possible Heuristic for the K-Center Problem,” Math. of Operational Research, vol. 10, no. 2, pp. 180-184, 1985.
[2] L. Kaufman and P.J. Rousseeuw, Finding Groups in Data: An Introduction to Cluster Analysis. Wiley Publishers, 1990.
[3] A.K. Jain and R.C. Dubes, Algorithms for Clustering. Prentice-Hall, 1998.
[4] P. Zhang, X. Wang, and P.X. Song, “Clustering Categorical Data Based on Distance Vectors,” The J. Am. Statistical Assoc., vol. 101, no. 473, pp. 355-367, 2006.
[5] J. Grambeier and A. Rudolph, “Techniques of Cluster Algorithms in Data Mining,” Data Mining and Knowledge Discovery, vol. 6, pp. 303-360, 2002.
[6] K.C. Gowda and E. Diday, “Symbolic Clustering Using a New Dissimilarity Measure,” Pattern Recognition, vol. 24, no. 6, pp. 567- 578, 1991.
[7] J.C. Gower, “A General Coefficient of Similarity and Some of Its Properties,” Biometrics, vol. 27, pp. 857-871, 1971.
[8] Z. Huang, “Extensions to the K-Means Algorithm for Clustering Large Data Sets with Categorical Values,” Data Mining and Knowledge Discovery, vol. 2, pp. 283-304, 1998.
[9] Z. He, X. Xu, and S. Deng, “Squeezer: An Efficient Algorithm for Clustering Categorical Data,” J. Computer Science and Technology, vol. 17, no. 5, pp. 611-624, 2002.
[10] P. Andritsos and V. Tzerpos, “InformationTheoretic Software Clustering,” IEEE Trans. Software Eng., vol. 31, no. 2, pp. 150-165, Feb. 2005.
[11] D. Cristofor and D. Simovici, “Finding Median Partitions Using InformationTheoretical-Based Genetic Algorithms,” J. Universal Computer Science, vol. 8, no. 2, pp. 153-172, 2002.
[12] D.H. Fisher, “Knowledge Acquisition via Incremental Conceptual Clustering,” Machine Learning, vol. 2, pp. 139-172, 1987.