Performance Comparison of Two Streaming Data Clustering Algorithms
||International Journal of Computer Trends and Technology (IJCTT)||
|© 2014 by IJCTT Journal|
|Year of Publication : 2014|
|Authors : Chandrakant Mahobiya , Dr. M. Kumar|
|DOI : 10.14445/22312803/IJCTT-V12P111|
Chandrakant Mahobiya , Dr. M. Kumar."Performance Comparison of Two Streaming Data Clustering Algorithms". International Journal of Computer Trends and Technology (IJCTT) V12(2):56-59, June 2014. ISSN:2231-2803. www.ijcttjournal.org. Published by Seventh Sense Research Group.
The weighted fuzzy c-mean clustering algorithm (WFCM) and weighted fuzzy c-mean-adaptive cluster number (WFCM-AC) are extension of traditional fuzzy c-mean algorithm to stream data clustering algorithm. Clusters in WFCM are generated by renewing the centers of weighted cluster by iteration. On the other hand, WFCM-AC generates clusters by applying WFCM on the data & selecting best K± initialize center. In this paper we have compared these two methods using KDD-CUP’99 data set. We have compared these algorithms with respect to number of valid clusters, computational time and mean standard error.
 Aggarwal, J. Han, J. Wang, and P.S. Yu, “ A Framework for Clustering Evolving Data Streams,” Proc. 2 th Int’l Conf. Very Large Data Bases (VLDB), 2003.
 A. Zhou, F. Cao, Y. Yan, C. Sha, and X. He, “Distributed Data Stream Clustering: A Fast EM-Based Approach,” Proc. 23rd Int’l Conf. Data Eng., 2007.
 H. Kargupta and B.-H. Park, “A Fourier Spectrum-Based Approach to Represent Decision Trees for Mining Data Streams in Mobile Environments,” IEEE Trans. Knowledge Data Eng.,vol. 16, no. 2, pp. 216-229, Feb. 2004.
 P. Zhang, X. Zhu, and Y. Shi, “Categorizing and Mining Concept Drifting Data Streams,” Proc. 14th ACM SIGKDD Int’l Conf. Knowledge Discovery and Data Mining, 2008.
 P. Wang, H. Wang, X. Wu, W. Wang, and B. Shi, “A Low- Granularity Classifier for Data Streams with Concept Drifts and Biased Class Distribution,” IEEE Trans. Knowledge Data Eng., vol. 19, no. 9, pp. 1202-1213, Sept. 2007.
 J. Han and M. Kamber, Data Mining: Concepts and Techniques, J. Kacprzyk and L. C. Jain, Eds. Morgan Kaufmann, 2006, vol. 54, no. Second Edition.
 C. C. Aggarwal, J. Han, J. Wang, and P. S. Yu., A framework for clustering evolving data streams, In Proc. of VLDB, 2003, pp. 81– 92.
 Zhang, Ramakrishnan, and L. M., "BIRCH: An efficient data clustering method for very large databases " presented at ACM SIGMOD Conference on Management of Data, 1996.
 S. Guha, A. Meyerson, N. Mishra, R. Motwani, and L. O’Callaghan, “Clustering Data Streams: Theory and Practice,” IEEE Trans. Knowledge Data Eng., vol. 15, no. 3, pp. 515-528, May 2003.
 S. Guha, N. Mishra, R. Motwani, and L. O’Callaghan, “Clustering Data Streams,” Proc. 41st Ann. IEEE Symp. Foundations of Computer Science, 2000.
 B. Babcock, M. Datar, and R.M.L. O’Callaghan, “Maintaining Variance and k-Medians over Data Stream Windows,” Proc. 22nd ACM Symp. Principles of Databases Systems, 2003.
 C.C. Aggarwal, J. Han, J. Wang, and P.S. Yu, “A Framework for Clustering Evolving Data Streams,” Proc. 29th Int’l Conf. Very LargeData Bases (VLDB), 2003.
 C.C. Aggarwal, J. Han, J. Wang, and P.S. Yu, “On High Dimensional Projected Clustering of Data Streams,” Data Mining and Knowledge Discovery, vol. 10, pp. 251-273, 2005.
 F. Cao, M. Ester, W. Qian, and A. Zhou, “Density-Based Clustering over an Evolving Data Stream with Noise,” Proc. Sixth SIAM Int’l Conf. Data Mining, 2006.
 Y. Chen and L. Tu, “Density-Based Clustering for Real-Time Stream Data,” Proc. 13th ACM SIGKDD Int’l Conf. Knowledge Discovery and Data Mining, 2007.
 M. Khalilian, N. Mustapha, M. N. Sulaiman, and F. Z. Boroujeni, "K-Means Divide and Conquer Clustering," presented at ICCAE, Thiland, Bangkok, 2009.
 S. Lühr and M. Lazarescu, "Incremental clustering of dynamic data streams using connectivity based representative points," Data & Knowledge Engineering, vol. 68, pp. 1-27, 2009.
 K. Udommanetanakit, T. Rakthanmanon, and K. Waiyamai, “E-stream: Evolution-based technique for stream clustering,” in Proceedings of the 3rd international conference on Advanced Data Mining and Applications, ser. ADMA.
 J. Gao, J. Li, Z. Zhang, and P.-N. Tan, “An incremental data stream lustering algorithm based on dense units detection,” in Proceedings of the 9th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining, ser.
 C. Jia, C. Tan, and A. Yong, “A grid and density-based clustering algorithm for processing data stream,” in Proceedings of the 2008Second International Conference on Genetic and Evolutionary Computing, ser. WGEC ’08. Washington, DC, USA: IEEE Computer Society, 2008, pp. 517–521.
 W. Meesuksabai, T. Kangkachit, and K. Waiyamai, “Hue-stream: Evolution-based clustering technique for heterogeneous data streams with uncertainty.” in ADMA (2), ser. Lecture Notes in Computer Science, vol. 7121. Springer, 2011, pp. 27–40.
 Tai Wai Cheng, Dmitry B. Goldgof, Lawrence O. Hall (1998). “Fast fuzzy clustering”. Fuzzy Sets and Systems. pp. 49-56.
 David Altman (1999). “Efficient Fuzzy Clustering of Multispectral Images”. Proceedings of international Geoscience and Remote Sensing Symposium. pp. 1594-1596.
 Richard J. Hathaway, James C. Bezdek (2006). “Extending Fuzzy and Probabilistic Clustering to Very Large Data Sets”, Journal of Computational Statistics and Data Analysis.Vol.51, No.1, pp. 215-234.
 Robert Cannon, Janison V. Dave, and James C. Bezdek (1986). “Efficient implementation of the fuzzy c-means clustering algorithms”, IEEE Transaction on Pattern Analysis and Machine Intelligence. Vol.8, No.2, pp. 248- 255.
 Chin-Hsiung Wu, Shi-Jinn Horng,Yi-Wen Chen and Wei- Yi Lee (2000). “Designing Scalable and Efficient Parallel Clustering Algorithms on Arrays with Reconfigurable Optical Buses”. Image and Vision Computing.Vol.18, No.13, pp.1033–1043.
 Moh’d Belal AL-Zoubi, Amjad Hudaib, Bashar Al-Shboul (2007). ”A Fast Fuzzy Clustering Algorithm”. Proceedings of the 6th WSEAS international conference On Artificial Intelligence, Knowledge Engineering and Data Bases. pp. 28-32.
 R. Wan, X. Yan, and X. Su, "A Weighted Fuzzy Clustering Algorithm for Data Stream," presented at ISECS International Colloquium on Computing, Communication, Control, and Management CCCM`08, 2008.
 S.Mostafavi,and A.Amiri ”Extending Fuzzy C-means to Clustering Data Streams” 20th Iranian Conference on Electrical Engineering, ICEE Iran 2012.
Streaming data, weighted fuzzy c-mean, weighted fuzzy c-mean-adaptive clustering.