A Study of Data Mining with Big Data

International Journal of Computer Trends and Technology (IJCTT)          
© 2016 by IJCTT Journal
Volume-38 Number-2
Year of Publication : 2016
Authors : Dr. V.Harsha Shastri, V.Sreeprada


Dr. V.Harsha Shastri, V.Sreeprada "A Study of Data Mining with Big Data". International Journal of Computer Trends and Technology (IJCTT) V38(2):99-103, August 2016. ISSN:2231-2803. www.ijcttjournal.org. Published by Seventh Sense Research Group.

Abstract -
Data has become an important part of every economy, industry, organization, business, function and individual. Big Data is a term used to identify large data sets typically whose size is larger than the typical data base. Big data introduces unique computational and statistical challenges. Big Data are at present expanding in most of the domains of engineering and science. Data mining helps to extract useful data from the huge data sets due to its volume, variability and velocity. This article presents a HACE theorem that characterizes the features of the Big Data revolution, and proposes a Big Data processing model, from the data mining perspective.

[1] Xindong Wu, Xingquan Zhu, Gong Qing Wu, WeiDing, „Data mining with Big data, IEEE, Volume 26, Issue 1, January 2014.
[2] Bharti Thakur, Manish Mann „Data Mining for Big Data- A Review, IJARCSSE, Volume 4, Issue 5, May 2014.
[3] Rohit Pitre, Vijay Kolekar, A Survey Paper on Data Mining With Big Data, IJIRAE, Volume 1, Issue 1, April 2014.
[4] Dr. A.N. Nandhakumar, Nandita Yambem, “A Survey of Data Mining Algorithms on Apache Hadoop Platforms, IJETAC, Volume 4, Issue 1, January 2014.
[5] C.L. Philip Chen, C.-Y. Zhang, “Data-intensive applications, challenges, techniques and technologies: A survey on Big Data”, Inform. Sci. (2014)http://dx.doi.org/10.1016/j.ins.2014.01.015.
[6] Puneet Singh Duggal, Sanchita Paul, (2013), “Big Data Analysis:Challenges and Solutions”, Int. Conf. on Cloud, Big Data and Trust, RGPV
[7] Apache Mahout, http://mahout.apache.org.
[8] A. Bifet, G. Holmes, R. Kirkby, and B. Pfahringer. MOA: Massive Online Analysis http://moa.cms.waikato.ac.nz/. Journal of Machine Learning Research (JMLR), 2010.
[9] R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria, 2012. ISBN 3-900051-07-0.
[10] J. Langford. Vowpal Wabbit, http://hunch.net/˜vw/,2011.
[11] B. Efron. Large-Scale Inference: Empirical Bayes Methods for Estimation, Testing, and Prediction. Institute of Mathematical Statistics Monographs. Cambridge University Press, 2010
[12] L. Neumeyer, B. Robbins, A. Nair, and A. Kesari.S4: Distributed Stream Computing Platform. In ICDM Workshops, pages 170–177, 2010.
[13] J. Gama. Knowledge Discovery from Data Streams. Chapman & Hall/Crc Data Mining and Knowledge Discovery. Taylor & Francis Group, 2010.
[14] D. Feldman, M. Schmidt, and C. Sohler. Turning big data into tiny data: Constant-size coresets for k-means, pca and projective clustering. In SODA, 2013.
[15] R. Smolan and J. Erwitt. The Human Face of Big Data.Sterling Publishing Company Incorporated, 2012.
[16] J. Gantz and D. Reinsel. IDC: The Digital Universe in 2020: Big Data, Bigger Digital Shadows, and Biggest Growth in the Far East. December 2012.

Big Data, Data Mining, HACE theorem, structured and unstructured.