Data Science: Bigtable, MapReduce and Google File System

International Journal of Computer Trends and Technology (IJCTT)
© 2014 by IJCTT Journal
Volume-16 Number-3
Year of Publication : 2014
Authors : Karan B. Maniar , Chintan B. Khatri
DOI :  10.14445/22312803/IJCTT-V16P128


Karan B. Maniar , Chintan B. Khatri. "Data Science: Bigtable, MapReduce and Google File System". International Journal of Computer Trends and Technology (IJCTT) V16(3):115-118, Oct 2014. ISSN:2231-2803. Published by Seventh Sense Research Group.

Abstract -
Data science is the extension of research findings and drawing conclusions from data[1]. BigTable is built on a few of Google technologies[2]. MapReduce is a programming model and an associated implementation for processing and generating large data sets with a parallel, distributed algorithm on a cluster[3]. Google File System is designed to provide efficient, reliable access to data using large clusters of commodity hardware[4]. This paper will discuss Bigtable, MapReduce and Google File System, along with discussing the top 10 algorithms in data mining in brief.

5. Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber. “Bigtable: A Distributed Storage System for Structured Data”, 2006.
6. Xiao Chen. “Google Big Table”, 2010.
7. Jeffrey Dean and Sanjay Ghemawat. “MapReduce: Simplified Data Processing on Large Clusters”, 2004.
8. Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung. “The Google File System”, 2003.
11. XindongWu, Vipin Kumar, J. Ross Quinlan, Joydeep Ghosh, Qiang Yang, Hiroshi Motoda, Geoffrey J. McLachlan, Angus Ng, Bing Liu, Philip S. Yu, Zhi-Hua Zhou, Michael Steinbach, David J. Hand, Dan Steinberg. “Top 10 algorithms in data mining”, 2008.

Data Science, Bigtable, MapReduce, Google File System, Top 10 algorithms in data mining.