Abstract -
Data science is the extension of research findings and drawing conclusions from data[1]. BigTable is built on a few of Google technologies[2]. MapReduce is a programming model and an associated implementation for processing and generating large data sets with a parallel, distributed algorithm on a cluster[3]. Google File System is designed to provide efficient, reliable access to data using large clusters of commodity hardware[4]. This paper will discuss Bigtable, MapReduce and Google File System, along with discussing the top 10 algorithms in data mining in brief.

