A Study on Big Data Integration with Data Warehouse

International Journal of Computer Trends and Technology (IJCTT)          
© 2014 by IJCTT Journal
Volume-9 Number-4                          
Year of Publication : 2014
Authors : T.K.Das , Arati Mohapatro
DOI :  10.14445/22312803/IJCTT-V9P137


T.K.Das , Arati Mohapatro."A Study on Big Data Integration with Data Warehouse". International Journal of Computer Trends and Technology (IJCTT) V9(4):188-192, March 2014. ISSN:2231-2803. www.ijcttjournal.org. Published by Seventh Sense Research Group.

Abstract -
The amount of data in world is exploding. Data is being collected and stored at unprecedented rates. The challenge is not only to store and manage the vast volume of data, but also to analyze and extract meaningful value from it. In the last decade Data Warehousing technology has been evolved for efficiently storing the data from different sources for business intelligence purpose. In the Age of the Big Data, it is important to remodel the existing warehouse system that will help you and your organization make the most of unstructured data with your existing Data Warehouse. As Big Data continues to revolutionize how we use data, this paper addresses how to leverage big data by effectively integrating it to your data warehouse.

[1] Abouzeid, A., Bajda-Pawlikowski, K., Abadi, D.J.. Rasin, A., and Silberschatz, A. HadoopDB: An Architectural Hybrid of MapReduce and DBMS Technologies for Analytical Workloads. PVLDB 2(1), 2009..
[2] Bakshi Kapil, Considerations for Big Data –Architecture and Approach, IEEE,2012
[3] Cohen, J., Dolan, B., Dunlap, M., Hellerstein, J.M., and Welton, C. MAD Skills: New Analysis Practices for Big Data. PVLDB 2(2), 2009.
[4] J. Dean and S. Ghemawat. MapReduce: simplified data processing on large clusters.Communications of the ACM, 51(1):107–113, 2008.
[5] Hadoop. http://hadoop.apache.org.
[6] Hadoop distributed file system (hdfs). http://hadoop.apache.org/hdfs.
[7] Hadoop MapReduce. http://hadoop.apache.org/mapreduce.
[8] Hive. http://hive.apache.org.
[9] Liu Z.H., Krishnamurthy.V. ,Towards Business Intelligence over Unified Structured and Unstructured Data using XML, edited volume “Business Intelligence-Solution for Business Development”, Intech Publisher,2011
[10] Thusoo, A. Sarma, J.S., Jain, N., Shao, Z., Chakka, P. Zhang, N.,Antony, S., Liu, H., and Murthy, R. Hive – A Petabyte Scale Data Warehouse Using Hadoop. Proc. of ICDE, 2010.
[11] R. Chaiken, et. al. Scope: -Easy and Efficient Parallel Processing of Massive Data Sets. In Proc. of VLDB, 2008..
[12] A. Pavlo, E. Paulson, A. Rasin, D.J. Abadi, D.J. DeWitt, S. Madden, and M. Stonebraker. A comparison of approaches to large-scale data analysis. In Proceedings of the 35th SIGMOD international conference on Management of data, SIGMOD ’09, pages 165–178. ACM, 2009.
[13] M. Stonebraker, D. Abadi, D.J. DeWitt, S. Madden, E. Paulson, A. Pavlo, and A. Rasin. MapReduce and parallel DBMSs: friends or foes? Communications of the ACM, 53(1):64–71, 2010.
[14] Cuzzocrea.A , Song .Y , Davis Karen C : Analytics over large scale Multidimensional Data:The Big Data Revolution, Communications of ACM,2011
[15] Awadallah Amar,Graham.Dan,Hadoop and the Data Warehouse- When to use which , Cloudera Inc and Teradata Corporation,2011
[16] Hollingsworth.A, Graham.D, Hadoop and Hive as scalable alternatives to RDBMS –A Case Study,Boise State University Scholarworks,2012
[17] http://infolab.stanford.edu/~ragho/hive-icde2010.pdf
[18] www.datavault.com
[19] Hadoop Pig. Available at http://hadoop.apache.org/pig
[20] Chen Songting. “Cheetah – Ahigh performance custom Datawarehouse on top of MapReduce” Proceedings of VLDB ,Vol 3,No . 2, 2010
[21] Yongqiang He et all “ RCFile: A Fast and Space-efficient Data Placement Structure in MapReduce-based Warehouse Systems”, ICDE 2011
[22] www.informatica.com

Big Data, Data warehouse, Hadoop