Concepts and Technologies of Big Data Management and Hadoop File System

Balu Srinivasulu; Andemariam Mebrahtu

doi:10.14445/22312803/IJCTT-V44P114

Research Article | Open Access | Download PDF

Volume 44 | Number 1 | Year 2017 | Article Id. IJCTT-V44P114 | DOI : https://doi.org/10.14445/22312803/IJCTT-V44P114

Concepts and Technologies of Big Data Management and Hadoop File System

Balu Srinivasulu, Andemariam Mebrahtu

Citation :

Balu Srinivasulu, Andemariam Mebrahtu, "Concepts and Technologies of Big Data Management and Hadoop File System," International Journal of Computer Trends and Technology (IJCTT), vol. 44, no. 1, pp. 80-88, 2017. Crossref, https://doi.org/10.14445/22312803/IJCTT-V44P114

Abstract

In the digital era, uncontrolled data growth is a huge problem. This paper intends to cover the various data storage medium and their backup patterns adopted by end users for their personal data. With respect to an individual concern; the rate of increase in personal data is directly proportional to storage space issues; we focus on an implementation of file-level deduplication, which keeps away the duplicate files. This increases the storage capacity making a room for new data. It also illustrates the comparison of compression, deduplication, and deduplication with compression. We conclude that data will continue to grow and users should seek intelligent methods to shrink the storage space.

Keywords

Big Data, Hadoop, Map Reduce

References

[1]. DunrenChe, MejdlSafran, ZhiyongPeng,"From Big Data to Big Data Mining: Challenges, Issues, and Opportunities", DASFAA Workshops 2013, LNCS 7827, pp. 1–15, 2013
[2]. Venkata Narasimha Inukollu , Sailaja Arsi and Srinivasa Rao Ravuri “Security issues associated with big data in cloud computing “International Journal of Network Security & Its Applications (IJNSA), Vol.6, No.3, May 2014.
[3]. DDai, Jinquan, et al.,“Hitune: dataflow-based performance analysis for big data cloud”, Proc. of the 2011 USENIX ATC (2011), pp. 87-100. [Online]Available:https://www.usenix.org/legacy/event/atc11/tech/final_files/Dai.pdf.
[4]. KK, Chitharanjan, and Kala Karun A. "A review on hadoop — HDFS infrastructure extensions.” JeJu Island: 2013, pp. 132-137, 11-12 Apr. 2013.
[5]. Lohr, Steve. “The Age of Big Data.” New York Times. 11 Feb, 2012. http://www.nytimes.com/2012/02/12/sunday-review/big-datas-impact-in-the-world.html?_r=2& pagewanted=all
[6]. D. Borthakur, “The hadoop distributed ? le system: Architecture and design,” Hadoop Project Website, vol. 11, 2007
[7]. Wie, Jiang, Ravi V.T, and Agrawal G. "A Map-Reduce System with an Alternate API for Multi-core Environments.” Melbourne, VIC: 2010, pp. 84-93, 17-20 May. 2010
[8]. JJefry Dean, Sanjay Ghemwat,"Mapreduce: A Flexible Data Processing Tool", communications of the ACM, Vol. 53, Issuse 1, January 2010, pp. 72-77.
[9]. Manyika, James, Michael Chui, Brad Brown, Jacques Bughin, Richard Dobbs, Charles Roxburgh and Angela H. Byers. “Big data: The next frontier for innovation, competition, and productivity.” McKinsey Global Institute (2011): 1-137. May 2011.
[10]. ! Big data: The next frontier for innovation, competition, and productivity. McKinsey Global Institute, June 2011.http://www.mckinsey.com/mgi/publications/big_data/pdfs/MGI_big_data_full_report.pdf
[11]. Boyd, Dana and Crawford, Kate. “Six Provocations for Big Data.” Working Paper - Oxford Internet Institute. 21 Sept. 2011http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1926431>
[12]. Villars, R. L., Olofson, C. W., & Eastwood, M. (2011, June). Big data: What it is and why you should care. IDC White Paper. Framingham, MA: IDC.
[13]. F.C.P, Muhtaroglu, Demir S, Obali M, and Girgin C. "Busines on big dataapplications." Big Data, 2013 IEEE International Conference, Silicon Valley, CA, Oct 6-9, 2013, pp.32 - 37.