Securing User’s Data in HDFS

Abstract: - — With the advent of Technology and increasing growth in volume of data the business are finding the cloud as suitable option to host their data. However, putting the sensitive data on third-party infrastructure poses several security risks to their data utilizing the advantages of the Clouds. Though there are many risks and concerns are involved in cloud computing according many surveys conducted by different organizations the prime concern of clients when opting for cloud solution is the security of their data. Key issue is to protect important data from unauthorized access by adversaries in case the confidentiality of data is broken by internal or external attacks on the cloud hosting those data. HDFS is the file system suitable for storing and processing large volume of data using MapReduce model. When public cloud is based on the Hadoop which uses HDFS to store data, the data are stored in plain text and by default the transport of data is also insecure when client submit the data to storage servers on cloud. Requirement here is design and implement a prototype to secure the HDFS to harness is with security features so that it can be deployed in public cloud to provide storage and computing services. We have proposed and implemented secure HDFS by incorporating Elliptic Curve Integrated Encryption which provides data confidentiality as well as integrity in Hadoop. Experiments were carried out to analyze the performance with respect to other hybrid encryption schemes.


Keywords — HDFS, Cloud, Security, ECIES