The Impact of Automated Data Engineering on Cost and Time Savings

  IJCTT-book-cover
 
         
 
© 2023 by IJCTT Journal
Volume-71 Issue-6
Year of Publication : 2023
Authors : Deexith Reddy
DOI :  10.14445/22312803/IJCTT-V71I6P110

How to Cite?

Deexith Reddy, "The Impact of Automated Data Engineering on Cost and Time Savings," International Journal of Computer Trends and Technology, vol. 71, no. 6, pp. 57-62, 2023. Crossref, https://doi.org/10.14445/22312803/IJCTT-V71I6P110

Abstract
As the digital age continues to evolve, the role of data engineering in business operations has become increasingly significant. The advent of automated data engineering has further revolutionized this landscape, promising enhanced business efficiency, cost reductions, and time savings. This paper aims to delve into the transformative potential of automated data engineering, exploring its impact on various business processes. Through a series of case studies, we examine real-world implementations of automated data engineering and analyze their outcomes. We further discuss the challenges businesses may encounter during this digital transition and propose strategies to mitigate them. The paper's findings underscore the pivotal role of automated data engineering in driving business efficiency and competitiveness in the modern digital era.

Keywords
Data engineering, Automation, Cloud, Efficiency, Machine learning.

Reference

[1] Michael Stonebraker, and Uĝur Çetintemel, “'One Size Fits All': An Idea whose Time has Come and Gone,” Proceedings of the 21st International Conference on Data Engineering, pp. 441-462, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[2] Tyler Akidau et al., "The Dataflow Model: A Practical Approach to Balancing Correctness, Latency, and Cost in Massive-Scale, Unbounded, Out-of-Order Data Processing," Proceedings of the VLDB Endowment, vol. 8, no. 12, pp. 1792-1803, 2015.
[Google Scholar] [Publisher Link]
[3] David Chapela-Campa, and Marlon Dumas, "Modeling Extraneous Activity Delays in Business Process Simulation," 4th International Conference on Process Mining, pp. 72-79, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[4] Christian Wellmann et al., "A Framework to Evaluate the Viability of Robotic Process Automation for Business Process Activities," International Conference on Business Process Management, pp. 200-214, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[5] Vijay Janapa Reddi et al., "Data Engineering for Everyone," Arxiv Preprint Arxiv:2106.15504v1, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[6] Hui Han, and Silvana Trimi, "Cloud Computing-based Higher Education Platforms during the COVID-19 Pandemic," International Conference on E-Education, E-Business, E-Management, and E-Learning, pp. 83-89, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[7] S Qamar, Niranjan Lal, and Mrityunjay Singh, "Internet Ware Cloud Computing: Challenges," Arxiv Preprint Arxiv:1004.1746v1, 2010.
[CrossRef] [Google Scholar] [Publisher Link]
[8] U. Breitenbucher et al., "Policy-Aware Provisioning of Cloud Resources for Data Processing Workflows," Arxiv Preprint Arxiv:2211.09174v3, 2021.
[9] Marc Schmitt, “Automated Machine Learning: AI-Driven Decision Making in Business Analytics,” Intelligent Systems with Applications, vol. 18, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[10] Bill Schmarzo, "The Economics of Data, Analytics and Digital Transformation," 2021.
[11] Michael Stonebraker et al., “Data Curation at Scale: the Data Tamer System,” CIDR, vol. 4, 2013.
[Google Scholar] [Publisher Link]
[12] Xin Luna Dong, and Theodoros Rekatsinas, “Data Integration and Machine Learning: A Natural Synergy,” Proceedings of the 2018 International Conference on Management of Data, New York, NY, USA, pp. 1645–1650, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[13] Sean Kandel et al., “Enterprise Data Analysis and Visualization: An Interview Study,” IEEE Transactions on Visualization and Computer Graphics, vol. 18, no. 12, pp. 2917-2926, 2012.
[CrossRef] [Google Scholar] [Publisher Link]
[14] Huang Fang, “Managing Data Lakes in Big Data Era: What's a Data Lake and why has it Become Popular in Data Management Ecosystem,” IEEE International Conference on Cyber Technology in Automation, Control, and Intelligent Systems, pp. 820-824, 2015.
[CrossRef] [Google Scholar] [Publisher Link]
[15] Protiva Rahman, Arnab Nandi, and Courtney Hebert, “Amplifying Domain Expertise in Clinical Data Pipelines,” JMIR Medical Information, vol. 8, no. 11, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[16] Sebastian Schelter et al., “Automating Large-Scale Data Quality Verification,” Proceedings of the VLDB Endowment, vol. 11, no. 12, pp. 1781-1794, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[17] James Warren, and Nathan Marz, Big Data: Principles and Best Practices of Scalable Realtime Data Systems, Manning Publications Co., 2015.
[Google Scholar] [Publisher Link]
[18] Mark Beyer, and Douglas Laney, “The Importance of 'Big Data': A Definition,” Gartner, 2016.
[Publisher Link]
[19] Xavier Amatriain, and Justin Basilico, “Netflix Recommendations: Beyond the 5 Stars (Part 1),” Netflix TechBlog, 2012.
[Publisher Link]
[20] Netflix Technology Blog, Keystone Real-Time Stream Processing Platform, 2018. [Online]. Available: https://netflixtechblog.com/keystone-real-time-stream-processing-platform-a3ee651812a
[21] Maxime Beauchemin, The Rise of the Data Engineer, Freecodecamp, 2017. [Online]. Available: https://www.freecodecamp.org/news/the-rise-of-the-data-engineer-91be18f1e603/
[22] J. Hermann, and M. Del Balso, Meet Michelangelo: Uber’s Machine Learning Platform, 2017. [Online]. Available: https://www.uber.com/en-IN/blog/michelangelo-machine-learning-platform/
[23] Sam Ransbotham et al., “Reshaping Business with Artificial Intelligence,” MIT Sloan Management Review, vol. 59, no. 1, pp. 1-17, 2017.
[Google Scholar] [Publisher Link]
[24] Siemens, Digitalization Productivity Bonus: Machine Building, 2020. [Online]. Available: https://www.siemens.com/global/en/products/financing/whitepapers/whitepaper-the-digitalization-productivity-bonus-machine-building.html
[25] Peter B. Jensen, Lars J. Jensen, and Søren Brunak, “Mining Electronic Health Records: Towards Better Research Applications and Clinical Care,” Nature Reviews Genetics, vol. 13, no. 6, pp. 395-405, 2012.
[CrossRef] [Google Scholar] [Publisher Link]
[26] Michael McShea et al., “The eICU Research Institute - A Collaboration Between Industry, Health-Care Providers, and Academia,” IEEE Engineering in Medicine and Biology Magazine, vol. 29, no. 2, pp. 18-25, 2010.
[CrossRef] [Google Scholar] [Publisher Link]
[27] Shell, Shell Collaborates with AWS on OSDUᵀᴹ Deployment, 2022. [Online]. Available: https://www.shell.com/energy-and-innovation/digitalisation/news-room/shell-collaborates-with-aws-on-osdu-deployment.html
[28] Google Cloud, “PayPal Leverages Google Cloud to Flawlessly Manage Surges in Financial Transactions,” 2020. [Online]. Available: https://cloud.google.com/customers/paypal
[29] AWS, FINRA Adopts AWS to Perform 500 Billion Validation Checks Daily, 2023. [Online]. Available: https://aws.amazon.com/solutions/case-studies/finra-data-validation/
[30] AWS, Modernizing FINRA Data Collection with Amazon DocumentDB, 2023. [Online]. Available: https://aws.amazon.com/solutions/case-studies/finra-case-study/
[31] Shaokun Fan, Raymond Y.K. Lau, and J. Leon Zhao, “Demystifying Big Data Analytics for Business Intelligence through the Lens of Marketing Mix,” Big Data Research, vol. 2, no. 1, pp. 28-32, 2015.
[CrossRef] [Google Scholar] [Publisher Link]
[32] Samant Khajuria, Lene Tolstrup Sørensen, and Knud Erik Skouby, Cybersecurity and Privacy - Bridging the Gap, 2017.
[Google Scholar] [Publisher Link]
[33] Abayomi Baiyere, Hannu Salmela, and Tommi Tapanainen, “Digital Transformation and the New Logics of Business Process Management,” European Journal of Information Systems, vol. 29, no. 3, pp. 238-259, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[34] Alain Yee-Loong Chong, Felix T.S. Chan, and Keng-Boon Ooi, “Predicting consumer decisions to adopt mobile commerce: Cross country empirical examination between China and Malaysia,” Decision Support Systems, vol. 53, no. 1, pp. 34-43, 2012.
[CrossRef] [Google Scholar] [Publisher Link]
[35] Erhard Rahm, and Hong Hai Do, “Data Cleaning: Problems and Current Approaches,” IEEE Data Engineering Bulletin, vol. 23, no. 1, pp. 3-13, 2000.
[Google Scholar] [Publisher Link]
[36] John Philipp Albrecht, “How the GDPR Will Change the World,” European Data Protection Law Review, vol. 2, no. 1, pp. 287-289, 2016.
[Google Scholar] [Publisher Link]
[37] T. Somers & K. Nelson, “The Impact of Strategy and Integration Mechanisms on Enterprise System Value: Empirical Evidence From Manufacturing Firms,” European Journal of Operational Research, vol. 146, no. 2, pp. 315-338, 2003.
[CrossRef] [Google Scholar] [Publisher Link]
[38] David S. Linthicum, Next Generation Application Integration: From Simple Information to Web Services, 2004.
[Google Scholar] [Publisher Link]
[39] Casey G. Cegielski et al., "Adoption of Cloud Computing Technologies in Supply Chains: An Organizational Information Processing Theory Approach," The International Journal of Logistics Management, vol. 23, no. 2, pp. 184-211, 2012.
[CrossRef] [Google Scholar] [Publisher Link]
[40] Lukas Budach et al., "The Effects of Data Quality on ML-Model Performance," ArXiv, abs/2207.14529, 2022.
[CrossRef] [Publisher Link]
[41] Faraz Faghri et al., "Toward Scalable Machine Learning and Data Mining: the Bioinformatics Case," Arxiv:1710.00112, 2017.
[CrossRef] [Google Scholar] [Publisher Link]
[42] Rishabh Gupta, Deepika Saxena, and Ashutosh Kumar Singh, "Data Security and Privacy in Cloud Computing: Concepts and Emerging Trends," Arxiv:2108.09508, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[43] Benoît Otjacques, "Reporting on Decision-Making Algorithms and some Related Ethical Questions," Arxiv:1911.05731, 2019.
[CrossRef] [Google Scholar] [Publisher Link]