Big Data Architectural Pattern to Ingest Multiple Sources and Standardization to Immune Downstream Applications
|© 2020 by IJCTT Journal|
|Year of Publication : 2020|
|Authors : Imran Quadri Syed|
|DOI : 10.14445/22312803/IJCTT-V68I1P102|
How to Cite?
Imran Quadri Syed, "Big Data Architectural Pattern to Ingest Multiple Sources and Standardization to Immune Downstream Applications," International Journal of Computer Trends and Technology, vol. 68, no. 1, pp. 5-10, 2020. Crossref, https://doi.org/10.14445/22312803/IJCTT-V68I1P102
In today’s era where organizations are handling large volume of varying data to meet their business needs. Also, Organizations receive data from numerous sources for the same data domain but in different layouts and formats. In this article we will go over a Big data architectural pattern that immunes traditional downstream system of any change to source system. This is achieved by Datahub (big data) by ingesting data from different sources, standardize to denormalized canonical form, integrate with reference data, reject reprocess and publish extract using big data technologies like hive, impala to traditional downstream systems. This article also discusses how key management service (KMS) is utilized to identify latest iteration of a record and to achieve easier querying and then generating standard publications for downstream systems.
Big data, Data Ingestion, Data Integration, Standardization, Reject Reprocessing, Architecture, Key Management Service (KMS), Hive, Impala, Datahub, publisher subscriber pattern.
 John Russell (2014). Getting Started with Impala. Publisher O’Reily Media, Inc ISBN: 9781491905777
 Li, N., &Mahalik, N. (2019). A big data and cloud computing specification, standards and architecture: agricultural and food informatics. International Journal of Information and Communication Technology, 14(2), 159- 174.
 James Le, An Introduction to Big Data: Data Integration : Published at Medium.com
 Ruojing Zhang, Marta Indulska, Shazia Sadiq “Discover Data Quality Problems” published in Business and Information Systems Engineering journal in July 2019
 Atif Mohammad, Hamid Mcheick, Emanuel Grant “Big Data Architecture Evolution: 2014 and Beyond” published in Association for Computing Machinery in September 2014
 Mohammed M.A, Bartholomew E “Big Data Performance Analysis In Apache And Internet Information Services” published in International Journal of Computer Trends and Technology in November 2019
 Eric Huey “Cloud Computing-Challenges and Benefits” published in International Journal of Computer Trends and Technology in September 2019