Building Resilient Data Ingestion Pipelines for Third-Party Vendor Data Integration
DOI:
https://doi.org/10.55544/jrasb.1.1.14Keywords:
Resilient data ingestion architecture, third-party data integration, data reliability, error handling, scalability, adaptability, data quality assuranceAbstract
In this report, the author provides a review of the design and operation of the resilient data ingestion architecture with a particular emphasis on the issues associated with third-party data vendor integration. As more and more companies have made information a principal factor in their business strategies, it has become equally imperative to obtain external data in a coherent manner. This paper describes ways of handling Enhanced Data Ingestion Reliability Efficiency and Adaptability and other challenges such as Data quality Error Handling and Scalability.
Downloads
Metrics
References
Šprem, Š., Tomažin, N., Matečić, J. and Horvat, M., 2020. Building Advanced Web Applications Using Data Ingestion and Data Processing Tools. Electronics, 13(4), p.709.
Lampathaki, F., Biliri, E., Tsitsanis, T., Tsatsakis, K., Miltiadou, D. and Perakis, K., 2022. Toward an Energy Data Platform Design: Challenges and Perspectives from the SYNERGY Big Data Platform and AI Analytics Marketplace. Data Spaces: Design, Deployment and Future Directions, pp.293-315..
Zeydan, E. and Mangues-Bafalluy, J., 2022. Recent advances in data engineering for networking. IEEE Access, 10, pp.34449-34496.
Mantzoukas, K., 2020. Runtime monitoring of security SLAs for big data pipelines: design implementation and evaluation of a framework for monitoring security SLAs in big data pipelines with the assistance of run-time code instrumentation (Doctoral dissertation, City, University of London).
Weise, M., Kovacevic, F., Popper, N. and Rauber, A., 2022. OSSDIP: open source secure data infrastructure and processes supporting data visiting. Data Science Journal, 21, pp.4-4.
Ikegwu, A.C., Nweke, H.F., Anikwe, C.V., Alo, U.R. and Okonkwo, O.R., 2022. Big data analytics for data-driven industry: a review of data sources, tools, challenges, solutions, and research directions. Cluster Computing, 25(5), pp.3343-3387.
Sharma, R.S., Mannava, P.N. and Wingreen, S.C., 2022. Reverse-engineering the design rules for cloud-based big data platforms. Cloud Computing and Data Science, pp.39-59.
Oktian, Y.E., Lee, S.G. and Lee, B.G., 2020. Blockchain-based continued integrity service for IoT big data management: A comprehensive design. Electronics, 9(9), p.1434.
Mudambo, N.A., 2021. A Data Pipeline Architecture For Classification Of Potential Claimants In Reunification Of Unclaimed Financial Assets (Doctoral dissertation, Kca University).
Gökalp, M.O., Kayabay, K., Zaki, M., Koçyiğit, A., Eren, P.E. and Neely, A., 2019. Big-Data Analytics Architecture for Businesses: a comprehensive review on new open-source big-data tools. Cambridge Service Alliance: Cambridge, UK.
Korhonen, N., 2020. Managing and Optimising IoT Data and ML applications dependencies (Master's thesis).
Oktian, Y.E., Lee, S.G. and Lee, B.G., 2020. Blockchain-based continued integrity service for IoT big data management: A comprehensive design. Electronics, 9(9), p.1434.
Santosh Palavesh. (2021). Developing Business Concepts for Underserved Markets: Identifying and Addressing Unmet Needs in Niche or Emerging Markets. Innovative Research Thoughts, 7(3), 76–89. https://doi.org/10.36676/irt.v7.i3.1437
Palavesh, S. (2021). Co-Creating Business Concepts with Customers: Approaches to the Use of Customers in New Product/Service Development. Integrated Journal for Research in Arts and Humanities, 1(1), 54–66. https://doi.org/10.55544/ijrah.1.1.9
Santhosh Palavesh. (2022). Entrepreneurial Opportunities in the Circular Economy: Defining Business Concepts for Closed-Loop Systems and Resource Efficiency. European Economic Letters (EEL), 12(2), 189–204. https://doi.org/10.52783/eel.v12i2.1785
Santhosh Palavesh. (2022). The Impact of Emerging Technologies (e.g., AI, Blockchain, IoT) On Conceptualizing and Delivering new Business Offerings. International Journal on Recent and Innovation Trends in Computing and Communication, 10(9), 160–173. Retrieved from https://www.ijritcc.org/index.php/ijritcc/article/view/10955
Santhosh Palavesh. (2021). Business Model Innovation: Strategies for Creating and Capturing Value Through Novel Business Concepts. European Economic Letters (EEL), 11(1). https://doi.org/10.52783/eel.v11i1.1784
Vijaya Venkata Sri Rama Bhaskar, Akhil Mittal, Santosh Palavesh, Krishnateja Shiva, Pradeep Etikani. (2020). Regulating AI in Fintech: Balancing Innovation with Consumer Protection. European Economic Letters (EEL), 10(1). https://doi.org/10.52783/eel.v10i1.1810
Challa, S. S. S. (2020). Assessing the regulatory implications of personalized medicine and the use of biomarkers in drug development and approval. European Chemical Bulletin, 9(4), 134-146.
D.O.I10.53555/ecb.v9:i4.17671
EVALUATING THE EFFECTIVENESS OF RISK-BASED APPROACHES IN STREAMLINING THE REGULATORY APPROVAL PROCESS FOR NOVEL THERAPIES. (2021). Journal of Population Therapeutics and Clinical Pharmacology, 28(2), 436-448. https://doi.org/10.53555/jptcp.v28i2.7421
Challa, S. S. S., Tilala, M., Chawda, A. D., & Benke, A. P. (2019). Investigating the use of natural language processing (NLP) techniques in automating the extraction of regulatory requirements from unstructured data sources. Annals of Pharma Research, 7(5), 380-387.
Challa, S. S. S., Chawda, A. D., Benke, A. P., & Tilala, M. (2020). Evaluating the use of machine learning algorithms in predicting drug-drug interactions and adverse events during the drug development process. NeuroQuantology, 18(12), 176-186. https://doi.org/10.48047/nq.2020.18.12.NQ20252
Challa, S. S. S., Tilala, M., Chawda, A. D., & Benke, A. P. (2022). Quality Management Systems in Regulatory Affairs: Implementation Challenges and Solutions. Journal for Research in Applied Sciences and Biotechnology, 1(3), 278–284. https://doi.org/10.55544/jrasb.1.3.36
Ranjit Kumar Gupta, Sagar Shukla, Anaswara Thekkan Rajan, & Sneha Aravind. (2022). Strategies for Effective Product Roadmap Development and Execution in Data Analytics Platforms. International Journal for Research Publication and Seminar, 13(1), 328–342. Retrieved from https://jrps.shodhsagar.com/index.php/j/article/view/1515
Ranjit Kumar Gupta, Sagar Shukla, Anaswara Thekkan Rajan, & Sneha Aravind. (2022). Leveraging Data Analytics to Improve User Satisfaction for Key Personas: The Impact of Feedback Loops. International Journal for Research Publication and Seminar, 11(4), 242–252. https://doi.org/10.36676/jrps.v11.i4.1489
Ranjit Kumar Gupta, Sagar Shukla, Anaswara Thekkan Rajan, Sneha Aravind, 2021. "Utilizing Splunk for Proactive Issue Resolution in Full Stack Development Projects" ESP Journal of Engineering & Technology Advancements 1(1): 57-64.
Sagar Shukla. (2021). Integrating Data Analytics Platforms with Machine Learning Workflows: Enhancing Predictive Capability and Revenue Growth. International Journal on Recent and Innovation Trends in Computing and Communication, 9(12), 63–74. Retrieved from https://ijritcc.org/index.php/ijritcc/article/view/11119
Sneha Aravind. (2021). Integrating REST APIs in Single Page Applications using Angular and TypeScript. International Journal of Intelligent Systems and Applications in Engineering, 9(2), 81 –. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/6829
Aravind, S., Cherukuri, H., Gupta, R. K., Shukla, S., & Rajan, A. T. (2022). The role of HTML5 and CSS3 in creating optimized graphic prototype websites and application interfaces. NeuroQuantology, 20(12), 4522-4536. https://doi.org/10.48047/NQ.2022.20.12.NQ77775
Rishabh Rajesh Shanbhag, Rajkumar Balasubramanian, Ugandhar Dasi, Nikhil Singla, & Siddhant Benadikar. (2022). Case Studies and Best Practices in Cloud-Based Big Data Analytics for Process Control. International Journal for Research Publication and Seminar, 13(5), 292–311. https://doi.org/10.36676/jrps.v13.i5.1462
Siddhant Benadikar. (2021). Developing a Scalable and Efficient Cloud-Based Framework for Distributed Machine Learning. International Journal of Intelligent Systems and Applications in Engineering, 9(4), 288 –. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/6761
Siddhant Benadikar. (2021). Evaluating the Effectiveness of Cloud-Based AI and ML Techniques for Personalized Healthcare and Remote Patient Monitoring. International Journal on Recent and Innovation Trends in Computing and Communication, 9(10), 03–16. Retrieved from https://www.ijritcc.org/index.php/ijritcc/article/view/11036
Challa, S. S., Tilala, M., Chawda, A. D., & Benke, A. P. (2019). Investigating the use of natural language processing (NLP) techniques in automating the extraction of regulatory requirements from unstructured data sources. Annals of PharmaResearch, 7(5), 380-387.
Chaturvedi, R., & Sharma, S. (2022). Assessing the Long-Term Benefits of Automated Remittance in Large Healthcare Networks. Journal for Research in Applied Sciences and Biotechnology, 1(5), 219–224. https://doi.org/10.55544/jrasb.1.5.25
Chaturvedi, R., & Sharma, S. (2022). Enhancing healthcare staffing efficiency with AI-powered demand management tools. Eurasian Chemical Bulletin, 11(Regular Issue 1), 675-681. https://doi.org/10.5281/zenodo.13268360
Dr. Saloni Sharma, & Ritesh Chaturvedi. (2017). Blockchain Technology in Healthcare Billing: Enhancing Transparency and Security. International Journal for Research Publication and Seminar, 10(2), 106–117. Retrieved from https://jrps.shodhsagar.com/index.php/j/article/view/1475
Saloni Sharma. (2020). AI-Driven Predictive Modelling for Early Disease Detection and Prevention. International Journal on Recent and Innovation Trends in Computing and Communication, 8(12), 27–36. Retrieved from https://www.ijritcc.org/index.php/ijritcc/article/view/11046
Chaturvedi, R., & Sharma, S. (2022). Assessing the Long-Term Benefits of Automated Remittance in Large Healthcare Networks. Journal for Research in Applied Sciences and Biotechnology, 1(5), 219–224. https://doi.org/10.55544/jrasb.1.5.25
Pavan Ogeti, Narendra Sharad Fadnavis, Gireesh Bhaulal Patil, Uday Krishna Padyana, Hitesh Premshankar Rai. (2022). Blockchain Technology for Secure and Transparent Financial Transactions. European Economic Letters (EEL), 12(2), 180–188. Retrieved from https://www.eelet.org.uk/index.php/journal/article/view/1283
Fadnavis, N. S., Patil, G. B., Padyana, U. K., Rai, H. P., & Ogeti, P. (2020). Machine learning applications in climate modeling and weather forecasting. NeuroQuantology, 18(6), 135-145. https://doi.org/10.48047/nq.2020.18.6.NQ20194
Narendra Sharad Fadnavis. (2021). Optimizing Scalability and Performance in Cloud Services: Strategies and Solutions. International Journal on Recent and Innovation Trends in Computing and Communication, 9(2), 14–21. Retrieved from https://www.ijritcc.org/index.php/ijritcc/article/view/10889
Gireesh Bhaulal Patil. (2022). AI-Driven Cloud Services: Enhancing Efficiency and Scalability in Modern Enterprises. International Journal of Intelligent Systems and Applications in Engineering, 10(1), 153–162. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/6728
Patil, G. B., Padyana, U. K., Rai, H. P., Ogeti, P., & Fadnavis, N. S. (2021). Personalized marketing strategies through machine learning: Enhancing customer engagement. Journal of Informatics Education and Research, 1(1), 9. http://jier.org
Krishnateja Shiva. (2022). Leveraging Cloud Resource for Hyperparameter Tuning in Deep Learning Models. International Journal on Recent and Innovation Trends in Computing and Communication, 10(2), 30–35. Retrieved from https://www.ijritcc.org/index.php/ijritcc/article/view/10980
Shiva, K., Etikani, P., Bhaskar, V. V. S. R., Palavesh, S., & Dave, A. (2022). The rise of robo-advisors: AI-powered investment management for everyone. Journal of Namibian Studies, 31, 201-214.
Bhaskar, V. V. S. R., Etikani, P., Shiva, K., Choppadandi, A., & Dave, A. (2019). Building explainable AI systems with federated learning on the cloud. Journal of Cloud Computing and Artificial Intelligence, 16(1), 1–14.
Ogeti, P., Fadnavis, N. S., Patil, G. B., Padyana, U. K., & Rai, H. P. (2022). Blockchain technology for secure and transparent financial transactions. European Economic Letters, 12(2), 180-192. http://eelet.org.uk
Vijaya Venkata Sri Rama Bhaskar, Akhil Mittal, Santosh Palavesh, Krishnateja Shiva, Pradeep Etikani. (2020). Regulating AI in Fintech: Balancing Innovation with Consumer Protection. European Economic Letters (EEL), 10(1). https://doi.org/10.52783/eel.v10i1.1810
Dave, A., Shiva, K., Etikani, P., Bhaskar, V. V. S. R., & Choppadandi, A. (2022). Serverless AI: Democratizing machine learning with cloud functions. Journal of Informatics Education and Research, 2(1), 22-35. http://jier.org
Dave, A., Etikani, P., Bhaskar, V. V. S. R., & Shiva, K. (2020). Biometric authentication for secure mobile payments. Journal of Mobile Technology and Security, 41(3), 245-259.
Saoji, R., Nuguri, S., Shiva, K., Etikani, P., & Bhaskar, V. V. S. R. (2021). Adaptive AI-based deep learning models for dynamic control in software-defined networks. International Journal of Electrical and Electronics Engineering (IJEEE), 10(1), 89–100. ISSN (P): 2278–9944; ISSN (E): 2278–9952
Narendra Sharad Fadnavis. (2021). Optimizing Scalability and Performance in Cloud Services: Strategies and Solutions. International Journal on Recent and Innovation Trends in Computing and Communication, 9(2), 14–21. Retrieved from https://www.ijritcc.org/index.php/ijritcc/article/view/10889
Nitin Prasad. (2022). Security Challenges and Solutions in Cloud-Based Artificial Intelligence and Machine Learning Systems. International Journal on Recent and Innovation Trends in Computing and Communication, 10(12), 286–292. Retrieved from https://www.ijritcc.org/index.php/ijritcc/article/view/10750
Prasad, N., Narukulla, N., Hajari, V. R., Paripati, L., & Shah, J. (2020). AI-driven data governance framework for cloud-based data analytics. Volume 17, (2), 1551-1561.
Big Data Analytics using Machine Learning Techniques on Cloud Platforms. (2019). International Journal of Business Management and Visuals, ISSN: 3006-2705, 2(2), 54-58. https://ijbmv.com/index.php/home/article/view/76
Shah, J., Narukulla, N., Hajari, V. R., Paripati, L., & Prasad, N. (2021). Scalable machine learning infrastructure on cloud for large-scale data processing. Tuijin Jishu/Journal of Propulsion Technology, 42(2), 45-53.
Narukulla, N., Lopes, J., Hajari, V. R., Prasad, N., & Swamy, H. (2021). Real-time data processing and predictive analytics using cloud-based machine learning. Tuijin Jishu/Journal of Propulsion Technology, 42(4), 91-102
Secure Federated Learning Framework for Distributed Ai Model Training in Cloud Environments. (2019). International Journal of Open Publication and Exploration, ISSN: 3006-2853, 7(1), 31-39. https://ijope.com/index.php/home/article/view/145
Paripati, L., Prasad, N., Shah, J., Narukulla, N., & Hajari, V. R. (2021). Blockchain-enabled data analytics for ensuring data integrity and trust in AI systems. International Journal of Computer Science and Engineering (IJCSE), 10(2), 27–38. ISSN (P): 2278–9960; ISSN (E): 2278–9979.
Challa, S. S. S., Tilala, M., Chawda, A. D., & Benke, A. P. (2019). Investigating the use of natural language processing (NLP) techniques in automating the extraction of regulatory requirements from unstructured data sources. Annals of Pharma Research, 7(5),
Challa, S. S. S., Tilala, M., Chawda, A. D., & Benke, A. P. (2021). Navigating regulatory requirements for complex dosage forms: Insights from topical, parenteral, and ophthalmic products. NeuroQuantology, 19(12), 15.
Challa, S. S. S., Tilala, M., Chawda, A. D., & Benke, A. P. (2022). Quality management systems in regulatory affairs: Implementation challenges and solutions. Journal for Research in Applied Sciences and Biotechnology, 1(3),
Tilala, M., & Chawda, A. D. (2020). Evaluation of compliance requirements for annual reports in pharmaceutical industries. NeuroQuantology, 18(11), 27.
Ghavate, N. (2018). An Computer Adaptive Testing Using Rule Based. Asian Journal For Convergence In Technology (AJCT) ISSN -2350-1146, 4(I). Retrieved from http://asianssr.org/index.php/ajct/article/view/443
Shanbhag, R. R., Dasi, U., Singla, N., Balasubramanian, R., & Benadikar, S. (2020). Overview of cloud computing in the process control industry. International Journal of Computer Science and Mobile Computing, 9(10), 121-146. https://www.ijcsmc.com
Benadikar, S. (2021). Developing a scalable and efficient cloud-based framework for distributed machine learning. International Journal of Intelligent Systems and Applications in Engineering, 9(4), 288. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/6761
Shanbhag, R. R., Benadikar, S., Dasi, U., Singla, N., & Balasubramanian, R. (2022). Security and privacy considerations in cloud-based big data analytics. Journal of Propulsion Technology, 41(4), 62-81.
Shanbhag, R. R., Balasubramanian, R., Benadikar, S., Dasi, U., & Singla, N. (2021). Developing scalable and efficient cloud-based solutions for ecommerce platforms. International Journal of Computer Science and Engineering (IJCSE), 10(2), 39-58.
Tripathi, A. (2020). AWS serverless messaging using SQS. IJIRAE: International Journal of Innovative Research in Advanced Engineering, 7(11), 391-393.
Tripathi, A. (2019). Serverless architecture patterns: Deep dive into event-driven, microservices, and serverless APIs. International Journal of Creative Research Thoughts (IJCRT), 7(3), 234-239. Retrieved from http://www.ijcrt.org
Tripathi, A. (2022). Serverless deployment methodologies: Smooth transitions and improved reliability. IJIRAE: International Journal of Innovative Research in Advanced Engineering, 9(12), 510-514.
Tripathi, A. (2022). Deep dive into Java tiered compilation: Performance optimization. International Journal of Creative Research Thoughts (IJCRT), 10(10), 479-483. Retrieved from https://www.ijcrt.org
Thakkar, D. (2021). Leveraging AI to transform talent acquisition. International Journal of Artificial Intelligence and Machine Learning, 3(3), 7. https://www.ijaiml.com/volume-3-issue-3-paper-1/
Thakkar, D. (2020, December). Reimagining curriculum delivery for personalized learning experiences. International Journal of Education, 2(2), 7. Retrieved from https://iaeme.com/Home/article_id/IJE_02_02_003
Kanchetti, D., Munirathnam, R., & Thakkar, D. (2019). Innovations in workers compensation: XML shredding for external data integration. Journal of Contemporary Scientific Research, 3(8). ISSN (Online) 2209-0142.
Thakkar, D., Kanchetti, D., & Munirathnam, R. (2022). The transformative power of personalized customer onboarding: Driving customer success through data-driven strategies. Journal for Research on Business and Social Science, 5(2). ISSN (Online) 2209-7880. Retrieved from https://www.jrbssonline.com
Aravind Reddy Nayani, Alok Gupta, Prassanna Selvaraj, Ravi Kumar Singh, & Harsh Vaidya. (2019). Search and Recommendation Procedure with the Help of Artificial Intelligence. International Journal for Research Publication and Seminar, 10(4), 148–166. https://doi.org/10.36676/jrps.v10.i4.1503
Vaidya, H., Nayani, A. R., Gupta, A., Selvaraj, P., & Singh, R. K. (2020). Effectiveness and future trends of cloud computing platforms. Tuijin Jishu/Journal of Propulsion Technology, 41(3). Retrieved from https://www.journal-propulsiontech.com
Selvaraj, P. . (2022). Library Management System Integrating Servlets and Applets Using SQL Library Management System Integrating Servlets and Applets Using SQL database. International Journal on Recent and Innovation Trends in Computing and Communication, 10(4), 82–89. https://doi.org/10.17762/ijritcc.v10i4.11109
Gupta, A., Selvaraj, P., Singh, R. K., Vaidya, H., & Nayani, A. R. (2022). The Role of Managed ETL Platforms in Reducing Data Integration Time and Improving User Satisfaction. Journal for Research in Applied Sciences and Biotechnology, 1(1), 83–92. https://doi.org/10.55544/jrasb.1.1.12
Alok Gupta. (2021). Reducing Bias in Predictive Models Serving Analytics Users: Novel Approaches and their Implications. International Journal on Recent and Innovation Trends in Computing and Communication, 9(11), 23–30. Retrieved from https://ijritcc.org/index.php/ijritcc/article/view/11108
Rinkesh Gajera, "Leveraging Procore for Improved Collaboration and Communication in Multi-Stakeholder Construction Projects", International Journal of Scientific Research in Civil Engineering (IJSRCE), ISSN : 2456-6667, Volume 3, Issue 3, pp.47-51, May-June.2019
Voddi, V. K. R., & Konda, K. R. (2021). Spatial distribution and dynamics of retail stores in New York City. Webology, 18(6). Retrieved from https://www.webology.org/issue.php?volume=18&issue=60
Gudimetla, S. R. (2022). Ransomware prevention and mitigation strategies. Journal of Innovative Technologies, 5, 1-19.
Gudimetla, S. R., et al. (2015). Mastering Azure AD: Advanced techniques for enterprise identity management. Neuroquantology, 13(1), 158-163. https://doi.org/10.48047/nq.2015.13.1.792
Gudimetla, S. R., & et al. (2015). Beyond the barrier: Advanced strategies for firewall implementation and management. NeuroQuantology, 13(4), 558-565. https://doi.org/10.48047/nq.2015.13.4.876
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2022 Balachandar Paulraj
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.