Research Article

Intelligent Cloud Data Platform: An Integrated Framework for AI-Driven ETL, Real-Time Analytics, and API-First Architecture

by  Shankar Das Boddu
journal cover
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 187 - Issue 109
Published: May 2026
Authors: Shankar Das Boddu
10.5120/ijca0f872ddd7bad
PDF

Shankar Das Boddu . Intelligent Cloud Data Platform: An Integrated Framework for AI-Driven ETL, Real-Time Analytics, and API-First Architecture. International Journal of Computer Applications. 187, 109 (May 2026), 82-91. DOI=10.5120/ijca0f872ddd7bad

                        @article{ 10.5120/ijca0f872ddd7bad,
                        author  = { Shankar Das Boddu },
                        title   = { Intelligent Cloud Data Platform: An Integrated Framework for AI-Driven ETL, Real-Time Analytics, and API-First Architecture },
                        journal = { International Journal of Computer Applications },
                        year    = { 2026 },
                        volume  = { 187 },
                        number  = { 109 },
                        pages   = { 82-91 },
                        doi     = { 10.5120/ijca0f872ddd7bad },
                        publisher = { Foundation of Computer Science (FCS), NY, USA }
                        }
                        %0 Journal Article
                        %D 2026
                        %A Shankar Das Boddu
                        %T Intelligent Cloud Data Platform: An Integrated Framework for AI-Driven ETL, Real-Time Analytics, and API-First Architecture%T 
                        %J International Journal of Computer Applications
                        %V 187
                        %N 109
                        %P 82-91
                        %R 10.5120/ijca0f872ddd7bad
                        %I Foundation of Computer Science (FCS), NY, USA
Abstract

The rapid growth of enterprise data volume, the maturation of cloud-native infrastructure, and the rising organizational demand for real-time, AI-augmented decision-making have exposed a critical architectural gap in modern data platform design: the persistent fragmentation of Extract, Transform, and Load (ETL) pipelines, streaming analytics engines, and data consumption APIs into independently managed, loosely coupled subsystems. This paper presents the Intelligent Cloud Data Platform (ICDP), a unified four-layer architectural framework that systematically integrates AI-driven ETL, real-time streaming analytics, intelligent warehouse storage with open table format governance, and API-first data consumption under a single, co-designed engineering model. The ICDP framework is grounded in and extends the findings of three recent IEEE publications addressing AI-augmented data warehousing, scalable API-driven cloud pipelines, and cloud ETL migration methodology. A comprehensive experimental evaluation conducted on live multi-cloud AWS and GCP infrastructure using the TPC-DS benchmark dataset at 1TB, 10TB, and 100TB scale factors demonstrates that the ICDP delivers a 94.1% autonomous schema drift resolution rate, an overall data quality defect detection rate of 91.8%, streaming end-to-end latency of 387ms at p99 with in-stream ML fraud detection achieving an F1 score of 0.923, and API response times of 489ms p99 at 1,000 concurrent users with 99.97% availability. Against a monolithic batch warehouse baseline, the ICDP reduces time-to-insight for operational decisions from 8.4 hours to 0.9 seconds for inventory optimization scenarios. These results establish the ICDP as a validated, production-grade architectural framework for intelligent cloud data platform design.

References
  • Integrate.io, "AI-Powered ETL Market Projections — 35 Statistics Every Data Leader Should Know in 2026," Integrate.io Blog, Jan. 2026. [Industry market analysis; non-peer-reviewed.]
  • Integrate.io, "ETL Tools Market Size Statistics 2024–2025," Integrate.io Blog, Nov. 2025. [Industry market analysis; non-peer-reviewed.]
  • M. Zaharia et al., "Lakehouse: A New Generation of Open Platforms that Unify Data Warehousing and Advanced Analytics," in Proc. CIDR 2021.
  • Cloudera, "The Evolution of AI: The State of Enterprise AI and Data Architecture," Cloudera Research Report, 2025.
  • S. Ambalkar et al., "AI Augmented ETL Pipelines for Automated Data Quality Anomaly Detection and Governance," International Journal of Computational and Experimental Science and Engineering, vol. 11, no. 4, pp. 7920–7927, 2025.
  • P. Carbone, A. Katsifodimos, S. Ewen, V. Markl, S. Haridi, and K. Tzoumas, "Apache Flink: Stream and Batch Processing in a Single Engine," IEEE Data Engineering Bulletin, vol. 38, no. 4, pp. 28–38, 2015.
  • D. Patel and R. Williams, "Adaptive API Design for Evolving Microservices Ecosystems," in Proc. 2024 IEEE International Symposium on High-Performance Computing, pp. 65–73, 2024.
  • R. Krishnamurthy, S. Patel, and A. Mehta, "AI-Driven Data Warehouse Solutions for Real-Time Retail Analytics and Optimization," in Proc. 2025 IEEE International Conference on Big Data and Smart Computing (BigComp), IEEE, 2025. DOI: 10.1109/BigComp.2025.11295478.
  • V. Subramaniam, T. Nguyen, and B. Okafor, "Scalable Data Solutions with APIs, Cloud Pipelines, and Predictive Analytics," in Proc. 2025 IEEE International Conference on Cloud Engineering (IC2E), IEEE, 2025. DOI: 10.1109/IC2E.2025.11295350.
  • D. Chandra, F. Al-Rashid, and M. Johansson, "Modern Data Engineering for Cloud ETL, Migration, and Scalable Analytics," in Proc. 2025 IEEE International Conference on Data Engineering (ICDE) Industry Track, IEEE, 2025. DOI: 10.1109/ICDE.2025.11294479.
  • A. Armbrust et al., "Delta Lake: High-Performance ACID Table Storage over Cloud Object Stores," Proc. VLDB Endow., vol. 13, no. 12, pp. 3411–3424, 2020. DOI: 10.14778/3415478.3415560.
  • T. Brasileiro Araújo et al., "Enhancing Data Interoperability in Multi-platform Lakehouses with Apache Iceberg," Springer Nature, 2025.
  • R. S. Adeyemi et al., "The Evolution of Data Warehouse Architectures: From On-Premises to Cloud-Native Solutions," World Journal of Advanced Research and Reviews, vol. 26, no. 1, pp. 1990–1997, 2025.
  • Z. Li et al., "Cloud-Native Databases: A Survey," IEEE Transactions on Knowledge and Data Engineering, vol. 36, no. 12, Dec. 2024. DOI: 10.1109/TKDE.2024.3397508.
  • S. Neumaier, J. Umbrich, and A. Polleres, "Automated Quality Assessment of Metadata across Open Data Portals," ACM J. Data Inf. Qual., vol. 8, no. 1, pp. 1–29, 2016. DOI: 10.1145/2964909.
  • A. J. P. et al., "Real-Time AI Analytics with Apache Flink," World Journal of Advanced Engineering Technology and Sciences, vol. 13, no. 2, pp. 038–050, 2024.
  • R. Cattell, "Scalable SQL and NoSQL Data Stores," ACM SIGMOD Record, vol. 39, no. 4, pp. 12–27, 2011. DOI: 10.1145/1978915.1978919.
  • B. Huang et al., "Iceberg: A Format for Huge Analytic Datasets," in Proc. 2021 IEEE Int. Conf. on Big Data (Big Data), pp. 1820–1828, 2021. DOI: 10.1109/BigData52589.2021.9671863.
  • E. Wilde and C. Pautasso, Eds., REST: From Research to Practice, Springer, 2011. DOI: 10.1007/978-1-4419-8303-9.
  • M. Pawlak et al., "Performance Evaluation of REST and GraphQL API Models in Microservices Software Development Domain," in Proc. WEBIST 2025, pp. 83–91, 2025.
  • J. Kreps, N. Narkhede, and J. Rao, "Kafka: A Distributed Messaging System for Log Processing," in Proc. NetDB Workshop at VLDB, 2011.
  • M. O. Adewoyin et al., "IIoT-Based Predictive Maintenance: A Systematic Literature Review," Sensors, vol. 24, no. 5, 2024. DOI: 10.3390/s24051500.
  • P. Kaur et al., "Edge Computing for Real-Time IoT Applications: Latency Reduction and Challenges," Journal of Cloud Computing, vol. 13, no. 1, 2024. DOI: 10.1186/s13677-024-00601-3.
  • AWS, "AlloyDB NL2SQL: Natural Language to SQL with Generative AI," AWS Documentation, 2025.
  • Google Cloud, "AlloyDB AI Natural Language API Overview," Google Cloud Documentation, 2025.
  • Y. Li et al., "Can LLM Already Serve as A Database Interface? A BIg Bench for Large-Scale Database Grounded Text-to-SQLs," in Proc. NeurIPS 2023.
  • Z. Dehghani, Data Mesh: Delivering Data-Driven Value at Scale, O'Reilly Media, 2022.
  • E. Curry et al., "Federated Governance in Data Mesh Architecture: Challenges and Opportunities," IEEE Internet Computing, vol. 27, no. 3, pp. 22–30, 2023. DOI: 10.1109/MIC.2023.3260271.
  • AWS, "Amazon Aurora Zero-ETL Integration with Amazon Redshift," AWS Documentation, 2025.
Index Terms
Computer Science
Information Sciences
No index terms available.
Keywords

Intelligent Cloud Data Platform (ICDP) AI-Driven ETL and Schema Drift Resolution Real-Time Streaming Analytics Open Table Format Governance API-First Data Consumption Architecture AutoOps and MLOps Orchestration

Powered by PhDFocusTM