
Hi, I am Raj Kumar. I am a Cloud AI & Data Engineer with 6+ years of experience. I architect secure cloud foundations and build intelligent data solutions, bridging the gap between Robust Data Engineering & Generative AI.
About Me
I am a Cloud AI & Data Engineer with 6+ years of experience. My goal is simple: I build secure cloud systems that turn raw data into intelligent action within the Microsoft Azure Ecosystem.
My technical foundation is rooted in Data Engineering. I believe that AI is only as good as the data feeding it. That’s why I specialize in using tools like Azure Data Factory, Databricks, and SQL to build clean, high-performance pipelines that enterprises can actually trust.
I build upon this stability with advanced AI Engineering. Moving beyond simple prototypes, I architect Production-Grade GenAI solutions. Using Azure AI Foundry and RAG (Retrieval-Augmented Generation), I create custom "Copilots" and intelligent search systems that understand your internal business data, driving real operational efficiency.
Having recently graduated with my Master's in Computer Science (Dec 2025) from Concordia University Chicago, I am open to relocation and eager to bring this blend of data stability and AI innovation to a new team. Let's connect!
Outside of work, I enjoy playing soccer, the gym, watching movies, and spending time with my friends.
My projects
DITA
Data is ingested from an on-premises, transformed using data engineering tools, and analyzed through visualization tools.
- MS SQL Server
- Azure Data Lake
- Data Factory
- Databricks
- Synapse Analytics
- Power BI
Product Sales Analytics
An interactive Power BI report leveraging the AdventureWorks database for sales performance through data visualization.
- Power Query
- Power BI
- M language
- DAX
Supply Chain Analytics
A public web app for quick analytics on text. It shows word count, character count and social media post limits.
- Databricks
- Pyspark
- SQL
- Delta Lake
- TIme Travel
- Multi Hop
- Unity Catalog
My skills
- Azure
- AWS
- Databricks
- MS SQL Server
- Power BI
- Tableau
- MS Excel
- VScode
- Anaconda
- Hadoop
- Kafka
- Spark
- Air Flow
- Rest API
- SQL
- NoSQL
- Python
- R
- PostgreSQL
- Oracle
- Agile Methodologies
- CI/CD
- ETL
- Data Modelling
- Data Cleansing
- Data Transforming
- Data Visualization
- Git
- Github
My experience
Discover Financial Services
Senior Cloud Data Engineer
Chicago, IL, USA
- Led modernization of enterprise analytics platform by implementing a Microsoft Fabric Lakehouse with Medallion architecture, unifying siloed financial datasets into a governed single source of truth for BI and ML.
- Designed fault-tolerant, metadata-driven ingestion pipelines using Azure Data Factory and Fabric Data Pipelines with incremental loads and watermarking, maintaining 99.7%+ SLA across 15+ source systems.
- Developed PySpark and Spark SQL transformations in Azure Databricks and Fabric Notebooks, reducing processing time by ~40% through Delta Lake optimization, Z-order indexing, and partition pruning.
- Delivered a production-grade RAG solution using Azure OpenAI (GPT-4) + Azure AI Search, reducing manual document lookup effort by ~60% for compliance and support teams.
- Built Power BI dashboards on Fabric Warehouse semantic models to monitor dispute volumes, fraud detection rates, and credit portfolio performance with row-level security and drill-through capabilities.
Concordia University Chicago
Cloud AI & Data Engineer
Chicago, IL, USA
- Architected a production-grade Service Desk Copilot using Azure AI Foundry and RAG (Retrieval-Augmented Generation), reducing ticket volume by delivering citation-backed answers from internal runbooks.
- Engineered automated document processing workflows using Azure AI and JSON parsers to extract key data fields from unstructured finance documents for downstream reporting.
- Developed comprehensive Power BI dashboards to visualize operational KPIs, utilizing DAX and Power Query to identify trends in system usage and support efficiency.
- Secured cloud infrastructure by implementing Role-Based Access Control (RBAC) and policy governance within Microsoft Entra ID for faculty and staff systems.
Concordia University Chicago
Data Operations & Cloud Analyst
Chicago, IL, USA
- Optimized university IT workflows by analyzing system log data using SQL and Power BI, identifying bottlenecks in the ticketing lifecycle.
- Managed Azure Active Directory (Entra ID) user identities and access policies, ensuring 99.9% uptime for student and faculty portal access.
- Collaborated with cross-functional teams to migrate on-premise data to cloud storage, validating data integrity through SQL scripting and automated quality checks.
- Created automated reporting scripts using PowerShell and Python to track license usage and cloud resource consumption, reducing operational waste.
LTIMindtree LTD. (Microsoft Vendor)
Senior Data Engineer
Hyderabad, India
- Architected metadata-driven ingestion frameworks using Azure Data Factory, orchestrating data movement across ADLS Gen2, Synapse Analytics, and Snowflake for insurance and Xbox sales domains.
- Designed dimensional data models (star/snowflake schemas) with SCD Type 1/2 in Azure Synapse and Snowflake, enabling tracking of claims efficiency, sales velocity, and regional revenue.
- Developed event-driven processing solutions using Azure Event Hubs and Stream Analytics, reducing reporting latency from hours to under 15 minutes for time-sensitive business decisions.
- Implemented comprehensive data quality frameworks including source-to-target validation, schema conformance checks, and duplicate detection, reducing data-related production incidents by ~40%.
- Managed platform security using Azure Key Vault, implemented row-level security in Synapse Analytics, and configured RBAC across ADLS Gen2 to comply with enterprise governance standards.
Mindtree (Microsoft Vendor)
Data Engineer
Mumbai, India
- Engineered 30+ scalable ETL/ELT pipelines using Azure Data Factory, processing ~5+ TB of transactional data daily with a 99.5% pipeline success rate across insurance and Xbox sales domains.
- Built PySpark and Spark SQL transformations on Azure HDInsight and Synapse Spark pools, improving data processing throughput by ~35% through partition pruning, broadcast joins, and caching.
- Built enterprise-grade data ingestion from SQL Server, MySQL, APIs, JSON, and Kafka into Bronze/Silver/Gold zones within ADLS Gen2 following medallion architecture with Delta Lake.
- Created business-facing datasets and reporting feeds consumed by Power BI and Tableau dashboards, collaborating with analysts to translate business requirements into technical designs.
- Managed CI/CD deployment practices using Azure DevOps and Jenkins across dev, QA, staging, and production environments with ARM template parameterization and release gate approvals.
Bosch
Data Engineer Intern
Bangalore, India
- Engineered end-to-end IoT telemetry ingestion pipelines using Kafka producers/consumers in Python and Scala on AWS, enabling real-time streaming of high-frequency industrial sensor data.
- Developed Spark Streaming applications on AWS EMR to process raw telemetry events, persisting to HBase for operational lookups and S3 data lake zones for batch analytics.
- Implemented AWS Kinesis with Lambda functions for real-time anomaly detection, triggering SNS alerts when sensor thresholds were breached — reducing incident response time to near real-time.
- Built PySpark batch jobs on EMR to process terabytes of historical IoT sensor datasets, performing time-series aggregations to support predictive maintenance analytics.
- Designed dimensional data models in AWS Redshift for machine performance metrics, enabling stakeholders to track equipment efficiency and downtime patterns through BI dashboards.
My Education
Swami Vivekananda Institute of Technology
Hyderabad, India
Bachelor of Technology in Electronics and Communication Engineering. I immediately found a job as a Data Engineer.
2020Concordia University Chicago
River Forest, IL
Graduated with Master's Degree, Computer Science.
Aug 2023 - Dec 2025My Certifications










Contact me
Please contact me directly at manalarajkumar.rm@gmail.com or through this form.


