Description
With over 4 years of experience in data engineering, I specialize in designing and implementing scalable data solutions that streamline data workflows, enhance analytics, and optimize cloud infrastructure. My expertise spans across various AWS services, including Lambda, Glue, Kinesis, and Redshift, and I have a proven track record of delivering high-quality, efficient data pipelines for complex, real-time data integration projects.Skills :
AWS, Python, PySpark, SQL, DBT, Airflow, Snowflake, Git, Data Modelling, Data Warehousing, Effective Communication, Organisational Skills
Some of my Notable Works :
API-Driven Data Integration:
I have successfully led API-driven data integration pipelines to load data from centralized data lakes into Amazon OpenSearch and Amazon Redshift, optimizing data ingestion and query performance with AWS Glue and PySpark. My experience includes configuring VPC Peering to enable secure cross-account data integration and ensuring seamless data flow between systems.Automated Data Workflows:
I excel at creating automated workflows using AWS Lambda, AWS Batch, and EventBridge to manage periodic data loads and automate tasks such as loading CSV files to external portals. My solutions integrate SNS notifications for job monitoring and real-time alerts, ensuring high visibility and reliability.Real-Time Data Pipelines:
In my role as a Senior Data Engineer, I architected and implemented real-time streaming pipelines using AWS Gateway, Kinesis, and Glue, significantly reducing processing time to under 1 minute. This enabled efficient data ingestion from Amazon S3 to Snowflake, enhancing operational capabilities and supporting critical business decisions.