Azure Data Engineer


(per hour)


  • Overall, 10+ years of experience in as Azure data engineer using Microsoft Azure with Databricks, Databricks workspace for Business Analytics, Manage Clusters in Databricks, Managing the Machine Learning Lifecycle.
  • Hands on exp Data extraction (extract, Schemas, corrupt record handling and parallelize code), transformations and loads (user -defined functions, join optimizations) and Production (Optimize and automate Extract, Transform and Load)
  • Have Extensive Experience in IT data analytics projects, Hands on experience in migrating on premise ETLs to Google Cloud Platform (GCP) using cloud native tools such as BIG query, Cloud Data Proc, Google Cloud Storage, Composer
  • Hands on exp on Unified Data Analytics with Databricks, Databricks Workspace User Interface, Managing Databricks Notebooks, Delta Lake with Python, Delta Lake with Spark SQL.
  • Design and develop Spark applications using Pyspark and Spark-SQL for data extraction, transformation and aggregation from multiple file formats for analyzing & transforming the data to uncover insights into the customer usage patterns.
  • Worked on projects in waterfall and agile methodology.
  • Experience on Migrating SQL database to Azure data Lake, Azure data lake Analytics, Azure SQL database, Presto from data bricks.
  • Experience on configure the connection to Presto
  • Experience in Developing Spark applications using Spark – SQL in Databricks for data
  • Design and implement database solutions in Azure SQL Data Warehouse, Azure SQL
  • Created Azure SQL database, performed monitoring and restoring of Azure SQL database. Performed migration of Microsoft SQL server to Azure SQL database.
  • Machine Learning: Linear Regression, Logistic Regression, Naive Bayes, Decision Trees, Random Forest, Support Vector Machines (SVM), K-Means Clustering, K-Nearest Neighbors (KNN), Random Forest, Gradient Boosting Trees, Ada Boosting, PCA, LDA, Natural Language Processing
  • Good understanding of Bigdata Hadoop and Yarn architecture along with various Hadoop Demons such as Job Tracker, Task Tracker, Name Node, Data Node, Resource/Cluster Manager and Kafka.
  • Expertise in understanding, solving big data problems using Hadoop ecosystem components such as HDFS, Map Reduce, Hive, Oozie, Autosys.
  • Expertise in various phases of project life cycles (Design, Analysis, Implementation and testing).
  Technical Skills:
  • Hadoop – Hive
  • Spark Data Frame API
  • Spark Programming
  • Python, Pyspark.
  • Azure Databricks, Azure SQL database, Azure SQL Datawarehouse
  • Azure Py-Spark, Spark SQL
  • Azure data lake, data factory,
  • Azure Synapse
  • Simplify Data Analysis with Python
  • Data Extraction and Transformation and Load (Databricks & Hadoop)
  • Implementing Partitioning and Programming with MapReduce
  • Snowflake
  • Data Pipeline
  • Linux Command – Unix Shell Script
  • Java/J2EE
  • GCP- Big query, Big Table and Kubernetes

Report Freelancer

Ready To Get Started

QWIRK is a brand owned by Gigart Solutions Incorporation, Delaware, USA. The company is a freelance marketplace to help firms find quality professionals when needed, to be a part of your workforce.