Description
- Professional Highlights
- Support transformations of SQL/PLSQLS from various Datasources to targets - Spark, Hive, Azure, Redshit, Snowflake
- Automated query conversion from source to target by one click.
- We are replicating Mainframe Data Processing into Datalake by migrating Mainframe Application into Datalake.
- Ingestion – CDC Enabler (IBM Tool) pulls data from Mainframe to landing location in Datalake (.data, .meta, .ctl) and scheduler runs through them and load data in hive tables.
- Extraction - Processing and transformation on the loaded data and give it to Reporting team.
- Automated data pipelines through Oozie
- Databridge pulls data from different metastores (Oracle, db2, mySql) and dump it to the required format on HDFS. The huge amount of data is then processed by firing SQL queries by end users which are internally parsed into Pig and Scala scripts through SQL Parser
- TECHNICAL SKILLSET:
- Technologies and Languages: Big Data, Spark , Hadoop, Hive, Oracle, SQL, JAVA, Scala, JAVA, Python, C Programing, Node JS
- Brief knowledge on: Teradata, Greenplum, Vertica, Redshift, Azure, Snowflake, Shell scripting, Python, Spark streaming
- Operating Systems: Windows, Linux