Enrique Sampaio

Transforming data into innovation with expertise in scalable solutions.

  • 32 Views

$30.00

(per hour)

Description

I'm Enrique, a seasoned Data Engineer and Software Architect with over a decade of experience. I've worked with leading companies in the data and analytics industry, including global pioneers in big data and cloud-based data platforms, specializing in data engineering and custom solutions. From web development to cutting-edge LLM models for chatbots, I'm ready to bring your project to life.

Experience

Senior Solutions Architect

  • Databricks
  • September 1, 2024 - continue

As part of my role, I was responsible for designing and implementing architectures and pipelines for data ingestion, processing, storage, and distribution on the Databricks platform. I developed and deployed MLOps pipelines to streamline the lifecycle of machine learning models, ensuring efficient transitions from development to production environments. I crafted and optimized machine learning models tailored to address specific business challenges, leveraging cutting-edge techniques to deliver impactful solutions. Additionally, I worked on developing Retrieval-Augmented Generation (RAG) systems and compound agents using Large Language Models (LLMs), enabling enhanced data-driven insights and decision-making. A key aspect of my responsibilities included optimizing Spark code to improve the performance and efficiency of data processing tasks. I also facilitated training sessions and workshops, covering critical topics such as data ingestion pipelines, MLOps, machine learning model development, RAG systems, compound agents, and Spark optimization on the Databricks platform. Collaboration was central to my role, working closely with cross-functional teams to identify and resolve technical challenges while providing innovative and scalable solutions that aligned with organizational objectives.

Senior Solutions Architect

  • Cloudera
  • August 1, 2021 - April 30, 2024

In this role, I was responsible for assessing, planning, and executing migrations from CDH and HDP environments to the new CDP platform, ensuring the preservation of security requirements and functional use cases throughout the process. I also designed architectures and pipelines for data ingestion, processing, storage, and distribution within a big data environment using the CDP platform. My responsibilities included sizing, architecting, configuring, and installing new CDP clusters tailored to client use cases, data volumes, and compliance requirements. I conducted health checks and troubleshooting for environments that were not functioning as expected, collaborating closely with the support team to identify and resolve potential platform issues, ensuring optimal performance and reliability.

Hadoop Developer

  • Brainboss Company
  • February 1, 2020 - October 30, 2022

In this role, I was responsible for installing CDH 6.x clusters, which included verifying node prerequisites, designing the cluster architecture, and configuring resources for optimal performance and management. Additionally, I developed a web application framework featuring multitenancy, token-based authentication, and Role-Based Access Control (RBAC) for secure and flexible user permissions. I also designed and implemented a real-time notifications module that connected Apache Spark jobs to web application dashboards. This solution leveraged Apache Redpanda and WebSockets with token authentication, ensuring secure and efficient delivery of real-time updates to end users.

Hadoop Developer

  • Mindlabs
  • April 1, 2019 - February 29, 2020

In this role, I managed the installation of CDH 5.x and 6.x clusters, ensuring node prerequisites, architecture design, and resource configuration for effective management. I developed IoT data streaming pipelines using stateful operations on Databricks integrated with Azure Event Hub, enabling real-time data processing. I created Cloudera Envelope pipelines for data processing, incorporating custom inputs, derivers, and data quality enhancements tailored to specific use cases, achieving notable performance improvements. Additionally, I developed Spark applications utilizing various streaming platforms, storage systems, and file systems such as Kafka, HDFS, HBase, Kudu, and local structured files. I implemented Flafka architectures (Apache Flume + Apache Kafka) for near real-time data ingestion and processing, designing custom Apache Flume sources and interceptors to handle specific data ingestion, processing, and resilience needs. Furthermore, I developed a Twitter/X tracking system to analyze brand engagement against competitors. This solution used Apache Flume, Morphlines, and Apache Solr to process data and provided actionable insights through a web-based user interface.

Hadoop Developer

  • Brainboss Company
  • January 1, 2017 - April 30, 2019

In this role, I developed an application for time-series data analysis and monitoring, leveraging Kafka, Spark, HBase, and OpenTSDB. The application enabled users to perform real-time analytics and replay historical data through a web-based interface. I also created a solution for massive audio transcription and indexing using Apache Spark and Apache Solr, allowing efficient handling and retrieval of large-scale audio data. Additionally, I developed a web application for NLP exploration, focused on keyword spotting to enhance text analysis and search capabilities. My responsibilities included the installation and configuration of CDH 5.x and 6.x clusters, ensuring node prerequisites, designing cluster architectures, and managing resources for optimal performance and scalability.

Educational Details

Master's Degree in Computer Science

  • Universidade Federal de São Carlos
  • March 1, 2018 - December 31, 2020

Concentration: Distributed Algorithms and Fault Tolerance

Bachelor of Computer Science

  • Universidade Federal de São Carlos
  • March 1, 2014 - December 31, 2017

Report Freelancer

Ready To Get Started

QWIRK is a brand owned by Gigart Solutions Incorporation, Delaware, USA. The company is a freelance marketplace to help firms find quality professionals when needed, to be a part of your workforce.