resume profile picture
Profile

I design and operate production data platforms and pipelines, focusing on reliable ETL, SQL performance tuning, and observability. I automate infra, enforce data quality and testing, and enable analytics at scale with Spark, query engines and orchestration.

Professional Experience
Oct 2024 – Present
Continental, Data Platform Engineer
  • Provision and manage AWS infrastructure using Terraform and Terragrunt for scalable S3 storage, secure Linux appliances, and network configuration, building and operating a data platform.
  • Build and maintain PySpark pipelines on Amazon EMR to ingest, transform, and curate large-scale datasets from S3.
  • Deploy pre-configured EC2 appliances (JupyterLab, RStudio, KNIME, Deep Learning environments) to enable self-service analytics with direct access to governed S3 datasets.
  • Maintain and optimize AWS Redshift for scalable data warehousing and analytics.
  • Containerize platform services with Docker and publish to Amazon ECR for reproducible and consistent deployments.
  • Implement monitoring and observability with AWS CloudWatch metrics integrated into Grafana dashboards, tracking system health, resource utilization, and performance.
  • Develop automation scripts in Python and Bash to streamline infrastructure and data operations.
  • Nov 2023 – Nov 2024Porto, Portugal
    Critical Techworks | BMW Group, Data Engineer
  • Architected and managed a scalable data lakehouse on AWS with S3 and Apache Iceberg, enabling efficient storage, querying, and processing.
  • Built event-driven data pipelines using Python, PySpark, and AWS services (Glue, Lambda, SQS, EventBridge ...), ensuring reliable ingestion, transformation, and SLA-aligned
  • Tuned Spark performance and optimized Iceberg write strategies (partitioning, file sizing) to reduce latency and cost.
  • Enforced data quality with PyDeequ, validating schema, distributions, and completeness of datasets.
  • Engineered a Python-based wrapper for Terraform, simplifying cloud resource provisioning.
  • Built CI/CD pipelines with GitHub Actions to automate integration and deployment of data infrastructure and pipelines.
  • Nov 2021 – Nov 2023Porto, Portugal
    Data Analytics Developer
  • Automated complex reporting workflows using Python, pandas and SQL, increasing reporting efficiency and accuracy.
  • Developed and maintained Tableau dashboards to provide real-time business insights for stakeholders.
  • Designed and implemented KPIs and business metrics to monitor operations and drive strategic decisions.
  • Delivered custom reporting solutions with a focus on data governance, accessibility, and stakeholder alignment.
  • Mar 2021 – Aug 2021Porto, Portugal
    Hotel Black Tulip, Business Analyst Intern
  • Implemented Power BI solution for tracking financial and operational performance.
  • Projects
    Real Time Data Pipeline

    Designed and deployed the Docker Compose infrastructure for a real-time data pipeline using Kafka, Flink, Iceberg, Trino, MinIO, and Superset.

    MarketPipe

    Docker containerized and configurable Airflow data pipeline for collecting and storing stock and cryptocurrency market data.

    DataFlow

    ETL pipeline using Pulumi for infrastructure as code, integrating AWS services and Snowflake for automated data flow.

    FinStockDash

    Streamlit Python-based web application to analyze historical stock data.

    Healthcare NPS Analysis

    Analyzing and monitoring Net Promoter Score (NPS) performance for healthcare companies using SQL and Power BI.

    Certificates
    Open Source and Community
    pandas, Contributor

    Improved the library’s data manipulation and reporting functionalities, supporting the development of efficient data pipelines and enabling scalable data solutions for analytics and modeling.

    Education
    Skills
    Programming & Development — Python, JavaScript, Go, Bash, OOP, Unit Testing, CI/CD, Git, Linux, AgileBig Data — Spark, Kafka, Flink, Hadoop, Hive, Iceberg, Delta Lake, TrinoDatabases — SQL, OLTP, OLAP, Postgres, Oracle, DynamoDBCloud & Infrastructure — AWS, Azure, Terraform, Terragrunt, Packer, Prometheus, GrafanaData Quality — PyDeequ, Great Expectations, SodaData Analysis & Reporting — Tableau, Power BI, SupersetSoft Skills — Communication, Leadership, Teamwork, Organization, Project and Time Management
    Languages
    Portuguese
    English