FlowCV Logo
Profile

Data Engineer with more than four years of experience implementing Big Data Solutions. Proficient in Python and SQL.

Skills
Languages

Python, SQL

Cloud

AWS (S3, EC2, EMR)

Frameworks & Libraries

Airflow

Miscellaneous

OOP, Big Data, APIs

Work Experience
Epsilon, Software Engineer 2
  • Developed end-to-end data pipeline leveraging Hive, SQL, and Spark Jobs for user impression and user conversion data, orchestrated and scheduled with Apache Airflow.
  • Developed and maintained scalable data pipelines for processing large datasets from diverse sources, ensuring seamless data flow and compliance with industry regulations.
  • 12/2023
  • Collaborated cross-functionally to gather requirements and define data schemas tailored to specific advertising campaigns, aligning with business objectives and optimizing targeting strategies.
  • Implemented automated data validation and monitoring procedures to detect anomalies in real-time, ensuring data accuracy and integrity for effective ad campaign performance analysis.
  • Utilized machine learning algorithms to analyze historical ad campaign data, identifying trends and customer behaviors to enhance targeting strategies and improve ROI.
  • Conducted performance tuning and optimization of data processing workflows, reducing latency and enhancing scalability for efficient ad personalization and real-time bidding optimizations.
  • Tata Consultancy Services, Developer

  • Successfully implemented a data migration project involving multiple sources, responsible for converting data from PostgreSQL into JSON files using Python and posting it to a downstream API for consumption.
  • 10/2020 – 12/2023Bengaluru, India
  • Implemented a data migration project to perform Data Ingestion from multiple schemas into S3 Data Lake using a three-layer approach with SCD Type 2 and SCD Type 1 transformation to generate extracts for actuarial analysis of policy and claims data which helped to generate valuable insights and informed decision-making for the stakeholders.
  • Implemented a robust data reconciliation process in the pipeline using SQL to ensure the accuracy and consistency of the data in every layer, effectively identifying and resolving discrepancies, resulting in improved data quality.
  • Developed a POC of a No Code ETL web application supporting data ingestion from CSV, JSON, and XLSX formats, using Python with data manipulation capabilities, and options for database storage or direct export.
  • Automated BDM Static Code Analysis Tool usage report generation using Python by parsing CSV data and eliminating manual work, saving significant time and effort.
  • Implemented new features to an existing Static Code Analysis Tool using Shell Scripting, resulting in improved efficiency by 50% - 80%.
  • Projects
    Spotify ETL Pipeline

    Developed an ETL Pipeline using Python that extracts a user's Spotify listening history data using the Spotify API, validates the data, and loads it into a PostgreSQL database. The pipeline is automated using Apache Airflow and visualized using Power BI.

    03/2023 – present
    Education
    Bachelor of Technology in Electronics & Communication Engineering

    Maulana Abul Kalam Azad University of Technology

    DGPA - 8.60

    08/2016 – 07/2020Kolkata, India
    Certificates
    Google IT Automation with PythonPython, Git, Shell ScriptingMachine LearningSupervised Learning,Linear Regression,Logistics Regression,Machine Learning,Neural Networks