Results-driven Data Engineer with over 5 years of experience in designing and optimizing scalable ETL/ELT pipelines and event-driven architectures. Proven expertise in Apache Spark, Kafka, and Azure Databricks, with a track record of cutting data processing time by 35% and reducing transfer costs by 90%. Strong background in building real-time analytics systems and high-throughput APIs for telecom, healthcare, and mobility sectors.
Python, Java, Scala, Shell Scripting, J2EE, HTML5, JavaScript, Bootstrap
Performance Tuning, Query Optimization, Caching Strategies
MongoDB, MySQL, Google BigQuery, Apache Druid, Data Warehousing, OLAP/OLTP, SQL, NoSQL, Data Governance, Data Quality
Apache Spark, PySpark, Apache Kafka, Apache Hadoop, Hive
Azure Databricks, Azure Delta Lake, Azure Data Factory, Azure Blob Storage
Docker, Kubernetes (basics), Git, GitHub
ETL/ELT Pipelines, Data Modeling (Dimensional, Relational), Snowflake Schema, ER Diagrams, Data Wrangling, Data Transformation, Data Aggregation, Data Annotation, Data Retention, Data Backup
Pandas, NumPy, SciPy, Matplotlib, Seaborn, Tableau
Apache Airflow, CI/CD basics
Specialized in ML/AI
Major in Computer Science