Big Data Engineer with 5+ years of experience in all phases of the software development life cycle. Passionate about Big Data and Machine Learning technologies and the delivery of effective solutions through creative problem-solving. Track record of building large scale systems using Big Data and Machine Learning technologies.
Python | SQL | Spark
Docker | Docker Compose | GitHub Actions | MLflow
MySQL | MongoDB | Cassandra | HBase
S3 | EC2 | EMR | RDS | Redshift | Glue | CloudWatch | ECS
Spark | Hadoop | Hive | Kafka | Sqoop
Data Factory | Databricks | Functions | Blob | Synapse | Delta Lake
Git | DVC | Airflow
Pandas | Numpy | Sklearn | PySpark | Pytorch |
Matplotlib | Seaborn | TFX
Categorization of financial product and service complaints registered by consumers.
Tech: Python, PySpark, Grafana, Prometheus, AWS, Azure
Designed and developed ETL pipeline to export data from the MySQL transaction database to AWS Redshift for data analysis
Tech: Apache Airflow, PySpark, Amazon Redshift, S3 bucket, Apache Kafka