Results driven Data Platform Engineer with 6+ years of experience in building, optimizing, and operating large-scale distributed data systems. Strong expertise in Hadoop ecosystem (HDFS, YARN), cluster management, performance tuning, and system-level troubleshooting. Proven track record in managing high-throughput data platforms (1.6B+ records/day), improving cluster efficiency, and driving platform reliability across telecom and healthcare domains.
Python, Java, Scala, Shell Scripting, J2EE, HTML5, JavaScript, Bootstrap
Apache Spark, PySpark, Apache Kafka, Apache Hadoop, HDFS, YARN, Hadoop Cluster Management, Resource Allocation, Capacity Planning, Cluster Optimization, Platform Engineering
ETL/ELT Pipelines, Data Modeling (Dimensional, Relational), Snowflake Schema, ER Diagrams, Data Processing (Wrangling, Transformation, Aggregation), Data Annotation, Data Retention, Data Backup
Azure Databricks, Azure Delta Lake, Azure Data Factory, Azure Blob Storage
Pandas, NumPy, SciPy, Matplotlib, Seaborn, Tableau
MongoDB, MySQL, Google BigQuery, Apache Druid, SQL, NoSQL, Data Warehousing, OLAP/OLTP, Data Governance, Data Quality
Docker, Kubernetes (Basics), Git, GitHub
Apache Airflow, CI/CD Basics
Linux, Distributed Systems, System-Level Troubleshooting
Performance & Query Optimization, Caching Strategies, Monitoring, Alerting & Observability (Logs, Metrics, APIs, Cluster Monitoring)
Reliance Jio Infocomm Pvt Ltd
Big Data EngineerMpHrx
Software EngineerSAR GROUP(Lectrix E-Vehicle)
Software DeveloperFintree Global Research
Software Developer (Co-Founder)Woolf University
M.S. in Computer ScienceSpecialized in ML/AI
Maharishi Markandeshwar (Deemed to be University)
B.TechMajor in Computer Science