Akhilesh Pratap Shahi – Online Resume

Akhilesh Pratap ShahiData Engineer
Email
[email protected] Phone
+91-844-7020-911 
akhileshshahi GitHub
shahiakhilesh1304 Location
Bangalore, India Visa Status
Open For Relocation (US, Europe, Asia) 

Profile

Results driven Data Platform Engineer with 6+ years of experience in building, optimizing, and operating large-scale distributed data systems. Strong expertise in Hadoop ecosystem (HDFS, YARN), cluster management, performance tuning, and system-level troubleshooting. Proven track record in managing high-throughput data platforms (1.6B+ records/​day), improving cluster efficiency, and driving platform reliability across telecom and healthcare domains.

Skills

Programming & ScriptingPython, Java, Scala, Shell Scripting, J2EE, HTML5, JavaScript, Bootstrap
Big Data & Distributed SystemApache Spark, PySpark, Apache Kafka, Apache Hadoop, HDFS, YARN, Hadoop Cluster Management, Resource Allocation, Capacity Planning, Cluster Optimization, Platform Engineering
Data Engineering & ETLETL/​ELT Pipelines, Data Modeling (Dimensional, Relational), Snowflake Schema, ER Diagrams, Data Processing (Wrangling, Transformation, Aggregation), Data Annotation, Data Retention, Data Backup
Cloud & ToolsAzure Databricks, Azure Delta Lake, Azure Data Factory, Azure Blob Storage
Data Analysis & VisualizationPandas, NumPy, SciPy, Matplotlib, Seaborn, Tableau
Databases & WarehousingMongoDB, MySQL, Google BigQuery, Apache Druid, SQL, NoSQL, Data Warehousing, OLAP/​OLTP, Data Governance, Data Quality
DevOps & ContainersDocker, Kubernetes (Basics), Git, GitHub
Orchestration & Workflow ToolsApache Airflow, CI/​CD Basics
System & InfrastructureLinux, Distributed Systems, System-Level Troubleshooting
Performance & ObservabilityPerformance & Query Optimization, Caching Strategies, Monitoring, Alerting & Observability (Logs, Metrics, APIs, Cluster Monitoring)

Professional Experience

03/2023 – presentBanglore, India

Reliance Jio Infocomm Pvt Ltd

Big Data Engineer

•Managed and optimized large-scale Hadoop/Spark clusters processing 1.6B+ records/day, ensuring efficient YARN resource allocation and HDFS utilization

•Performed cluster-level performance tuning, reducing execution latency and improving workload distribution across nodes

•Led system-level troubleshooting for production failures, including memory bottlenecks, executor failures, and data skew issues

•Built internal frameworks to monitor cluster health, resource contention, and job-level performance metrics

•Designed APIs that interact with distributed systems, reducing data retrieval time by 90% while maintaining cluster stability

•Improved platform reliability and fault tolerance by optimizing job retries, partitioning strategies, and storage access patterns

•Actively worked on Linux-based debugging, including log tracing, disk usage issues, and process-level analysis across nodes

•Collaborated with infra teams to enhance cluster scalability and workload isolation strategies

10/2022 – 02/2023Gurgaon, India

MpHrx

Software Engineer

•Built and optimized Apache Spark (PySpark) pipelines on Azure Databricks for high-volume healthcare data processing

•Improved Spark job performance through partition tuning, caching strategies, and execution plan optimization

•Contributed to data platform stability by identifying bottlenecks in distributed processing and improving job reliability

•Enhanced data ingestion workflows ensuring efficient resource usage and reduced cluster strain

•Worked on query optimization and system performance improvements, reducing latency by 50%

03/2022 – 07/2022Gurgaon, India

SAR GROUP(Lectrix E-Vehicle)

Software Developer

•Designed and maintained backend services supporting high-volume telemetry data ingestion, ensuring stable data flow into downstream systems

•Worked on system performance optimization, reducing API response latency by 40% under production load

•Implemented efficient data models in MongoDB to handle large-scale, semi-structured data with high write throughput

•Performed system-level debugging and performance tuning to resolve bottlenecks in API and database layers

•Contributed to improving service reliability and fault tolerance through better exception handling and logging mechanisms

•Worked closely with infrastructure teams to ensure scalability and efficient resource utilization

01/2020 – 03/2022Lucknow, India

Fintree Global Research

Software Developer (Co-Founder)

•Built and managed scalable backend systems handling financial data ingestion, processing, and analytics workflows

•Designed system architecture focusing on performance, reliability, and extensibility for growing user demand

•Optimized database queries and backend services, improving overall system performance by 28%

•Led development of data-driven features requiring efficient handling of large datasets and real-time processing needs

•Implemented structured logging and monitoring practices to support system observability and debugging

•Took ownership of end-to-end platform development, including deployment, performance tuning, and system stability

Education

08/2024 – presentValletta, Malta

Woolf University

M.S. in Computer Science

Specialized in ML/​AI

08/2015 – 04/2019Ambala, India

Maharishi Markandeshwar (Deemed to be University)

B.Tech

Major in Computer Science

Certificates

Python – HackerRank⁠
Introduction to Scala – LinkedIN⁠
SQL – HackerRank⁠
Maestro in SDDM using Java⁠

Recognition and Achievements

•Guest Lecturer – Python, BSA Engineering College

•Former Member – Computer Society of India

•Winner – Hackathons, Mono Acts, and Inter-college Theater Events

•Vice President – Trojan Society | President – Pratibimb Theatre Club