FlowCV Logo
resume profile picture
Profile

Experienced Data Scientist with nearly 11 years of IT experience, including over 8 years of hands-on experience in Data Science projects, currently leading a high-performing team at Tokopedia, Hyderabad.

Skills & Abilities

Data Science, Machine Learning, Deep Learning, Python, R, SQL, AWS (EC2, Lambda, Sagemaker), GCP (Vertex AI), MLOps (MLFlow, W&B), NLP, Gen AI (Prompt Engineering, RAG, SFT), LLMs (PaLM 2, Gemma, Llama 3), ML Frameworks (Keras, Tensorflow, Langchain, LlamaIndex), Vector Databases (Milvus, ChromaDB), Technical Leadership, Team Management

Professional Experience
Tokopedia
Data Science Senior Lead
Jan 2023 – present | Hyderabad, India
  • Search Ads - TopAds
  • Leading 2 data science projects - Ads Ranking and Keyword Recommendation, and managing 4 data scientists, and contributing to various DS initiatives that have benefited the Ads DS team and GoTo DS.
  • Consistently delivering high-quality data science solutions for TopAds business problems and contributing positively to Tokopedia's revenue by using Gen AI (Vertex AI - PaLM 2 LLM) for Augmented Ad title generation for products with low/no impressions and Product Keyword creation for products with low keyword supply.
  • Overseeing research and development projects at various stages ranging from initial exploration to deployment into production systems. Collaborating with engineering teams to integrate successful A/B tested ML models into large-scale and highly complex production services.
  • Principal Data Scientist
    Dec 2021 – Dec 2022 | Hyderabad, India
  • Real-time Ads Ranking - TopAds
  • Worked on building a product, user and query context-aware Real-time Ads Ranking (RTR) system.
  • Developed a tree-based Learning-to-Rank model to increase the relevancy of ads displayed on the search results page (SRP) which enabled the Real-time Ranking system to achieve a 3.25% increase in CTR, 1.57% increase in Ads Revenue (~1.6 billion IDR), while maintaining the system level ROAS>=7.
  • Deployed a highly scalable, low latency and high throughput model inference pipeline for Real-time Ranking system to achieve sub 80 ms latency with 1K requests served per second.
  • Optimised inference of an existing deep learning solution using post-training quantization techniques which increased revenue by 1.43% (~1.4 billion IDR monthly) and saved inference costs up to 166 million IDR monthly.
  • ADP, Data Scientist
    Nov 2019 – Dec 2021 | Hyderabad, India

    Worked as a Data Scientist in the ADP DataCloud team building machine learning models for Job Classification and PayCode Classification.

  • Job Classification - ADP DataCloud
  • Technologies: Python 3, Machine Learning – NLP, Classification, AWS, Amazon Sagemaker
  • The Job classification model normalises the client-entered jobs to standard ADP Job Taxonomies, which enables the benchmarking of an organization's workforce metrics against industry benchmarks in ADP DataCloud.
  • Built multi-lingual Job Classification for French & Spanish using Language Agnostic BERT (LaBSE) & multi-lingual Universal Sentence Encoder (mUSE) in a Tensorflow Keras model & reduced time to market from 6 to 2 months.
  • Developed a multi-class classifier using a Keras model using FastText word embeddings on Amazon Sagemaker and productionised it using Step Functions, ECR, Sagemaker Endpoint and API Gateway.
  • Deployed the model as both batch transform and real-time endpoint.
  • 1 / 2
  • PayCode Classification, ADP DataCloud
  • Technologies: Python 3, Machine Learning – NLP, Classification, AWS, Databricks
  • The PayCode classification model provides paycode group and paycode subgroup for the text entered by the end user while submitting their timesheet, which powers the workforce trends analytics and reporting in ADP DataCloud.
  • Developed a multi-class classifier using FastText on Amazon Sagemaker and productionised it using AWS Batch, Step Functions, ECR, Sagemaker Endpoint and API Gateway.
  • Deployed the model as both batch transform and real-time endpoint.
  • DBS, Data Scientist
    Oct 2017 – Oct 2019 | Hyderabad, India

    Worked as a Data Scientist in the Core Banking division and built machine learning models for Self-Service Banking – Cash In Transit and Cheque Analytics.

  • Cheque Analytics - Self-Service Banking
  • Technologies: Python 3, PySpark 2.3, Machine Learning – Classification
  • Developed binary classifier using LightGBM for Cheque Risk Scoring model to provide a risk score for all inward cheques using the historic cheque attribute data and customer information.
  • Reduced false negatives by 18% through effective handling of class imbalance using SMOTE and TomekLinks and derived the best hyperparameters for the model using Bayesian Hyperparameter Optimisation.
  • Cash In Transit (CIT)- Self-Service Banking
  • Technologies: Python 3, PySpark 2.3, Machine Learning - Time Series Forecasting
  • Forecasted no-cash situations for ATM, BTM, and CRS cash withdrawals using FB Prophet for time-series forecasting, resulting in a 15% reduction in cash shortage incidents.
  • Tech Mahindra, Software Engineer
    Nov 2013 – Oct 2017 | Hyderabad, India

    Worked as a Software Engineer for General Electric (GE) client in 4 different projects spanning from data ingestion to data science.

  • Technologies: Python, R, SQL Machine Learning - Time Series Forecasting, Clustering
  • Engineered a Cash Collection ML model using clustering to segment customers based on payment behavior, decision trees to predict the likelihood of timely payments and GE US Payroll model using ARIMA.
  • Integrated JSON data from APIs to R dataframes and CSV, developed dynamic Talend jobs for data integration.
  • Created Stored Procedures and finished code stubs received from team members.
  • Education
    Electronics and Communications Engineering, GITAM University
    2009 – 2013 | Hyderabad, India

    Bachelor of Technology (B.Tech) in Electronics and Communications Engineering (ECE) with 8.4 CGPA

    Courses
    Summer School on NLP, IIIT Hyderabad
    Jul 2024 | Hyderabad, India

    Attended Summer School on NLP with focus on Large Language Models (LLM) and Natural Language Generation (NLG) conducted by LTRC, IIIT Hyderabad

    Summer School on CV and DL, IIIT Hyderabad
    Jul 2018 | Hyderabad, India

    Attended Summer School on Computer Vision (CV) and Deep Learning (DL) conducted by CVIT, IIIT Hyderabad

    Awards
    Best Summer School Project, IIIT Hyderabad
    Jul 2024

    Awarded 2nd place for Continued Pre-training of LLM on Legal Contracts - LLM Domain Adaptation project at Summer School on NLP, IIIT Hyderabad.

    Super Techie, DBS
    Dec 2018

    Awarded the Pride Connect Award for being the Super Techie at C2E, Consumer Banking Analytics

    Workplace Catalyst, DBS
    Jul 2018

    Awarded for being a part of the team which is Workplace Catalyst across the whole DBS Tech Organisation

    2 / 2