Professional Summary

AI and LLM Engineer with 5+ years of experience designing production-ready AI systems across NLP and Computer Vision. Skilled in fine-tuning and evaluating LLMs (GPT-4, LLaMA, BERT) for NER, summarization, and sentiment analysis. Advanced proficiency in LangChain-based RAG architectures, vector databases (FAISS, Neo4j), and deploying AI agents on GCP Cloud Run. Strong background in MLOps, model performance optimization, and scalability. Experienced Team Lead capable of managing cross-functional workflows, reducing project rework by 40%, and integrating complex AI solutions into existing software ecosystems.

WORK EXPERIENCE
A.Team, Software Engineer
03/2026 – Present
  • Contributed to Terminal Bench 2.0, a leading agentic coding benchmark referenced by frontier AI labs including Anthropic (Claude 4.5)
  • Designed realistic DevOps incident scenarios in air-gapped Kubernetes (k3s) environments involving Gitea CI/​CD, ingress misconfiguration, Vault authentication, and webhook debugging
  • Implemented Docker-based evaluation environments with Python graders using structured Kubernetes API parsing and end-to-end functional testing
  • Calibrated task difficulty through iterative rollout analysis, achieving meaningful agent score variance across 10 evaluation runs
  • Collaborated with ML engineers to align task design with reinforcement learning training objectives
  • Turing, LLM - Backend Engineer
    08/2024 – 03/2026

    QA — Computer Usage Annotation

  • Reviewed and validated 850+ long-horizon agent trajectories (100–200+ actions each) against a 10-dimension QA rubric, evaluating prompt quality, recording accuracy, event log integrity, subtask design, and task-pair alignment across Windows, macOS, and Ubuntu.
  • Enforced S.T.R.U.C.T. subtask standards and the Parent-Child task-pair framework, classifying 2,500+ evaluation scenarios into Critical Mistakes, Bad Side Effects, and Misunderstandings with zero-ambiguity feedback.
  • Delivered structured annotator feedback using the Where/​Why/​What model, reducing rework by 40% and improving weekly throughput by 15%.
  • Audited coordinate metadata, screenshot accuracy, event chronology, and PII compliance across every submission, maintaining 100% dataset delivery quality for client-ready AI training data.
  • Identified and escalated re-recording triggers vs. minor rework cases, balancing review efficiency with dataset accuracy across multi-OS environments.
  • Turing Intelligence — AI Systems (Portfolio)

  • Built Creds Creator & Proposal Producer, an AI automation suite using GPT-4 + Google Slides API to generate compliant consulting proposals with a Validator agent for tone, formatting, and quality control.
  • Developed Sales Knowledge Center, creating real-time data pipelines for company intelligence and news, enriched with live web search for executive insights.
  • Engineered Advisor Insight at Scale, a RAG platform using LangChain + FAISS enabling multi-format ingestion (PDF/​DOCX) and context-aware financial Q&A.
  • AGI / LLM Research

  • Fine-tuned SFT models for code translation and domain reasoning, improving explainability and structured output quality.
  • Created taxonomy-aligned training datasets to enhance LLM performance in classification, generation, and reasoning tasks.
  • Designed agentic reasoning workflows using CoT and ToT, improving multi-step problem solving under uncertainty.
  • Implemented Think-Act-Observe and Plan-Execute frameworks to simulate real-world decision making and evaluate autonomous agent reasoning.
  • Fehmida AI Startup, Data Scientist
    10/2024 – 08/2025
  • Built RAG‑based document analysis systems using LangChain, LlamaIndex, and Neo4j Vector, improving retrieval and contextual understanding of financial, sustainability, and research reports.
  • Combined OpenAI GPT‑4 with Neo4j Vector for real‑time Q&A, boosting insight extraction accuracy by 30% and enabling precise financial analysis workflows.
  • Automated data scraping and processing with Selenium and Finnhub API, extracting structured datasets from multiple websites to support AI model training and analytics.
  • Designed thematic modeling workflows using embeddings and clustering, automating document categorization, enhancing search accuracy, and improving user experience.
  • UOG, ML Engineer (Computer Vision & NLP)
    01/2023 – 08/2024
  • Developed real-time web apps by integrating Hugging Face vision models with React UI, enabling interactive virtual try-on experiences.
  • Automated document extraction and compliance workflows using YOLOv8 + OCR (Tesseract) + NER, achieving 95% OCR accuracy and 90% NER precision.
  • Delivered construction analytics solutions using YOLOv5 object detection and OpenCV depth estimation, improving site monitoring accuracy and reporting.
  • Optimized Flask microservices, reducing API response time by 40% and improving reliability in production.
  • Deployed scalable systems using Python, Flask/​FastAPI, React, PostgreSQL, Docker, AWS, GitHub Actions (CI/​CD).
  • Freelancer, Client Engagement Lead
    01/2021 – 12/2022
  • Designed and developed multi‑tenant Learning Management Systems (LMS) and food ordering platforms with secure access controls, optimized backend logic, and scalable APIs.
  • Built UI/​UX prototypes in Figma, translating client requirements into intuitive, user‑friendly interfaces that improved adoption and engagement.
  • Implemented backend services using Python, FastAPI, Flask, and PostgreSQL, ensuring performance, reliability, and modular scalability.
  • Deployed solutions with Docker and AWS, integrating CI/​CD pipelines for streamlined delivery and updates.
  • Applied Agile/​XP practices to deliver high‑quality, user‑centric platforms, emphasizing rapid iteration, continuous feedback, and client satisfaction.
  • Skills

    Programming & Languages: Python, JavaScript, TypeScript, Data Structures & Algorithms

    NLP & LLMs: GPT‑4, BERT, LangChain, LlamaIndex, PyTorch

    MLOps & Pipelines: Airflow, Docker, GCP Cloud Run, AWS

    Data & Databases: PostgreSQL, MySQL, MongoDB, Neo4j, FAISS, Redis, Vector Databases, SQL, NoSQL, Redis

    Machine Learning: scikit‑learn, XGBoost, WEKA

    Visualization & Dashboards: Matplotlib, Plotly, Streamlit, Dash, Tableau, PoweBI

    APIs & Web Development: FastAPI, Flask, Django, RESTful API design, GraphQL, Postman

    Frontend: React.js, Next.js, Tailwind CSS

    Collaboration & Practices: Git, GitHub, Agile/​XP

    AI Tools: Replit, Cursor, Bolt, Winsurf, Claude, CodeAider

    Projects and links
    Education
    Bachelor in Computer Science, University of Gujrat
    08/2019 – 09/2023