Experience

AI Engineer (Full-Time)

Sawt-SA⁠

TAGS: vLLM, LLMs, AI Agents, GCP, AWS, Speech to Text, Text to Speech, Vertix AI, FastAPI, Nvidia NeMo, Nvidia RIVA, LangChain, LangGraph, LiveKit

2025/02 – present(Remote) Riyadh, Saudi Arabia

  • STT: Developed and trained SOTA Arabic (Saudi) ASR models with very high robustness and accuracy across noisy, real-world telephony and conversational scenarios, contributing to the full end-to-end lifecycle from dataset collection and curation to optimized production deployment.
  • LLMs: Managed a huge improvement in the function-calling capabilities of our LLMs, contributing to the end-to-end project lifecycle, including dataset creation, model fine-tuning, and deployment, all while maintaining the model's conversational skills.
  • TTS: Developed and trained human-level, natural-sounding Arabic (Saudi) TTS models, contributing to the complete pipeline from data acquisition and voice design to optimized production deployment.
  • Evaluation & Benchmarking: Designed and implemented in-house evaluation pipelines and custom Arabic (Saudi) benchmarks for ASR, TTS, LLMs, etc. Including building task-specific test sets, defining quality metrics, and running systematic offline evaluations to guide model iteration and deployment decisions.
  • LLM Deployment & Integration: Deployed and optimized high-performance LLMs using vLLM for low-latency inference, and integrated LLM technologies into production AI agent workflows to enhance accuracy and user experience
  • AI Engineer (Part-Time)

    Hams-AI⁠

    TAGS: LLMs, Agentic AI, RAG, AWS, Speech-To-Text (ASR), Text-To-Speech (TTS), REST APIs.

    2024/12 – 2025/03(Remote) Riyadh, Saudi Arabia
  • Improved Hams.AI’s speech recognition system to be over 30% more accurate, about 90% smaller, and nearly 12× faster.
  • Helped build and improve Hams.AI⁠’s Voice AI agents by adding new features, boosting performance, and handling everything from gathering training data to deploying on AWS.
  • AI Engineer (Intern)

    Spiro Carbon Group⁠

    Worked on enhancing an LLM application's response accuracy by integrating Graph Knowledge as a Graph RAG, replacing simple RAG to incorporate semantic general knowledge, addressing workflow needs, and enabling interconnected data retrieval.

    2024/09 – 2024/12(Remote) Utah, USA

    NLP Engineer (Intern)

    Omoto-AI⁠

    Built specialized Arabic AI applications, including AMUN-RAG (RAG system), Rosseta (dialect translator), and HieroScribes (ASR model).

    2024/03 – 2024/06Cairo, Egypt
    Education

    Master’s degree in Artificial Intelligence.

    Ain Shams University
    2026/03 – PresentCairo, Egypt

    Bachelor's degree in Computer Science - Artificial Intelligence department

    Helwan University - Grade: 3.12 / 4
    2021/05 – 2025/05Cairo, Egypt
    Publications

    JusticeLeague at FIGNEWS 2024 Shared Task: Innovations in Bias Annotation⁠

    Authors: Amr Khaled Saleh, Huda Maher Mohamed, Hager Sayed Rashed

    The paper was published at the 2nd Arabic NLP conference at ACL 2024 - Paper Link - LinkedIn post describes it

    2024/08/16
    Projects

    Automatic Speech Recognition recognition and Speaker Diarization System for Egyptian Arabic Using NeMo

    Tags: ASR, Speaker Diarization, NeMo, Conformer-CTC, CTC Loss, Mixed Precision, Novograd Optimizer, SpecAugment
  • GitHub Repositories Links: QuartzNet, Conformer_CTC, Speaker Diarization
  • Developed an end-to-end Automatic Speech Recognition (ASR) system for Egyptian-Arabic dialects using the QuartzNet and Conformer-CTC architectures.
  • Achieved significant performance improvements with Conformer-CTC, obtaining an 8.739 Character Error Rate (CER), outperforming the QuartzNet model by 25.51%, and top competitors in the MTC-AIC2 (Phase-1) competition by 18.97%
  • Implemented a comprehensive speaker diarization system incorporating models such as vad_multilingual_marblenet, TitaNet-L, and diar_msdd_telephonic to accurately segment and identify speakers in Egyptian Arabic audio.
  • Conducted extensive experimentation with various neural architectures, including QuartzNet, Conformer-HAT, Squeezeformer, Citrinet, and FastConformer, to determine the optimal model for ASR tasks.
  • Arabic Retrieval-Augmented Generation (RAG) System⁠

    Tags: Advanced RAG, Pinecone, LangChain, LLaMa-3.3, Streamlit, OCR, Docker, Docker-Compose, FastAPI
  • Designed an Arabic-focused RAG system supporting .txt, .pdf (text and images), and image files (.png, .jpg, .jpeg) with advanced text extraction capabilities.
  • Enhanced retrieval speed and accuracy by leveraging Pinecone’s namespace features and integrating embedding (intfloat/​multilingual-e5-small) and reranking (NAMAA-Space/​GATE-Reranker-V1) models.
  • Integrated Groq's llama-3.3-70b-specdec for contextual, accurate answer generation and implemented a user-friendly Streamlit interface with customizable parameters.
  • Deployed the system using modular Docker containers, ensuring scalability, portability, and efficient vector storage with Pinecone.
  • Question Answering Fine-Tuning with ArabianGPT, AraBert v2, and DistilBert - GitHub⁠

    Tags: QA, PyTorch, Fine-tuning, Hugging Face, ArabianGPT, Docker, FastAPI
  • Fine-tuned language models for question-answering tasks using the "ARCD Dataset".
  • Performance on the ROUGE-1 metric [ArabianGPT: 0.8464, AraBert: 0.8415, DistilBert: 0.26], more evaluation metrics results on the "GitHub repo".
  • Developed a user-friendly API using FastAPI to easily integrate the fine-tuned model into future projects. Containerized it in a Docker image for efficient deployment with CUDA support for GPU acceleration.
  • Competitions & Awards

    MTC-AIC2 (AI Competition-2 organized by the Military Technical College)

    Team Name: 7asebatiya 7elwan, Results:

    2024/06
  • We achieved 3rd out of 102 places with our ASR and Speaker DIarization systems.
  • Competitive Programming

  • ECPC Qualifications: (2023: solved 7 -- 2022: solved 3)
  • Skills
    Knowledge Topics

    LLMs • NLP • RAG • Agentic AI • Text to Speech • Speech to Text • ML/​Deep Learning • Algorithms & Data Structures • Version Control • REST API • Cloud Computing (AWS)

    Tools / Frameworks

    Nvidia NeMo & RIVA • LiveKit • LangChain • AWS • GCP • FastAPI • C++ • Python • PyTorch • Git /​ GitHub • NLTK • ONNX • ScikitLearn • OpenCV • Huggingface • Docker

    Courses

    BSc Related Courses

    AI department (3rd, Last years): AI • Evolutionary Algorithms • ML • Advanced ML • NLP • Convex Optimization Theory • Robotics • Big Data • Parallel Processing • Signals & Systems • Cloud Computing • NLP • Image processing

    General department (1st, 2nd years): Data Structures • Algorithms • Networks • Operating Systems • SWE • System Analysis • Database Systems • Probability & Statistics • Mathematics (1, 2) • Linear algebra • Logic Design

    Volunteering

    Teaching assistant: Back-end (Laravel) • Data Structures (C++) • OOP Course (C++) • ProblemSolving course(C++)