FlowCV Logo
Phan Quốc Long AI Engineer, Data scientist, Software Engineer
Email
[email protected]
Location
No. 51, Street 81, District 7, Ho Chi Minh City
Vietnam
GitHub
https://github.com/phqlong
Phone
+84829998922
2000/10/28
LinkedIn
https://www.linkedin.com/in/phqlong/
ORCID
https://orcid.org/0000-0003-2205-8812
Profile

AI Engineer / Data Scientist / Software Engineer with a strong background in mathematics, machine learning, deep learning, and software development. Passionate about building innovative solutions and delivering tangible value to organizations.

Education
Work Experience
2024/05 – presentHo Chi Minh City, Vietnam

AI Engineer

Spartan
  • Build character-based AI systems using LangChain, Transformers, and RAG pipelines to enable personalized, multi-turn conversations for wellness use cases.
  • Set up document loaders, embeddings (e.g. SentenceTransformers, OpenAI, and vector databases ElasticSearch) to support semantic search and context injection.
  • Design and orchestrate modular LangChain chains with memory strategies (e.g. buffer, summary, entity memory) to ensure continuity across sessions.
  • Craft prompt templates tailored to health-specific tone, personalization, and accuracy, and implement guardrails or fallback logic to handle edge cases safely.
  • Analyze and improve LLM outputs by measuring hallucination frequency, grounding quality, and persona consistency across different interaction flows.
  • Tech stack: AWS, Python FastAPI, Langchain, LangGraph, LlamaIndex, ElasticSearch, Qdrant, Redis, Transformers, LangFuse.
  • 2023/01 – 2024/04Taiwan (Remotely)

    AI Data Engineer

    GLSoft - A.S Watsons
  • Developed Recommendation System pipelines for global e-commerce platforms.
  • Researched and implemented state-of-the-art NLP solutions (Text classification, ABSA, Keyword Extraction, etc.).
  • Worked with LLMs (OpenAI, BERT, DeBERTa, Lllama, etc.) and RAG pipelines for chatbot development.
  • Optimized ML workflows and ETL data pipelines on big data within Azure Databricks environment.
  • Tech stack: Azure, Databricks, Pyspark, Pandas, Pytorch, Transformers, Langchain, MLFlow, MongoDB, REST API.
  • 2022/05 – 2022/12Ho Chi Minh City, Vietnam

    Associate Data Scientist

    ZingMP3 - Zalo Group (VNG Corporation)
  • Improved recommendation systems using collaborative and content-based filtering.
  • Performed social network analysis on artist graphs using Graph ML.
  • Reduced skip rate through model improvements validated via A/B testing.
  • Gained industrial exposure to ML/AI techniques and large-scale data environments.
  • 2021/06 – 2021/09Ho Chi Minh City, Vietnam

    Odoo developer - Intern

    NOVOBI Vietnam
  • Gain an understanding of the Business Procedure, and RMA process.
  • Developing an Odoo Project: create and automate the RMA process in business. Create custom modules and extensions in Odoo, as well as integrate them into existing systems.
  • Certificates
    IELTS Academic

    2024/12

    Overall Band Score 7.0

    ETS TOEIC Listening and Reading

    2022/04

    IIG Vietnam

    Scoring: 900/990

    Natural Language Processing Specialization.

    2021/10 - 2022/03

    Coursera & DeepLearning.AI

    It includes 4 courses certificates: NLP with Classification and Vector Spaces, NLP with Probabilistic Models, NLP with Sequence Models and NLP with Attention Models

    Awards
    2022/11

    2nd Place - VLSP 2022 - Multilingual Visual Question Answering Challenge

    Association for Vietnamese Language and Speech Processing (VLSP)
  • Developed solution for multilingual Visual Question Answering (mVQA) with joint Vision-Language model mT5/XLM-R + ViT/Swin transformers.
  • Publications
    2023

    "OhYeah at VLSP2022 Shared Task: A Jointly Language-Image Model for Multilingual Visual Question Answering"

    International Workshop on Vietnamese Language and Speech Processing
    Personal Projects
    2023/02 – 2023/03

    Deep SITS: Rice Yield Prediction on Satellite Imagery Time Series Data using Transformer.

    Top 10 final teams in EY Open Data Science Challenge 2023
  • Goal: Predict rice yield in the Mekong Delta, Vietnam, utilizing Rice Field Satellite Imagery Time Series Data for each phenological season.
  • Research and analyze vegetation indices data and imagery data from Sentinel and Landsat.
  • Develop a cutting-edge model called ViTs for SITS and propose a new relative positional encoding scheme and channel independence method for multi-resolution time series.
  • 2023/01

    Meta Learning: Few-Shot Classification on Omniglot Dataset

    Research project
  • Dataset: Omniglot is a dataset containing a large number of handwritten characters from various alphabets.
  • This project develops a MAML (Model-Agnostic Meta-Learning) model, a meta learning algorithm, to perform few-shot image classification on the Omniglot dataset, where it will be given a small number of examples (k shots) from n classes (n ways) and must learn to generalize from these examples to classify new instances. It does this by learning an initialization of the model's parameters that can be quickly adapted to new tasks.
  • Utilized PytorchLightning for simplify code, integrate with Hydra for config management, WandB for logger, Higher for supporting higher-order optimization and Torchmeta for many datasets and benchmarks in meta learning.
  • Github: https://github.com/phqlong/Few-Shot-Classification-on-Omniglot-Dataset
  • 2022/10 – 2022/11

    En-Vi-Ja Visual Question Answering - VLSP-2022

    Second prize in EVJVQA Task on 2022 VLSP Competition
  • Team size: 3
  • Based on an image and a question about it, we built Multilingual Visual Question Answering - mVQA system that can predict correct answers in 3 languages respectively to its question language.
  • We have experimented with some Multilingual models such as mT5, mBERT, or XLM-Roberta model and used the VIT model as the main image feature extractor. We then fine-tuned on train set evaluating BLEU and F1 scores. Also, we have tried image augmentation and warm-starting encoder-decoder techniques.
  • 2022/07 – 2022/11

    ZingMP3 Music Artist Network

    Role: Data science team member
  • Collecting, processing, and doing lots of EDA for data from Google, Spotify, and ZingMP3 to gain insights about data for building co-listened artists.
  • Then, building an Artist Network and doing a lot of research for leveraging the power of GNN to give predictions on related artists. Evaluating using many metrics and testing on production.
  • 2021/08 – 2022/05

    KeepItReal - Plagiarism Detection System

    Bachelor Thesis & Applied Research
  • Developed a web-based system for detecting plagiarism in educational settings.
  • Designed architecture, implemented NLP-based semantic matching using Sentence-BERT.
  • Built and deployed full-stack app with semantic search, REST APIs, and scalable backend.
  • Tech stacks: PyTorch, Transformers, NLTK, Django, ReactJS, MongoDB, PostgreSQL, Docker, FAISS.
  • 2021/01 – 2021/04

    Speak your mind

  • Translated Vietnamese sign language into spoken words using vision and speech recognition.
  • Finalist in 2021 Intellectual Property Rights Competition for students.