FlowCV Logo
Caghan Köksal Machine Learning Engineer

Upcoming MSc Informatics graduate from TUM, specializing in Computer Vision and NLP. I am passionate about bridging the gap between cutting-edge research and practical applications, consistently integrating state-of-the-art findings into real-world projects.

Professional Experience

Student Assistant @ GRIS-MEC Lab

Technische Universität Darmstadt, "One of the best German universities"
  • Working on surgical scene generation with diffusion models and scene graph conditioning.
  • 2024/04 – presentRemote, Germany

    Machine Learning Thesis Student @ Research & Technology

    ZEISS, "World leading medical technology company"
  • Developed an annotation-efficient method for surgical video workflow analysis.
  • 2023/07 – 2024/02Munich, Germany
  • Proposed unsupervised video object segmentation method on surgery videos.
  • Created scene graph representations for surgical videos.
  • Machine Learning Working Student @ Research & Technology

    ZEISS, "World leading medical technology company"
  • Researched unsupervised object discovery with foundation models on surgical scenes.
  • 2022/09 – 2023/07Munich, Germany
  • Developed weakly supervised instance segmentation model on surgery videos.
  • Improved surgical tool segmentation performance by ~2%.
  • Research Assistant @ KUIS AI Lab

    Koc University, "Top research university of Turkey"
  • Researched applications of object detection, domain adaptation, and self-supervised learning on comic scenes.
  • 2020/09 – 2021/10Istanbul, Turkey
  • Experimented with Swin Transformers, DINO, MoCo v2, and Jigsaw Puzzle task in self-supervision research.
  • Explored image generation with Generative Adversarial Networks (GANs) in domain adaptation research.
  • Leveraged MMdetection and Detectron2 frameworks for object detection pipelines.
  • Developed a self-labeling tool on top of LabelImg to ease the annotation process.
  • NLP Working Student @ AI Team

    Akbank, "Most valuable banking brand of Turkey in 2018"
  • Developed data pipelines for efficient data crawling, text preprocessing, and tokenization, enhancing data processing speed and accuracy.
  • 2019/10 – 2020/06Istanbul, Turkey
  • Improved performance of spellcheck and autocomplete services.
  • Constructed knowledge graph by using Neo4j enabling data relationships and insights.
  • Education

    Master of Science, Informatics

    Technical University of Munich

    2.1/5.0 (German GPA)

    2021/04 – 2024/06Munich, Germany

    Bachelor, Computer Science

    Sabancı University

    3.62/4.00 (USA GPA)

    2015/09 – 2020/06Istanbul, Turkey
    Skills
    Natural Language Processing

    Information Retrieval, Language Modeling, NER

    Python
    MERN

    Node, React, Express, MongoDB

    Computer Vision

    Video Segmentation, Scene Graph based Image Generation, Visual Question Answering

    C++
    Machine Learning
    Languages
    English

    C1

    German

    B1

    Publications

    SU-NLP at TREC NEWS 2020

    Ali Eren Ak, Çağhan Köksal, Kenan Fayoumi, and Reyyan Yeniterzi

    As a research group, we examined BERT Summarization, Universal Sentence Encoder, BERT Finetuning approaches on Background Linking task and Wikification.

    My contributions are :

  • Setting up the baseline result with ElasticSearch.
  • Experimenting with Universal Sentence Encoder embeddings in the Background Linking task.
  • Projects

    Image Generation and Manipulation with Diffusion Models and Scene Graph Representations.

  • Ongoing project.
  • Finetuned Stable Diffusion models with scene graph conditioning.
  • Visual Question Answering on Medical Data

  • Worked on large-scale multimodal datasets, MIMIC-CXR and ROCO.
  • Used SOTA multimodal Flamingo architecture.
  • My contributions:
  • Implemented data pipelines in VQARAD, ROCO, MIMIC-CXR, and ImageCLEF datasets.
  • Experimented with Vision Transformers and EfficientNet.
  • Integrated in-domain language models such as PubMedBERT.
  • Created training and evaluation pipelines of VQA and text generation tasks.
  • Modeled VQA as a multi-task learning problem to improve model VQA performance by %2.
  • Graph Neural Networks on Abdominal Data Meshes

    Explore Graph Neural Networks on abdominal data meshes.

  • Implemented and experimented with GCN, GraphSage, and FeastNet models.
  • Created data, training, and evaluation pipelines.
  • Worked on gender prediction, age, BMI, height, and weight regression tasks.
  • Self-Supervised-Learning on COMICS

    Explored vision based self-supervision methods to assess their capabilities on highly stylistic comic domains. The following methods are used in my experiments:

  • DINO (Vision Transformer and Swin Transformer), MoCo v2, Jigsaw task, attention map visualization, and input gradient-based method are used to understand the learned features.
  • SSuperGAN: Self-Supervised Face Generation Using Panel Context

  • We proposed a face reconstruction task.
  • Given a sequence of comic panels, one character's face is reconstructed in the last frame using temporal information.
  • Generative Adversarial Networks (GANs) and Variational Auto-Encoders (VAEs) are explored on Golden Comic Dataset.
  • My contributions are:
  • DCGAN implementation and experiments.
  • MSGGAN implementation and experiments.
  • Self Labeling Tool based on LabelImg

  • Developed a self-labeling tool for object detection.
  • Powered by FasterRCNN of MMDetection framework
  • Automatically generate bounding boxes and let users update them.
  • Developing a Vehicle Price Prediction Model Using Existing Features and Available Text

    Bachelor Thesis
  • Secondhand car price prediction system that uses the text of the car ads and structure data such as mileage, brand.
  • The dataset is created by crawling the website of one of Turkey's biggest second-hand car sellers.
  • Selenium and UiPath
  • Word2Vec embeddings are used to examine the relationship between the text's brands, features, and correct typos.
  • Doc2Vec and CNN-based approaches are used to vectorize the text.
  • Linear Regression, Decision Tree, and Random Forest methods are used for the final regression task.
  • A flask based web application is created for the demo.