FlowCV Logo
Education

Degree anticipated March 2025

Selected Courses (Grades out of 20):

  • Artificial Intelligence: 17.0
  • Machine Learning: 19.8
  • Data science: 17.2
  • Deep Learning: 17.5
  • Generative Models: 16.9
  • research experience

    Time Series Forecasting with a Backward Technique: Past, Present, Future (PaPrFu)

    Dr. Hossein Peyvandi, Aren Golazizian, Pooriya Safaei, Maryam Asgarinejad

    Abstract:

    This study explores the application of a backward technique in time series forecasting, focusing on models such as LSTM, ARIMA, and innovative approaches including Informer(a transformer-based model for time series forecasting) and TSmixer (a model based on mixing MLPs). Our goal is to enhance the performance and prediction accuracy of these models for the task of time series forecasting. The findings are scheduled for publication in early 2025.

    My role:

  • Focused on implementing the transformer-based Informer model, integrating the B&F technique into its architecture.
  • Conducted experiments to evaluate the Informer's performance, comparing it with classical and hybrid models under the B&F framework.
  • Hybrid Sequence Modeling: Integrating Transformers and RNNs for Enhanced Prediction Accuracy

    Hamidreza Hosseinkhani, Aren Golazizian, Amirreza Mehrzadian, Aida Khaleghi, Nima Atashin

    Abstract:

    This research focuses on developing an innovative hybrid model that combines the sequential processing capabilities of RNNs with the parallelization and global attention mechanisms of Transformers for advanced sequence modeling tasks. By integrating RNNs to handle short-term dependencies and Transformers to process long-term correlations, the model aims to improve prediction accuracy and computational efficiency in complex sequence datasets.

    My role:

  • Designed and implemented the architecture for integrating Transformers into the hybrid framework, ensuring compatibility with the RNN module for seamless short- and long-term dependency modeling.
  • Collaborated with the team to analyze results and iteratively improve the Transformer’s integration into the hybrid architecture.
  • technical experience

    Named Entity Recognition with DistilBERT

    Deep Learning Course, Dr. Beigy

    Fine-tuned DistilBERT, a lightweight transformer-based language model, for the Named Entity Recognition (NER) task on the CoNLL-2003 dataset. Implemented a custom data preprocessing pipeline to tokenize text while aligning entity labels, and trained the model to classify tokens into entity categories like PER, ORG, LOC, and MISC. The DistilBERT model's contextual embeddings were leveraged to capture intricate semantic relationships in text, enhancing NER performance.

    A combination of PyTorch-based training and Hugging Face's Trainer API was employed to optimize performance and streamline model evaluation. Achieved a validation accuracy of 94.94% after 3 epochs, with further evaluation on the test dataset showcasing the model's robust ability to identify named entities accurately. The fine-tuned model was saved and shared on Huggingface for reproducibility and future use in real-world NLP applications.

    Conditional and Standard Generative Models: DDPMs vs. GANs

    Deep Learning Course, Dr. Beigy

    Implemented and analyzed Denoising Diffusion Probabilistic Models (DDPMs) and Generative Adversarial Networks (GANs), along with their conditional counterparts, on the FashionMNIST dataset to explore both general and class-specific image generation. The DDPM used a UNet-based architecture with positional encodings to iteratively denoise random noise, achieving stable training and producing smooth, coherent images. The GAN model relied on a Generator-Discriminator pair trained adversarially, demonstrating occasional detail but struggling with artifacts and training instability. Extensions to Conditional GANs (cGANs) and Conditional DDPMs enabled class-specific generation, with Conditional DDPMs exhibiting superior stability and structural quality, while cGANs showed moderate improvements in diversity and output structure.

    Transformer from Scratch: Implementing "Attention Is All You Need"

    Deep Learning Course, Dr. Beigy

    Implemented a Transformer model from scratch in PyTorch based on the seminal paper Attention Is All You Need. The project involved building key components, including input embeddings, positional encodings, multi-head attention, feedforward layers, and encoder-decoder blocks, fully integrated into a complete Transformer architecture. The model was trained on the WMT14 English-German translation dataset, utilizing advanced techniques like causal masks, layer normalization, and residual connections to enable effective sequence-to-sequence learning. Training and validation losses were tracked over epochs, demonstrating the model's ability to learn complex language translation tasks. This implementation provided hands-on experience in constructing a state-of-the-art sequence processing model while deepening the understanding of the Transformer architecture.

    Persian Text Generation with RNN and Transformer Models

    Generative Models Course, Dr. Fatemeh Seyyedsalehi

    Built and trained RNN and Transformer models from scratch for Persian text generation using a preprocessed Persian Wikipedia dataset. Implemented a Byte-Pair Encoding (BPE) tokenizer to handle Persian text morphology. The RNN achieved a low perplexity of 2.07, demonstrating strong predictive performance, while the Transformer, despite its higher capacity, struggled with generalization, yielding a perplexity of 444.68. Evaluated model performance through text generation, with the RNN producing repetitive but coherent outputs and the Transformer generating more diverse but less semantically consistent text.

    Text Classification with RNN and LSTM

    Deep Learning Course, Dr. Beigy

    For this project, I implemented and compared RNN and LSTM models for sentiment analysis on the IMDB movie reviews dataset. The models were trained to classify reviews as positive or negative using preprocessed data with tokenization and padding. While LSTM models significantly outperformed RNNs (90% and 92% accuracy compared to 58% and 64%), I analyzed the reasons behind this disparity. LSTMs excelled due to their ability to handle long-term dependencies and mitigate the vanishing gradient problem. Additionally, I explored the impact of increasing sequence lengths, noting that while LSTMs leveraged the extended context effectively, RNNs struggled with longer sequences. This project emphasized the importance of model architecture in handling sequential data efficiently.

    Ultrasound Image Segmentation with U-Net

    Deep Learning Course, Dr. Beigy

    In this project, a U-Net model was implemented for ultrasound image segmentation, focusing on identifying specific regions within medical imagery. The dataset of ultrasound images and their corresponding annotations was preprocessed to ensure compatibility with the model, including padding and resizing operations. The U-Net architecture was constructed with an encoder-decoder structure and skip connections to capture multi-scale features effectively. The model was trained using Binary Cross-Entropy Loss over 10 epochs, achieving consistent reductions in training and validation losses. The final validation loss of 0.2008 highlighted the model’s capability to segment medical regions accurately, with predictions visualized to demonstrate its performance on test data.

    Traffic Prediction Using GRU

    Data science and Regression Analysis Course, Dr. Alishahi

    Developed a Gated Recurrent Unit (GRU) model to predict traffic conditions on Tehran's top four busiest roadways. The project aimed to reduce congestion and optimize transportation systems by leveraging machine learning to analyze complex traffic data patterns.

    PixelCNN: Autoregressive Image Generation

    Generative Models Course, Dr. Fatemeh Seyyedsalehi

    Built a PixelCNN model from scratch to generate images pixel-by-pixel using masked convolutions. Trained the model on MNIST, achieving low training and testing loss, and generated high-quality samples by modeling pixel dependencies. Designed masked convolutional layers and optimized the model using binary cross-entropy loss.

    Deep Q-Network (DQN) on CartPole-v1

    Deep Learning Course, Dr. Beigy

    Implemented a Deep Q-Network (DQN) to solve the CartPole-v1 environment using the PyTorch framework, incorporating key components like a policy network, target network, and experience replay buffer to enhance training stability. The policy network was designed to approximate Q-values, while the target network provided stable Q-value targets by being periodically updated. The implementation featured an ε-greedy policy for exploration-exploitation trade-off and the Bellman equation for Q-value updates. The agent was trained over multiple episodes, successfully balancing the pole by optimizing cumulative rewards. Additionally, training progress was visualized through score trends and moving averages, demonstrating the effectiveness of the DQN approach for reinforcement learning tasks.

    technical strengths

    Programming Python, R, C++, Matlab, SQL, Neo4j, Linux, Git

    Machine Learning Pandas, Numpy, Scikit-learn, PyTorch, TensorFlow, Matplotlib, Plotly, ADAM optimizer

    Languages English, Armenian, Persian

    Additional skills LaTeX, Microsoft Office applications(Word, Excel, PowerPoint)

    References
    Hossein Peyvandi, Professor, Dept. of Computer Engineering, Sharif University of Technology
    [email protected], +98 912 139 7361
    Teaching assistant

    Machine Learning

    Instructor: Dr. Ali Sharifi-Zarchi

    Machine Learnign Theory

    Instructor: Dr. Amir Najafi

    Stochastic Processes

    Instructor: Dr. Hossein Peyvandi

    Deep Learning

    Dr. Fatemeh Seyyedsalehi