Degree anticipated March 2025
Selected Courses (Grades out of 20):
Time Series Forecasting with a Backward Technique: Past, Present, Future (PaPrFu)
Dr. Hossein Peyvandi, Aren Golazizian, Pooriya Safaei, Maryam AsgarinejadAbstract:
This study explores the application of a backward technique in time series forecasting, focusing on models such as LSTM, ARIMA, and innovative approaches including Informer(a transformer-based model for time series forecasting) and TSmixer (a model based on mixing MLPs). Our goal is to enhance the performance and prediction accuracy of these models for the task of time series forecasting. The findings are scheduled for publication in early 2025.
My role:
Hybrid Sequence Modeling: Integrating Transformers and RNNs for Enhanced Prediction Accuracy
Hamidreza Hosseinkhani, Aren Golazizian, Amirreza Mehrzadian, Aida Khaleghi, Nima AtashinAbstract:
This research focuses on developing an innovative hybrid model that combines the sequential processing capabilities of RNNs with the parallelization and global attention mechanisms of Transformers for advanced sequence modeling tasks. By integrating RNNs to handle short-term dependencies and Transformers to process long-term correlations, the model aims to improve prediction accuracy and computational efficiency in complex sequence datasets.
My role:
Named Entity Recognition with DistilBERT
Deep Learning Course, Dr. BeigyFine-tuned DistilBERT, a lightweight transformer-based language model, for the Named Entity Recognition (NER) task on the CoNLL-2003 dataset. Implemented a custom data preprocessing pipeline to tokenize text while aligning entity labels, and trained the model to classify tokens into entity categories like PER, ORG, LOC, and MISC. The DistilBERT model's contextual embeddings were leveraged to capture intricate semantic relationships in text, enhancing NER performance.
A combination of PyTorch-based training and Hugging Face's Trainer API was employed to optimize performance and streamline model evaluation. Achieved a validation accuracy of 94.94% after 3 epochs, with further evaluation on the test dataset showcasing the model's robust ability to identify named entities accurately. The fine-tuned model was saved and shared on Huggingface for reproducibility and future use in real-world NLP applications.
Conditional and Standard Generative Models: DDPMs vs. GANs
Deep Learning Course, Dr. BeigyImplemented and analyzed Denoising Diffusion Probabilistic Models (DDPMs) and Generative Adversarial Networks (GANs), along with their conditional counterparts, on the FashionMNIST dataset to explore both general and class-specific image generation. The DDPM used a UNet-based architecture with positional encodings to iteratively denoise random noise, achieving stable training and producing smooth, coherent images. The GAN model relied on a Generator-Discriminator pair trained adversarially, demonstrating occasional detail but struggling with artifacts and training instability. Extensions to Conditional GANs (cGANs) and Conditional DDPMs enabled class-specific generation, with Conditional DDPMs exhibiting superior stability and structural quality, while cGANs showed moderate improvements in diversity and output structure.
Transformer from Scratch: Implementing "Attention Is All You Need"
Deep Learning Course, Dr. BeigyImplemented a Transformer model from scratch in PyTorch based on the seminal paper Attention Is All You Need. The project involved building key components, including input embeddings, positional encodings, multi-head attention, feedforward layers, and encoder-decoder blocks, fully integrated into a complete Transformer architecture. The model was trained on the WMT14 English-German translation dataset, utilizing advanced techniques like causal masks, layer normalization, and residual connections to enable effective sequence-to-sequence learning. Training and validation losses were tracked over epochs, demonstrating the model's ability to learn complex language translation tasks. This implementation provided hands-on experience in constructing a state-of-the-art sequence processing model while deepening the understanding of the Transformer architecture.
Persian Text Generation with RNN and Transformer Models
Generative Models Course, Dr. Fatemeh SeyyedsalehiBuilt and trained RNN and Transformer models from scratch for Persian text generation using a preprocessed Persian Wikipedia dataset. Implemented a Byte-Pair Encoding (BPE) tokenizer to handle Persian text morphology. The RNN achieved a low perplexity of 2.07, demonstrating strong predictive performance, while the Transformer, despite its higher capacity, struggled with generalization, yielding a perplexity of 444.68. Evaluated model performance through text generation, with the RNN producing repetitive but coherent outputs and the Transformer generating more diverse but less semantically consistent text.
Text Classification with RNN and LSTM
Deep Learning Course, Dr. BeigyFor this project, I implemented and compared RNN and LSTM models for sentiment analysis on the IMDB movie reviews dataset. The models were trained to classify reviews as positive or negative using preprocessed data with tokenization and padding. While LSTM models significantly outperformed RNNs (90% and 92% accuracy compared to 58% and 64%), I analyzed the reasons behind this disparity. LSTMs excelled due to their ability to handle long-term dependencies and mitigate the vanishing gradient problem. Additionally, I explored the impact of increasing sequence lengths, noting that while LSTMs leveraged the extended context effectively, RNNs struggled with longer sequences. This project emphasized the importance of model architecture in handling sequential data efficiently.
Ultrasound Image Segmentation with U-Net
Deep Learning Course, Dr. BeigyIn this project, a U-Net model was implemented for ultrasound image segmentation, focusing on identifying specific regions within medical imagery. The dataset of ultrasound images and their corresponding annotations was preprocessed to ensure compatibility with the model, including padding and resizing operations. The U-Net architecture was constructed with an encoder-decoder structure and skip connections to capture multi-scale features effectively. The model was trained using Binary Cross-Entropy Loss over 10 epochs, achieving consistent reductions in training and validation losses. The final validation loss of 0.2008 highlighted the model’s capability to segment medical regions accurately, with predictions visualized to demonstrate its performance on test data.
Traffic Prediction Using GRU
Data science and Regression Analysis Course, Dr. AlishahiDeveloped a Gated Recurrent Unit (GRU) model to predict traffic conditions on Tehran's top four busiest roadways. The project aimed to reduce congestion and optimize transportation systems by leveraging machine learning to analyze complex traffic data patterns.
PixelCNN: Autoregressive Image Generation
Generative Models Course, Dr. Fatemeh SeyyedsalehiBuilt a PixelCNN model from scratch to generate images pixel-by-pixel using masked convolutions. Trained the model on MNIST, achieving low training and testing loss, and generated high-quality samples by modeling pixel dependencies. Designed masked convolutional layers and optimized the model using binary cross-entropy loss.
Deep Q-Network (DQN) on CartPole-v1
Deep Learning Course, Dr. BeigyImplemented a Deep Q-Network (DQN) to solve the CartPole-v1 environment using the PyTorch framework, incorporating key components like a policy network, target network, and experience replay buffer to enhance training stability. The policy network was designed to approximate Q-values, while the target network provided stable Q-value targets by being periodically updated. The implementation featured an ε-greedy policy for exploration-exploitation trade-off and the Bellman equation for Q-value updates. The agent was trained over multiple episodes, successfully balancing the pole by optimizing cumulative rewards. Additionally, training progress was visualized through score trends and moving averages, demonstrating the effectiveness of the DQN approach for reinforcement learning tasks.
Programming Python, R, C++, Matlab, SQL, Neo4j, Linux, Git
Machine Learning Pandas, Numpy, Scikit-learn, PyTorch, TensorFlow, Matplotlib, Plotly, ADAM optimizer
Languages English, Armenian, Persian
Additional skills LaTeX, Microsoft Office applications(Word, Excel, PowerPoint)
Machine Learning
Instructor: Dr. Ali Sharifi-ZarchiMachine Learnign Theory
Instructor: Dr. Amir NajafiStochastic Processes
Instructor: Dr. Hossein PeyvandiDeep Learning
Dr. Fatemeh Seyyedsalehi