top of page

Daniel Plotkin
Welcome to my page! This is a collection of some of my personal open-source Data Science projects.
Projects
PROJECTS

Advanced Image Captioning Transformer with Reinforcement Learning Optimization in PyTorch
In this project, an advanced image captioning model is implemented using the CPTR architecture and optimized it with Self-critical Sequence Training (SCST) in PyTorch. The CPTR architecture serves as my baseline model, and SCST is applied, an improved variant of the popular REINFORCE algorithm, to further optimize it. The model is trained on the Flickr8K dataset, which includes specific train, validation, and test splits.
The project demonstrates improved captioning performance through SCST, with visual examples of generated captions on unseen images. The captions are able to capture the essence of the images to a good degree.
The project demonstrates improved captioning performance through SCST, with visual examples of generated captions on unseen images. The captions are able to capture the essence of the images to a good degree.

Amino Acid Sequence Function Prediction
Predicting protein function from a sequence of amino acids is very important for understanding biology, enabling drug discovery, advancing biotechnology, personalizing medicine, analyzing genomic data, and elucidating protein structure-function relationships. In this project, using the dataset from the "CAFA 5 Protein Function Prediction" Kaggle competition, I will use TensorFlow and ProtT5 embeddings to predict the biological functions of a protein given its amino acid sequence.

Chat with your Documents Web App with LangChain & Streamlit
This open-source repository gives users the access to create and deploy a Streamlit web app that utilized OpenAI models to not only chat with users, but also interact with very large uploaded documents that ChatGPT lacks the capability to. This project utilized the LangChain framework to implement OpenAI chat models and embeddings into a powerful chat with your document's application. To get started or learn more about the project, navigate to the GitHub repo.

Implementing a Deep Convolutional GAN in PyTorch
This project implements a Deep Convolutional Generative Adversarial Network (DCGAN) based on the Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks paper by Radford et al. The DCGAN incorporates deep convolutional neural networks (CNNs) in both the generator and discriminator. The goal is to generate realistic images from random noise vectors.

Building and Training Neural Networks with NumPy from Scratch
From data preprocessing to forward and backward propagation, I show the key components of neural networks, shedding light on activation functions, weight initialization, batch processing, and gradient descent optimization. I will be using the MNIST dataset that comes with Google Colab to implement a neural network to that allows us to identify numbers from image inputs.

Identifying Diseases from Symptom Descriptions
In this project, I will be fine-tuning DistilBERT to accurately classify diseases based on a patient's symptoms. By creating a virtual doctor that anyone can access, regardless of location or financial status, we can increase access to healthcare and provide quicker medical advice.

Classifying Brain Cancer Tumors with PyTorch
In this project, I will demonstrate how to classify tumors into 4 classes: Gliomas, Meningiomas, Pituitary adenomas, and no tumor. I used the PyTorch framework to achieve 96.26% model accuracy to new data.

Time Series Forecasting with LSTMs
In this project, I will be using Microsoft's Yahoo stock data to create a time series forecast predicting adjusted closing prices using Long Short-Term Memory (LSTM). I will be implementing this with PyTorch.

Twitter Sentiment Analysis with Transformers
In this project, I will be fine-tuning a DistilBERT transformer model to predict sentiment to Twitter tweets. This is important for companies to extract insights from natural language, allowing businesses, researchers, and marketers to understand public sentiment towards products, brands, events, or social issues.
Educatinal Notebook
Educational Notebooks

Mastering Reinforcement Learning: Theory, Math, and Python
"Mastering Reinforcement Learning: Theory, Math, and Python" is a comprehensive self-designed set of Google Colab notebooks providing a deep understanding of reinforcement learning principles. Split into two parts, the first section gives learners the theoretical foundations, exploring concepts such as the mathematical framework behind reinforcement learning, Markov Decision Processes (MDP), Q-learning, and Deep-Q Learning. The second part aims to bridge theory with practical implementation by guiding readers Python implementation that bring these concepts to life in real-world applications. By the course's conclusion, students will possess a robust comprehension of reinforcement learning and the skills to implement these theories effectively in Python.

Fundamentals of Neural Networks: A Practical Introduction to Deep Learning
In this notebook, we will talk about the difference between Deep Learning and Machine Learning, we will talk about the architecture of an Artificial Neural Network (ANN), the algorithms to train one, and some of the challenges in Deep Learning training.
About
ABOUT
Hi there! My name is Daniel Plotkin, and I am currently pursuing an M.S. in Artificial Intelligence at Northwestern University.
My fascination lies in Deep Learning and AI research. I’m deeply passionate about contributing to the development and implementation of cutting-edge AI solutions to address some of humanity’s most complex challenges. Despite the controversies surrounding AI, I remain committed to leveraging this technology for the greater good. I firmly believe in its transformative potential and am eager to play a role in shaping its positive impact on the world.

Contact
bottom of page