Byte Sized ML

Fine-Tuning ChemBERTa for Chemical Interaction Prediction

Machine Learning

ChemBERTa

NLP

Drug Discovery

In a recent hackathon, the Virtual Cell Program Initiative (VCPI) prediction contest hosted by Ginkgo Bioworks, I worked on predicting gene expression directly from chemical…

Counting Number of Parameters in GPT2

Attention

Transformers

GPT2

Math

Building Blocks of Transformers: 3. Extending single-head attention to multi-head attention

Multihead-Attention

The attention module is typically extended into multiple attention modules sometimes referred to as attention heads. Multiple attention heads operate independently allowing…

Building Blocks of Transformers: 2. Position Representation

Attention

Transformers

Machine Learning

1. Positional Encoding Given the sequence He can always be seen working hard. This sequence is distinctly different from He can hardly be seen working. As can be seen slight…

Building Blocks of Transformers: 1. Self Attention

Attention

Transformers

Machine Learning

I had a garbled understanding of Transformer architecture after consulting blogs, videos and coursera course. What made it finally clicked for me is Stanford CS 224N…