Course webiste for BT5153
View My GitHub Profile
Slides
Notebook
RNN vs Autoregressive Models: Transformer
Awesome BERT & Transfer Learning in NLP
Why BERT Fails In Commercial Environments
When Recurrent Models Don’t Need to be Recurrent
The Time Series Transformer
Applying massive language models in the real world with Cohere
Understanding and Coding the Self-Attention Mechanism of Large Language Models From Scratch