Natural Language Processing with Deep Learning
What is this course about?
Natural language processing (NLP) or computational linguistics is one of the most important technologies of the information age. Applications of NLP are everywhere because people communicate almost everything in language: web search, advertising, emails, customer service, language translation, virtual agents, medical reports, politics, etc. In the 2010s, deep learning (or neural network) approaches obtained very high performance across many different NLP tasks, using single end-to-end neural models that did not require traditional, task-specific feature engineering. In the 2020s amazing further progress was made through the scaling of Large Language Models, such as ChatGPT. In this course, students will gain a thorough introduction to both the basics of Deep Learning for NLP and the latest cutting-edge research on Large Language Models (LLMs). Through lectures, assignments and a final project, students will learn the necessary skills to design, implement, and understand their own neural network models, using the Pytorch framework.
“Take it. CS221 taught me algorithms. CS229 taught me math. CS224N taught me how to write machine learning models.” – A CS224N student on Carta
Previous offerings
Below you can find archived websites and student project reports from previous years. Disclaimer: assignments change from year to year; please do not do assignments from previous years!
CS224n Websites: Winter 2024 / Winter 2023 / Winter 2022 / Winter 2021 / Winter 2020 / Winter 2019 / Winter 2018 / Winter 2017 / Autumn 2015 / Autumn 2014 / Autumn 2013 / Autumn 2012 / Autumn 2011 / Winter 2011 / Spring 2010 / Spring 2009 / Spring 2008 / Spring 2007 / Spring 2006 / Spring 2005 / Spring 2004 / Spring 2003 / Spring 2002 / Spring 2000 |
CS224n Lecture Videos: Winter 2023 / Winter 2021 / Winter 2019 / Winter 2017 |
CS224n Reports: Winter 2024 / Winter 2023 / Winter 2022 / Winter 2021 / Winter 2020 / Winter 2019 / Winter 2018 / Winter 2017 / Autumn 2015 and earlier |
CS224d Reports: Spring 2016 / Spring 2015 |
Prerequisites
- Proficiency in Python
All class assignments will be in Python (using NumPy and PyTorch). If you need to remind yourself of Python, or you're not very familiar with NumPy, you can come to the Python review session in week 1 (listed in the schedule). If you have a lot of programming experience but in a different language (e.g. C/C++/Matlab/Java/Javascript), you will probably be fine.
- College Calculus, Linear Algebra (e.g. MATH 51, CME 100)
You should be comfortable taking (multivariable) derivatives and understanding matrix/vector notation and operations.
- Basic Probability and Statistics (e.g. CS 109 or equivalent)
You should know the basics of probabilities, gaussian distributions, mean, standard deviation, etc.
- Foundations of Machine Learning (e.g. CS221, CS229, CS230, or CS124)
We will be formulating cost functions, taking derivatives and performing optimization with gradient descent. If you already have basic machine learning and/or deep learning knowledge, the course will be easier; however it is possible to take CS224n without it. There are many introductions to ML, in webpage, book, and video form. One approachable introduction is Hal Daumé’s in-progress A Course in Machine Learning. Reading the first 5 chapters of that book would be good background. Knowing the first 7 chapters would be even better!
Reference Texts
The following texts are useful, but none are required. All of them can be read free online.
- Dan Jurafsky and James H. Martin. Speech and Language Processing (2024 pre-release)
- Jacob Eisenstein. Natural Language Processing
- Yoav Goldberg. A Primer on Neural Network Models for Natural Language Processing
- Ian Goodfellow, Yoshua Bengio, and Aaron Courville. Deep Learning
- Delip Rao and Brian McMahan. Natural Language Processing with PyTorch (requires Stanford login).
- Lewis Tunstall, Leandro von Werra, and Thomas Wolf. Natural Language Processing with Transformers
If you have no background in neural networks but would like to take the course anyway, you might well find one of these books helpful to give you more background:
- Michael A. Nielsen. Neural Networks and Deep Learning
- Eugene Charniak. Introduction to Deep Learning