Generative Modelling for Natural Language Processing
Table of Contents
Syllabus
Session 1: Word Embeddings
Session 2: Language Models
Session 3: Attention in RNN and Transformer Language Models
Session 4: Chatbots fine-tuning RHLF and DPO
Lecture on
and Lab onReferences:
- Reinforcement Learning from Human Feedback (Nathan Lambert, 2024)
- Reinforcement Learning: An Introduction (Sutton, Richard S. and Barto, Andrew G., 2018 )
- Learning to summarize from human feedback (Nisan Stiennon and Long Ouyang and Jeff Wu and Daniel M. Ziegler and Ryan Lowe and Chelsea Voss and Alec Radford and Dario Amodei and Paul F. Christiano, 2020)
- Simple statistical gradient-following algorithms for connectionist reinforcement learning (Williams, R. J., 1992)
- Direct Preference Optimization: Your Language Model is Secretly a Reward Model (Rafael Rafailov and Archit Sharma and Eric Mitchell and Christopher D Manning and Stefano Ermon and Chelsea Finn, 2023)
Session 5: Deep Latent Variable Models for Texts
Exam
References
- Advances in Pre-Training Distributed Word Representations (Mikolov, Tomas and Grave, Edouard and Bojanowski, Piotr and Puhrsch, Christian and Joulin, Armand, 2018) pdf
- Neural Word Embedding as Implicit Matrix Factorization (Levy, Omer and Goldberg, Yoav, 2014) pdf
- Reinforcement Learning from Human Feedback (Nathan Lambert, 2024)
- Reinforcement Learning: An Introduction (Sutton, Richard S. and Barto, Andrew G., 2018 )
- Learning to summarize from human feedback (Nisan Stiennon and Long Ouyang and Jeff Wu and Daniel M. Ziegler and Ryan Lowe and Chelsea Voss and Alec Radford and Dario Amodei and Paul F. Christiano, 2020)
- Simple statistical gradient-following algorithms for connectionist reinforcement learning (Williams, R. J., 1992)
- Direct Preference Optimization: Your Language Model is Secretly a Reward Model (Rafael Rafailov and Archit Sharma and Eric Mitchell and Christopher D Manning and Stefano Ermon and Chelsea Finn, 2023)