In this seminar, students will become familiar with the large language models (LLMs) that have recently revolutionized the field of natural language processing (NLP) and serve as the foundation for cutting-edge systems in a variety of tasks ranging from text generation, classification, question answering, translation, summarization, few-shot learning, code-generation, chatbots etc. The students, through intensive literature review, will gain an understanding of theory, modeling, evaluation and systems aspect of the transformer-based large language models (LLMs) such as BERT, T5, GPT etc. Students will also learn about the ethical and societal implications of large-scale language models, including their impact on privacy and bias. More generally, students will learn to research and evaluate scientific literature.

Basic knowledge of machine learning and NLP is required, although these will be touched upon during the introductory session.