Machine learning for NLP (2018)

Any question? Contact me at aurelie DOT herbelot AT unitn DOT it.


The course introduces core Machine Learning algorithms for computational linguistics (CL). Its goal is to (1) provide students with an overview of core Machine Learning techniques, widely used in CL; (2) understand in which contexts / for which applications each technique is suitable; (3) understand the experimental pipeline necessary to apply the technique to a particular problem, including possible data collection and choice of evaluation method; (4) get some practice in running Machine Learning software and interpreting the output. The syllabus is meant to cover Machine Learning methods from both a theoretical and practical point of view, and to give students a tool to read relevant scientific literature with a critical mind.

At the end of the course, students will: (1) demonstrate knowledge of the principles of core Machine Learning techniques; (2) be able to read and understand CL literature using the introduced techniques, and critically assess their use in research and applications; (3) have some fundamental computational skills allowing them to run existing Machine Learning software and interpret their output.


There are no prerequisites for this course. Students with no computational background will acquire good intuitions for a range of Machine Learning techniques, as well as basic practical skills to interact with existing software. Students with a good mathematical and computational background (including programming and familiarity with the Unix command line) will be invited to gain a deeper understanding of each introduced algorithm, and to try out their own modifications of the course software.


The whole course will introduce ten topics, including a general introduction in the first week. Each topic will be taught over three sessions including 1) a lecture explaining the theory behind a technique/algorithm; 2) a lecture discussing a scientific paper where the technique is put to work, demonstrating the experimental pipeline around the algorithm; 3) a hands-on session where the students will have a chance to run some software to familiarise themselves with the practical implementation of the method.

Course schedule:

March 15/19/20: Introduction

Lecture 1: General introduction. What is ML? What is it for? What has it got to do with AI? Slides
Lecture 2: Basic principles of statistical NLP: Naive Bayes, Maximum Likelihood, Precision/Recall.
Practical: Set up computers and/or access to development server for experimentation. Run a simple authorship attribution algorithm using Naive Bayes.

March 22/26/27 Data preparation techniques

Lecture 1: How to choose your data. Annotating. Focus on inter-annotator agreement metrics.
Lecture 2: Shekhar et al (2017) and b) Herbelot & Vecchi (2016). The former to understand what good/bad data is and how it affects a task. The latter to understand issues in annotation.
Practical: Hands-on intro to crowdsourcing. Perform some annotation and calculate your interannotator agreement.

March 29, April 3/5 Supervised learning

Lecture 1: Introduction to regression techniques.
Lecture 2: Padó et al (2016) on compositional distributional semantics for morphology. A big regression analysis of morphological composition for different patterns!
Practical: Playing with linear regression in Python: http://www.dataschool.io/linear-regression-in-python/.

April 9/10/12 Unsupervised learning

Lecture 1: Clustering and dimensionality reduction.
Lecture 2: Latent Semantic Analysis, Landauer & Dumais (1997).
Practical: Document clustering for information retrieval, playing with the PeARS search engine.

April 16/17/19 Support Vector Machines

Lecture 1: SVM principles.
Lecture 2: Herbelot & Kochmar (2016) on semantic error detection.
Practical: Run some simple LibSVM script.

April 23/24/26 Neural Networks: introduction

Lecture 1: Basics of NNs and general AI concepts. Short intro to various NN frameworks (Theano, Tensorflow, Torch…)
Lecture 2: Marblestone et al (2016) - what NNs really have to do with neuroscience.
Practical: Follow tutorial on implementing an NN from scratch: http://www.wildml.com/2015/09/implementing-a-neural-network-from-scratch/.

May 7/8/10 RNNs and LSTMs

Lecture 1: Sequence learning with RNNs and LSTMs
Lecture 2: Sutskever et al (2011) on generating text with RNNs.
Practical: Follow tutorial on implementing RNNs in Theano: http://www.wildml.com/2015/09/recurrent-neural-networks-tutorial-part-1-introduction-to-rnns/.

May 14/15/17 Reinforcement learning

Lecture 1: Principles of RL.
Lecture 2: Lazaridou et al (2017) on multi-agent emergence of natural language.
Practical: Solving the OpenAI gym ‘frozen lake’ puzzle in Tensorflow: https://medium.com/emergent-future/simple-reinforcement-learning-with-tensorflow-part-0-q-learning-with-tables-and-neural-networks-d195264329d0.

May 21/22/24 ML and small data

Lecture 1: Dealing with no or small data. Issues and solutions. Examples from low-resource languages.
Lecture 2: Herbelot & Baroni (2017) on building vectors from tiny data. (Opportunity to review Word2Vec.)
Practical: Play with the Herbelot & Baroni implementation. Use it on new ‘tiny’ data (one sentence only!)

May 28/29/30 The ethics of machine learning

Lecture 1: Ethical issues with ML. Bias in distributional vectors. Reflections on data choices.
Lecture 2: Herbelot et al (2012) on race and gender; Bolukbasi et al (2016) on debiasing vectors.
Practical: Experiment with visualisations of word embeddings to find biases.

May 31 Revision session