Scheda programma d'esame
HUMAN LANGUAGE TECHNOLOGIES
GIUSEPPE ATTARDI
Anno accademico2022/23
CdSINFORMATICA
Codice649AA
CFU9
PeriodoSecondo semestre
LinguaInglese

ModuliSettore/iTipoOreDocente/i
HUMAN LANGUAGE TECHNOLOGIESINF/01LEZIONI72
GIUSEPPE ATTARDI unimap
Obiettivi di apprendimento
Learning outcomes
Conoscenze

Learning fundamental techniques, algorithms and models used in natural language processing. Understanding of the architectures of typical text analytics applications and of libraries for building them. Expertise in design, implementation and evaluation of applications that exploit analysis, understanding and transformation of texts.

Knowledge

Learning fundamental techniques, algorithms and models used in natural language processing. Understanding of the architectures of typical text analytics applications and of libraries for building them. Expertise in design, implementation and evaluation of applications that exploit analysis, interpretation and transformation of texts.

Modalità di verifica delle conoscenze

Homeworks and final project.

Assessment criteria of knowledge

Homeworks and final project.

Capacità

Ability to design, implement and evaluate applications that exploit analysis, interpretation and transformation of texts.

Skills

Ability to design, implement and evaluate applications that exploit analysis, interpretation and transformation of texts.

Prerequisiti (conoscenze iniziali)
  • programming skills, proficiency in the programming language Python
  • elementary Calculus and Linear Algebra (e.g. course “Computational Mathematics for learning and data analysis” (646AA))
  • elements of probability and statistics (e.g. course “Calcolo delle Probabilità e Statistica” (269AA))
  • machine learning (e.g. course “Machine Learning” (654AA))
Prerequisites
  • programming skills, proficiency in the programming language Python
  • elementary Calculus and Linear Algebra (e.g. course “Computational Mathematics for learning and data analysis” (646AA))
  • elements of probability and statistics (e.g. course “Calcolo delle Probabilità e Statistica” (269AA))
  • machine learning (e.g. course “Machine Learning” (654AA))
Programma (contenuti dell'insegnamento)

The course presents principles, models and the state of the art techniques for the analysis of natural language, focusing mainly on statistical machine learning approaches and Deep Learning in particular. Students will learn how to apply these techniques in a wide range of applications using modern programming libraries.

  • Formal and statistical approaches to NLP.
  • Statistical methods: Language Model, Hidden Markov Model, Viterbi Algorithm, Generative vs Discriminative Models
  • Linguistic essentials: words, lemmas, morphology, PoS, phrases.
  • Parsing: constituency and dependency parsing.
  • Processing Pipelines
  • Lexical semantics: collocations, corpora, thesauri, gazetteers.
  • Distributional Semantics: Word embeddings, Character embeddings.
  • Deep Learning for natural language.
  • Transformer Models
  • Applications: Entity recognition, Entity linking, Classification, Summarization.
  • Opinion mining, Sentiment Analysis.
  • Question answering, Language inference, Dialogic interfaces (chatbots)
  • Statistical Machine Translation.
  • NLP libraries: NLTK, Tensorflow, PyTorch
Syllabus

The course presents principles, models and the state of the art techniques for the analysis of natural language, focusing mainly on statistical machine learning approaches and Deep Learning in particular. Students will learn how to apply these techniques in a wide range of applications using modern programming libraries.Formal and statistical approaches to NLP.

  • Statistical methods: Language Model, Hidden Markov Model, Viterbi Algorithm, Generative vs Discriminative Models
  • Linguistic essentials: words, lemmas, morphology, PoS, phrases.
  • Parsing: constituency and dependency parsing.
  • Processing Pipelines: UIMA, Tanl
  • Lexical semantics: collocations, corpora, thesauri, gazetteers.
  • Distributional Semantics: Word embeddings, Character embeddings.
  • Deep Learning for natural language.
  • Applications: Entity recognition, Entity linking, Classification, Summarization.
  • Opinion mining, Sentiment Analysis.
  • Question answering, Language inference, Dialogic interfaces (chatbots)
  • Statistical Machine Translation.
  • NLP libraries: NLTK, Theano, Tensorflow, Keras
Bibliografia e materiale didattico
  1. C. Manning, H. Schutze. Foundations of Statistical Natural Language Processing. MIT Press, 2000.
  2. D. Jurafsky, J.H. Martin, Speech and Language Processing. 2nd edition, Prentice-Hall, 2008.
  3. S. Kubler, R. McDonald, J. Nivre. Dependency Parsing. 2010.
  4. P. Koehn. Statistical Machine Translation. Cambridge University Press, 2010.
  5. S. Bird, E. Klein, E. Loper. Natural Language Processing with Python.
Bibliography
  1. D. Jurafsky, J.H. Martin, Speech and Language Processing. 3rd edition, Prentice-Hall, 2021.
  2. S. Bird, E. Klein, E. Loper. Natural Language Processing with Python.
  3. Ian Goodfellow and Yoshua Bengio and Aaron Courville, Deep Learning, MIT Press, 2016.

Additional Material

  1. C. Manning, H. Schutze. Foundations of Statistical Natural Language Processing. MIT Press, 2000.
  2. S. Kubler, R. McDonald, J. Nivre. Dependency Parsing. 2010.
  3. P. Koehn. Statistical Machine Translation. Cambridge University Press, 2010.
Modalità d'esame

Project.

Assessment methods

Final project and discussion.

Ultimo aggiornamento 21/02/2023 21:06