F20AA - Applied Text Analytics

John See Su Yang
Radu-Casian Mihailescu
Neamat Elgayar

Course leader(s):

Aims

This course aims to provide the students with knowledge and skills in applied text analytics focusing on Machine Learning and Natural Language Processing tools.

 

In particular the course:

- Presents the area of text analytics and provides fundamental tools to extract, represent and analyse information from text sources using machine learning models

- Provides a fundamental understanding of concepts and tools to build effective language aware systems and applications

- Presents basic understanding of deep learning models for Natural Language Processing applications and related research

- Discusses current research advances, business cases and future direction of the field

Syllabus

1. Text Processing and Representation (1.1 Data sources and data cleaning, 1.2 Text processing Tokenization, stemming, lemmatization, stopword removal,POS .., 1.3 Vector space model, 1.4 Language models and N-gram)

2. Text Classification (2.1 Similarity measures, 2.2 Text Classifiers, 2.3 Evaluation measures, 2.4 Text analytics pipeline)

3. Topic Modelling (3.1 Text Clustering vrs topic modeling, 3.2 LSA / LDA, 3.3 Visualization of topic modeling)

4. Wordembeddings (4.1 Word encoding techniques, 4.2 Word2vec, 4.3 Learning skip-gram embeddings, 4.4 Pre-trained embeddings Glove, Fasttext & Visualization)

5. Deep Learning Models for Text Analytics (5.1 Introduction to deep learning for NLP, 5.2 Sequence Models, 5.3 Transformers and pre-trained models)

Learning outcomes

By the end of the course, students should be able to do the following:

Further details

Curriculum explorer: Click here

SCQF Level: 10

Credits: 15