Questa è una vecchia versione del documento!
Schedule | ||
---|---|---|
Day | Hour | Room |
Monday | 11-13 | X1, Polo Fibonacci |
Tuesday | 9-11 | X1, Polo Fibonacci |
Forum on Piazza
The course targets text analytics systems and applications to respond to business problems by discovering and presenting knowledge that is otherwise locked in textual form. The objective is to learn to recognize situations in which text analytics techniques can solve information processing needs, to identify the analytic task/process that best models the business problem, to select the most appropriate resources methods and tools, to collect text data and apply such methods to them. Several applications context will be presented: information extraction, sentiment analysis (what is the nature of commentary on an issue), spam and fake posts detection, quantification problems, summarization, etc.
A server has been setup for running Jupyter Notebooks. In order to log into the server, you must get credentials for a Google Suite account:go to this page and register with your University credentials to activate your free account.
Date | Lecture | Notes |
---|---|---|
17/9/2018 | Introduction | Text Analytics |
18/9/2018 | Introduction to Probability | Probability |
24/9/2018 | Language Modeling | Language Modeling |
25/9/2018 | Introduction to Python | See notebooks “Introduction to Python” in folder “Text Analytics” on http://attardi-4.di.unipi.it:8000/“ |
1/10/2018 | Introduction to Python | See notebooks “Introduction to Python 2” and “RegEx” in folder “Text Analytics” on http://attardi-4.di.unipi.it:8000/“ |
2/10/2018 | Introduction to NLTK | See notebooks “Introduction to NLTK” in folder “Text Analytics” on http://attardi-4.di.unipi.it:8000/“ |
8/10/2018 | Preprocessing and tokenization | Tokenization |
9/10/2018 | Word Similarity | Tokenization Homework 1 (deadline 15/10) |
15/10/2018 | Correction of Homework 1, Text Classification | Text Classification |
16/10/2018 | Classifiers | Classifiers |
22/10/2018 | Hidden Markov Models | HMM |
23/10/2018 | POS Tagging | HMM |
29/10/2018 | HomeWork 2 | |
5/11/2018 | Named Entity Tagging | NER |
06/11/2018 | Universal Dependencies | |
12/11/2018 | Dependency Parsing | |
13/11/2018 | Neural Language Models: PCA, Word2Vec | LM See notebooks “LanguageModels.ipynb” on http://attardi-4.di.unipi.it:8000/ |
20/11/2018 | NLM: FastText, Doc2Vec | LM See notebooks “docEmbeddings.ipynb” on http://attardi-4.di.unipi.it:8000/ |
26/11/2018 | NLM: Text Generation, ELMo, BERT | LM See notebooks “TextGeneration.ipynb” on http://attardi-4.di.unipi.it:8000/ |
27/11/2018 | Sentiment Analysis | SA |
3/12/2018 | Lexical Resources | LR See notebooks “pmi-lex.ipynb” and “pmi-lex-IMDB.ipynb” on http://attardi-4.di.unipi.it:8000/ |
4/12/2018 | Sentiment Classification | SC See notebooks “VADER.ipynb”, “sklearn.ipynb”, “lstmNet.ipynb” and “cnnNet.ipynb” on http://attardi-4.di.unipi.it:8000/ |
10/12/2018 | Quantification | |
11/12/2018 | Spam, Scam, Phishing, Fake reviews, Clickbaits, Fake News |