Strumenti Utente

Strumenti Sito


Big Data Analytics A.A. 2020/21


WARNING: All lectures of the First Semester of the academic year 2020/21, until 31/12/2020, will be provided exclusively remotely, through the Teams team named “599AA 20/21 - BIG DATA ANALYTICS [WDS-LM]” (

ATTENZIONE: Tutte le lezioni frontali del Primo Semestre dell’a.a. 2020/21, fino al 31/12/2020, verranno erogate esclusivamente in modalità a distanza, attraverso il canale Teams “599AA 20/21 - BIG DATA ANALYTICS [WDS-LM]” (

Instructors - Docenti:

Timetable (

  • Monday 16:15 - 18:00 Aula WDS/1
  • Tuesday 16:15 - 18:00 Aula WDS/1

Fill the doodle with your preference for time/day during the week (forgot about dates, just care about the day of the week and the time of the day):

Pre-registration to the course: fill the form with your name and surname, email, skills and languages (the results of the form will help building up teams), by Wed, September 16th:

Team Registration: build up teams of 3 or 4 students and register your team here, by September 23th:

Learning goals

In our digital society, every human activity is mediated by information technologies, hence leaving digital traces behind. These massive traces are stored in some, public or private, repository: phone call records, movement trajectories, soccer-logs and social media records are all examples of “Big Data”, a novel and powerful “social microscope” to understand the complexity of our societies. The analysis of big data sources is a complex task, involving the knowledge of several technological and methodological tools. This course has three objectives:

  • introducing to the emergent field of big data analytics and social mining;
  • introducing to the technological scenario of big data, like programming tools to analyze big data, query NoSQL databases, and perform predictive modeling;
  • guide students to the development of a open-source and reproducible big data analytics project, based on the analyis of real-world datasets.

Module 1: Big Data Analytics and Social Mining

In this module, analytical methods and processes are presented thought exemplary cases studies in challenging domains, organized according to the following topics:

  • The Big Data Scenario and the new questions to be answered
  • Sport Analytics:
    1. Soccer data landscape and injury prediction
    2. Analysis and evolution of sports performance
  • Mobility Analytics
    1. Mobility data landscape and mobility data mining methods
    2. Understanding Human Mobility with vehicular sensors (GPS)
    3. Mobility Analytics: Novel Demography with mobile-phone data
  • Social Media Mining
    1. The social media data landscape: Facebook, Linked-in, Twitter, Last_FM
    2. Sentiment analysis. example from human migration studies
    3. Discussion on ethical issues of Big Data Analytics
  • Well-being&Now-casting
    1. Nowcasting influenza with retail market data
    2. Predicting well-being from human mobility patterns
  • Paper presentations by students

Module 2: Big Data Analytics Technologies

This module will provide to the students the technologies to collect, manipulate and process big data. In particular the following tools will be presented:

  • Python for Data Science
  • The Jupyter Notebook: developing open-source and reproducible data science
  • MongoDB: fast querying and aggregation in NoSQL databases
  • GeoPandas: analyze geo-spatial data with Python
  • Scikit-learn: machine learning in Python
  • Keras: deep learning in Python

Module 3: Laboratory for Interactive Project Development

During the course, teams of students will be guided in the development of a big data analytics project. The projects will be based on real-world datasets covering several thematic areas. Discussions and presentation in class, at different stages of the project execution, will be performed.

  • 1st Mid Term: Data Understanding and Project Formulation
  • 2nd Mid Term: Model(s) construction and evaluation
  • 3rd Mid Term: Model interpretation/explanation
  • Exam: Final Project results


14/09 (Mod. 1) Introduction to the course, The Big Data scenario lesson1_introduction_to_the_course_bda2021.pdf

15/09 (Mod. 2) Python for Data Science and the Jupyter Notebook: developing open-source and reproducible data science

21/09 No Lesson (Election Day in Italy)


28/09 (Mod. 2) Geopandas and scikit-mobility: analyze trajectory data in Python:

29/09 (Mod. 2) PyMongo and MongoDB: fast querying and aggregation in NoSQL databases:

05/10 (Mod. 1) Soccer data landscape and injury prediction

06/10 No Lesson (SocInfo2020 conference)

12/10 (Mod. 1) Performance evaluation: from human evaluations to data-driven algorithms

13/10 (Mod. 1) Nowcasting well-being with Big Data

19/10 (Mod. 3) 1st Mid Term - first group of teams

20/10 (Mod. 3) 1st Mid Term - second group of teams

26/10 (Mod. 3) Discussion and group working on projects

27/10 (Mod. 3) Discussion and group working on projects

02/11 (Mod. 1) Forecasting influenza with retail market data

03/11 (Mod. 1) Trustworthy data mining

16/11 (Mod. 3) 2nd Mid Term - first group of teams

17/11 (Mod. 3) 2nd Mid Term - second group of teams

23/11 (Mod. 3) Discussion and group working on projects

24/11 (Mod. 3) Discussion and group working on projects

30/11 (Mod. 3) Paper presentations

01/12 (Mod. 3) Paper presentations

07/12 (Mod. 3) 3rd Mid Term - first and second group of teams



Previous Big Data Analytics websites

bigdataanalytics/bda/start.txt · Ultima modifica: 16/09/2020 alle 14:54 (4 giorni fa) da Luca Pappalardo