Strumenti Utente

Strumenti Sito


bigdataanalytics:bda:start

Big Data Analytics A.A. 2017/18

Instructors - Docenti:

Learning goals -- Obiettivi del corso

Objective In our digital society, every human activity is mediated by information technologies. Therefore, every activity leaves digital traces behind, that can be stored in some repository. Phone call records, transaction records, web search logs, movement trajectories, social media texts and tweets, Every minute, an avalanche of “big data” is produced by humans, consciously or not, that represents a novel, accurate digital proxy of social activities at global scale. Big data provide an unprecedented “social microscope”, a novel opportunity to understand the complexity of our societies, and a paradigm shift for the social sciences. Objective of the course is twofold: an introduction to the emergent field of big data analytics and social mining, aimed at acquiring and analyzing big data from multiple sources to the purpose of discovering the patterns and models of human behavior that explain social phenomena and an introduction to the technological scenario of scalable analytics.

Intro lectures

Lecture 1: Course Presentation, Course organization, Big Data Landscape: Opportunities, risks, big data sources, challenges.

Slides:https://goo.gl/WztPDg

Technologies lectures:

Lecture 1: Overview/Recall parallel computing. Slides: https://goo.gl/eCwz7G

Lecture 2: Introduction to Hadoop and Map-Reduce Patterns. Slides: https://goo.gl/kukSQx https://goo.gl/efVLKD

Lecture 3: HDFS and Spark (LAB). Slides https://goo.gl/eD5p6c

Lecture 4-5-6: Data Analytics with Spark (LAB) (Last slides of Lecture 3 with exercises) https://goo.gl/AQJXhD

Lecture 7-8-9: Data Mining with Spark and Mllib (LAB) Slides: https://goo.gl/HJEQwT, Materials: https://goo.gl/VxAEhi

Lecture 10: Other Big Data tools

Lecture 11: Case of study implementation

Methodological scenarios lectures:

Lecture 1-2: What is possible to observe with Mobile Phone Data? Formulation of novel questions to be answered: estimating population, understanding city dynamics, estimating unemployment or gender Distribution, Wellbeing; The complexity of feature construction; Model Construction; new mining algorithms; validation strategies.

Slides: https://goo.gl/fULiAu, https://goo.gl/UZEPdu

Lecture 3-4: What is possible to observe with GPS data? Formulation of novel questions to be answered: Understanding Human Mobility; the complexity of feature construction, new Model Construction, ew mining algorithms; validation strategies.

Lecture 5-6: What is possible to observe with Social Media Data? Formulation of novel questions to be answered: Understanding Sentiment, Wellbeing, Happyness; the complexity of feature construction, new Model Construction, ew mining algorithms; validation strategies.

Lecture 7: What is possible to observe with IoT Data? Formulation of novel questions to be answered: Understanding performance in Sport; the complexity of feature construction, new Model Construction, ew mining algorithms; validation strategies.

Datasets

The datasets overview: https://goo.gl/fyAjth The datasets folder: https://goo.gl/nPd6HT

Calendar

18/09 - (Intro) Course Presentation, Big Data Landscape

22/09 - (Tech) Overview/Recall parallel computing

25/09 - (Method) What is possible to do observe with Mobile Phone Data? (i)

29/09 - (Method) What is possible to do observe with Mobile Phone Data? (ii)

02/10 - (Tech) Introduction to Hadoop e Design Pattern (Lab)

06/10 - Cancelled!

09/10 - (Tech) Managing HDFS and Introduction to Spark (Lab) and Datasets Presentation

13/10 - (Tech) Data Analytic with Spark (Lab)

16/10 - (Tech) Data Analytic with Spark (Lab)

20-23/10 - No Class (Time to practice!)

27/10 - (Tech) Data Analytic with Spark (Lab)

30/10 Mid-term Tech I - 16,30 starts, you will have 1 hour and 30 minutes

6/11 - (Tech) Data Mining with Spark and Mllib (Lab) (i)

10/11 - (Method) What is possible to do observe with GPS data? (i)

13/11 - (Tech) Data Mining with Spark and Mllib (Lab) (ii)

17/11 - (Method) What is possible to do observe with GPS data? (ii)

20/11 - Discussing the final project proposal - Collective discussion (not evaluated)

24/11 - (Tech) Data Mining with Spark and Mllib (Lab) (iii)

27/11 - (Method) What is possible to do observe with Social Media Data? (i)

01/12 - (Method) What is possible to do observe with Social Media Data? (ii)

4/12 - (Method) What is possible to do observe with IoT data: examples from sport ?

11/12 - Discussing the final project proposal - Collective discussion (not evaluated)

15/12 - (Tech) Other Big Data tools: Spark SQL, Hive, Pig, Sqoop, Flume, etc.

18/12 Mid-term Tech II

15/01 - 16/02 Final Project and Discussion

Exam

The two mid-terms will be 40% of the final grade, the remaining 60% is the evaluation of the Project and the Discussion. There is the possibility to do the a final test about technologies if the Mid-Terms are not sufficient.

The following table describe the expected content of a project:

Laboratories

Student should bring their own laptop (especially for technology lectures).

Software & Links

Virtual Machines:

Previous Big Data Analytics websites

bigdataanalytics/bda/start.txt · Ultima modifica: 24/11/2017 alle 10:19 (24 ore fa) da Roberto Trasarti