Strumenti Utente

Strumenti Sito


bigdataanalytics:bda:start

Questa è una vecchia versione del documento!


<html> <!– Google Analytics –> <script type=“text/javascript” charset=“utf-8”> (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){ (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o), m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m) })(window,document,'script','www.google-analytics.com/analytics.js','ga'); ga('create', 'UA-34685760-1', 'auto', 'personalTracker', {'allowLinker': true}); ga('personalTracker.require', 'linker'); ga('personalTracker.linker:autoLink', ['pages.di.unipi.it', 'enforce.di.unipi.it', 'didawiki.di.unipi.it'] ); ga('personalTracker.require', 'displayfeatures'); ga('personalTracker.send', 'pageview', 'ruggieri/teaching/bda/'); setTimeout(“ga('send','event','adjusted bounce rate','30 seconds')”,30000); </script> <!– End Google Analytics –> <!– Capture clicks –> <script> jQuery(document).ready(function(){ jQuery('a[href$=“.pdf”]').click(function() { var fname = this.href.split('/').pop(); ga('personalTracker.send', 'event', 'BDA', 'PDFs', fname); }); jQuery('a[href$=“.r”]').click(function() { var fname = this.href.split('/').pop(); ga('personalTracker.send', 'event', 'BDA', 'Rs', fname); }); jQuery('a[href$=“.zip”]').click(function() { var fname = this.href.split('/').pop(); ga('personalTracker.send', 'event', 'BDA', 'ZIPs', fname); }); jQuery('a[href$=“.mp4”]').click(function() { var fname = this.href.split('/').pop(); ga('personalTracker.send', 'event', 'BDA', 'Videos', fname); }); jQuery('a[href$=“.flv”]').click(function() { var fname = this.href.split('/').pop(); ga('personalTracker.send', 'event', 'BDA', 'Videos', fname); }); }); </script> </html> ====== Big Data Analytics A.A. 2018/19 ====== Instructors - Docenti: * Fosca Giannotti, Luca Pappalardo * KDD Laboratory, Università di Pisa ed ISTI - CNR, Pisa * http://www-kdd.isti.cnr.it * fosca [dot] giannotti [at] isti [dot] cnr [dot] it * luca [dot] pappalardo [at] isti [dot] cnr [dot] it Notice: you can find a list of the papers to read at this link: http://bit.ly/bda_papers. Send an email to Luca Pappalardo with your choice for three papers. We then assign you one of the papers. ====== Learning goals ====== In our digital society, every human activity is mediated by information technologies, hence leaving digital traces behind. These massive traces are stored in some, public or private, repository: phone call records, movement trajectories, soccer-logs and social media records are all examples of “Big Data”, a novel and powerful “social microscope” to understand the complexity of our societies. The analysis of big data sources is a complex task, involving the knowledge of several technological and methodological tools. This course has three objectives: * introducing to the emergent field of big data analytics and social mining; * introducing to the technological scenario of big data, like programming tools to analyze big data, query NoSQL databases, and perform predictive modeling; * guide students to the development of a open-source and reproducible big data analytics project, based on the analyis of real-world datasets. ====== Module 1: Big Data Analytics and Social Mining ====== In this module, analytical methods and processes are presented thought exemplary cases studies in challenging domains, organized according to the following topics: * The Big Data Scenario and the new questions to be answered * Sport Analytics: - Soccer data landscape and injury prediction - Analysis and evolution of sports performance * Mobility Analytics - Mobility data landscape and mobility data mining methods - Understanding Human Mobility with vehicular sensors (GPS) - Mobility Analytics: Novel Demography with mobile-phone data * Social Media Mining - The social media data landscape: Facebook, Linked-in, Twitter, Last_FM - Sentiment analysis. example from human migration studies - Discussion on ethical issues of Big Data Analytics * Well-being&Now-casting - Nowcasting influenza with retail market data - Predicting well-being from human mobility patterns * Paper presentations by students ====== Module 2: Big Data Analytics Technologies ====== This module will provide to the students the technologies to collect, manipulate and process big data. In particular the following tools will be presented: * Python for Data Science * The Jupyter Notebook: developing open-source and reproducible data science * MongoDB: fast querying and aggregation in NoSQL databases * GeoPandas: analyze geo-spatial data with Python * Scikit-learn: programming tools for data mining and analysis * M-Atlas: a toolkit for mobility data mining ====== Module 3: Laboratory for Interactive Project Development ====== During the course, teams of students will be guided in the development of a big data analytics project. The projects will be based on real-world datasets covering several thematic areas. Discussions and presentation in class, at different stages of the project execution, will be performed. * Data Understanding and Project Formulation * Mid Term Project Results * Final Project results ====== Calendar ====== 17/09 (Mod. 1) Introduction to the course, The Big Data scenario mod1.introduction_bigdatalandscape_newquestions_.pdf 21/09 (Mod. 1) Big Data Analytics: new questions to be solved + Presentation of datasets * list of datasets: http://bit.ly/bda_list_datasets * slides: http://bit.ly/bda18_datasets_slides) 24/09 (Mod. 2) Python for Data Science: The Jupyter Notebook: developing open-source and reproducible data science * How to install Jupyter notebook: https://jupyter.readthedocs.io/en/latest/install.html * Python notebooks: http://bit.ly/bda_notebooks_1 28/09 (Mod. 1) Soccer data landscape and players’ injury prediction * slides: http://bit.ly/bda_sports_data_injury * paper: http://bit.ly/plos_injury 01/10 (Mod. 2) Scikit-learn: programming tools for data mining and analysis. * Python notebooks: http://bit.ly/bda_notebooks_2 05/10 (Mod. 1) Analysis and evolution of sports performance 08/10 (Mod. 1) The mobility data landscape 12/10 (Mod. 1) Suspended 15/10 (Mod. 1) Mobility data mining methods (patterns&models) 19/10 (Mod. 1) Understanding Human Mobility with GPS - Case Studies 22/10 (Mod. 1) Urban Dynamics with GSM 26/10 (Mod. 3) Data Understanding and Project Formulation 05/11 (Mod. 2) GeoPandas: analyse geo-spatial data with Python 09/11 (Mod. 1) Predicting well-being from human mobility patterns 12/11 (Mod. 1) Nowcasting influenza with retail market data 16/11 (Mod. 3) Papers presentations from students 19/11 (Mod. 2) MongoDB: fast querying and aggregation in NoSQL databases 23/11 (Mod. 3) Papers presentations from students 26/11 (Mod. 3) Mid Term Project Results 30/11 No lessons 03/12 (Mod. 1) The social media data landscape and social media mining methods 07/12 (Mod. 1) Sentiment analysis: examples from Human Migration studies 10/12 (Mod. 3) Discussion on Ethical issues in Big Data Analytics and Final Project results 14/12 (Mod. 3) Final Project results 12/01 14,00 @ CNR (Entrance 20 - Room C36b) - Exam ===== Exam ===== The two mid-terms will be 40% of the final grade, the remaining 60% is the evaluation of the Project and the Discussion (prepare some Slides to present your project). There is the possibility to do the a final test about technologies if the Mid-Terms are not sufficient. The following table describe the expected content of a project: ====== Previous Big Data Analytics websites ====== http://didawiki.di.unipi.it/doku.php/bigdataanalytics/bda/bda2017 http://didawiki.di.unipi.it/doku.php/bigdataanalytics/bda/bda2016 http://didawiki.di.unipi.it/doku.php/bigdataanalytics/bda/bda2015

bigdataanalytics/bda/start.1539683671.txt.gz · Ultima modifica: 16/10/2018 alle 09:54 (6 anni fa) da Fosca Giannotti

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki