Strumenti Utente

Strumenti Sito


bigdataanalytics:bda:start

Differenze

Queste sono le differenze tra la revisione selezionata e la versione attuale della pagina.

Link a questa pagina di confronto

Entrambe le parti precedenti la revisione Revisione precedente
Prossima revisione
Revisione precedente
bigdataanalytics:bda:start [05/12/2018 alle 09:24 (6 anni fa)]
Luca Pappalardo [Big Data Analytics A.A. 2018/19]
bigdataanalytics:bda:start [04/11/2022 alle 12:21 (18 mesi fa)] (versione attuale)
Salvatore Ruggieri
Linea 1: Linea 1:
-<html> +====== Big Data Analytics A.A2022/23 ======
-<!-- Google Analytics --> +
-<script type="text/javascript" charset="utf-8"> +
-(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){ +
-(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o), +
-m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m) +
-})(window,document,'script','//www.google-analytics.com/analytics.js','ga');+
  
-ga('create''UA-34685760-1', 'auto', 'personalTracker', {'allowLinker': true}); +This yearthe course 599AA Big Data Analytics (BDAis replaced by [[http://didawiki.di.unipi.it/doku.php/geospatialanalytics/gsa/start|Geospatial Analytics]]. For any questions, please contact Luca Pappalardo (luca [dot] pappalardo [at] isti [dot] cnr [dot] it).
-ga('personalTracker.require', 'linker')+
-ga('personalTracker.linker:autoLink', ['pages.di.unipi.it', 'enforce.di.unipi.it', 'didawiki.di.unipi.it'] ); +
-   +
-ga('personalTracker.require', 'displayfeatures'); +
-ga('personalTracker.send', 'pageview', 'ruggieri/teaching/bda/'); +
-setTimeout("ga('send','event','adjusted bounce rate','30 seconds')",30000); +
  
-</script> +====== Previous Big Data Analytics websites ======
-<!-- End Google Analytics --> +
-<!-- Capture clicks --> +
-<script> +
-jQuery(document).ready(function(){ +
-  jQuery('a[href$=".pdf"]').click(function() { +
-    var fname = this.href.split('/').pop(); +
-    ga('personalTracker.send', 'event',  'BDA', 'PDFs', fname); +
-  }); +
-  jQuery('a[href$=".r"]').click(function() { +
-    var fname = this.href.split('/').pop(); +
-    ga('personalTracker.send', 'event',  'BDA', 'Rs', fname); +
-  }); +
-  jQuery('a[href$=".zip"]').click(function() { +
-    var fname = this.href.split('/').pop(); +
-    ga('personalTracker.send', 'event',  'BDA', 'ZIPs', fname); +
-  }); +
-  jQuery('a[href$=".mp4"]').click(function() { +
-    var fname = this.href.split('/').pop(); +
-    ga('personalTracker.send', 'event',  'BDA', 'Videos', fname); +
-  }); +
-  jQuery('a[href$=".flv"]').click(function() { +
-    var fname = this.href.split('/').pop(); +
-    ga('personalTracker.send', 'event',  'BDA', 'Videos', fname); +
-  }); +
-}); +
-</script> +
-</html> +
-====== Big Data Analytics A.A. 2018/19 ======+
  
-Instructors - Docenti: +[[bigdataanalytics:bda:bda2021|]]
-  * **Fosca Giannotti, Luca Pappalardo** +
-    * KDD Laboratory, Università di Pisa ed ISTI - CNR, Pisa +
-    * [[http://www-kdd.isti.cnr.it]] +
-    * [[fosca.giannotti@isti.cnr.it]]    +
-    * [[luca.pappalardo@isti.cnr.it]]   +
  
 +[[bigdataanalytics:bda:bda2020|]]
  
-**Notice**you can find a list of the papers to read at this linkhttp://bit.ly/bda_papers. Send an email to Luca Pappalardo __**within** Thursday, October 26th__ with your choice for three/four papers. We then assign you one of the papers considering your preferences.+[[bigdataanalytics:bda:bda2019|]]
  
-**Instructions for project proposal** (October 26th):  +[[bigdataanalytics:bda:bda2018|]]
-  * **presentation**: 10 minutes (+ 5 minutes questions), send the pdf of the presentation to Luca Pappalardo by Thursday 25th.  +
-  * **report**: 5 pages at most, summarize the data understanding and show your project proposal. Send the pdf of the report to Luca Pappalardo by Thursday 25th. In the report put the name of the dataset you are working on and the names of the members of the team.  +
- +
-**Instructions for paper presentation** (November 16th and 23th): +
-  * **presentation**: 7 minutes (+ 3 minutes questions), send the pdf of the presentation to Luca Pappalardo by __the day before__ the presentation of your paper. +
-  * **scheduling**: date of presentation for each student: http://bit.ly/papers_scheduling +
- +
-**Instructions for project advancements report and presentation** (November 26th): +
-  * **presentation**: 10 minutes (+ 3 minutes questions), send the __pdf__ of the presentation to Luca Pappalardo by November 25th. +
-  * **report**: 10 pages at most, __extend the previous report__ by adding details about the implementation of the solution to your analytical problem and its validation. Send me the extended report and the Python notebooks used to develop the solution by November 25th. +
- +
-**Instructions for final report and presentation** (December 10th and 14th): +
-  * **presentation**: 20 minutes (+ 10 minutes questions), send the __pdf__ of the presentation to Luca Pappalardo by December 10th, 2pm. +
-  * **report**: 20 pages at most, __extend the previous report__. Send me the extended report and the Python notebooks used to develop the solution by December 10th, 2pm. +
-  * check the date of your presentation here: http://bit.ly/2rm3P6Y +
-====== Learning goals ====== +
- +
-In our digital society, every human activity is mediated by information technologies, hence leaving digital traces behind. These massive traces are stored in some, public or private, repository: phone call records, movement trajectories, soccer-logs and social media records are all examples of “Big Data”, a novel and powerful “social microscope” to understand the complexity of our societies. The analysis of big data sources is a complex task, involving the knowledge of several technological and methodological tools. +
-This course has three objectives:  +
- +
-  * introducing to the emergent field of big data analytics and social mining;  +
-  * introducing to the technological scenario of big data, like programming tools to analyze big data, query NoSQL databases, and perform predictive modeling; +
-  * guide students to the development of a open-source and reproducible big data analytics project, based on the analyis of real-world datasets.  +
- +
-====== Module 1: Big Data Analytics and Social Mining ====== +
-In this module, analytical methods and processes are presented thought exemplary cases studies in challenging domains, organized according to the following topics:  +
- +
-  * The Big Data Scenario and the new questions to be answered +
-  * Sport Analytics:    +
-    - Soccer data landscape and injury prediction +
-    - Analysis and evolution of sports performance +
-  * Mobility Analytics +
-    - Mobility data landscape and mobility data mining methods +
-    - Understanding Human Mobility with vehicular sensors (GPS) +
-    - Mobility Analytics: Novel Demography with mobile-phone data  +
-  * Social Media Mining +
-    - The social media data landscape: Facebook, Linked-in, Twitter, Last_FM +
-    - Sentiment analysis. example from human migration studies +
-    - Discussion on ethical issues of Big Data Analytics +
-  * Well-being&Now-casting +
-    - Nowcasting influenza with retail market data +
-    - Predicting well-being from human mobility patterns +
-  * Paper presentations by students +
- +
- +
-====== Module 2: Big Data Analytics Technologies ====== +
-This module will provide to the students the technologies to collect, manipulate and process big data. In particular the following tools will be presented:  +
- +
-  * Python for Data Science +
-  * The Jupyter Notebook: developing open-source and reproducible data science  +
-  * MongoDB: fast querying and aggregation in NoSQL databases +
-  * GeoPandas: analyze geo-spatial data with Python +
-  * Scikit-learn: programming tools for data mining and analysis +
-  * M-Atlas: a toolkit for mobility data mining  +
- +
- +
-====== Module 3: Laboratory for Interactive Project Development  ====== +
-During the course, teams of students will be guided in the development of a big data analytics project. The projects will be based on real-world datasets covering several thematic areas. Discussions and presentation in class, at different stages of the project execution, will be performed.  +
- +
-  * Data Understanding and Project Formulation +
-  * Mid Term Project Results  +
-  * Final Project results +
- +
-====== Calendar ====== +
- +
-17/09 (Mod. 1) Introduction to the course, The Big Data scenario {{ :bigdataanalytics:bda:mod1.introduction_bigdatalandscape_newquestions_.pdf |}} +
- +
- +
-21/09 (Mod. 1) Big Data Analytics: new questions to be solved + Presentation of datasets  +
-  * list of datasets: http://bit.ly/bda_list_datasets +
-  * slides: http://bit.ly/bda18_datasets_slides) +
- +
- +
-24/09 (Mod. 2) Python for Data Science: The Jupyter Notebook: developing open-source and reproducible data science +
-  * How to install Jupyter notebook: https://jupyter.readthedocs.io/en/latest/install.html +
-  * Python notebooks: http://bit.ly/bda_notebooks_1 +
- +
- +
-28/09 (Mod. 1) Soccer data landscape and players’ injury prediction +
-  * slides: http://bit.ly/bda_sports_data_injury +
-  * paper: http://bit.ly/plos_injury +
- +
-01/10 (Mod. 2) Scikit-learn: programming tools for data mining and analysis. +
-  * Python notebooks: http://bit.ly/bda_notebooks_2 +
- +
-05/10 (Mod. 1) Analysis and evolution of sports performance +
-  * Slides: http://bit.ly/bda_soccer_evaluation +
- +
-08/10 (Mod. 1) The mobility data landscape +
-  * Slides:  {{ :bigdataanalytics:bda:part1.mobilitydataanaysis1-foundations_.pdf |}} +
- +
-12/10 (Mod. 1) //Suspended// +
- +
-15/10 (Mod. 1) Mobility data mining methods (Patterns&Models) +
-  * Slides: {{ :bigdataanalytics:bda:part1.mobilitydataanaysis2-patterns_models.pdf |}} +
- +
-19/10 (Mod. 1) Understanding Human Mobility with GPS - Case Studies +
-  * Slides: {{ :bigdataanalytics:bda:part1.mobilitydataanaysis3-humanmobility-gps.pdf |}} +
-  * Slides: {{ :bigdataanalytics:bda:part1.mobilitydataanaysis5-casesudies.pdf |}} +
- +
-22/10 (Mod. 1) Urban Dynamics with mobile phone data +
-  * Slides:  {{ :bigdataanalytics:bda:part1.mobilitydataanaysis4-citydinamics-gsm.pdf |}} +
- +
-26/10 (Mod. 3) **Data Understanding and Project Formulation** +
- +
-05/11 (Mod. 2) GeoPandas: analyse geo-spatial data with Python +
-  * Python notebook: {{ :bigdataanalytics:bda:bda_geopandas.zip |}} +
- +
-09/11 (Mod. 1) Predicting well-being from human mobility patterns +
-  * Slides: http://bit.ly/2B1ohzt +
-   +
-12/11 (Mod. 2) MongoDB: fast querying and aggregation in NoSQL databases +
- +
-16/11 (Mod. 3) **Papers presentations from students** +
- +
-19/11 (Mod. 1) Nowcasting influenza with retail market data +
- +
-23/11 (Mod. 3) **Papers presentations from students** +
- +
-26/11 (Mod. 3) **Mid Term Project Results** +
- +
-30/11 No lessons +
- +
-03/12 (Mod. 1) The social media data landscape and social media mining methods +
- +
-07/12 (Mod. 1) Sentiment analysis: examples from Human Migration studies +
-  +
- +
-10/12 (Mod. 3) Discussion on Ethical issues in Big Data Analytics and **Final Project results** +
- +
- +
-14/12 (Mod. 3) **Final Project results** +
- +
- +
-18/01 EXAM: 09:00 @ aula L1 +
- +
-08/02 EXAM: 09:00 @ aula L1 +
- +
-===== Exam ===== +
-The two mid-terms will be 40% of the final grade, the remaining 60% is the evaluation of the Project and the Discussion (prepare some Slides to present your project). +
-There is the possibility to do the a final test about technologies if the Mid-Terms are not sufficient. +
- +
-The following table describe the expected content of a project: +
-{{:bigdataanalytics:bda:project.png?800|}} +
- +
-====== Previous Big Data Analytics websites ======+
  
-http://didawiki.di.unipi.it/doku.php/bigdataanalytics/bda/bda2017+[[bigdataanalytics:bda:bda2017|]]
  
-http://didawiki.di.unipi.it/doku.php/bigdataanalytics/bda/bda2016+[[bigdataanalytics:bda:bda2016|]]
  
-http://didawiki.di.unipi.it/doku.php/bigdataanalytics/bda/bda2015+[[bigdataanalytics:bda:bda2015|]]
bigdataanalytics/bda/start.1544001863.txt.gz · Ultima modifica: 05/12/2018 alle 09:24 (6 anni fa) da Luca Pappalardo