| Entrambe le parti precedenti la revisioneRevisione precedenteProssima revisione | Revisione precedente |
| digitalhealth:0001a [29/11/2024 alle 17:36 (13 mesi fa)] – [First Semester] Anna Monreale | digitalhealth:0001a [08/12/2025 alle 10:44 (5 giorni fa)] (versione attuale) – [First Semester] Anna Monreale |
|---|
| ====== Data Analytics for Digital Health (DAD) ====== | ====== Data Analytics for Digital Health (DAD) - 9 CFU A.Y. 2025/2026====== |
| **Instructors:** | **Instructors:** |
| * **Anna Monreale** | * **Anna Monreale** |
| |
| ====== News ====== | ====== News ====== |
| | |
| | * [08.09.2025] - Lecture of the first week will be canceled, so they will start on 22nd September 2025 |
| | * [30.09.2025] - All Students must fill [[https://docs.google.com/spreadsheets/d/1HFdrGb7PjjKglAFYHR1T0RJHjg5C2ufyPuysCSsfuho/edit?pli=1&gid=0#gid=0| this document ]] for exam information |
| ====== Learning Goals ====== | ====== Learning Goals ====== |
| * Fundamental concepts of data knowledge and discovery. | * Fundamental concepts of data knowledge and discovery. |
| ^ Day of Week ^ Hour ^ Room ^ | ^ Day of Week ^ Hour ^ Room ^ |
| | Monday | 09:00 - 11:00 | Room FIB PS4 | | | Monday | 09:00 - 11:00 | Room FIB PS4 | |
| | Wednesday| 14:00 - 16:00 | Room C | | | Tuesday | 14:00 - 16:00 | Room C1 | |
| | Friday | 11:00 - 13:00 | Room FIB PS4 | | | Friday | 11:00 - 13:00 | Room FIB PS4 | |
| |
| |
| |
| **Office hours - Ricevimento:** | **Office hours - Ricevimento:** |
| Anna Monreale: Thu 09:00-11:00 - Online using Teams or in my Office (Appointment by email). | Anna Monreale: TBD - Online using Teams or in my Office (Appointment by email). |
| Francesca Naretto: Mon 11:00-13:00 - Online using Teams or in my Office (Appointment by email). | Francesca Naretto: TBD - Online using Teams or in my Office (Appointment by email). |
| |
| |
| A [[https://teams.microsoft.com/l/team/19%3AYicRl7qo_TVGdu-QzXkPsV78YMyBSz-DUvdz3AJMoUI1%40thread.tacv2/conversations?groupId=d4217229-2988-44de-bbd8-6f4be6224ffa&tenantId=c7456b31-a220-47f5-be52-473828670aa1|Teams Channel]] will be used ONLY to post news, Q&A, and other stuff related to the course. The lectures will be only in presence and will **NOT** be live-streamed. | A [[https://teams.microsoft.com/l/team/19%3AaixkwjuGSoUvrBNsO88NiDZsr8C2yIucNEonmj8ssSY1%40thread.tacv2/conversations?groupId=bfaf6e19-deca-4d53-921c-65b44db73608&tenantId=c7456b31-a220-47f5-be52-473828670aa1|Teams Channel]] will be used ONLY to post news, Q&A, and other stuff related to the course. The lectures will be only in presence and will **NOT** be live-streamed. |
| ====== Learning Material -- Materiale didattico ====== | ====== Learning Material -- Materiale didattico ====== |
| |
| |
| | |
| ====== Class Calendar (2024/2025) ====== | ====== Class Calendar (2025/2026) ====== |
| |
| ===== First Semester ===== | c===== First Semester ===== |
| |
| ^ ^ Day ^ Topic ^ Learning material ^ References ^ Video Lectures ^ Teacher ^ | ^ ^ Day ^ Topic ^ Learning material ^ References ^ Teacher ^ |
| |1. | 16.09 | Overview. Introduction to KDD + Data Types | {{ :digitalhealth:0-overview.pdf | Overview}} {{ :digitalhealth:1-intro-da-dm-tecs.pdf |Introduction to DADH}} {{ :digitalhealth:2-data_understanding.pdf |Data Understanding}}| Chap. 1 Kumar Book | |Monreale | | | | 22.09 | Strike | | | | |
| |2. | 18.09 | Data Understanding for tabular data | Slides of DU of the previous lecture | Chap.2 Kumar Book and additioanl resource of Kumar Book: [[https://www-users.cs.umn.edu/~kumar001/dmbook/data_exploration_1st_edition.pdf|Data Exploration Chap.]] If you have the first ed. of KUMAR this is the Chap 3 | |Monreale | | | | 23.09 | CANCELED for Teacher's health issues | | | | |
| |3. | 20.09 | Data Preparation for tabula Data | {{ :digitalhealth:3-data_preparation_dad.pdf |}} | Chap.2 Kumar Book and additioanl resource of Kumar Book: [[https://www-users.cs.umn.edu/~kumar001/dmbook/data_exploration_1st_edition.pdf|Data Exploration Chap.]] If you have the first ed. of KUMAR this is the Chap 3 | |Monreale | | |1. | 26.09 | Overview. Introduction to Data Analyics for DH + Data Types | {{ :digitalhealth:0-overview-2025.pdf | Overview}} {{ :digitalhealth:1-intro-da-dm-tecs.pdf |}}|Chap. 1 Kumar Book |Monreale | |
| |4. | 23.09 | Data Understanding and Preparation for images | {{ :digitalhealth:4-data-understanding_images.pdf |}}|Digital Image processing, 3 edition, Rafael Gonzalez, Richard Woods | | Naretto | | |2. | 29.09 | Data Understanding TD | {{ :digitalhealth:2-Data_Understanding.pdf | Data Understanding}}|Chap.2 Kumar Book and additioanl resource of Kumar Book: [[https://www-users.cs.umn.edu/~kumar001/dmbook/data_exploration_1st_edition.pdf|Data Exploration Chap.]] If you have the first ed. of KUMAR this is the Chap 3 |Naretto | |
| |5. | 25.09 | Data Understanding and Preparation for images and Time Series | {{ :digitalhealth:5-data-understanding_ts.pdf |}}| | | Naretto | | |3. | 30.09 | Data Preparation TD | {{ :digitalhealth:3-data_preparation_dad.pdf | Data Preparation}} | Chap.2 Kumar Book and additional resource of Kumar Book: [[https://www-users.cs.umn.edu/~kumar001/dmbook/data_exploration_1st_edition.pdf|Data Exploration Chap.]] If you have the first ed. of KUMAR this is the Chap 3 | Monreale| |
| |6. | 27.09 | Data Understanding and Preparation for Time Series + Python Lab.| {{ :digitalhealth:integrazione.zip | Intro to Python}}| | | Naretto | | |4. | 01.10 - Room I | Python Lab: Data Understanding & Preparation TD | | | Naretto| |
| |7. | 30.09 | Data Understanding and Preparation for Tabular Python Lab. | {{:digitalhealth:Data_Und.zip}} | | | Naretto | | | | 03.10 | Strike | | | Naretto| |
| |8. | 02.10 | Data Understanding and Preparation for Images and Time Series Python Lab. | {{:digitalhealth:Data_Und.zip}} | | | Naretto | | |5. | 06.10 | Project Presentation + Data Understanding and Preparation for TD |{{ :digitalhealth:PIMA-DU.zip | Zip file for DU e DP for TD}} {{ :digitalhealth:Integrazione.zip | Zip file for Python}}| | Naretto| |
| |9. | 04.10 | Data Management and Data Warehousing | {{ :digitalhealth:6-dw.pdf |}}| | | Monreale | | |6. | 07.10 | Clustering: intro and k-means |{{ :digitalhealth:6-basic_cluster_analysis-intro.pdf | Intro clustering}} {{ :digitalhealth:6-basic_cluster_analysis-kmeans.pdf | kmeans}}| Chapter 7, Introduction to Data Mining, 2nd Edition by Tan, Steinbach, Karpatne, Kumar | Naretto| |
| |9. | 07.10 | Data Management and Data Warehousing | {{ :digitalhealth:6-dw.pdf |}}| | | Monreale | | |7. | 08.10 Room I | Clustering: hierarchical and db-scan | {{ :digitalhealth:7.basic_cluster_analysis-hierarchical.pdf | hierarchical}} {{:digitalhealth:10-basic_cluster_analysis-dbscan.pdf | DB-scan}}| Chapter 7, Introduction to Data Mining, 2nd Edition by Tan, Steinbach, Karpatne, Kumar | Naretto| |
| |9. | 09.10 | Data Reporting - Project presentation | {{ :digitalhealth:6-dw.pdf |}}| | | Monreale | | | | 10.10 | suspension of teaching activities | | | | |
| |10. | 14.10 | Clustering: intro and k-means | {{ :digitalhealth:6-basic_cluster_analysis-intro.pdf |}} {{ :digitalhealth:6-basic_cluster_analysis-kmeans.pdf |}}| Chapter 7, Introduction to Data Mining, 2nd Edition by Tan, Steinbach, Karpatne, Kumar| | Naretto | | |8. | 13.10 | Density-based clusering + Clustering Validity | {{ :digitalhealth:12-basic_cluster_analysis-validity.pdf |}}| Chapter 7, Introduction to Data Mining, 2nd Edition by Tan, Steinbach, Karpatne, Kumar |Naretto | |
| |11. | 16.10 | Clustering: k-means and hierarchical| {{ :digitalhealth:7.basic_cluster_analysis-Hierarchical.pdf |}}|Chapter 7, Introduction to Data Mining, 2nd Edition by Tan, Steinbach, Karpatne, Kumar | | Naretto | | | | 14.10 | Canceled: No Lecture| | | | |
| |12. | 21.10 | Clustering: k-means variants and density Based approaches|{{ :digitalhealth:11-basic_cluster_analysis-kmeans-variants.pdf |}} {{ :digitalhealth:10-basic_cluster_analysis-dbscan.pdf |}} |Chapter 7, Introduction to Data Mining, 2nd Edition by Tan, Steinbach, Karpatne, Kumar | | Monreale | | |9. | 17.10 | Clustering Validity + Data Warehouse | {{ :digitalhealth:6-dw.pdf |}} | | Monreale | |
| |13. | 23.10 | Clustering: Validity|{{ :digitalhealth:12-basic_cluster_analysis-validity.pdf |}} |Chapter 7, Introduction to Data Mining, 2nd Edition by Tan, Steinbach, Karpatne, Kumar | | Monreale | | |10. | 20.10 | Data Warehouse| {{ :digitalhealth:6-dw.pdf |}}| | Monreale| |
| |14. | 25.10 | Clustering and similarity for Images| {{ :digitalhealth:3.2-Clustering_images.pdf |}}| | | Naretto | | |11. | 21.10 | Data Warehouse + PowerBI Demo|Same Slides of the previous lecture | | Monreale| |
| |15. | 28.10 | Clustering and similarity for Time Series| {{ :digitalhealth:8_time_series_similarity_2024.pdf |}}| Time Series Analysis and Its Applications. Robert H. Shumway and David S. Stoffer. 4th edition| | Naretto | | |12. | 22.10 Room Lab I| Pre-processing for Image | {{ :digitalhealth:2.1-data-understanding_images.pdf |}}| | Naretto| |
| |16. | 30.10 | Python Lab: Clustering| {{ :digitalhealth:clustering_diabetes.zip|}} {{ :digitalhealth:images_similarity.zip|}} {{ :digitalhealth:timeseries_similarity_clustering.zip|}} {{ :digitalhealth:clustering_tabular_tips.zip|}}| | | Naretto | | |13. | 24.10 |Python Lab: Clustering | {{ :digitalhealth:clustering_diabetes.zip}} | |Naretto | |
| |17. | 04.11 | Python Lab: Clustering + Frequent Pattern Mining| {{ :digitalhealth:17_association_analysis.pdf |}}| | | Naretto, Monreale | | |14. | 27.10 | Pre-processing for Image | {{ :digitalhealth:2.1-data-understanding_images.pdf |}} {{ :digitalhealth:kernel1.pdf |}} |Digital Image processing (Gonzales, Woods) | Naretto| |
| |18. | 06.11 | Frequent Pattern Mining| same slides as previous lecture| | |Monreale | | |15. | 28.10 |Time series pre-processing | {{ :digitalhealth:5-data-understanding_ts.pdf|}}| |Monreale | |
| |19. | 08.11 | Sequential Pattern Mining| {{ :digitalhealth:18_sequential_patterns_2024.pdf |}}| | |Monreale | | |16. | 31.10 |Time series pre-processing, similarities and project presentation (task n.2) | {{ :digitalhealth:8_time_series_similarity_2024.pdf |}} {{digitalhealth:timeseries_similarity_clustering.zip}}| |Naretto | |
| |20. | 11.11 | Python lab: FPM + SPM|{{ :digitalhealth:AR_SPM.zip |}} | | |Naretto | | |17. | 03.11 |Image clustering and presentation of the project (task 3)| {{ :digitalhealth:3.2-clustering_images.pdf |}} {{ :digitalhealth:ecg-first-analysis.ipynb.zip}}| |Naretto | |
| |21. | 13.11 | Classification for tabular| | | |Naretto | | |18. | 04.11 |Image clustering |{{ :digitalhealth:3.2-clustering_images.pdf |}} {{digitalhealth:images_similarity.zip}}| |Naretto | |
| |22. | 15.11 | Classification for tabular|{{ :digitalhealth:10-KNN.pdf |}} | | |Naretto | | |19. | 07.11 |Project work | | |Monreale, Naretto | |
| |23. | 18.11 | Classification for tabular|{{ :digitalhealth:10-lg.pdf |}} | | |Naretto | | |20. | 10.11 |KNN and Logistic regression | {{ :digitalhealth:10-KNN.pdf |}} {{ :digitalhealth:10-lg2025.pdf |}} | |Naretto | |
| |24. | 20.11 | Project| | | |Monreale, Naretto | | |21. | 11.11 |LG |{{ :digitalhealth:10-lg2025.pdf |}} | |Naretto | |
| |25. | 22.11 | Classification for tabular|{{ :digitalhealth:10-Rule-Based-Classifiers.pdf |}} {{ :digitalhealth:11_2021-naive_bayes.pdf|}}| | |Naretto | | |22. | 14.11 |Rule-based classifier |{{ :digitalhealth:10-Rule-Based-Classifiers.pdf |}} | |Naretto | |
| |26. | 25.11 | Classification for tabular|{{ :digitalhealth:13_ensemble_2023.pdf |}}| | |Naretto | | |23. | 17.11 | Naive Bayes | {{:digitalhealth:11_2021-naive_bayes.pdf}}| |Naretto |
| |27. | 27.11 | Python Lab: Classification & Project Work| | |Naretto | | |24. | 18.11 | Naive Bayes and Ensemble methods| {{:digitalhealth:11_2021-naive_bayes.pdf}} {{:digitalhealth:13_ensemble_2023.pdf}}| |Naretto | |
| |28. | 29.11 | Time seres classification via shapelet and Motif Discovery| {{ :digitalhealth:23_time_series_motif-2024.pdf |Shapelets & Motifs}}| |Monreale | | |25. | 21.11 | Ensemble methods | {{:digitalhealth:13_ensemble_2023.pdf}}| |Naretto | |
| | |26. | 24.11 |Imbalanced learning | {{:digitalhealth:imbalanced-learning.pdf}}| | | |
| | |27. | 25.11 | Python Lab on classification and presentation of the project, task 4|{{:digitalhealth:classification-diabetes.ipynb.zip}}{{:digitalhealth:imbalanced-classification.zip}} | |Naretto | |
| | |28. | 28.11 |GSP and Apriori |{{:digitalhealth:18_sequential_patterns_2024.pdf}} | |Monreale | |
| | |29. | 01.12 |GSP and Apriori |{{:digitalhealth:17_association_analysis.pdf}} | |Monreale | |
| | |30. | 02.12 |Time series |{{:digitalhealth:23_time_series_motif-2024.pdf}}{{:digitalhealth:matrixprofile.pdf}} {{:digitalhealth:shapelets.pdf}} | |Monreale | |
| | |31. | 05.12 |Time series lab |{{:digitalhealth:23_time_series_motif-2024.pdf}} | |Naretto | |
| | |32. | 09.12 | | | | | |
| | |33. | 12.12 | | | | | |
| | |34. | 15.12 | | | | | |
| | |35. | 16.12 | | | | | |
| | |36. | 19.12 | Project CHECK - mandatory | | | | |
| |
| ====== Exams ====== | |
| **Project ** | |
| |
| A project consists in data analyses based on the use of data mining tools. | |
| The project has to be performed by a team of 2 students. It has to be performed by using Python. The guidelines require to address specific tasks. Results must be reported in a unique paper. The total length of this paper must be max 25 pages of text including figures. The students must deliver both: paper (single column) and well commented Python Notebooks. | |
| |
| * First part of the project consists in the **assignments** described here: {{ :digitalhealth:data_analytics_for_digital_health_project_du_cl.pdf |Project Description: DU and Clustering}} | ====== Exams ====== |
| - **Dataset**:[[https://unipiit.sharepoint.com/:f:/s/a__td_65366/EmBuMOhFMZVMiWh9cZT_VrMB5DJ8Xf6s5w4m0bPmwx_5jA?e=PvGLMs|Dataset Material]] | The exam consists of: a **group project** (in teams of two or three) and an **oral exam** that includes a discussion of the project and an assessment of the theoretical knowledge acquired, for those who complete the project during the course and meet all intermediate and final deadlines set by the instructors. |
| - **Deadline**: the fist part has to be delivered by ** December 2th, 2024 **. The delivery will be through Teams' assignement | |
| | |
| |
| | Alternatively, students who do not complete or submit the project within the established deadlines will be required to take a **written exam** and an **oral exam** covering all course topics. |
| |
| **Students who did not deliver the above project within **Dec 31, 2024** need to ask by email a new project to the teachers. The project that will be assigned will require about 20 days of work and after the delivery it will be discussed during the oral exam. ** | |
| |
| | **PROJECT** |
| |
| **Oral Exam** | A project consists in data analyses based on the use of data mining tools. |
| * **Project presentation** (with slides) – 15 minutes: mandatory for all the students with question fo understanding the details of any part of the project. | The project has to be performed by a team of 2 max 3 students. It has to be performed by using Python. The guidelines require to address specific tasks. Results must be reported in a unique paper. The total length of this paper must be max 25 pages of text including figures. The students must deliver both: paper (single column) and well commented Python Notebooks. |
| * ** Open questions on the entire program ** | |
| | |
| **How to book for the exam colloquium? ** | |
| | |
| In https://esami.unipi.it/ you can find the dates for the exam: one for January and one for February. Each student must do the registration on one of the 2 dates. These are not the dates of the colloquium or project delivery but we will use the list of registered students for organizing the exam dates. After that deadline we will share with you a calendar for the oral exam. | |
| |
| | ====== Previous years ===== |
| | [[DAD 2024-2025]] |
| |
| |