Entrambe le parti precedenti la revisioneRevisione precedenteProssima revisione | Revisione precedente |
dm:start [08/08/2024 alle 12:38 (11 mesi fa)] – Salvatore Ruggieri | dm:start [19/05/2025 alle 09:04 (7 settimane fa)] (versione attuale) – [Second Semester (DM2 - Data Mining: Advanced Topics and Applications)] Riccardo Guidotti |
---|
====== Data Mining A.A. 2023/24 ====== | ====== Data Mining A.A. 2024/25 ====== |
| |
===== DM1 - Data Mining: Foundations (6 CFU) ===== | ===== DM1 - Data Mining: Foundations (6 CFU) ===== |
* Meeting: https://calendly.com/andreafedele/ | * Meeting: https://calendly.com/andreafedele/ |
====== News ====== | ====== News ====== |
| * **[11.03.2025]** The lecture of DM2 planned for the 14/03/2025 will be held in Room C instead of in Room E. |
* **[24.05.2024]** When registering for the oral exam please specify in the notes DM1 if you do not want to do DM2 (that is assumed by default). After having booked it please contact Prof. Pedreschi to agree on the exam date (put Prof. Guidotti and Andrea Fedele in cc). There will be no agenda for DM1. | * *[04.03.2025]* The sixth lecture of DM2 planned for the 04/03/2025 will be in Room C instead of in Room E. |
* [03.05.2024] Next lecture of DM2 will be as usual on Monday 06/05 from 9 to 11 in room C. | * [27.01.2025] The first lecture of DM2 will be held the 18.02.2025 in Room A1 exchanging with S4DS that will be held the 17.02.2025 in Room E. |
* [19.01.2024 DM2 Lectures will start on Mon 19/02, only for that lecture the time will be 14-16 instead of 9-11. | * [07.01.2025] Exams Registration Instructions for DM1 (second term): |
* [13.10.2023] To schedule meeting with the Teaching Assistant you can use: https://calendly.com/andreafedele/ | - Use the Google registration form: [[https://forms.gle/NuAvCa3YK2h8MgrX7|here]] before the 23/01/2025. |
* [20.09.2023] Recordings of the lectures can be found on the web pages of the course for the years 2020/2021 and 2021/2022 (see links at the bottom of this page) | - When the registration closes you will receive a link to the Agenda |
* [20.09.2023] Thursday 21 September there will be no lecture. | - Register on the Agenda selecting day and time (do not change you choice or cancel, if you book you want to do the exam) |
* [11.09.2023] Lectures will start on Monday 18 September 2023 at 11.00 room C1. | - Submit the project at least 1 week before the day you selected in the Agenda. |
* [11.09.2023] Lectures will be in presence only. Registrations of the lectures of past years can be found at the bottom of this web page. | * [03.12.2024] This year' lectures available at [[https://unipiit-my.sharepoint.com/:f:/g/personal/a_fedele7_studenti_unipi_it/Er7vET5iUWtGhScjXe7XzHUBDd3aYv8j87VYil6moFVyzw|link]] |
* [11.09.2023] Project Groups [[https://docs.google.com/spreadsheets/d/10R5AcqdlXsqTAxSys6zyqArvdytq4HH6Ik8Uy-NHkQ4/edit?usp=sharing|link]] | * [07.09.2024] Past years' lectures available at [[https://unipiit-my.sharepoint.com/:f:/g/personal/a_fedele7_studenti_unipi_it/EkecHQpnojVLqX0OqTlfrbMBBRMFbIJfNCw_RdFPN2276g?e=Y2uIcu|link]] |
* [11.09.2023] MS Teams [[https://teams.microsoft.com/l/team/19%3a7uEgK_aekrBFuOsbREccAa-tfqeSwvfBemfK_lG6HA01%40thread.tacv2/conversations?groupId=84cc4fec-41fc-4208-a9d4-a02675216d22&tenantId=c7456b31-a220-47f5-be52-473828670aa1|link]] | * [02.09.2024] Lectures will start on Monday 30 September 2024 at 11.00 room C1. |
| * [02.09.2024] Lectures will be in presence only. Registrations of the lectures of past years can be found at the bottom of this web page. |
| * [02.09.2024] Project Groups [[https://docs.google.com/spreadsheets/d/1RFWIwKM5Myaehh4tHceaf3olMYm_CktGvoNOFX2Oovc/edit?usp=sharing|link]] |
| * [11.09.2023] MS Teams [[https://teams.microsoft.com/l/team/19%3AMMVIsw09XAOGOcd8-D8dKmNUO2hKXsFKpgkOoiFnwJM1%40thread.tacv2/conversations?groupId=3f7fd5a7-5c84-4930-92e4-0704013877f2&tenantId=c7456b31-a220-47f5-be52-473828670aa1|link]] |
====== Learning Goals ====== | ====== Learning Goals ====== |
* DM1 | * DM1 |
^ Day of Week ^ Hour ^ Room ^ | ^ Day of Week ^ Hour ^ Room ^ |
| Monday | 11:00 - 13:00 | C1 | | | Monday | 11:00 - 13:00 | C1 | |
| Wednesday | 11:00 - 13:00 | C1 | | | Tuesday | 14:00 - 16:00 | C1 | |
| |
**Office hours - Ricevimento:** | **Office hours - Ricevimento:** |
| |
* Prof. Pedreschi | * Prof. Pedreschi |
* Monday 16:00 - 18:00 | * TBD |
* Online | * Online |
* Prof. Guidotti | * Prof. Guidotti |
* Tuesday 16:00 - 18:00 or Appointment by email | * Thursday 16:00 - 18:00 or Appointment by email |
* Room 363 Dept. of Computer Science or MS Teams | * Room 363 Dept. of Computer Science or MS Teams |
| |
| |
^ Day of Week ^ Hour ^ Room ^ | ^ Day of Week ^ Hour ^ Room ^ |
| Monday | 09:00 - 11:00 | C | | | Monday | 11:00 - 13:00 | E | |
| Wednesday | 11:00 - 13:00 | C | | | Wednesday | 09:00 - 11:00 | E | |
| |
**Office Hours - Ricevimento:** | **Office Hours - Ricevimento:** |
* Didactic Data Mining [[http://matlaspisa.isti.cnr.it:5055/Help| DDMv1]], [[https://kdd.isti.cnr.it/ddm/#/| DDMv2]] | * Didactic Data Mining [[http://matlaspisa.isti.cnr.it:5055/Help| DDMv1]], [[https://kdd.isti.cnr.it/ddm/#/| DDMv2]] |
| |
====== Class Calendar (2023/2024) ====== | ====== Class Calendar (2024/2025) ====== |
| |
===== First Semester (DM1 - Data Mining: Foundations) ===== | ===== First Semester (DM1 - Data Mining: Foundations) ===== |
| |
^ ^ Day ^ Time ^ Room ^ Topic ^ Material ^ Lecturer ^ | ^ ^ Day ^ Time ^ Room ^ Topic ^ Material ^ Lecturer ^ |
|01.| 18.09.2023 | 11-13 |C1| Overview, Introduction | {{ :dm:00_dm1_introduction_2023_24.pdf | Intro}} | Pedreschi| | | | 16.09.2024 | | | No Lecture | | | |
| | 20.09.2023 | 11-13 | | No Lecture | | | | | | 17.09.2024 | | | No Lecture | | | |
|02.| 25.09.2023 | 11-13 |C1| Lab. Introduction to Python | {{ :dm:dm1_lab01_python_basics.zip | Python Basic}} | Guidotti| | | | 23.09.2024 | | | No Lecture | | | |
|03.| 27.09.2023 | 11-13 |C1| Lab. Data Understanding | {{ :dm:dm1_lab02_data_understanding.zip | Data Understanding}} | Guidotti| | | | 24.09.2024 | | | No Lecture | | | |
|04.| 02.10.2023 | 11-13 |C1| Data Understanding | {{ :dm:01_dm1_data_understanding_2023_24.pdf | Data Understanding}} | Guidotti| | |01.| 30.09.2024 | 11-13 |C1| Overview, Introduction | {{ :dm:00_dm1_introduction_2024_25.pdf | Intro}} | Pedreschi| |
|05.| 04.10.2023 | 11-13 |C1| Data Understanding & Preparation | {{ :dm:01_dm1_data_understanding_2023_24.pdf | Data Understanding}}, {{ :dm:02_dm1_data_preparation_2023_24.pdf | Data Preparation}} | Pedreschi| | |02.| 01.10.2024 | 14-16 |C1| Lab. Introduction to Python | {{ :dm:dm1_lab01_python_basics_2024_25.zip | Python Basics}} | Pedreschi| |
|06.| 09.10.2023 | 11-13 |C1| Data Preparation & Data Similarity | {{ :dm:02_dm1_data_preparation_2023_24.pdf | Data Preparation}}, {{ :dm:03_dm1_data_similarity_2023_24.pdf | Data Similarity}} | Pedreschi| | |03.| 07.10.2024 | 11-13 |C1| Data Understanding | {{ :dm:01_dm1_data_understanding_2024_25.pdf | Data Understanding}} | Pedreschi| |
|07.| 11.10.2023 | 11-13 |C1| Data Similarity & Lab. Data Understanding | {{ :dm:03_dm1_data_similarity_2023_24.pdf | Data Similarity}}, {{ :dm:dm1_lab02_data_understanding.zip | Data Understanding}} | Pedreschi| | |04.| 08.10.2023 | 14-16 |C1| Data Understanding & Preparation | {{ :dm:01_dm1_data_understanding_2024_25.pdf | Data Understanding}}, {{ :dm:02_dm1_data_preparation_2024_25.pdf | Data Preparation}} | Pedreschi| |
|08.| 16.10.2023 | 11-13 |C1| Introduction to Clustering, K-Means | {{ :dm:04_dm1_clustering_intro_2023_24.pdf | Intro_Clustering}}, {{:dm:05_dm1_kmeans_2023_24.pdf | K-Means }} | Pedreschi| | |05.| 14.10.2023 | 11-13 |C1| Data Preparation & Similarity | {{ :dm:02_dm1_data_preparation_2024_25.pdf | Data Preparation}}, {{ :dm:03_dm1_data_similarity_2024_25.pdf | Data Similarity}} | Pedreschi| |
|09.| 18.10.2023 | 11-13 |C1| Clustering Validation, Hierarchical Clustering | {{ :dm:04_dm1_clustering_intro_2023_24.pdf | Intro_Clustering}}, {{ :dm:06_dm1_hierarchical_clustering_2023_24.pdf | Hierarchical}} | Pedreschi| | |06.| 15.10.2024 | 14-16 |C1| Lab. Data Understanding | {{ :dm:dm1_lab02_data_understanding.zip | Data Understanding}}| Pedreschi| |
|10.| 23.10.2023 | 11-13 |C1| Density-based Clustering | {{ :dm:07_dm1_density_based_2023_24.pdf | Density-based Clustering}} | Pedreschi| | |07.| 21.10.2024 | 11-13 |C1| Introduction to Clustering, K-Means | {{ :dm:04_dm1_clustering_intro_2024_25.pdf | Intro Clustering}}, {{:dm:05_dm1_kmeans_2024_25.pdf | K-Means }} | Pedreschi| |
|11.| 25.10.2023 | 11-13 |C1| Lab. Clustering | {{ :dm:dm1_lab03_clustering.zip | Clustering}}| Guidotti| | |08.| 22.10.2024 | 14-16 |C1| Centroid-based Clustering | {{:dm:05_dm1_kmeans_2024_25.pdf | K-Means }} | Pedreschi| |
|12.| 30.10.2023 | 11-13 |C1| Ex. Clustering | {{ :dm:ex1_dm1_clustering_2023_24.pdf | ExClustering}}| Guidotti| | |09.| 28.10.2023 | 11-13 |C1| Hierarchical Clustering & Density-based Clustering | {{ :dm:06_dm1_hierarchical_clustering_2024_25.pdf | Hierarchical Clustering}}, {{ :dm:07_dm1_density_based_2024_25.pdf | Density-based Clustering}} | Pedreschi| |
| | 01.11.2023 | 11-13 | | No Lecture | | | | |10.| 29.10.2024 | 14-16 |C1| Lab. Clustering | {{ :dm:dm1_lab03_clustering.zip | Clustering}}| Pedreschi| |
|13.| 06.11.2023 | 11-13 |C1| Intro Classification, kNN[[https://unipiit.sharepoint.com/sites/a__td_61280/Shared%20Documents/General/Recordings/Lecture%2006_11_2023-20231106_110052-Registrazione%20della%20riunione.mp4?web=1|(video)]] | {{ :dm:08_dm1_classification_intro_2023_24.pdf | Intro_Classification}}, {{ :dm:09_dm1_knn_2023_24.pdf | kNN}}| Guidotti| | |11.| 04.11.2024 | 11-13 |C1| Ex. Clustering | {{ :dm:ex1_dm1_clustering_2023_24.pdf | ExClustering}}| Guidotti| |
|14.| 08.11.2023 | 11-13 |C1| Naive Bayes, Exercises | {{ :dm:10_dm1_naive_bayes_2023_24.pdf | Naive Bayes}} | Guidotti| | |12.| 05.11.2024 | 14-16 |C1| Intro Classification & kNN | {{ :dm:08_dm1_classification_intro_2024_25.pdf | Intro Classification}}, {{ :dm:09_dm1_knn_2024_25.pdf | kNN}} | Guidotti| |
|15.| 13.11.2023 | 11-13 |C1| Model Evaluation | {{ :dm:11_dm1_classification_eval_2023_24.pdf | Model Evaluation}} | Guidotti| | |13.| 11.11.2024 | 11-13 |C1| Naive Bayes, Exercises | {{ :dm:10_dm1_naive_bayes_2024_25.pdf | Naive Bayes}} | Guidotti| |
|16.| 15.11.2023 | 11-13 |C1| Model Evaluation Exercises & Lab | {{ :dm:dm1_lab04_classification_regression.zip | Classification}} | Guidotti| | |14.| 12.11.2024 | 14-16 |C1| Model Evaluation, Lab. Classification (kNN,NB) | {{ :dm:11_dm1_classification_eval_2024_25.pdf | Model Evaluation}}, {{ :dm:dm1_lab04_classification.zip | Classification}} | Guidotti| |
| | 20.11.2023 | 11-13 | | No Lecture | | | | |15.| 14.11.2024 | 9-11 |C1| Decision Tree Classifier | {{ :dm:12_dm1_decision_trees_2024_25.pdf | Decision Tree}} | Guidotti| |
|17.| 22.11.2023 | 11-13 |C1| Decision Tree Classifier | {{ :dm:12_dm1_decision_trees_2023_24.pdf | Decision Tree}} | Pedreschi| | |16.| 18.11.2024 | 11-13 |C1| Decision Tree Classifier | {{ :dm:12_dm1_decision_trees_2024_25.pdf | Decision Tree}} | Guidotti| |
|18.| 27.11.2023 | 11-13 |C1| Decision Tree Classifier | {{ :dm:12_dm1_decision_trees_2023_24.pdf | Decision Tree}} | Pedreschi| | |17.| 19.11.2024 | 14-16 |C1| Decision Tree Classifier | {{ :dm:12_dm1_decision_trees_2024_25.pdf | Decision Tree}} | Guidotti| |
|19.| 29.11.2023 | 11-13 |C1| Exercises and Lab. Decision Tree Classifier | {{ :dm:dm1_lab04_classification.zip | Decision Tree}} | Guidotti| | |18.| 21.11.2024 | 9-11 |C1| Decision Tree Classifier Exercises and Lab | {{ :dm:12_dm1_decision_trees_2024_25.pdf | Decision Tree}}, {{ :dm:dm1_lab04_classification.zip | Classification}} | Guidotti| |
|20.| 04.12.2023 | 11-13 |C1| Decision Tree Classifier, Exercises and Lab | {{ :dm:12_dm1_decision_trees_2023_24.pdf | Decision Tree}} | Pedreschi| | |19.| 25.11.2024 | 11-13 |C1| Regression & Lab. Regression | {{ :dm:13_dm1_linear_regression_2024_25.pdf | Regression}}, {{ :dm:dm1_lab05_regression.zip | Regression}}, {{ :dm:dm1_2425_imdb_rating.zip | IMDb Rating}} | Guidotti| |
|21.| 06.12.2023 | 11-13 |C1| Intro Regression & Lab. Regression | {{ :dm:12_dm1_linear_regression_2023_24.pdf | Regression}}, {{ :dm:dm1_lab05_regression.zip | Regression}} | Guidotti| | |20.| 26.11.2024 | 14-16 |C1| Into Pattern Mining and Apriori | {{ :dm:14_dm1_pattern_mining_2024_25.pdf | Pattern Mining}} | Pedreschi| |
|22.| 11.12.2023 | 11-13 |C1| Into Pattern Mining and Apriori | {{ :dm:13_dm1_pattern_mining_2023_24.pdf | Pattern Mining}} | Pedreschi| | |21.| 28.11.2024 | 9-11 |C1| Apriori & FP-Growth | {{ :dm:14_dm1_pattern_mining_2024_25.pdf | Pattern Mining}} | Guidotti| |
|23.| 13.12.2023 | 16-18 |C1| Apriori & Lab. Pattern Mining | {{ :dm:13_dm1_pattern_mining_2023_24.pdf | Pattern Mining}}, {{ :dm:dm1_lab06_pattern_mining.zip | Pattern Mining}} | Pedreschi| | |22.| 02.12.2024 | 11-13 |C1| Lab. Pattern Mining & Exercises | {{ :dm:14_dm1_pattern_mining_2024_25.pdf | Pattern Mining}}, {{ :dm:dm1_lab06_pattern_mining.zip | Pattern Mining}} | Guidotti| |
|24.| 18.12.2023 | 11-13 |C| FP-Growth and Exercises | {{ :dm:13_dm1_pattern_mining_2023_24.pdf | Pattern Mining}} | Guidotti| | |23.| 03.12.2024 | 14-16 |C1| Rule-based Classifiers | {{ :dm:15_dm1_rule_based_classifier_2024_25.pdf | Rule-based Classifiers}} | Guidotti| |
| |24.| 05.12.2024 | 9-11 |C1| FP-Growth Exercises & Project Discussion | | Guidotti| |
===== Second Semester (DM2 - Data Mining: Advanced Topics and Applications) ===== | ===== Second Semester (DM2 - Data Mining: Advanced Topics and Applications) ===== |
| |
^ ^ Day ^ Time ^ Room ^ Topic ^ Material ^ Lecturer ^ | ^ ^ Day ^ Time ^ Room ^ Topic ^ Material ^ Lecturer ^ |
|01.| 19.02.2024 | 14-16 |C| Overview, Rule-based Models | {{ :dm:14_dm2_intro_2023_24.pdf | Introduction}}, {{ :dm:dm2_project_guidelines_23_24.pdf | Guidelines}}, {{ :dm:15_dm2_rule_based_classifier_2023_24.pdf | Rule-based Models }} | Guidotti| | |01.| 18.02.2025 | 14-16 |A1| Overview, Imbalanced Learning | {{ :dm:16_dm2_intro_2024_25.pdf | Introduction}}, {{ :dm:dm2_project_guidelines_24_25.pdf | Guidelines}}, {{ :dm:17_dm2_imbalanced_learning_2024_25.pdf | Imbalanced Learning}}, [[https://unipiit.sharepoint.com/:v:/s/a__td_64992/EWrX2F6xAS9JtNXh1l5JIgMByAU0eMWBFr5sbGIYL3jakA|Link]] | Guidotti| |
| | 21.02.2024 | | | No Lecture | | | | |02.| 19.02.2025 | 09-11 |E| Dimensionality Reduction (Overview, Random, PCA) | {{ :dm:18_dm2_dimred_2024_25.pdf | Dimensionality Reduction}}, {{ :dm:dm2_lab01_imbalance.zip | LabImbLearn}}, {{ :dm:dm2_lab02_dimred.zip | LabDimRed}}, [[https://unipiit.sharepoint.com/:v:/s/a__td_64992/EXtiUFDI075FsJ5JZxc1AjkBZ06S7MJhVKhVUD8plrEMrg?e=RbH3To|Link]] | Guidotti| |
| | 26.02.2024 | | | No Lecture | | | | |03.| 24.02.2025 | 14-16 |E| Dimensionality Reduction (MDS, tSNE), Outlier Detection (Overview) | {{ :dm:19_dm2_anomaly_detection_2024_25.pdf | Outlier Detection}}, [[ https://unipiit.sharepoint.com/:v:/s/a__td_64992/EWeMgbhOR0lGjrIRnidGv_QBltzRtyriUeXqMHlOZXq1bA?e=8GN0eC |Link]] | Guidotti| |
|02.| 28.02.2024 | 11-13 |C| Sequential Pattern Mining | {{ :dm:16_dm2_sequential_pattern_mining_2023_24.pdf | Sequential Pattern Mining}}, {{ :dm:GSP.zip | GSP}} | Guidotti| | |04.| 26.02.2025 | 09-11 |E| Outlier Detection (Methods) | {{ :dm:19_dm2_anomaly_detection_2024_25.pdf | Outlier Detection}}, {{ :dm:dm2_lab03_outlier_det.zip |LabOutDet}}, [[ https://unipiit.sharepoint.com/:v:/s/a__td_64992/EWrBINmEU_9En7HBfFl9RV8Bsfh0gv2_1scH5P0BqDkdXQ |Link]], [[https://unipiit.sharepoint.com/:v:/s/a__td_64992/EW-ssQNGV8JGvCMhhUD-xNUBFrJFU5MKNwBMy8i6u5g5IA?e=Mc3HHN | Link2]] | Guidotti| |
|03.| 04.03.2024 | 9-11 |C| Sequential Pattern Mining | {{ :dm:16_dm2_sequential_pattern_mining_2023_24.pdf | Sequential Pattern Mining}}, {{ :dm:GSP.zip | GSP}} | Guidotti| | |05.| 04.03.2025 | 11-13 |D3| Outlier Detection (Methods) | {{ :dm:19_dm2_anomaly_detection_2024_25.pdf | Outlier Detection}}, {{ :dm:dm2_lab03_outlier_det.zip |LabOutDet}}, [[ https://unipiit.sharepoint.com/:v:/s/a__td_64992/EY6wbzFDXkRHndtIx90Ayb0B77VTX8mcXQXI_Z3nIOYo9g?e=BJ3CNf |Link]], [[ https://unipiit.sharepoint.com/:v:/s/a__td_64992/Ect99g3RSAlDrdTYJgZ2Fm0B2BbVTtr-WqUE3hYjkJuQEA?e=2LnXZG |Link2]] | Guidotti| |
|04.| 06.03.2024 | 11-13 |C| Transactional Clustering | {{ :dm:17_dm2_transactional_clustering_2023_24.pdf | Transactional Clustering}} | Guidotti| | |06.| 05.03.2025 | 09-11 |C| Outlier Detection (Methods), Gradient Descent | {{ :dm:19_dm2_anomaly_detection_2024_25.pdf | Outlier Detection}}, {{ :dm:dm2_lab03_outlier_det.zip |LabOutDet}}, {{ :dm:20_dm2_gradient_descent_2024_25.pdf | GD}}, [[ https://unipiit.sharepoint.com/:v:/s/a__td_64992/ETgyScDQOW5Dgaoiu_RmQKsBvu7i5AIXn-hItudrFzvg4g?e=yzD6WN |Link]] | Guidotti| |
|05.| 11.03.2024 | 9-11 |C| Time Series Similarity | {{ :dm:18_dm2_time_series_similarity_2023_24.pdf | Time Series Similarity}}, {{ :dm:dm2_lab00_spotify.zip | TS_Load}}, {{ :dm:dm2_lab01_dist_transf.zip | TS_Similarity}} | Guidotti| | |07.| 10.03.2025 | 11-13 |E| Maximum Likelihood Estimation, Odds, Log Odds, Logistic Regression | {{ :dm:21_dm2_maximum_likelihood_estimation_2024_25.pdf | MLE}}, {{ :dm:22_dm2_odds_2024_25.pdf | Odds}},{{ :dm:23_dm2_logistic_regression_2024_25.pdf | LogReg}}, {{ :dm:dm2_lab04_logistic_reg.zip | LabLogReg}}, [[ https://unipiit.sharepoint.com/:v:/s/a__td_64992/ETgyScDQOW5Dgaoiu_RmQKsBvu7i5AIXn-hItudrFzvg4g?e=yzD6WN |Link]], [[ [[ https://unipiit.sharepoint.com/:v:/s/a__td_64992/ETgyScDQOW5Dgaoiu_RmQKsBvu7i5AIXn-hItudrFzvg4g?e=yzD6WN |Link]] |Link2]] | Guidotti| |
|06.| 13.03.2024 | 11-13 |C| Time Series Approximation | {{ :dm:19_dm2_time_series_clustering_approximation_2023_24.pdf | Time Series Clustering}}, {{ :dm:dm2_lab02_approx_clust.zip | TS_Approx_Clustering}} | Guidotti| | |08.| 12.03.2025 | 09-11 |E| Support Vector Machines | {{ :dm:24_dm2_svm_2024_25.pdf | SVM}}, {{ :dm:dm2_lab05_svm.zip | LabSVM}}, [[ https://unipiit.sharepoint.com/:v:/s/a__td_64992/EU10T1Ogn7pHkUMRB0t3iGgBAJE_TX_FnXtmH2_w3w95Pw?e=wBg4Kn |Link]], [[ https://unipiit.sharepoint.com/:v:/s/a__td_64992/EVn8waCZketMmS5jE3m-u5sBZHZYTbvAx87DGPgynFMv0g?e=0ohaLq |Link2]] | Guidotti| |
|07.| 18.03.2024 | 9-11 |C| Time Series Clustering & Motifs| {{ :dm:20_dm2_time_series_matrix_profile_2023_24.pdf | Time Series Motifs}}, {{ :dm:dm2_lab03_motifs.zip | TS_Motifs}} | Guidotti| | |09.| 17.03.2025 | 11-13 |E| Neural Networks, Linear Perceptron | {{ :dm:25_dm2_perceptron_2024_25.pdf | Neural Network}}, {{ :dm:dm2_lab06_neural_networks.zip | LabNN}}, [[ https://unipiit.sharepoint.com/:v:/s/a__td_64992/EYpWxyzOb2BOriv-qzdF1LgBULySoNUU5qS18_Kj3D1pTg?e=lpdcyq |Link]] | Guidotti| |
|08.| 20.03.2024 | 11-13 |C| Time Series Classification | {{ :dm:21_dm2_time_series_classification_2023_24.pdf | Time Series Classification}}, {{ :dm:dm2_lab04_classification.zip | TS_Classification}} | Guidotti| | |10.| 19.03.2025 | 09-11 |E| Deep Neural Networks | {{ :dm:26_dm2_neural_network_2024_25.pdf | Deep Neural Network}}, {{ :dm:dm2_lab06_neural_networks.zip | LabNN}}, [[ https://unipiit.sharepoint.com/:v:/s/a__td_64992/EXn8ZmDM0q5FuqqGSe8VabYBS2jmjjhlS_Z5J8gqNaW-QQ?e=vg5QWG |Link]] | Guidotti| |
|09.| 25.03.2024 | 9-11 |C| Imbalanced Learning | {{ :dm:22_dm2_imbalanced_learning_2023_24.pdf | Imbalanced Learning}}, {{ :dm:dm2_lab05_imbalance.zip |ImbLearn}} | Guidotti| | |11.| 24.03.2025 | 11-13 |E| Ensemble Methods | {{ :dm:27_dm2_ensemble_2024_25.pdf | Ensemble Methods}}, {{ :dm:dm2_lab07_ensemble.zip |LabEnsemble}}, [[ https://unipiit.sharepoint.com/:v:/s/a__td_64992/Efwq4uqQptVGtUFMhzy1_-UBhGUu6dii-4ZextmLvHscug?e=me0tex |Link]] | Guidotti| |
|10.| 27.03.2024 | 11-13 |C| Dimensionality Reduction | {{ :dm:23_dm2_dimred_2023_24.pdf | Dimensionality Reduction}}, {{ :dm:dm2_lab06_dimred.zip |DimRed}} | Guidotti| | |12.| 26.03.2025 | 09-11 |E| Ensemble Methods | {{ :dm:27_dm2_ensemble_2024_25.pdf | Ensemble Methods}}, {{ :dm:28_dm2_gradient_boost_2024_25.pdf | Gradient Boosting}}, {{ :dm:dm2_lab07_ensemble.zip |LabEnsemble}}, [[ https://unipiit.sharepoint.com/:v:/s/a__td_64992/EXvE6wTpN2BInKpz7FjWjXQBnBGDO64a6yv73rQ7QOCwmg?e=bQYffK |Link]] | Guidotti| |
|11.| 03.04.2024 | 11-13 |C| Outlier Detection | {{ :dm:24_dm2_anomaly_detection_2023_24.pdf | Outlier Detection}} | Guidotti| | |13.| 31.03.2025 | 11-13 |E| Ensemble Methods | {{ :dm:27_dm2_ensemble_2024_25.pdf | Ensemble Methods}}, {{ :dm:28_dm2_gradient_boost_2024_25.pdf | Gradient Boosting}}, {{ :dm:dm2_lab07_ensemble.zip |LabEnsemble}}, [[ https://unipiit.sharepoint.com/:v:/s/a__td_64992/EQkGUk8URIdDjeaCv-1f4CYBtRSZkUqtdGAKu8C3Lz8QAg?e=oNkk1u |Link]], [[ https://unipiit.sharepoint.com/:v:/s/a__td_64992/ETNVlrEbWQFLtcXIh6yPTRoBzjfiWxA_4uNuUr3K7rglhA?e=7wkTSL |Link]] | Guidotti| |
|12.| 08.04.2024 | 9-11 |C| Outlier Detection | {{ :dm:24_dm2_anomaly_detection_2023_24.pdf | Outlier Detection}}, {{ :dm:dm2_lab07_outlier_det.zip | OutlierDetection}} | Guidotti| | |14.| 02.04.2025 | 09-11 |E| Explainable Artiticial Intelligence | {{ :dm:29_dm2_explainability_2024_25.pdf | XAI}}, [[ https://unipiit.sharepoint.com/:v:/s/a__td_64992/EZm67TidbZtIs5iscelNLNkBoL5Ou1PZ7O-rO14H0GATfw?e=R97gfd |Link]] | Guidotti| |
|13.| 10.04.2024 | 11-13 |C| Outlier Detection | {{ :dm:24_dm2_anomaly_detection_2023_24.pdf | Outlier Detection}}, {{ :dm:dm2_lab07_outlier_det.zip | OutlierDetection}} | Guidotti| | |15.| 07.04.2025 | 11-13 |E| Explainable Artiticial Intelligence | {{ :dm:29_dm2_explainability_2024_25.pdf | XAI}}, {{ :dm:dm2_lab08_xai.zip | LabXAI}}, [[ https://unipiit.sharepoint.com/:v:/s/a__td_64992/ERnvQmqb59NCkgrLpyd85w4BZYFyxX0XnsOLk0GwqTEgyQ?e=qrmnjs |Link]] | Guidotti| |
|14.| 15.04.2024 | 14-16 |C| Gradient Descend, MLE | {{ :dm:25_dm2_gradient_descent_2023_24.pdf | GD}}, {{ :dm:26_dm2_maximum_likelihood_estimation_2023_24.pdf | MLE}} | Guidotti| | |16.| 09.04.2025 | 09-11 |E| Transactional Clustering | {{ :dm:30_dm2_transactional_clustering_2024_25.pdf | Transactional Clustering}}, [[ https://unipiit.sharepoint.com/:v:/s/a__td_64992/EeRyorJxnYtLh83YVPZ66xwBkMrr14kmDGhrscX5NuTXTw?e=hOb1zx |Link]] | Guidotti| |
|15.| 17.04.2024 | 11-13 |C| Odds, LogOdds, Logistic Regression| {{ :dm:27_dm2_odds_2023_24.pdf | Odds}}, {{ :dm:28_dm2_logistic_regression_2023_24.pdf | LogReg}}, {{ :dm:dm2_lab08_logistic_reg.zip | LogReg}} | Guidotti| | |17.| 14.04.2025 | 11-13 |C| Sequential Pattern Mining | {{ :dm:31_dm2_sequential_pattern_mining_2024_25.pdf | GSP}}, [[ https://unipiit.sharepoint.com/:v:/s/a__td_64992/Efl8bH67zQ1GnSoOhGLLKgcBzF14ZdF3opb8mi-WV0KXJg?e=akHpP4 |Link]] | Guidotti| |
|16.| 22.04.2024 | 9-11 |C| Support Vector Machine | {{ :dm:29_dm2_svm_2023_24.pdf | SVM}}, {{ :dm:dm2_lab09_svm.zip | SVM}} | Guidotti| | |18.| 16.04.2025 | 9-11 |C| Sequential Pattern Mining | {{ :dm:31_dm2_sequential_pattern_mining_2024_25.pdf | GSP}}, [[ https://unipiit.sharepoint.com/:v:/s/a__td_64992/ERycsmADkv5Nhq8xbRadgSwBa3EWHdtBIZJhYlj1grEKCQ?e=hKYcni |Link]] | Guidotti| |
|17.| 24.04.2024 | 11-13 |C| Perceptron, Neural Networks| {{ :dm:30_dm2_perceptron_2023_24.pdf | Perceptron}} | Guidotti| | |19.| 28.04.2025 | 11-13 |E| Time Series - Intro & Preprocessing |{{ :dm:32_dm2_time_series_preprocessing_2024_25.pdf | TS_Preprocessing}}, {{ :dm:dm2_lab09_ts_preprocessing.zip | LabTS_Prep}}, Video Missing | Guidotti| |
|18.| 29.04.2024 | 9-11 |C| Deep Neural Networks | {{ :dm:31_dm2_neural_network_2023_24.pdf | Deep Neural Networks}}, {{ :dm:dm2_lab10_neural_networks.zip | NN}} | Guidotti| | |20.| 30.04.2025 | 09-11 |E| Time Series - Similarities & Distances | {{ :dm:33_dm2_time_series_similarity_2024_25.pdf | TS_Similarity}}, {{ :dm:dm2_lab10_ts_dist.zip | LabTS_Sim}}, [[ https://unipiit.sharepoint.com/:v:/s/a__td_64992/ER3DhpHdMqBArNpEA1NKPgcBZysjQTGU3W9HTc6bzaaCCQ?e=Aeh7CD |Link]] | Guidotti| |
|19.| 06.05.2024 | 9-11 |C| CNN, RNN, DL-TS, Ensemble Intro | {{ :dm:31_dm2_neural_network_2023_24.pdf |DNN}}, {{ :dm:21_dm2_time_series_classification_2023_24.pdf | TSC-DNN}}, {{ :dm:32_dm2_ensemble_2023_24.pdf | Ensemble}} | Guidotti| | |21.| 05.05.2025 | 09-11 |E| Time Series - Aprroximation & Clustering | {{ :dm:34_dm2_time_series_approximation_clustering_2024_25.pdf | TS_ApproxClustering}}, {{ :dm:dm2_lab11_ts_approx_clustering.zip | LabTS_ApproxClustering}}, [[ https://unipiit.sharepoint.com/:v:/s/a__td_64992/ER3DhpHdMqBArNpEA1NKPgcBZysjQTGU3W9HTc6bzaaCCQ?e=Aeh7CD |Link]] | Guidotti| |
|20.| 08.05.2024 | 11-13 |C| Ensemble, Boosting, Adaboost | {{ :dm:32_dm2_ensemble_2023_24.pdf | Ensemble}}, {{ :dm:dm2_lab11_ensamble.zip | LabEnsemble}} | Guidotti| | |22.| 07.05.2025 | 11-13 |E| Time Series - Matrix Profile |{{ :dm:35_dm2_time_series_matrix_profile_2024_25.pdf | TS_MatrixProfile}}, {{ :dm:dm2_lab12_ts_matrixprofile.zip | LabTS_MatrixProfile}}, [[ https://unipiit.sharepoint.com/:v:/s/a__td_64992/EdtlOrd_oFRPjiOmAx-eKd0BvpEh1j9xriGhEgN8Vv0bkw?e=FpcIGt |Link]] | Guidotti| |
|21.| 13.05.2024 | 9-11 |C| Ensemble-TS, Gradient Boosting | {{ :dm:33_dm2_gradient_boost_2023_24.pdf | Gradient Boosting Machines}}, {{ :dm:dm2_lab11_ensamble.zip | LabEnsemble}} | Guidotti| | |23.| 12.05.2025 | 09-11 |E| Time Series - Classification | {{ :dm:36_dm2_time_series_classification_2024_25.pdf | TS_Classification}}, {{ :dm:dm2_lab13_ts_classification.zip | LabTS_Classification}}, [[ https://unipiit.sharepoint.com/:v:/s/a__td_64992/EV_0MjXUb1NEkCijo7f18fgB2NUoSzKUSO6hZZn6sxKJ0w?e=WMGGL3 |Link]] | Guidotti| |
|22.| 15.05.2024 | 11-13 |C| Extreme Gradient Boosting | {{ :dm:33_dm2_gradient_boost_2023_24.pdf | Gradient Boosting Machines}}, {{ :dm:dm2_lab11_ensamble.zip | LabEnsemble}} | Guidotti| | |24.| 14.05.2025 | 11-13 |E| Time Series - Classification | {{ :dm:36_dm2_time_series_classification_2024_25.pdf | TS_Classification}}, {{ :dm:dm2_lab13_ts_classification.zip | LabTS_Classification}}, [[ https://unipiit.sharepoint.com/:v:/s/a__td_64992/EQ6U-VR5E-5AvIxRMtT5jo0B9pLAc6khNdAQIeXr40xByg?e=jvGVph |Link]], [[ https://unipiit.sharepoint.com/:v:/s/a__td_64992/EeumwM-NkM5Jtr3Xt8BeCP4BDWGlbVJSNsAnKS4ogh_Uxg?e=naKzjZ |Link]] | Guidotti| |
|23.| 20.05.2024 | 9-11 |C1| eXplainable Artificial Intelligence | {{ :dm:34_dm2_explainability_2023_24.pdf | XAI}}, {{ :dm:dm2_lab12_xai.zip | LabXAI}} | Guidotti| | |
|24.| 22.05.2024 | 11-13 |C1| eXplainable Artificial Intelligence | {{ :dm:34_dm2_explainability_2023_24.pdf | XAI}}, {{ :dm:dm2_lab12_xai.zip | LabXAI}} | Guidotti| | |
====== Exams ====== | ====== Exams ====== |
| |
** Final Mark: ** for 12-credit exam, the final mark will be obtained as the | ** Final Mark: ** for 12-credit exam, the final mark will be obtained as the |
average mark of DM1 and DM2. | average mark of DM1 and DM2. |
| |
| *** Exams Registration Instructions for DM1*** |
| - Use the Google registration form: [[https://forms.gle/JFULK3nNsHBU6Tqa8|here]] if you cannot register on Esami on Data Mining for year 2024/2025. |
| - When the registration closes you will receive a link to the Agenda |
| - Register on the Agenda selecting day and time (do not change you choice or cancel, if you book you want to do the exam) |
| - Submit the project at least 1 week before the day you selected (or within 31/12 to get +0.5 extra mark) |
| |
===== Exam Booking Periods ===== | ===== Exam Booking Periods ===== |
* Exam portal link: [[https://esami.unipi.it/|here]] | * Exam portal link: [[https://esami.unipi.it/|here]] |
* 1st Appello: from 09/01/2024 to 31/12/2024 | * Registration Form: [[https://forms.gle/NuAvCa3YK2h8MgrX7|here]] |
* 2nd Appello: from 01/02/2024 to 17/02/2024 | * 1st Appello: from 08/01/2025 to 16/01/2025 |
* 3rd Appello: from 05/05/2024 to 30/05/2024 | * 2nd Appello: from 30/01/2025 to 05/02/2025 |
* 4th Appello: from 02/06/2024 to 27/06/2024 | * 3rd Appello: from TBD to TBD |
* 5th Appello: from 19/06/2024 to 14/07/2024 | * 4th Appello: from TBD to TBD |
* 6th Appello: | * 5th Appello: from TBD to TBD |
| * 6th Appello: from TBD to TBD |
| |
===== Exam Booking Agenda ===== | |
When registering for the oral exam please specify in the notes DM1 if you do not want to do DM2 (that is assumed by default). After having booked for DM1 please contact Prof. Pedreschi to agree on the exam date (put Prof. Guidotti and Andrea Fedele in cc). There will be no agenda for DM1. | |
| |
* 1st Appello - DM1: https://agende.unipi.it/yra-ief-dmo, DM2: https://agende.unipi.it/rnm-urj-wsu | |
* 2nd Appello - DM1: https://agende.unipi.it/yra-ief-dmo, DM2: https://agende.unipi.it/rnm-urj-wsu | |
* 3rd Appello: - DM1 & DM2: from 04/06/2024 to 13/06/2024 (deliver project by 29/05/2024) | |
* 4th Appello: - DM1 & DM2: from 02/07/2024 to 11/07/2024 (deliver project by 25/06/2024) | |
* 5th Appello: - DM1 & DM2: from 22/07/2024 to 25/07/2024 (deliver project by 15/07/2024) | |
* 6th Appello: | |
| |
**Do not forget to make the evaluation of the course!!!** | |
===== Exam DM1 ====== | ===== Exam DM1 ====== |
| |
* An **oral exam**, that includes: (1) discussing the project report; (2) discussing topics presented during the classes, including the theory and practical exercises. | * An **oral exam**, that includes: (1) discussing the project report; (2) discussing topics presented during the classes, including the theory and practical exercises. |
| |
* A **project**, that consists in exercises requiring the use of data mining tools for analysis of data. Exercises include: data understanding, clustering analysis, pattern mining, and classification (guidelines will be provided for more details). The project has to be performed by min 2, max 3 people. It has to be performed by using Python or any other data mining software. The results of the different tasks must be reported in a unique paper. The total length of this paper must be max 20 pages of text including figures. The paper must be emailed to [[andrea.fedele@phd.unipi.it]] and [[riccardo.guidotti@unipi.it]]. Please, use “[DM1 2023-2024] Project” in the subject. | * A **project**, that consists in exercises requiring the use of data mining tools for analysis of data. Exercises include: data understanding, clustering analysis, pattern mining, and classification (guidelines will be provided for more details). The project has to be performed by min 2, max 3 people. It has to be performed by using Python or any other data mining software. The results of the different tasks must be reported in a unique paper. The total length of this paper must be max 20 pages of text including figures. The paper must be emailed to [[andrea.fedele@phd.unipi.it]] and [[riccardo.guidotti@unipi.it]]. Please, use “[DM1 2024-2025] Project” in the subject. |
| |
* **Dataset** | * **Dataset** |
- Assigned: 25/09/2023 | - Assigned: 15/10/2024 |
- MidTerm Submission: 15/11/2023 (+0.5) (half project required, i.e., Data Understanding & Preparation and Clustering) | - MidTerm Submission: <del>15/11/2024</del> **22/11/2024** (+0.5) (half project required, i.e., Data Understanding & Preparation and Clustering) |
- Final Submission: 31/12/2023 (+0.5) one week before the oral exam (complete project required). | - Final Submission: 31/12/2024 (+0.5) one week before the oral exam (complete project required). |
- Dataset: {{ :dm:std.zip | STD}} | - Dataset: {{ :dm:dm1_dataset_2425_imdb.zip | IMDb}} |
| |
** DM1 Project Guidelines ** | ** DM1 Project Guidelines ** |
See {{ :dm:dm1_project_guidelines_23_24.pdf | Project Guidelines}}. | See {{ :dm:dm1_project_guidelines_24_25.pdf | Project Guidelines}}. |
| |
| |
| |
| |
| |
===== Exam DM2 ====== | ===== Exam DM2 ====== |
| |
* An **oral exam**, that includes: (1) discussing the project report; (2) discussing topics presented during the classes, including the theory and practical exercises. | * An **oral exam**, that includes: (1) discussing the project report; (2) discussing topics presented during the classes, including the theory and practical exercises. |
| |
* A **project**, that consists in exercises requiring the use of data mining tools for analysis of data. Exercises include: imbalanced learning, dimensionality reduction, outlier detection, advanced classification/regression methods, time series analysis/clustering/classification (guidelines will be provided for more details). The project has to be performed by min 1, max 3 people. It has to be performed by using Python or any other data mining software. The results of the different tasks must be reported in a unique paper. The total length of this paper must be max 30 pages of text including figures. The paper must be emailed to [[andrea.fedele@phd.unipi.it]] and [[riccardo.guidotti@unipi.it]]. Please, use “[DM2 2023-2024] Project” in the subject. | * A **project**, that consists in exercises requiring the use of data mining tools for analysis of data. Exercises include: imbalanced learning, dimensionality reduction, outlier detection, advanced classification/regression methods, time series analysis/clustering/classification (guidelines will be provided for more details). The project has to be performed by min 1, max 3 people. It has to be performed by using Python or any other data mining software. The results of the different tasks must be reported in a unique paper. The total length of this paper must be max 30 pages of text including figures. The paper must be emailed to [[andrea.fedele@phd.unipi.it]] and [[riccardo.guidotti@unipi.it]]. Please, use “[DM2 2024-2025] Project” in the subject. |
| |
* **Dataset** | * **Dataset** |
- Assigned: 19/02/2024 | - Assigned: 18/02/2025 |
- MidTerm Submission: 07/05/2024 (Modules 1 and 2 (for TS classification non DL-based models)) | - MidTerm Submission: 07/05/2025 |
- Final Submission: one week before the oral exam (complete project required, also with DL-based models for TS classification). | - Final Submission: one week before the oral exam (complete project required). |
- Dataset: [[https://unipiit-my.sharepoint.com/:u:/g/personal/a_fedele7_studenti_unipi_it/EUSyNv8ahD9FrBZ6fiF3gvABcYVLpbo1biIyOGy8AmcO5g?e=ziQtEc|STD]] | - Dataset: {{ :dm:dm2_dataset_2425_imdb.zip | IMDb Extended & IMDb Time Series}} |
| |
** DM2 Project Guidelines ** | ** DM2 Project Guidelines ** |
See {{ :dm:dm2_project_guidelines_23_24.pdf | Project Guidelines}}. | See {{ :dm:dm2_project_guidelines_24_25.pdf | Project Guidelines}}. |
| |
| |
| |
====== Previous years ===== | ====== Previous years ===== |
| * [[dm_ds2023-24]] |
* [[dm.2022-23ds]] | * [[dm.2022-23ds]] |
* [[dm.2021-22ds]] | * [[dm.2021-22ds]] |