dm:start
Differenze
Queste sono le differenze tra la revisione selezionata e la versione attuale della pagina.
| Entrambe le parti precedenti la revisioneRevisione precedenteProssima revisione | Revisione precedente | ||
| dm:start [28/04/2025 alle 06:49 (8 mesi fa)] – [Second Semester (DM2 - Data Mining: Advanced Topics and Applications)] Riccardo Guidotti | dm:start [01/12/2025 alle 10:55 (12 giorni fa)] (versione attuale) – [News] Riccardo Guidotti | ||
|---|---|---|---|
| Linea 1: | Linea 1: | ||
| - | ====== Data Mining A.A. 2024/25 ====== | + | ====== Data Mining A.A. 2025/26 ====== |
| ===== DM1 - Data Mining: Foundations (6 CFU) ===== | ===== DM1 - Data Mining: Foundations (6 CFU) ===== | ||
| Linea 15: | Linea 15: | ||
| Teaching Assistant | Teaching Assistant | ||
| - | * **Andrea Fedele** | + | * **Alessio Cascione** |
| * KDDLab, Università di Pisa | * KDDLab, Università di Pisa | ||
| - | * [[https:// | + | * [[https:// |
| - | * [[andrea.fedele@phd.unipi.it]] | + | * [[alessio.cascione@phd.unipi.it]] |
| ===== DM2 - Data Mining: Advanced Topics and Applications (6 CFU) ===== | ===== DM2 - Data Mining: Advanced Topics and Applications (6 CFU) ===== | ||
| Linea 28: | Linea 29: | ||
| Teaching Assistant | Teaching Assistant | ||
| - | * **Andrea Fedele** | + | * **Alessio Cascione** |
| * KDDLab, Università di Pisa | * KDDLab, Università di Pisa | ||
| - | * [[https:// | + | * [[https:// |
| - | * [[andrea.fedele@phd.unipi.it]] | + | * [[alessio.cascione@phd.unipi.it]] |
| - | * Meeting: https:// | + | |
| ====== News ====== | ====== News ====== | ||
| - | * **[11.03.2025]** The lecture of DM2 planned for the 14/03/2025 will be held in Room C instead | + | * **[01.12.2025] The lecture of Thursday 04/12/2025 is moved to Friday 05/12/2025 9-11 in room C (project presentation of Prof.ssa Pierotti will start at 11 after DM lecture). The last lecture |
| - | * *[04.03.2025]* The sixth lecture of DM2 planned for the 04/03/2025 will be in Room C instead | + | * [19.11.2025] The lecture of Thursday 20/11/2025 will be held in room N1 due to not usability |
| - | * [27.01.2025] The first lecture of DM2 will be held the 18.02.2025 in Room A1 exchanging with S4DS that will be held the 17.02.2025 in Room E. | + | * [07.10.2025] The lecture of Thursday 10/10/2025 is canceled due to the UniPi Orienta event. The recovery lecture is Tuesday 14/10/2025 9-11 room M1. |
| - | * [07.01.2025] | + | * [06.10.2025] |
| - | - Use the Google registration form: [[https:// | + | * [28.07.2025] Lectures will start on Monday |
| - | | + | |
| - | - Register on the Agenda selecting day and time (do not change you choice or cancel, if you book you want to do the exam) | + | |
| - | - Submit the project at least 1 week before the day you selected in the Agenda. | + | |
| - | * [03.12.2024] This year' lectures available at [[https://unipiit-my.sharepoint.com/ | + | ---- |
| - | * [07.09.2024] Past years' lectures available at [[https://unipiit-my.sharepoint.com/:f:/g/personal/a_fedele7_studenti_unipi_it/ | + | |
| - | * [02.09.2024] Lectures will start on Monday | + | |
| - | * [02.09.2024] | + | |
| - | * [02.09.2024] Project Groups [[https:// | + | |
| - | * [11.09.2023] MS Teams [[https:// | + | |
| ====== Learning Goals ====== | ====== Learning Goals ====== | ||
| * DM1 | * DM1 | ||
| Linea 74: | Linea 71: | ||
| ^ Day of Week ^ Hour ^ Room ^ | ^ Day of Week ^ Hour ^ Room ^ | ||
| - | | Monday | + | | Monday |
| - | | | + | | |
| **Office hours - Ricevimento: | **Office hours - Ricevimento: | ||
| * Prof. Pedreschi | * Prof. Pedreschi | ||
| - | * TBD | + | * Monday 15:00-17:00 or Appointment by email |
| - | * Online | + | * Room 318 Dept. of Computer Science or MS Teams |
| * Prof. Guidotti | * Prof. Guidotti | ||
| * Thursday 16:00 - 18:00 or Appointment by email | * Thursday 16:00 - 18:00 or Appointment by email | ||
| * Room 363 Dept. of Computer Science or MS Teams | * Room 363 Dept. of Computer Science or MS Teams | ||
| + | |||
| + | |||
| + | * Alessio Cascione | ||
| + | * Google Meet slot - https:// | ||
| + | * Alternative appointment by email | ||
| + | * I will be out of office from 05/12/2025 to 15/12/2025, checking emails and answering | ||
| | | ||
| Linea 117: | Linea 121: | ||
| * The slides used in the course will be inserted in the calendar after each class. Most of them are part of the slides provided by the textbook' | * The slides used in the course will be inserted in the calendar after each class. Most of them are part of the slides provided by the textbook' | ||
| + | |||
| + | ===== FAQ ===== | ||
| - | | + | For the academic year 2025/2026, we make available a document containing **frequently asked questions (FAQs)** about the project at the end of the lecture. |
| + | Please consult this document first, as your question may already be answered there. | ||
| + | The FAQ will be updated regularly after each lecture with new relevant questions from students. | ||
| + | |||
| + | Check the document: | ||
| + | https:// | ||
| + | |||
| + | |||
| + | |||
| + | ===== Recording past years ===== | ||
| + | |||
| + | Link to past years recordings (incrementally updated with respect to the current lectures of the course) | ||
| + | |||
| + | https:// | ||
| ===== Software===== | ===== Software===== | ||
| Linea 130: | Linea 149: | ||
| * Didactic Data Mining [[http:// | * Didactic Data Mining [[http:// | ||
| - | ====== Class Calendar (2024/2025) ====== | + | ====== Class Calendar (2025/2026) ====== |
| ===== First Semester (DM1 - Data Mining: Foundations) ===== | ===== First Semester (DM1 - Data Mining: Foundations) ===== | ||
| ^ ^ Day ^ Time ^ Room ^ Topic ^ Material ^ Lecturer ^ | ^ ^ Day ^ Time ^ Room ^ Topic ^ Material ^ Lecturer ^ | ||
| - | | | + | | |
| - | | | + | | |
| - | | | + | | |
| - | | | + | | |
| - | |01.| 30.09.2024 | 11-13 |C1| Overview, Introduction | {{ :dm:00_dm1_introduction_2024_25.pdf | Intro}} | Pedreschi| | + | |01.| 29.09.2025 | 09-11 | E | Overview, Introduction | {{ :dm:00_dm1_introduction_2025_26.pptx.pdf | Intro}} | Pedreschi | |
| - | |02.| 01.10.2024 | 14-16 |C1| Lab. Introduction to Python | + | |02.| 02.10.2025 | 09-11 | E | The KDD process |
| - | |03.| 07.10.2024 | 11-13 |C1| Data Understanding | + | |03.| 06.10.2025 | 09-11 | E | Introduction to Python |
| - | |04.| 08.10.2023 | 14-16 |C1| Data Understanding | + | | | 09.10.2025 | | | No Lecture (UNIPI Orienta) | | |
| - | |05.| 14.10.2023 | 11-13 |C1| Data Preparation | + | |04.| 13.10.2025 | 09-11 | E | Data Understanding | {{ :dm:01_dm1_data_understanding_2025_26.pdf | Data Understanding }} | Pedreschi | |
| - | |06.| 15.10.2024 | 14-16 |C1| Lab. Data Understanding | {{ :dm:dm1_lab02_data_understanding.zip | Data Understanding}}| Pedreschi| | + | |05.| 14.10.2025 | 09-11 | C1 | Data Preparation | {{ :dm:02_dm1_data_preparation_2025_26.pdf | Data Preparation}}, |
| - | |07.| 21.10.2024 | 11-13 |C1| Introduction to Clustering, K-Means | + | |04.| 16.10.2025 | 09-11 | E | Data Understanding |
| - | |08.| 22.10.2024 | 14-16 |C1| Centroid-based Clustering | {{:dm:05_dm1_kmeans_2024_25.pdf | K-Means }} | Pedreschi| | + | |06.| 20.10.2025 | 09-11 | E | Data Similarity and Introduction to Clustering | {{ :dm:03_dm1_data_similarity_2025_26.pdf | Data Similarity}}, {{ :dm:04_dm1_clustering_intro_2025_26.pdf | Introduction to Clustering}} | Guidotti |
| - | |09.| 28.10.2023 | 11-13 |C1| Hierarchical Clustering | + | |07.| 23.10.2025 | 09-11 | E | Centroid-based Clustering |
| - | |10.| 29.10.2024 | 14-16 |C1| Lab. Clustering | {{ :dm:dm1_lab03_clustering.zip | Clustering}}| | + | |08.| 27.10.2025 | 09-11 | E | Hierarchical Clustering |
| - | |11.| 04.11.2024 | 11-13 |C1| Ex. Clustering | {{ :dm:ex1_dm1_clustering_2023_24.pdf | ExClustering}}| Guidotti| | + | |09.| 27.10.2025 | 09-11 | E | Density-based |
| - | |12.| 05.11.2024 | 14-16 |C1| Intro Classification | + | |10.|03.11.2025 | 09-11 | E | Clustering |
| - | |13.| 11.11.2024 | 11-13 |C1| Naive Bayes, Exercises | {{ : | + | |11.|04.11.2025 | 09-11 | C1 | Classification: Overview and K-Nearest Neighbours |
| - | |14.| 12.11.2024 | 14-16 |C1| Model Evaluation, Lab. Classification (kNN, | + | |12.|06.11.2025 | 09-11 | E | Classification: |
| - | |15.| 14.11.2024 | 9-11 |C1| Decision | + | |13.|10.11.2025 | 09-11 | E | Classification: |
| - | |16.| 18.11.2024 | 11-13 |C1| Decision | + | |14.|13.11.2025 | 09-11 | E | Classification: |
| - | |17.| 19.11.2024 | 14-16 |C1| Decision | + | |15.|17.11.2025 | 09-11 | D5 | Classification: |
| - | |18.| 21.11.2024 | 9-11 |C1| Decision Tree Classifier Exercises and Lab | {{ :dm:12_dm1_decision_trees_2024_25.pdf | Decision Tree}}, {{ : | + | |16.|18.11.2025 | 09-11 | C1 | Classification: |
| - | |19.| 25.11.2024 | 11-13 |C1| Regression & Lab. Regression | {{ :dm: | + | |17.|20.11.2025 | 09-11 | N1 | Classification |
| - | |20.| 26.11.2024 | 14-16 |C1| Into Pattern Mining | + | |18.|24.11.2025 | 09-11 | E | Pattern Mining: Apriori |
| - | |21.| 28.11.2024 | 9-11 |C1| Apriori & FP-Growth | + | |19.|25.11.2025 | 09-11 | C | Pattern Mining: |
| - | |22.| 02.12.2024 | 11-13 |C1| Lab. Pattern Mining | + | |20.|27.11.2025 | 09-11 | E | Regression: Problem, Linear, KNN, Decision Tree | {{ :dm:13_dm1_linear_regression_2024_25.pptx.pdf | Regression |
| - | |23.| 03.12.2024 | 14-16 |C1| Rule-based Classifiers | {{ :dm:15_dm1_rule_based_classifier_2024_25.pdf | Rule-based Classifiers}} | Guidotti| | + | |21.|01.12.2025 | 09-11 | E | Lab on Regression and Pattern Mining; FPGROWTH| {{ :dm:01.12.25_regression_2025_lecture_in_class.zip |
| - | |24.| 05.12.2024 | 9-11 |C1| FP-Growth Exercises & Project Discussion | + | |
| ===== Second Semester (DM2 - Data Mining: Advanced Topics and Applications) ===== | ===== Second Semester (DM2 - Data Mining: Advanced Topics and Applications) ===== | ||
| ^ ^ Day ^ Time ^ Room ^ Topic ^ Material ^ Lecturer ^ | ^ ^ Day ^ Time ^ Room ^ Topic ^ Material ^ Lecturer ^ | ||
| - | |01.| 18.02.2025 | 14-16 |A1| Overview, Imbalanced Learning | {{ : | + | |01.| 18.02.2025 | 14-16 |A1| Overview, Imbalanced Learning | {{ : |
| - | |02.| 19.02.2025 | 09-11 |E| Dimensionality Reduction (Overview, Random, PCA) | {{ :dm: | + | |
| - | |03.| 24.02.2025 | 14-16 |E| Dimensionality Reduction (MDS, tSNE), Outlier Detection (Overview) | {{ : | + | |
| - | |04.| 26.02.2025 | 09-11 |E| Outlier Detection (Methods) | {{ : | + | |
| - | |05.| 04.03.2025 | 11-13 |D3| Outlier Detection (Methods) | {{ : | + | |
| - | |06.| 05.03.2025 | 09-11 |C| Outlier Detection (Methods), Gradient Descent | {{ : | + | |
| - | |07.| 10.03.2025 | 11-13 |E| Maximum Likelihood Estimation, Odds, Log Odds, Logistic Regression | {{ : | + | |
| - | |08.| 12.03.2025 | 09-11 |E| Support Vector Machines | {{ : | + | |
| - | |09.| 17.03.2025 | 11-13 |E| Neural Networks, Linear Perceptron | {{ : | + | |
| - | |10.| 19.03.2025 | 09-11 |E| Deep Neural Networks | {{ : | + | |
| - | |11.| 24.03.2025 | 11-13 |E| Ensemble Methods | {{ : | + | |
| - | |12.| 26.03.2025 | 09-11 |E| Ensemble Methods | {{ : | + | |
| - | |13.| 31.03.2025 | 11-13 |E| Ensemble Methods | {{ : | + | |
| - | |14.| 02.04.2025 | 09-11 |E| Explainable Artiticial Intelligence | {{ : | + | |
| - | |15.| 07.04.2025 | 11-13 |E| Explainable Artiticial Intelligence | {{ : | + | |
| - | |16.| 09.04.2025 | 09-11 |E| Transactional Clustering | {{ : | + | |
| - | |17.| 14.04.2025 | 11-13 |C| Sequential Pattern Mining | {{ : | + | |
| - | |18.| 28.04.2025 | 11-13 |E| Time Series - Intro & Preprocessing |{{ : | + | |
| - | |19.| 30.04.2025 | 09-11 |E| Transactional Clustering | {{ : | + | |
| ====== Exams ====== | ====== Exams ====== | ||
| Linea 203: | Linea 206: | ||
| - Understanding of the theoretical aspects of the topics addressed during the course. The student may be required to write on formulas or pseudocode. During the explanations, | - Understanding of the theoretical aspects of the topics addressed during the course. The student may be required to write on formulas or pseudocode. During the explanations, | ||
| - Understanding of the algorithms illustrated during the course and their practical implementation. You will be asked to perform one or more simple exercises. The text will be shown on the teacher' | - Understanding of the algorithms illustrated during the course and their practical implementation. You will be asked to perform one or more simple exercises. The text will be shown on the teacher' | ||
| - | - Discussion of the project with questions from the teacher regarding unclear aspects, | + | - Discussion of the project with questions from the teacher regarding unclear aspects, questionable steps or choices. |
| - | questionable steps or choices. | + | |
| ** Final Mark: ** for 12-credit exam, the final mark will be obtained as the | ** Final Mark: ** for 12-credit exam, the final mark will be obtained as the | ||
| Linea 210: | Linea 212: | ||
| *** Exams Registration Instructions for DM1*** | *** Exams Registration Instructions for DM1*** | ||
| - | - Use the Google registration form: [[https:// | + | - Use the Google registration form: TBD if you cannot register on Esami on Data Mining for year 2025/2026. |
| - When the registration closes you will receive a link to the Agenda | - When the registration closes you will receive a link to the Agenda | ||
| - Register on the Agenda selecting day and time (do not change you choice or cancel, if you book you want to do the exam) | - Register on the Agenda selecting day and time (do not change you choice or cancel, if you book you want to do the exam) | ||
| Linea 217: | Linea 219: | ||
| ===== Exam Booking Periods ===== | ===== Exam Booking Periods ===== | ||
| * Exam portal link: [[https:// | * Exam portal link: [[https:// | ||
| - | * Registration Form: [[https:// | + | * Registration Form: TBD |
| - | * 1st Appello: from 08/ | + | * 1st Appello: from TBD to TBD |
| - | * 2nd Appello: from 30/ | + | * 2nd Appello: from TBD to TBD |
| * 3rd Appello: from TBD to TBD | * 3rd Appello: from TBD to TBD | ||
| * 4th Appello: from TBD to TBD | * 4th Appello: from TBD to TBD | ||
| Linea 232: | Linea 234: | ||
| * An **oral exam**, that includes: (1) discussing the project report; (2) discussing topics presented during the classes, including the theory and practical exercises. | * An **oral exam**, that includes: (1) discussing the project report; (2) discussing topics presented during the classes, including the theory and practical exercises. | ||
| - | * A **project**, | + | * A **project**, |
| * **Dataset** | * **Dataset** | ||
| - | - Assigned: 15/10/2024 | + | - Assigned: 15/10/2025 |
| - | - MidTerm Submission: | + | - MidTerm Submission: 15/11/2025 (+0.5) (half project required, i.e., Data Understanding & Preparation and Clustering) |
| - | - Final Submission: 31/12/2024 (+0.5) one week before the oral exam (complete project required). | + | - Final Submission: 31/12/2025 (+0.5) one week before the oral exam (complete project required). |
| - | - Dataset: {{ :dm:dm1_dataset_2425_imdb.zip | IMDb}} | + | - Dataset: |
| ** DM1 Project Guidelines ** | ** DM1 Project Guidelines ** | ||
| - | See {{ :dm:dm1_project_guidelines_24_25.pdf | Project Guidelines}}. | + | See {{ :dm:dm1_project_guidelines_25_26.pdf |}} |
| Linea 253: | Linea 255: | ||
| * **Dataset** | * **Dataset** | ||
| - | - Assigned: 18/02/2025 | + | - Assigned: 18/02/2026 |
| - | - MidTerm Submission: 07/05/2025 | + | - MidTerm Submission: 07/05/2026 |
| - Final Submission: one week before the oral exam (complete project required). | - Final Submission: one week before the oral exam (complete project required). | ||
| - | - Dataset: | + | - Dataset: |
| ** DM2 Project Guidelines ** | ** DM2 Project Guidelines ** | ||
| - | See {{ : | + | See TBD. |
| Linea 284: | Linea 286: | ||
| ====== Previous years ===== | ====== Previous years ===== | ||
| + | * [[dm_ds2024-25]] | ||
| * [[dm_ds2023-24]] | * [[dm_ds2023-24]] | ||
| * [[dm.2022-23ds]] | * [[dm.2022-23ds]] | ||
dm/start.1745822995.txt.gz · Ultima modifica: 28/04/2025 alle 06:49 (8 mesi fa) da Riccardo Guidotti
