dm:start
Differenze
Queste sono le differenze tra la revisione selezionata e la versione attuale della pagina.
| Entrambe le parti precedenti la revisioneRevisione precedenteProssima revisione | Revisione precedente | ||
| dm:start [08/08/2024 alle 12:38 (15 mesi fa)] – Salvatore Ruggieri | dm:start [29/10/2025 alle 16:25 (4 ore fa)] (versione attuale) – [First Semester (DM1 - Data Mining: Foundations)] Riccardo Guidotti | ||
|---|---|---|---|
| Linea 1: | Linea 1: | ||
| - | ====== Data Mining A.A. 2023/24 ====== | + | ====== Data Mining A.A. 2025/26 ====== |
| ===== DM1 - Data Mining: Foundations (6 CFU) ===== | ===== DM1 - Data Mining: Foundations (6 CFU) ===== | ||
| Linea 15: | Linea 15: | ||
| Teaching Assistant | Teaching Assistant | ||
| - | * **Andrea Fedele** | + | * **Alessio Cascione** |
| * KDDLab, Università di Pisa | * KDDLab, Università di Pisa | ||
| - | * [[https:// | + | * [[https:// |
| - | * [[andrea.fedele@phd.unipi.it]] | + | * [[alessio.cascione@phd.unipi.it]] |
| ===== DM2 - Data Mining: Advanced Topics and Applications (6 CFU) ===== | ===== DM2 - Data Mining: Advanced Topics and Applications (6 CFU) ===== | ||
| Linea 28: | Linea 29: | ||
| Teaching Assistant | Teaching Assistant | ||
| - | * **Andrea Fedele** | + | * **Alessio Cascione** |
| * KDDLab, Università di Pisa | * KDDLab, Università di Pisa | ||
| - | * [[https:// | + | * [[https:// |
| - | * [[andrea.fedele@phd.unipi.it]] | + | * [[alessio.cascione@phd.unipi.it]] |
| - | * Meeting: https:// | + | |
| ====== News ====== | ====== News ====== | ||
| + | * [07.10.2025] The lecture of Thursday 10/10/2025 is canceled due to the UniPi Orienta event. The recovery lecture is Tuesday 14/10/2025 9-11 room M1. | ||
| + | * [06.10.2025] Link to Project Groups Registration DM1 [25/26] (max 3 students for each group - access with your University of Pisa account, deadline 17/ | ||
| + | * [28.07.2025] Lectures will start on Monday 29 September 2025 at 09.00 room E. Lectures will be in presence only. Registrations of the lectures of past years can be found at the bottom of this web page. | ||
| + | |||
| - | * **[24.05.2024]** When registering for the oral exam please specify in the notes DM1 if you do not want to do DM2 (that is assumed by default). After having booked it please contact Prof. Pedreschi to agree on the exam date (put Prof. Guidotti and Andrea Fedele in cc). There will be no agenda for DM1. | ||
| - | * [03.05.2024] Next lecture of DM2 will be as usual on Monday 06/05 from 9 to 11 in room C. | ||
| - | * [19.01.2024 DM2 Lectures will start on Mon 19/02, only for that lecture the time will be 14-16 instead of 9-11. | ||
| - | * [13.10.2023] To schedule meeting with the Teaching Assistant you can use: https:// | ||
| - | * [20.09.2023] Recordings of the lectures can be found on the web pages of the course for the years 2020/2021 and 2021/2022 (see links at the bottom of this page) | ||
| - | * [20.09.2023] Thursday 21 September there will be no lecture. | ||
| - | * [11.09.2023] Lectures will start on Monday 18 September 2023 at 11.00 room C1. | ||
| - | * [11.09.2023] Lectures will be in presence only. Registrations of the lectures of past years can be found at the bottom of this web page. | ||
| - | * [11.09.2023] Project Groups [[https:// | ||
| - | * [11.09.2023] MS Teams [[https:// | ||
| ====== Learning Goals ====== | ====== Learning Goals ====== | ||
| * DM1 | * DM1 | ||
| Linea 71: | Linea 66: | ||
| ^ Day of Week ^ Hour ^ Room ^ | ^ Day of Week ^ Hour ^ Room ^ | ||
| - | | Monday | + | | Monday |
| - | | | + | | |
| **Office hours - Ricevimento: | **Office hours - Ricevimento: | ||
| * Prof. Pedreschi | * Prof. Pedreschi | ||
| - | * Monday | + | * Monday |
| - | * Online | + | * Room 318 Dept. of Computer Science or MS Teams |
| * Prof. Guidotti | * Prof. Guidotti | ||
| - | * Tuesday | + | * Thursday |
| * Room 363 Dept. of Computer Science or MS Teams | * Room 363 Dept. of Computer Science or MS Teams | ||
| + | |||
| + | |||
| + | * Alessio Cascione | ||
| + | * Google Meet slot - https:// | ||
| + | * Alternative appointment by email | ||
| | | ||
| Linea 90: | Linea 91: | ||
| ^ Day of Week ^ Hour ^ Room ^ | ^ Day of Week ^ Hour ^ Room ^ | ||
| - | | Monday | + | | Monday |
| - | | Wednesday | + | | Wednesday |
| **Office Hours - Ricevimento: | **Office Hours - Ricevimento: | ||
| Linea 114: | Linea 115: | ||
| * The slides used in the course will be inserted in the calendar after each class. Most of them are part of the slides provided by the textbook' | * The slides used in the course will be inserted in the calendar after each class. Most of them are part of the slides provided by the textbook' | ||
| + | |||
| + | ===== FAQ ===== | ||
| - | | + | For the academic year 2025/2026, we make available a document containing **frequently asked questions (FAQs)** about the project at the end of the lecture. |
| + | Please consult this document first, as your question may already be answered there. | ||
| + | The FAQ will be updated regularly after each lecture with new relevant questions from students. | ||
| + | |||
| + | Check the document: | ||
| + | https:// | ||
| ===== Software===== | ===== Software===== | ||
| Linea 127: | Linea 135: | ||
| * Didactic Data Mining [[http:// | * Didactic Data Mining [[http:// | ||
| - | ====== Class Calendar (2023/2024) ====== | + | ====== Class Calendar (2025/2026) ====== |
| ===== First Semester (DM1 - Data Mining: Foundations) ===== | ===== First Semester (DM1 - Data Mining: Foundations) ===== | ||
| ^ ^ Day ^ Time ^ Room ^ Topic ^ Material ^ Lecturer ^ | ^ ^ Day ^ Time ^ Room ^ Topic ^ Material ^ Lecturer ^ | ||
| - | |01.| 18.09.2023 | 11-13 |C1| Overview, Introduction | + | | |
| - | | | + | | |
| - | |02.| 25.09.2023 | 11-13 |C1| Lab. Introduction to Python | + | | |
| - | |03.| 27.09.2023 | 11-13 |C1| Lab. Data Understanding | + | | |
| - | |04.| 02.10.2023 | 11-13 |C1| Data Understanding | {{ : | + | |01.| 29.09.2025 | 09-11 | E | Overview, Introduction |
| - | |05.| 04.10.2023 | 11-13 |C1| Data Understanding & Preparation | + | |02.| 02.10.2025 | 09-11 | E | The KDD process |
| - | |06.| 09.10.2023 | 11-13 |C1| Data Preparation & Data Similarity | + | |03.| 06.10.2025 | 09-11 | E | Introduction to Python |
| - | |07.| 11.10.2023 | 11-13 |C1| Data Similarity & Lab. Data Understanding | {{ : | + | | | 09.10.2025 | | | No Lecture (UNIPI Orienta) |
| - | |08.| 16.10.2023 | 11-13 |C1| Introduction to Clustering, K-Means | + | |04.| 13.10.2025 | 09-11 | E | Data Understanding |
| - | |09.| 18.10.2023 | 11-13 |C1| Clustering Validation, Hierarchical Clustering | + | |05.| 14.10.2025 | 09-11 | C1 | Data Preparation |
| - | |10.| 23.10.2023 | 11-13 |C1| Density-based Clustering | + | |04.| 16.10.2025 | 09-11 | E | Data Understanding Lab| {{ :dm:16.10.25_data_understanding_2025_lecture_in_class.zip |}} | Guidotti, Cascione |
| - | |11.| 25.10.2023 | 11-13 |C1| Lab. Clustering | {{ : | + | |06.| 20.10.2025 | 09-11 | E | Data Similarity and Introduction to Clustering |
| - | |12.| 30.10.2023 | 11-13 |C1| Ex. Clustering | + | |07.| 23.10.2025 | 09-11 | E | Centroid-based Clustering Algorithm |
| - | | | 01.11.2023 | 11-13 | | No Lecture | | | | + | |08.| 27.10.2025 | 09-11 | E | Hierarchical Clustering Algorithm |
| - | |13.| 06.11.2023 | 11-13 |C1| Intro Classification, kNN[[https:// | + | |09.| 27.10.2025 | 09-11 | E | Density-based Clustering Algorithm |
| - | |14.| 08.11.2023 | 11-13 |C1| Naive Bayes, Exercises | {{ : | + | |
| - | |15.| 13.11.2023 | 11-13 |C1| Model Evaluation | + | |
| - | |16.| 15.11.2023 | 11-13 |C1| Model Evaluation Exercises & Lab | {{ : | + | |
| - | | | + | |
| - | |17.| 22.11.2023 | 11-13 |C1| Decision Tree Classifier | + | |
| - | |18.| 27.11.2023 | 11-13 |C1| Decision Tree Classifier | {{ :dm:12_dm1_decision_trees_2023_24.pdf | Decision Tree}} | Pedreschi| | + | |
| - | |19.| 29.11.2023 | 11-13 |C1| Exercises and Lab. Decision Tree Classifier | {{ : | + | |
| - | |20.| 04.12.2023 | 11-13 |C1| Decision Tree Classifier, Exercises and Lab | {{ :dm:12_dm1_decision_trees_2023_24.pdf | Decision Tree}} | Pedreschi| | + | |
| - | |21.| 06.12.2023 | 11-13 |C1| Intro Regression & Lab. Regression | {{ : | + | |
| - | |22.| 11.12.2023 | 11-13 |C1| Into Pattern Mining and Apriori | + | |
| - | |23.| 13.12.2023 | 16-18 |C1| Apriori & Lab. Pattern Mining | {{ : | + | |
| - | |24.| 18.12.2023 | 11-13 |C| FP-Growth and Exercises | + | |
| ===== Second Semester (DM2 - Data Mining: Advanced Topics and Applications) ===== | ===== Second Semester (DM2 - Data Mining: Advanced Topics and Applications) ===== | ||
| ^ ^ Day ^ Time ^ Room ^ Topic ^ Material ^ Lecturer ^ | ^ ^ Day ^ Time ^ Room ^ Topic ^ Material ^ Lecturer ^ | ||
| - | |01.| 19.02.2024 | 14-16 |C| Overview, | + | |01.| 18.02.2025 | 14-16 |A1| Overview, |
| - | | | 21.02.2024 | | | No Lecture | | | | + | |
| - | | | 26.02.2024 | | | No Lecture | | | | + | |
| - | |02.| 28.02.2024 | 11-13 |C| Sequential Pattern Mining | {{ : | + | |
| - | |03.| 04.03.2024 | 9-11 |C| Sequential Pattern Mining | {{ : | + | |
| - | |04.| 06.03.2024 | 11-13 |C| Transactional Clustering | {{ : | + | |
| - | |05.| 11.03.2024 | 9-11 |C| Time Series Similarity | {{ : | + | |
| - | |06.| 13.03.2024 | 11-13 |C| Time Series Approximation | {{ : | + | |
| - | |07.| 18.03.2024 | 9-11 |C| Time Series Clustering & Motifs| {{ : | + | |
| - | |08.| 20.03.2024 | 11-13 |C| Time Series Classification | {{ : | + | |
| - | |09.| 25.03.2024 | 9-11 |C| Imbalanced Learning | {{ : | + | |
| - | |10.| 27.03.2024 | 11-13 |C| Dimensionality Reduction | {{ :dm:23_dm2_dimred_2023_24.pdf | + | |
| - | |11.| 03.04.2024 | 11-13 |C| Outlier Detection | {{ : | + | |
| - | |12.| 08.04.2024 | 9-11 |C| Outlier Detection | {{ : | + | |
| - | |13.| 10.04.2024 | 11-13 |C| Outlier Detection | {{ : | + | |
| - | |14.| 15.04.2024 | 14-16 |C| Gradient Descend, MLE | {{ : | + | |
| - | |15.| 17.04.2024 | 11-13 |C| Odds, LogOdds, Logistic Regression| {{ : | + | |
| - | |16.| 22.04.2024 | 9-11 |C| Support Vector Machine | {{ : | + | |
| - | |17.| 24.04.2024 | 11-13 |C| Perceptron, Neural Networks| {{ : | + | |
| - | |18.| 29.04.2024 | 9-11 |C| Deep Neural Networks | {{ : | + | |
| - | |19.| 06.05.2024 | 9-11 |C| CNN, RNN, DL-TS, Ensemble Intro | {{ : | + | |
| - | |20.| 08.05.2024 | 11-13 |C| Ensemble, Boosting, Adaboost | {{ : | + | |
| - | |21.| 13.05.2024 | 9-11 |C| Ensemble-TS, | + | |
| - | |22.| 15.05.2024 | 11-13 |C| Extreme Gradient Boosting | {{ : | + | |
| - | |23.| 20.05.2024 | 9-11 |C1| eXplainable Artificial Intelligence | {{ : | + | |
| - | |24.| 22.05.2024 | 11-13 |C1| eXplainable Artificial Intelligence | {{ : | + | |
| ====== Exams ====== | ====== Exams ====== | ||
| Linea 206: | Linea 178: | ||
| - Understanding of the theoretical aspects of the topics addressed during the course. The student may be required to write on formulas or pseudocode. During the explanations, | - Understanding of the theoretical aspects of the topics addressed during the course. The student may be required to write on formulas or pseudocode. During the explanations, | ||
| - Understanding of the algorithms illustrated during the course and their practical implementation. You will be asked to perform one or more simple exercises. The text will be shown on the teacher' | - Understanding of the algorithms illustrated during the course and their practical implementation. You will be asked to perform one or more simple exercises. The text will be shown on the teacher' | ||
| - | - Discussion of the project with questions from the teacher regarding unclear aspects, | + | - Discussion of the project with questions from the teacher regarding unclear aspects, questionable steps or choices. |
| - | questionable steps or choices. | + | |
| ** Final Mark: ** for 12-credit exam, the final mark will be obtained as the | ** Final Mark: ** for 12-credit exam, the final mark will be obtained as the | ||
| average mark of DM1 and DM2. | average mark of DM1 and DM2. | ||
| + | |||
| + | *** Exams Registration Instructions for DM1*** | ||
| + | - Use the Google registration form: TBD if you cannot register on Esami on Data Mining for year 2025/ | ||
| + | - When the registration closes you will receive a link to the Agenda | ||
| + | - Register on the Agenda selecting day and time (do not change you choice or cancel, if you book you want to do the exam) | ||
| + | - Submit the project at least 1 week before the day you selected (or within 31/12 to get +0.5 extra mark) | ||
| ===== Exam Booking Periods ===== | ===== Exam Booking Periods ===== | ||
| * Exam portal link: [[https:// | * Exam portal link: [[https:// | ||
| - | * 1st Appello: from 09/ | + | |
| - | * 2nd Appello: from 01/ | + | |
| - | * 3rd Appello: from 05/ | + | * 2nd Appello: from TBD to TBD |
| - | * 4th Appello: from 02/ | + | * 3rd Appello: from TBD to TBD |
| - | * 5th Appello: from 19/ | + | * 4th Appello: from TBD to TBD |
| - | * 6th Appello: | + | * 5th Appello: from TBD to TBD |
| + | * 6th Appello: | ||
| - | ===== Exam Booking Agenda ===== | ||
| - | When registering for the oral exam please specify in the notes DM1 if you do not want to do DM2 (that is assumed by default). After having booked for DM1 please contact Prof. Pedreschi to agree on the exam date (put Prof. Guidotti and Andrea Fedele in cc). There will be no agenda for DM1. | ||
| - | * 1st Appello - DM1: https:// | ||
| - | * 2nd Appello - DM1: https:// | ||
| - | * 3rd Appello: - DM1 & DM2: from 04/06/2024 to 13/06/2024 (deliver project by 29/ | ||
| - | * 4th Appello: - DM1 & DM2: from 02/07/2024 to 11/07/2024 (deliver project by 25/ | ||
| - | * 5th Appello: - DM1 & DM2: from 22/07/2024 to 25/07/2024 (deliver project by 15/07/2024) | ||
| - | * 6th Appello: | ||
| - | |||
| - | **Do not forget to make the evaluation of the course!!!** | ||
| ===== Exam DM1 ====== | ===== Exam DM1 ====== | ||
| Linea 238: | Linea 206: | ||
| * An **oral exam**, that includes: (1) discussing the project report; (2) discussing topics presented during the classes, including the theory and practical exercises. | * An **oral exam**, that includes: (1) discussing the project report; (2) discussing topics presented during the classes, including the theory and practical exercises. | ||
| - | * A **project**, | + | * A **project**, |
| * **Dataset** | * **Dataset** | ||
| - | - Assigned: | + | - Assigned: |
| - | - MidTerm Submission: 15/11/2023 (+0.5) (half project required, i.e., Data Understanding & Preparation and Clustering) | + | - MidTerm Submission: 15/11/2025 (+0.5) (half project required, i.e., Data Understanding & Preparation and Clustering) |
| - | - Final Submission: 31/12/2023 (+0.5) one week before the oral exam (complete project required). | + | - Final Submission: 31/12/2025 (+0.5) one week before the oral exam (complete project required). |
| - | - Dataset: {{ :dm:std.zip | STD}} | + | - Dataset: |
| ** DM1 Project Guidelines ** | ** DM1 Project Guidelines ** | ||
| - | See {{ :dm:dm1_project_guidelines_23_24.pdf | Project Guidelines}}. | + | See {{ :dm:dm1_project_guidelines_25_26.pdf |}} |
| - | |||
| - | |||
| - | |||
| ===== Exam DM2 ====== | ===== Exam DM2 ====== | ||
| Linea 259: | Linea 224: | ||
| * An **oral exam**, that includes: (1) discussing the project report; (2) discussing topics presented during the classes, including the theory and practical exercises. | * An **oral exam**, that includes: (1) discussing the project report; (2) discussing topics presented during the classes, including the theory and practical exercises. | ||
| - | * A **project**, | + | * A **project**, |
| * **Dataset** | * **Dataset** | ||
| - | - Assigned: | + | - Assigned: |
| - | - MidTerm Submission: 07/05/2024 (Modules 1 and 2 (for TS classification non DL-based models)) | + | - MidTerm Submission: 07/05/2026 |
| - | - Final Submission: one week before the oral exam (complete project required, also with DL-based models for TS classification). | + | - Final Submission: one week before the oral exam (complete project required). |
| - | - Dataset: | + | - Dataset: |
| ** DM2 Project Guidelines ** | ** DM2 Project Guidelines ** | ||
| - | See {{ : | + | See TBD. |
| Linea 293: | Linea 258: | ||
| ====== Previous years ===== | ====== Previous years ===== | ||
| + | * [[dm_ds2024-25]] | ||
| + | * [[dm_ds2023-24]] | ||
| * [[dm.2022-23ds]] | * [[dm.2022-23ds]] | ||
| * [[dm.2021-22ds]] | * [[dm.2021-22ds]] | ||
dm/start.1723120692.txt.gz · Ultima modifica: 08/08/2024 alle 12:38 (15 mesi fa) da Salvatore Ruggieri
