dm:start:guidelines
Differenze
Queste sono le differenze tra la revisione selezionata e la versione attuale della pagina.
Entrambe le parti precedenti la revisioneRevisione precedenteProssima revisione | Revisione precedenteUltima revisioneEntrambe le parti successive la revisione | ||
dm:start:guidelines [17/10/2016 alle 14:08 (8 anni fa)] – [Guidelines for the task on data understanding] Anna Monreale | dm:start:guidelines [03/10/2018 alle 23:31 (6 anni fa)] – [Guidelines for the task on Data Understanding] Anna Monreale | ||
---|---|---|---|
Linea 1: | Linea 1: | ||
- | ====== Guidelines for the task on data understanding | + | ====== Guidelines for the task on Data Understanding |
* Data understanding (30 points) | * Data understanding (30 points) | ||
- | | + | |
- | * Distribution of the variables and statistics (7 points) | + | - Distribution of the variables and statistics (7 points) |
- | * Assessing data quality (missing values, outliers) (7 points) | + | - Assessing data quality (missing values, outliers) (7 points) |
- | * Variables transformations (6 points) | + | - Variables transformations (6 points) |
- | * Pairwise correlations and eventual elimination of redundant variables (7 points) | + | - Pairwise correlations and eventual elimination of redundant variables (7 points) |
Linea 26: | Linea 26: | ||
====== Guidelines for the task on Association Rules Mining ====== | ====== Guidelines for the task on Association Rules Mining ====== | ||
- | | + | * Frequent |
- | * ** Association | + | * Discussion of the most interesting frequent patterns and analyze how changes the number of patterns w.r.t. the min_sup parameter (7 points) |
- | | + | |
+ | * Discussion | ||
+ | | ||
+ | | ||
====== Guidelines for the task on Classification ====== | ====== Guidelines for the task on Classification ====== | ||
- | * ** Learning of different decision trees (12 points) ** | + | * Learning of different decision trees/ |
- | * ** Decision | + | * Decision trees interpretation (6 points) |
- | * ** Discussion | + | |
+ | | ||
+ | |||
+ | |||
+ | ====== Guidelines for the Project ====== | ||
+ | * Title page is not counted in the 20 page limits, i.e., you can have 20 pages + 1 title page, the page limit is strict: additional pages will not be considered for the final evaluation, i.e., pages 21,22,23 etc. will not be read and evaluated. | ||
+ | * The project size must not exceed 25Mb, i.e. you must be able to send it by email without compression. | ||
+ | * Only PDF file are allowed, you do not have to submit python code or the knime workflows. | ||
+ | * The final paper must be easily readable, i.e., it is better to use font size higher than 9pt. | ||
+ | * Use a readable font type and size, e.g. Arial, Times New Romans | ||
+ | * You can use multiple columns and change the margin size but the project must be readable. | ||
+ | * It is NOT required to put python code, knime flows, or theoretical descriptions of the algorithm in the final paper. | ||
+ | * You must justify every choice you make with respect to the features used and selected for each algorithm and the parameters you tune. Discuss every result. Plots without any comment are useless. Even if you find a top configuration for your algorithm (e.g. K-Means with k=5) you MUST list which are the different parameters you tested and justify your choice. | ||
+ | * You can get 3 additional extra points in the final mark with respect to the following criteria: | ||
+ | - Innovation (0.5 points) | ||
+ | - Experimentation (0.5 points) | ||
+ | - Performance (0.5 points) | ||
+ | - Appearance (0.5 points) | ||
+ | - Organization (0.5 points) | ||
+ | - Summary (0.5 points) |
dm/start/guidelines.txt · Ultima modifica: 23/11/2020 alle 10:34 (4 anni fa) da Riccardo Guidotti