dm:start:guidelines
Differenze
Queste sono le differenze tra la revisione selezionata e la versione attuale della pagina.
Entrambe le parti precedenti la revisioneRevisione precedenteProssima revisione | Revisione precedente | ||
dm:start:guidelines [03/10/2018 alle 23:29 (6 anni fa)] – [Guidelines for the task on Data Understanding] Anna Monreale | dm:start:guidelines [23/11/2020 alle 10:34 (4 anni fa)] (versione attuale) – [Guidelines for the task on Classification] Riccardo Guidotti | ||
---|---|---|---|
Linea 1: | Linea 1: | ||
====== Guidelines for the task on Data Understanding ====== | ====== Guidelines for the task on Data Understanding ====== | ||
* Data understanding (30 points) | * Data understanding (30 points) | ||
- | | + | |
- | * | + | - Distribution of the variables and statistics (7 points) |
- | * | + | - Assessing data quality (missing values, outliers) (7 points) |
- | * | + | - Variables transformations (6 points) |
- | * | + | - Pairwise correlations and eventual elimination of redundant variables (7 points) |
Linea 36: | Linea 36: | ||
====== Guidelines for the task on Classification ====== | ====== Guidelines for the task on Classification ====== | ||
* Learning of different decision trees/ | * Learning of different decision trees/ | ||
- | * Decision trees interpretation (6 points) | + | * Decision trees interpretation, validation with test and training set (6 points) |
- | | + | |
* Discussion of the best prediction model (6 points) | * Discussion of the best prediction model (6 points) | ||
dm/start/guidelines.1538609398.txt.gz · Ultima modifica: 03/10/2018 alle 23:29 (6 anni fa) da Anna Monreale