Strumenti Utente

Strumenti Sito


dm:start:guidelines

Differenze

Queste sono le differenze tra la revisione selezionata e la versione attuale della pagina.

Link a questa pagina di confronto

Entrambe le parti precedenti la revisione Revisione precedente
Prossima revisione
Revisione precedente
Prossima revisione Entrambe le parti successive la revisione
dm:start:guidelines [14/11/2016 alle 14:13 (6 anni fa)]
Anna Monreale [Guidelines for the task on data understanding]
dm:start:guidelines [03/10/2018 alle 23:30 (4 anni fa)]
Anna Monreale [Guidelines for the task on Data Understanding]
Linea 1: Linea 1:
 ====== Guidelines for the task on Data Understanding ====== ====== Guidelines for the task on Data Understanding ======
    * Data understanding (30 points)    * Data understanding (30 points)
-   Data semantics (3 points) +   Data semantics (3 points) 
-   * Distribution of the variables and statistics (7 points) +   *   * Distribution of the variables and statistics (7 points) 
-   * Assessing data quality (missing values, outliers) (7 points) +   *   * Assessing data quality (missing values, outliers) (7 points) 
-   * Variables transformations (6 points) +   *   * Variables transformations (6 points) 
-   * Pairwise correlations and eventual elimination of redundant variables (7 points)+   *   * Pairwise correlations and eventual elimination of redundant variables (7 points)
  
    
Linea 26: Linea 26:
  
 ====== Guidelines for the task on Association Rules Mining ====== ====== Guidelines for the task on Association Rules Mining ======
-  * Frequent patterns extraction with different values of support and different types  (i.e. frequent, close maximal), (points) +  * Frequent patterns extraction with different values of support and different types (i.e. frequent, closemaximal), (points) 
-  * Discussion of the most interesting frequent patterns (points) +  * Discussion of the most interesting frequent patterns and analyze how changes the number of patterns w.r.t. the min_sup parameter (points) 
-  * Association rules extraction with different values of confidence (points) +  * Association rules extraction with different values of confidence (points) 
-  * Discussion of the most interesting rules (points) +  * Discussion of the most interesting rules and analyze how changes the number of rules w.r.t. the min_conf parameter, histogram of rules' confidence and lift (points) 
-  * Use the most meaningful rules to replace missing values and evaluate the accuracy  (points) +  * Use the most meaningful rules to replace missing values and evaluate the accuracy (points) 
-  * Use the most meaningful rules to predict if it is a bad buy or not and evaluate the accuracy (points)+  * Use the most meaningful rules to predict the target variable and evaluate the accuracy (points)
  
  
 ====== Guidelines for the task on Classification ====== ====== Guidelines for the task on Classification ======
-   * Learning of different decision trees with different parameters and gain formulas with the object of maximizing the performances (12 points)+   * Learning of different decision trees/classification algorithms with different parameters and gain formulas with the object of maximizing the performances (12 points)
    * Decision trees interpretation (6 points)    * Decision trees interpretation (6 points)
    * Decision trees validation with test and training set (6 points)    * Decision trees validation with test and training set (6 points)
Linea 46: Linea 46:
    * Only PDF file are allowed, you do not have to submit python code or the knime workflows.    * Only PDF file are allowed, you do not have to submit python code or the knime workflows.
    * The final paper must be easily readable, i.e., it is better to use font size higher than 9pt.    * The final paper must be easily readable, i.e., it is better to use font size higher than 9pt.
-   * Use a readable font size, e.g. Arial, Times New Romans+   * Use a readable font type and size, e.g. Arial, Times New Romans
    * You can use multiple columns and change the margin size but the project must be readable.    * You can use multiple columns and change the margin size but the project must be readable.
    * It is NOT required to put python code, knime flows, or theoretical descriptions of the algorithm in the final paper.    * It is NOT required to put python code, knime flows, or theoretical descriptions of the algorithm in the final paper.
dm/start/guidelines.txt · Ultima modifica: 23/11/2020 alle 10:34 (19 mesi fa) da Riccardo Guidotti