Strumenti Utente

Strumenti Sito


dm:temp

Differenze

Queste sono le differenze tra la revisione selezionata e la versione attuale della pagina.

Link a questa pagina di confronto

Prossima revisione
Revisione precedente
dm:temp [14/02/2014 alle 11:05 (11 anni fa)] – creata Mirco Nannidm:temp [04/04/2014 alle 07:30 (11 anni fa)] (versione attuale) Mirco Nanni
Linea 1: Linea 1:
 +====== Traccia secondo esercizio, DM2 ======
 +
 +  * ** Sequential pattern analysis: WarLogs Dataset.  Assigned on: 02.04.2014. To be completed within: 21.04.2014. Send papers (3 pages max of text, figures excluded) by email to datamining [dot] unipi [at] gmail [dot] com. Use ”[DM] exercise 6” in the subject.** Download the Dataset here in CVS format: {{:dm:warlogs.csv.zip| warlogs.csv.zip}}. Description of the variables are [[dm:warlogs2013-14|here]]. **Problem** : Build a dataset of sequences that describe, **for each day** and **for each geographical area**, the sequence of **events** happened there. The **geographical areas** to adopt can be the same indicated in the "region" attribute already in the dataset, or they can be obtained by partitioning the territory in some other way, for instance to try to have more balanced areas. The **events** to consider can be, for instance, represented by the "category" or "type" attributes in the dataset, or they can be computed considering other informations (kind of casualties, number of wounded or killed victims, etc.). Use this dataset to extract a set of frequent sequential patterns. **Tools for sequential patterns.** Among possible alternatives, we suggest do adopt one of the following:
 +    * **Weka**: use the GeneralizedSequentialPatterns associator. The input dataset should contain, for each line, a pair <sequence ID><Event ID>, and the lines should be temporally ordered (there is no explicit timestamp in the data). Here is an example: {{:dm:sequence_data.csv.zip|}}.
 +    * **Spam**: command-line tool, that can be downloaded {{:dm:spam_bin.zip|here}} (binaries for Windows and Linux, including sample input file). Notice that the input should contain only numeric (integer) values, therefore some coding is needed. Also, input sequences longer than 64 transactions are not allowed, therefore they should be truncated.
 +
 +
 +====== Ignorare quanto è qui sotto ======
 +
 +
 +{{ :dm:c641dad03aacb3d94ad6f575d6a43ac4.jpg?nolink&300 |}}
 +
 ^ ^ Day ^ Aula ^ Topic ^ Learning material ^ Instructor ^ ^ ^ Day ^ Aula ^ Topic ^ Learning material ^ Instructor ^
-|1.| 18.02.2013 9:00-11:00 | N1 | Introduction |  | Giannotti | +|1.| 17.02.2014 9:00-11:00 | N1 | Introduction |  | Giannotti | 
-|2.| 27.02.2013 9:00-11:00 | N1 | Frequent patterns and association rules / 1 | {{:dm:2-3tdm-restructured_assoc_2013.pdf|Association Rules -- Slides}} | Giannotti +|2.| 19.02.2014 9:00-11:00 | L1 | Frequent patterns and association rules / 1 | | Giannotti 
-|3.| 04.03.2013 9:00-11:00 | N1 | Frequent patterns and association rules / 2 |  | Giannotti +|3.| 24.02.2014 9:00-11:00 | N1 | Frequent patterns and association rules / 2 |  | Giannotti 
-|4.| 06.03.2013 9:00-11:00 | N1 | Frequent patterns and association rules / 3 |  | Giannotti +|4.| 26.02.2014 9:00-11:00 | L1 | Frequent patterns and association rules / 3 |  | Giannotti 
-|5.| 11.03.2013 9:00-11:00 | N1 | Introduction to CRM and Churn analysis | {{:dm:1.dm2_crm_customersegmentation-airmiles_2013.pdf|}} {{:dm:3.dm2012_st_events.pdf|}} {{:dm:4.dm2_churn_coop_2013.pdf|}} {{:dm:4.dm2_churn_intro_2013.pdf|}} | Giannotti +|5. | 3.03.2014 9:00-11:00 | N1 | Association rules on DM tools | | Giannotti 
-| 6. | 13.03.2013 9:00-11:00 | N1 | Association rules on DM tools | {{:dm:en_tanagra_assoc_rules_comparison.pdf|}} [[http://archive.ics.uci.edu/ml/datasets/Pima+Indians+Diabetes]]| Giannotti +|6.| 5.03.2014 9:00-11:00 | L1 | Sequential patterns / 1 |  | Nanni  | 
-|7.| 18.03.2013 9:00-11:00 | N1 | Sequential patterns / 1 | Textbook, Sect. 7.4 {{:dm:sequential_patterns.pdf|Sequential Patterns - Slides}} [1-12] | Nanni  | +|7.| 10.03.2014 9:00-11:00 | N1 | Sequential patterns / 2 |  | Nanni  | 
-|8.| 20.03.2013 9:00-11:00 | N1 | Sequential patterns / 2 | Sequential Patterns - Slides [13-24] | Nanni  | +|8.| 12.03.2014 9:00-11:00 | L1 | Time series / 1 + Data exploration: assignments | | Nanni  | 
-|9.| 25.03.2013 9:00-11:00 | N1 | Time series / 1 + Data exploration: assignments | {{:dm:time_series_from_keogh_tutorial.pdf|Time Series - Slides}} [1-34] | Nanni  | +|9.| 17.03.2014 9:00-11:00 | N1 | Time series / 2 |  | Nanni  | 
-|10.| 27.03.2013 9:00-11:00 | L1 | Time series / 2 | Time Series - Slides [35-84] | Nanni  | +|10.| 19.03.2014 9:00-11:00 | L1 | Classification: evaluation methods + Case study: Fraud detection|  | Giannotti |   
-|11.| 08.04.2013 9:00-11:00 | N1 | Classification: evaluation methods + Case study: Fraud detection| {{:dm:fraud_detection.pdf|}}{{:dm:dm2-fraudedetection1.ppt.pdf|}} | Giannotti |   +|11.| 24.03.2014 9:00-11:00 | N1 | Network diffusion and Virality Marketing|  | Giannotti   | 
-|12.| 10.04.2013 9:00-11:00 | L1 | Network diffusion and Virality Marketing| {{:dm:7.mains_crm_innovatori.pdf|}} | Giannotti   | +|12.| 26.03.2014 9:00-11:00 | L1 | Mobility Data Mining / 1 |  | Nanni  | 
-|13.| 15.04.2013 9:00-11:00 | N1 | Mobility Data Mining / 1 | {{:dm:spatio-temporal-dm_2012.pdf|Mobility DM - Slides}} [1-33] + Reference book chapter (ask to instructor) | Nanni  | +|13.| 7.04.2014 9:00-11:00 | N1 | Mobility Data Mining / 2 |  | Nanni  | 
-|14.| 17.04.2013 9:00-11:00 | L1 | Mobility Data Mining / 2 |  | Nanni  | +|14.| 9.04.2014 9:00-11:00 | L1 | Case study: Mobility Data Mining |  | Nanni  | 
-|15.| 22.04.2013 9:00-11:00 | N1 | Case study: Mobility Data Mining | {{:dm:slides20120229.pdf|MDM case study}} {{:dm:d4d.pdf|GSM for transport plannig}}  | Nanni  | +|15.| 14.04.2014 9:00-11:00 | N1 | Case study: Mobility Data Mining/2 |  | Giannotti - Nanni  | 
-|16.| 24.04.2013 9:00-11:00 | L1 | Case study: Mobility Data Mining/2 |  | Giannotti - Nanni  | +|16.| 16.04.2014 9:00-11:00 | L1 | Data exploration: results of assignments + Presentation of projects |  | Nanni  | 
-|17.| 06.05.2013 9:00-11:00 | N1 | Data exploration: results of assignments + Presentation of projects | {{:dm:project_1_solution.pdf|Project 1 sample solution}} | Nanni  | +|17.| 28.04.2014 9:00-11:00 | N1 | Data Mining and Privacy/1 |  | Giannotti 
-|18.| 08.05.2013 9:00-11:00 | L1 | Data Mining and Privacy/1 | {{:dm:privacy_lezione14-16.ppt.pdf|Privacy}} {{:dm:capprivacy.pdf|Mobility Data & Privacy}} | Giannotti +|18.| 30.04.2014 9:00-11:00 | L1 | Case study: Mining official data ed health data |  | Nanni  | 
-|19.| 13.05.2013 9:00-11:00 | N1 | Case study: Mining official data ed health data | {{:dm:5.dm2-miningofficialdata.pdf|Mining Official Data}} | Nanni  | +|10.| 5.05.2014 9:00-11:00 | N1 | Data Mining and Privacy/2 | | Giannotti  | 
-|20.| 15.05.2013 9:00-11:00 | L1 | Data Mining and Privacy/2 | | Giannotti  |+|20.| 7.05.2014 9:00-11:00 | L1 |  | |   | 
 +|21.| 12.05.2014 9:00-11:00 | N1 |  | |   | 
 +|22.| 14.05.2014 9:00-11:00 | L1 |  | |   | 
 +|23.| 19.05.2014 9:00-11:00 | N1 |  | |   | 
 +|24.| 21.05.2014 9:00-11:00 | L1 |  | |   | 
 +|25.| 27.05.2014 9:00-11:00 | N1 |  | |   | 
dm/temp.1392375919.txt.gz · Ultima modifica: 14/02/2014 alle 11:05 (11 anni fa) da Mirco Nanni

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki