Questa è una vecchia versione del documento!
Indice
Year 2012-2013
Prof. Roberto Grossi
Announcements
- Calendario orali (venire comunque per fissare anche le successive date di quel mese):
- per il mese di gennaio: ven. 18.01.2013 ore 9:00 nello studio del docente;
- per il mese di febbraio: lun. 11.02.2013 ore 9:00 nello studio del docente.
- Examination dates: Dec. 20 at 9:30 (“compitino”, aula C1), Jan. 15 at 9:30 (aula A), Feb. 5 at 9:30 (aula A).
- The class lectures will be (mainly) given in English.
Overview
We will study, design and analyze advanced algorithms and data structures for the efficient solution of combinatorial problems involving all basic data types, such as integers, strings, (geometric) points, trees and graphs. This course deepens and extends the algorithmic notions of students.
The syllabus is structured to highlight the applicative scenarios in which the studied algorithms and data structures can be successfully applied. The level of detail with which each argument will be dealt with can change year-by-year, and will be decided according to requests coming from other courses and/or specific issues arising in, possibly novel, applicative scenarios.
The advanced nature of this course focuses on developing algorithmic design skills. The students are exposed to complex problems that require a significant effort in problem solving. One “brainstorming” lecture per week is devoted to a full immersion in this activity, and it is highly recommend to attend it. The purpose is not that of teaching/learning further N algorithms, but to refine students' skills. The final written exam will be based on the topics discussed during the “brainstorming” lectures.
We like students that make mistakes in a constructive way, since this is the starting point of our brainstorming to drive their reasoning paths, thus learning by mistakes. We are interested in the route that leads to the algorithmic solution, rather than the solution itself. Regarding mistakes and intelligence, Alan Turing wrote in 1947 that “… if a machine is expected to be infallible, it cannot also be intelligent”. This is a motto for this class, and an invitation to express your own ideas even if they could be apparently wrong.
Topics
Algorithms for massive data sets and hierarchical memories
Date | Topics | References and notes |
---|---|---|
Oct. 2 | Introduction to the course. Problem solving: ideas and examples. Toy problem: maximal interval of positive integers with bounded pairwise difference. | [Notes] |
Oct. 3 | External memory model. Case study for sequential access versus random access. Description of the model. CPU complexity and I/O complexity. Cost of scanning and sorting. K-way merge sort. | Introduction, chapt.2 and 3 (set D=1 disks)] |
Oct. 4 | Lower bounds for sorting and permuting in the external memory model. K-way distribution sort. | [Introduction, chapt.2 and 3 (set D=1 disks), chapt.5 (set D=1 disks, no randomization, till Sect.5.1.1 included)] Notes[Notes] |
Oct. 9 | Problem solving. K-way merging, permuting, distributing, median, selection and order statistics. | Notes Notes |
Oct. 10 | k-way distributing. Scan+sort paradigm: case study of map-reduce. | Notes [Notes] |
Oct. 11 | Cache-oblivious (CO) model. Motivations and simple examples: multiple scanning, inverting an array, median and order statistics, binary search (lower bound). | [Sect.4.1] |
Oct. 12 | Cache-oblivious model. k-way search search and van Emde Boas layout. Paging of arbitrary binary trees. | [Sect.4.1] [Sect.3.1] [Sect.5.1] [Notes] |
Oct. 16 | Problem solving. Implicit van Emde Boas layout of binary complete trees. | Notes Notes Code Code |
Oct. 17 | Problem solving. Map-reduce computations. Cache-oblivious layout of arbitrary trees. | Code Code Gallery Notes |
Oct. 18 | Suffix sorting. DC-3 algorithm for RAM model, EM model, CO model. | DC3 algorithm (up to Section3) |
Oct. 19 | Suffix tree. How to make it cache-oblivious. | Notes |
Randomization, hashing and data streaming
Date | Topics | References and notes |
---|---|---|
Oct. 23 | Data Streaming Model. Motivations and examples. Basic technique: Count-Min Sketches. K-wise independence. | Site Notes |
Oct. 24 | Problem solving. Filling the details of the DC3 suffix sorting. Proving k-wise independence for a family of hash functions. | Notes. |
Oct. 25 | Count-Min Sketches. Full proof and analysis. Indicator variables, Markov's inequality. | Paper Site |
Oct. 30 | Problem solving. Space lower bound for Count-Min Sketches. | Notes App.C CLRS on Probability |
Oct. 31 | Problem solving. Chernoff bounds, indicator variables, and applications of count-min sketches. Designing a data streaming algorithm to estimate the number of distinct elements. | Sect.4.1, Motwani-Raghavan book Notes Notes |
Nov. 9 | Using a family of random hash functions: Cuckoo hashing. | Notes |
Nov. 13 | Problem solving. Count-Min Sketch. Cuckoo hashing. | Notes |
Nov. 14 | Class canceled. | Transportation strike |
Nov. 16 | Bloom filters. Design parameters and probabilistic analysis, with applications. | Survey |
Nov. 20 | Problem solving. Count-Min Sketch for interval queries and inner product. | Paper |
Nov. 21 | Randomized quicksort. Skiplists. Randomized binary search trees. | [sez.2.5.4] [sez.3.3] Paper (pages 288-302 |
Data compression and string algorithms
Nov. 23 | Class canceled. | Examination committee |
Nov. 27 | Shannon's entropy and Kolmogorov's complexity. Elias' coding: gamma- and delta-codes. Huffman coding. Lempel-Ziv dictionary-based parsing. | [cap.2:da inizio fino a pag.36; sez.2.6] [cap 3: fino pag.119] |
Hard problems
Dec. 7 | NP-Hard problems. Approximation algorithms. Traveling Salesman Problem (TSP): hardness of approximation and 2-approximation for metric instances. | [chapt.9 (Italian)] |
Dec. 11 | Further examples: approximation for Maximal independent sets (MIS) and Vertex Cover (VC). | [chapt.2: pp.39-40, par. 2.1.2] [Approximate evaluation of VC] |
Dec. 14 | Knapsack problem: bad example for greedy and its refinement for 2-approximation. Approximation for Min Bin Packing using Next Fit and First Fit Decreasing strategies. | [chapt.2: par. 2.1.1, par. 2.2.2] |
Dec. 18 | Problem solving. Distinct elements in a stream and general discussion of the exercises presented during the semester. |
Excercises discussed in class
List of problems discussed during the problem solving session (in Italian).
Official class log (registro)
Exams
20.12.2012 (visione della correzione: venerdì 11.1.2013 ore 11:00-13:00):
242180 25 303756 27 407947 28 409125 18 437532 22 437749 29 438125 21 438422 27 438591 24 439415 ins 443065 29 443495 24 451371 28 452095 27 453278 25 454413 29 490104 25 490537 27