Strumenti Utente

Strumenti Sito


magistraleinformatica:alg2:algo2_14:start

Questa è una vecchia versione del documento!


Year 2014-2015

Prof. Roberto Grossi

Announcements

  • The final list of problems discussed in class is available.
  • Office hours: Thu 11-14 (Dipartimento di Informatica)
  • Next written examination dates (please register): Room C1, 09:00-11:00, Jan. 20, 2015. Room C1, 15:00-17:00, Feb. 11, 2015.
  • Next oral examination dates (meeting to fix dates): Teacher's office, 09:00, Jan. 14, 2015. Teacher's office, 09:00, Jan. 22, 2015. Teacher's office,, 09:00, Feb. 16, 2015.

Overview

We will study, design and analyze advanced algorithms and data structures for the efficient solution of combinatorial problems involving all basic data types, such as integers, strings, (geometric) points, trees and graphs. This course deepens and extends the algorithmic notions of students.

The syllabus is structured to highlight the applicative scenarios in which the studied algorithms and data structures can be successfully applied. The level of detail with which each argument will be dealt with can change year-by-year, and will be decided according to requests coming from other courses and/or specific issues arising in, possibly novel, applicative scenarios.

The advanced nature of this course focuses on developing algorithmic design skills. The students are exposed to complex problems that require a significant effort in problem solving. One “brainstorming” lecture per week is devoted to a full immersion in this activity, and it is highly recommend to attend it. The purpose is not that of teaching/learning further N algorithms, but to refine students' skills. The final written exam will be based on the topics discussed during the “brainstorming” lectures.

We like students that make mistakes in a constructive way, since this is the starting point of our brainstorming to drive their reasoning paths, thus learning by mistakes. We are interested in the route that leads to the algorithmic solution, rather than the solution itself. Regarding mistakes and intelligence, Alan Turing wrote in 1947 that “… if a machine is expected to be infallible, it cannot also be intelligent”. This is a motto for this class, with a kind invitation to express your own ideas even if they could be apparently wrong.

Topics

Please note that several topics are the outcome of recent advancement in algorithms and data structures, and thus most of the course material consists in research papers or book chapters.

Date Topics References and notes
Sept. 23 Introduction to the course. Problem solving: ideas and examples. -

Algorithms for massive data sets and hierarchical memories

Massive data sets are one of the key elements in our so-called big data society. In this scenario the design and analysis of algorithms and data structures adopt the external and cache-oblivious models, which take into account the memory hierarchy that provides performance to modern computers. We study basic problems on sorting and searching in these models, comparing the results with the classical RAM model of computation.

Date Topics References and notes
Sept. 24 External memory model (EMM). Case study for sequential access versus random access. Description of the model. CPU complexity and I/O complexity. K-way mergesort. Introduction, chapt.2 and 3 (set D=1 disks) notes
Sept. 25Cost of scanning, sorting and permuting. Scan-and-sort paradigm. Lower bounds for sorting in the external memory model. Introduction, chapt.2 and 3 (set D=1 disks), chapt.5 (set D=1 disks, no randomization, till Sect.5.1.1 included)NotesNotes
Sept. 30 Problem solving. External memory K-way merge and mergesort. External memory permuting. list of problems
Oct. 1 Scan-and-sort paradigm: map-reduce. Lower and upper bounds for static searching in external-memory. paper slides slides sect.11.1
Oct. 2 Selection of splitters for the distribution sort. Notes wikipedia
Oct. 7 Problem solving. Implicit static search in external memory. External memory distribution sort. list of problems
Oct. 8 Dynamic searching in external memory: (a,b)-trees and B-trees. Definition, search and insert operations. sect.1 and 2
Oct. 9 Dynamic searching in external memory: (a,b)-trees and B-trees. Deletion, construction, and I/O analysis. sect.2
Oct. 14 Problem solving. Number of split/fuse operations for $(a,b)$-trees. 1-D range queries. list of problems
Oct. 15 Suffix sorting and suffix array. Relation with standard sorting. DC-3 algorithm for RAM model. DC3 algorithm (up to Section3)
Oct. 16 Suffix sorting and suffix array. DC-3 algorithm for RAM model (continued). DC3 algorithm (up to Section3)
Oct. 21 Problem solving. Filling the details of the DC3 suffix sorting. Suffix array searching. list of problems
Oct. 22 Cache-oblivious (CO) model. van Emde Boas (vEB) layout of complete binary (search) trees. Sect.4.1 Notes
Oct. 23 Text indexing and searching: suffix trees. wikipedia
Oct. 28 Problem solving. Computing the LCP and suffix array construction. Navigating in the implicit vEB layout. list of problems

Randomization, hashing and data streaming

Randomization is now a common powerful tool to solve large-scale problems. After introducing the concept of randomized algorithms and hashing, we apply them to solve data streaming problems, a field emerged in the last decade. Here data flow as a stream and one-pass algorithms with limited memory can process it. We focus on the count-min sketch paradigm and its applications.

Date Topics References and notes
Oct. 29 Randomness and Kolmogorov complexity. Randomized quicksort. Notes (first 2 pages) par. 7.3
Oct. 30 Las Vegas and Montecarlo algorithms. Analysis of randomised quick sort. Randomized algorithm for min-cut. par.5.1 par. 7.3
Nov. 11 Problem solving. list of problems
Nov. 12 Data Streaming Model. Motivations and examples. sect.1 Site Notes
Nov. 13 Count-Min Sketches and applications - part 1 sect.2,3 Site
Nov. 18 Problem solving. list of problems
Nov. 19 Count-Min Sketches and applications - part 2 sect.2,3 Site
Nov. 20 Count-Min Sketches and applications - part 3 sect.2,3 Site
Nov. 25 Problem solving. list of problems

Enumeration, hardness and approximation of some combinatorial problems

NP-hard problems are important but difficult to solve and no deterministic polynomial algorithms are currently known for them. Even basic problems, such as finding a simple path of vertices in a graph, are NP-hard. We discuss how to attack these problems (a) by counting and listing their solutions where the cost is proportional to the output (it can be exponential in the worst case but we only pay for what we get) and (b) by approximating their solutions with polynomial-time algorithms when the problem requires to minimize a cost or maximize a benefit. Most of the addressed problems are on graphs, which are popular representations for modern networked data.

Date Topics References and notes
Nov. 26 Complexity classes: P, NP, NPC. Polynomial reductions (prof. Pagli) Turing Machines pp.255-260, 264-276
Nov. 27 Example of reductions: SAT, 3-SAT, NE-3-SAT. Notes pp.282-283
Dec. 2 Problem solving list of problems
Dec. 3 NP-completeness of MAX-CUT. Notes
Dec. 4 Coloring problems. Reduction from 3-coloring of planar maps to 3-SAT. Notes
Dec. 4 Problem solving list of problems
Dec. 9 Notion of r-approximation algorithm. TSP: NP-hardness and difficulty of approximation. 2-approximation for metric TSP. Except Sect.35.1
Dec. 9 Problem solving (4 hours) list of problems
Dec. 10 2-approximation for MAX-CUT: local search, greedy. 2-approximation for minimum vertex cover. Notes Sect.35.1
Dec. 11 Approximation for bin packing and knapsack. chapt.2: par. 2.1.1, par. 2.2.2

Tutoring (prof. Giuseppe Prencipe)

Date Topics References and notes
Nov. 17 Review (part I) of topics from Chapt. 1, Chapt. 2, Sect. 3.1, Sect. 4.5. Cormen, Leiserson, Rivest, Stein. Introduction to Algorithms. MIT Press.
Nov. 26 Review (part II) of topics from Chapt. 1, Chapt. 2, Sect. 3.1, Sect. 4.5. Cormen, Leiserson, Rivest, Stein. Introduction to Algorithms. MIT Press.
Dec. 3 Review of topics from Chapt. 6, Sect. 7.1, 7.2, Sect. 8.1, Chapt.9. Cormen, Leiserson, Rivest, Stein. Introduction to Algorithms. MIT Press.
Dec. 10 Review of topics from Chapt. 10, Chapt. 11 (except Sect.11.5), Chapt. 12 (execpt Sect. 12.4), Sect. 15.3 and 15.4. Cormen, Leiserson, Rivest, Stein. Introduction to Algorithms. MIT Press.
Excercises discussed in class

This is the final list of problems. Some of the screen snapshots shown in the classroom.

Official class documents
Examination outcomes (in Italian)

Examination date 17.12.2014

Matricola Score/30 Comments
42345829Ex.1 missing the part on showing how to implement EM mergesort and analyzing its cost. Ex.2 done, with a small typo. Ex.3 done.
45161929Ex.1 done with some issues in the analysis of the CPU time: it is not specified which choice of k and, although the CPU complexity of the k-way merge is proven to be O(N log k), it is then used O(N k) in the rest of the analysis. Ex.2-3 done.
46422924Ex.1 discusses the analysis in a too light fashion. Ex.2 does not use dyadic intervals, so the solution is good only if the range is very small. Ex. 3 done.
47112228Ex.1 done. Ex.2 needs to introduce a global variable for the sum of the counters as suggested in the text of the exercise. Ex.3 done.
47371129Ex.1 does not specify the choice of k, and does not conclude the analysis for the CPU time. Ex.2 has an unclear step using the complement. Ex.2 done.
47875030-Ex.1 done, the proposed algorithm is for main memory but then it is said how to use it in EM. Ex.2-3 done.
48047429Ex.1 done but the proposed algorithm is for main memory and it does not specify how to get it for EM; the analysis is correct but cannot be referred otherwise to that algorithm. Ex.2 done with a small typo; appreciated that it is proved why at most 2 log n intervals are needed. Ex.3 done.
49943420Ex.1 starts from a concrete example but then it does not generalize; also, analysis is missing. Ex.2 not done. Ex.3 same as the first exercise, some effort is needed to learn how to generalize the ideas.
50871029Ex.1 missing the part on showing how to implement EM mergesort and analyzing its cost. Ex.2 done, one step is not completely clear. Ex.3 done.
525804Ex.1 done correctly. Ex.2-3 not done.
525820Ex.1 done partially, the few figures are explanatory, but more explanatory text would be appreciated. Ex.3 started with the definition of max cut but not completed. Ex.2 not done.
525822Ex.1 done correctly, the figures are explanatory, more explanatory text would be appreciated. Ex.2-3 not done.
527114Need student's help to read it, I cannot decode some parts of the text.
magistraleinformatica/alg2/algo2_14/start.1421075219.txt.gz · Ultima modifica: 12/01/2015 alle 15:06 (9 anni fa) da Roberto Grossi