Strumenti Utente

Strumenti Sito


magistraleinformatica:alg2:algo2_15:start

Questa è una vecchia versione del documento!


Year 2015-2016

Prof. Roberto Grossi
Tutor: Dr. Filippo Geraci

Announcements

  • New exercises (Dec.3) added to the partial list of problems.
  • Final term: Dec. 16 at 16:00 in room A1
  • Office hours: Tue 14-16 (Dipartimento di Informatica)

Overview

We will study, design and analyze advanced algorithms and data structures for the efficient solution of combinatorial problems involving all basic data types, such as integers, strings, (geometric) points, trees and graphs. This course deepens and extends the algorithmic notions of students.

The syllabus is structured to highlight the applicative scenarios in which the studied algorithms and data structures can be successfully applied. The level of detail with which each argument will be dealt with can change year-by-year, and will be decided according to requests coming from other courses and/or specific issues arising in, possibly novel, applicative scenarios.

The advanced nature of this course focuses on developing algorithmic design skills. The students are exposed to complex problems that require a significant effort in problem solving. One “brainstorming” lecture per week is devoted to a full immersion in this activity, and it is highly recommend to attend it. The purpose is not that of teaching/learning further N algorithms, but to refine students' skills. The final written exam will be based on the topics discussed during the “brainstorming” lectures.

We like students that make mistakes in a constructive way, since this is the starting point of our brainstorming to drive their reasoning paths, thus learning by mistakes. We are interested in the route that leads to the algorithmic solution, rather than the solution itself. Regarding mistakes and intelligence, Alan Turing wrote in 1947 that “… if a machine is expected to be infallible, it cannot also be intelligent”. This is a motto for this class, with a kind invitation to express your own ideas even if they could be apparently wrong.

Topics

Please note that several topics are the outcome of recent advancement in algorithms and data structures, and thus most of the course material consists in research papers or book chapters.

Date Topics References and notes
Sept. 22 Introduction to the course. Entrance exam. -
Sept. 24 Discussion of the solutions for the exercises of the entrance exam. see snapshots
Sept. 28 Problem solving. Statistics on arrays (top-k elements). see snapshots
Sept. 29 Linear-time worst-case selection. par.9.3

Randomization, hashing and data streaming

Randomization is now a common powerful tool to solve large-scale problems. After introducing the concept of randomized algorithms and hashing, we apply them to solve data streaming problems, a field emerged in the last decade. Here data flow as a stream and one-pass algorithms with limited memory can process it. We focus on the count-min sketch paradigm and its applications. [Note: to refresh the basic notions on counting and probability, please refer to Appendix C in Cormen-Leiserson-Rivest-Stein's book “Introduction to Algorithms”, 3rd ed., MIT Press.]

Date Topics References and notes
Oct. 1 Las Vegas and Montecarlo algorithms. Indicator variables and randomised quicksort (Las Vegas). par. 7.3
Oct. 5 Problem solving. Randomized selection (Las Vegas). list of problems
Oct. 6 Random indicator variables: secretary problem (and random permuting), birthday paradox. par. 5.1-5.3, 5.4.1
Oct. 8 Karp-Rabin fingerprints: randomized checking and pattern matching (Montecarlo and Las Vegas) par.7.4-7.6
Oct. 12 Randomized min-cut in unweighted graphs (Montecarlo) par.1.1
Oct. 13 Randomness and Kolmogorov complexity Notes
Oct. 15 Lecture by prof. Ferragina: issue of hierarchical memories: I/O-model. Index construction: multi-way mergesort, BSBI and SPIMI. Sketch on MapReduce. Distributed indexing: Term-based vs Doc-based partitioning. Slides Chapt.4
Oct. 19 Problem solving. Randomized min-cut in graphs. list of problems
Oct. 20 General assembly
Oct. 22 Data Streaming Model. Motivations and examples. Count-Min Sketches sects.1-3 Site Notes
Oct. 26 Problem solving. Data streaming. list of problems
Oct. 27 Chernoff bounds, and applications of count-min sketches. par.4.1,Th.4.1 (proof is optional) sect.4 Notes
Oct. 29 Using a family of random hash functions: Cuckoo hashing. Notes
Nov. 9 Problem solving. Hash functions and data streaming. list of problems

Algorithms for massive data sets and hierarchical memories

Massive data sets are one of the key elements in our so-called big data society. In this scenario the design and analysis of algorithms and data structures adopt the external and cache-oblivious models, which take into account the memory hierarchy that provides performance to modern computers. We study basic problems on sorting and searching in these models, comparing the results with the classical RAM model of computation.

Date Topics References and notes
Nov. 10 Review of the external memory model (EMM): sequential access versus random access, I/O complexity. Static and dynamic searching in external memory. (a,b)-trees and B-trees: definition, search and insert operations. Introduction, chapt.2 and 3 (set D=1 disks)sect.1 and 2
Nov. 12 Dynamic searching in external memory: (a,b)-trees and B-trees. Deletion, construction, and I/O analysis. Lower bound for the I/O complexity of sorting. sect.2 Notes
Nov. 16 Problem solving. Hash functions and data streaming. list of problems
Nov. 17 Cache-oblivious (CO) model. Difference from the EM model and examples. Sect.1-3
Nov. 17 Problem solving. Algorithms for external memory (continued). list of problems
Nov. 19 Cache-oblivious searching: van Emde Boas layout and CO B-trees. [Sect.4.1]
Nov. 23 Problem solving. CO algorithms. list of problems
Nov. 24 Cache-oblivious layout of binary trees. Notes Sect.5 (see 5.4)
Nov. 26 Problem solving. CO algorithms (continued). list of problems
Nov. 30 Map-Reduce paradigm. Suffix array construction (DC3). paper example DC3 algorithm (up to Section3)

Enumeration, hardness and approximation of some combinatorial problems

NP-hard problems are important but difficult to solve and no deterministic polynomial algorithms are currently known for them. Even basic problems, such as finding a simple path of vertices in a graph, are NP-hard. We discuss how to attack these problems (a) by counting and listing their solutions where the cost is proportional to the output (it can be exponential in the worst case but we only pay for what we get) and (b) by approximating their solutions with polynomial-time algorithms when the problem requires to minimize a cost or maximize a benefit. Most of the addressed problems are on graphs, which are popular representations for modern networked data.

Date Topics References and notes
Dec. 1 Quick review of NP-completeness and NP-hardness. Euler tour vs Hamiltonian tour vs TravelSalesman Problem (TSP). [CLRS, chap.34 (up to 34.4 plus 34.5.4, plus 35.2.2]
Dec. 1 Problem solving. MapReduce. Euler tour. list of problems
Dec. 3 2-approximation for metric TSP and for Min Vertex Cover. [CLRS, 35.1, 35.2]
Dec. 7 Approximations for the bin packing problem. chapt.2: par. 2.2.2
Dec. 10 Problem solving. Hard problems. list of problems
Dec. 14 Approximation for the MAX-CUT and knapsack problems. Notes chapt.2: par. 2.1.1
Dec. 15 Problem solving. Approximation algorithms. list of problems
Activity in class
Official documents for the class
Spot yourself in the class

Examination outcomes (in Italian)

Examination date to be fixed

magistraleinformatica/alg2/algo2_15/start.1450164164.txt.gz · Ultima modifica: 15/12/2015 alle 07:22 (8 anni fa) da Roberto Grossi