Indice

Algorithm Engineering A.A. 2011-2012

Thanks to P. Sanders Teacher: Paolo Ferragina

CFU: 6 (second semester).

Course ID: 283AA.

Language: English.

Degree: All master degrees in Computer Science, and the one in CS&Networking in collaboration with the Scuola Normale Superiore “S. Anna”.

Lectures schedule: Tuesday and Thursday 11-13, room N1.

Question time: After lectures.

News: about this course will be distributed via a Tweeter-channel.

Official lectures schedule: The schedule and content of the lectures is available with the official registro.

Goals

In this course we will study, design and analyze (theoretically and experimentally) advanced algorithms and data structures for the efficient solution of combinatorial problems involving all basic data types, such as integers, strings, (geometric) points, trees and graphs. These algorithmic tools will be designed and analyzed in several models of computation— such as RAM, 2-level memory, cache-oblivious, streaming— in order to take into account the architectural features and the memory hierarchy of modern PCs.

Every lecture will follow a problem-driven approach that starts from a real software-design problem, abstracts it in a combinatorial way (suitable for an algorithmic investigation), and then introduces algorithmic solutions aimed at minimizing the use of some computational resources like time, space, communication, I/O, energy, etc. Some of these solutions will be discussed at an experimental level, in order to introduce proper engineering and tuning tools for algorithmic development.

Exam

Dates Room
08/06/2012 Text
28/06/2012 Text
23/07/2012 Text
03/09/2012 Text

Background

If you wish to refresh your mind on Algorithms and Data Structures, I suggest you to follow the Video Lectures by Erik Demaine and Charles Leiserson, specifically Lectures 1-5, 7 and 10. There it is missing the part on basic graph problems (representation, DFS, BFS, topological sort) which you may browse in any book, such as Introduction to Algorithms by Cormen-Leiserson-Rivest-Stein, third edition.

Books, Notes, etc.

I'll project no slides for teaching, and use just the old-fashioned blackboard.

Since the course covers many different algorithmic themes and will deal with advanced results and issues which are not yet part of books, I'll distribute notes and copies of papers/books which will constitute the reading material of the course.

List of Lectures

Date Lecture Biblio
15/09/11 Refreshing: Analysis of algorithms, (asymptotic) time/space complexity, Big-Oh notation, Divide-and-Conquer, Binary search, Recurrences, and few other examples. Lectures 1, 2 and 3 of Demaine-Leiserson's course at MIT
16/09/11 Refreshing: Mergesort and Quicksort. Heap: properties and operations. Lectures 4 of Demaine-Leiserson's course at MIT
19/09/11 Refreshing: Hashing: hash functions and their properties. Hashing with chaining. Binary Search Trees: recursive visit, pre/in/post visits. Red-Black Trees. Lectures 7 and 10 of Demaine-Leiserson's course at MIT
20/09/11 Refreshing: Graphs: how to represent it in memory. DFS and BFS visits, properties and applications. part of Lecture 17 of Demaine-Leiserson's course at MIT. Look at the book CLR.
21/02/12 Introduction to the course. Models of computation: RAM, 2-level memory, cache-oblivious models. The role of the Virtual Memory system. Chap. 1-2 of the notes.
23/02/12 Random sampling: data-size known and unknown, disk model and streaming model. Chap. 3 of the notes.
28/02/12 Sorting and permuting atomic items on a two-level memory: 2-way mergesort, multi-way mergesort, lower-bounds.
01/03/12 Partitioning items cache-efficiently, selecting the item of rank k, average-time analysis of randomised quicksort. Chap. 4 of the notes.
08/03/12 Quicksort: How to bound the memory consumption in the recursive calls. Multi-way Quicksort. Disk striping and comments on the difficulty to use D parallel disks in sorting. Sorting strings: some preliminary considerations, lower bound, MSD-radix sort and its trie implementation.
13/03/12 Sorting strings: LSD-radix sort, Multi-way Quicksort. Chap. 5 of the notes.
15/03/12 Prefix search for strings: Array of pointers, Front-coding, Locality-preserving FC, Compacted trie, Patricia Trie, two-level indexing. Chap. 6 of the notes.
16/03/12 Front-coding and rear coding. Full-text indexing: suffix array, lcp array, text mining with SA, suffix tree. Slides of the lecture. String B-tree is optional. Chap 7 which contains much more material, but you have to read only what has been done in class.
22/03/12 Suffix array construction: The Skew algorithm. The Bloom filter and its applications Slides and paper on the BF.
27/03/12 Counting/Spectral Bloom filter. Hashing: hash table with chaining, issues in the use of simple hash functions, universal hashing - definition and examples.
29/03/12 Perfect hashing, cuckoo hashing, d-left hashing, min-ordered perfect hashing. Slides, CLR chapter, part of WMB and Cuckoo.
12/04/12 Data Compression: entropy, prefix-free coding, Huffman coding with proof of optimality. chapter of WMB, pag 21-41, 52-56, 74-79.
17/04/12 Canonical Huffman, Arithmetic Coding
19/04/12 Integer Encoding: gamma, delta, Rice, Variable byte, (s,c)-dense codes, PForDelta Chap 8, no “interpolative coding”
24/04/12 Dictionary-based compressors: LZ77, gzip. An example of application: Rsync. Streaming compressors: MTF, RLE. Slides
26/04/12 Burrows-Wheeler Transform and Bzip
03/05/12 Randomized data structures: Treaps some notes
10/05/12 Skip Lists
15/05/12 Exercises
17/05/12 Exercises
22/05/12 Exercises
24/05/12 Exercises