Strumenti Utente

Strumenti Sito


magistraleinformaticanetworking:cpa:start:cpa2009

Complementi di piattaforme abilitanti distribuite

Teacher: Nicola Tonellotto

Question time: Please contact the teacher

Tuesday 11:30-13:30 Room 10B
Wednesday 16:00-18:00 Room 10B
Friday 9:30-11:30 Room 10B

Teaching rooms: Room 10B, S.Anna/CNIT building in CNR Research Area, ground floor.
For this first year, the course is co-organized with Strumenti di programmazione per sistemi paralleli e distribuiti taught by Dr. Massimo Coppola

Syllabus

24/02: Grid Computing (I) Slides Student Notes

  • Large-scale problems in research and prodution environments
  • How to approach these problems
  • Preliminary definitions: resources, protocols, services, APIs and SDKs
  • A simple example: web services and their protocols

25/02: Grid Computing (II) Slides Student Notes

  • Virtual Organizations
  • The Grid vision and its requirements
  • The Grid architecture: fabric, connectivity, resource, collective and application layers
  • Using the Grid: scenarios and examples
  • Open Grid Service Architecture and its capabilities
  • The eight fallacies of Grid computing

09/03: Grid Computing (III) Slides Student Notes

  • The Globus Project
  • Public Key Infrastructure (concepts)
  • Grid Security Infrastructure
  • Certificates
  • Single Sign On and Delegation

10/03: Grid Computing (IV) Slides Student Notes

  • Grid Information Services
  • Lightweight Directory Access Protocol (concepts)
  • Monitoring and Discovery Service
  • IP, GRIS and GIIS
  • Grid Information Models: MDS-2 and GLUE schemata

19/03: Grid Computing (V) Slides Student Notes

  • HPC Resource Management
  • Grid Resource Management
  • Gatekeeper and Job Manager
  • Data Management
  • GASS and GridFTP
  • Replica Catalog and Replica Management Services

23/03: MapReduce: the programming model Slides Student Notes

  • Problem characterization
  • Map Fold in LISP
  • Programming model: mappers and reducers
  • Programming model: partitioners and combiners
  • Example and data flow

24/03: Distributed File Systems: GFS and HDFS Slides Student Notes

  • Problem characterization
  • Blocks, Name nodes and Data Nodes
  • Master/server architecture
  • Master Server (namenode), Chunk Servers (datanode) protocols and responsabilities
  • Anatomy of a read
  • Anatomy of a write
  • Benchmarks

26/03: Lab Notes Data

  • Hadoop installation and setup
  • Single mode and pseudodistributed mode configuration
  • Grep application

13/04: Lab Notes Data

  • Word Count application (old Hadoop APIs)
  • Word Count application (new Hadoop APIs)
  • API usage
  • Using large number of files

14/04: Lab Problem Solution (1/4)

  • Computing tf-idf with MapReduce
  • Word frequency in document

04/05: Lab Solution (2/4)

  • Computing tf-idf with MapReduce
  • Word count in document

05/05: Lab Solution (3/4) Solution (4/4)

  • Computing tf-idf with MapReduce
  • Word frequency in collection
  • Calculate TF-IDF

07/05: Autonomic Computing Slides Student Notes

  • Self management
  • Self properties
  • Feedback control of computing systems

11/05: Scheduling Slides Student Notes

  • Single processor scheduling: SJF, FCFS, RR, MLQ
  • Real time scheduling: RM, EDF
  • Cluster Scheduling: FCFS, Backfilling

12/05: Scheduling SlidesStudent Notes

  • Grid Resource Management
  • Bag-of-tasks heuristics: Min-Min, Max-Min, Sufferage
  • Workflow heuristics: List, Multilevel, Clustering scheduling. HEFT
  • Economic Scheduling

Bibliography

magistraleinformaticanetworking/cpa/start/cpa2009.txt · Ultima modifica: 04/10/2010 alle 14:43 (14 anni fa) da Nicola Tonellotto