Cilk Arts has teamed up with MIT Professional Education to put together a
2-day Multicore Programming workshop for developers, architects, and educators.
Where: MIT Campus, Cambridge, Mass
When: June 8 & 9, 2009
Who Should Attend?
Software developers, architects, team leaders, project managers, and educators.
Early Bird Discount:
Apply by May 10th and save $100!
Registration Website:
http://web.mit.edu/professional/short-programs/courses/concepts_multicore_programming.html
Program Schedule
Day 1
Goals for the day: Understand multicore architecture trends (Moore’s law,
chip multiprocessors, etc.). Exhibit the ability to compile and run basic
Cilk++ programs. Display a hands-on knowledge of basic multicore-programming
concepts, including nested and loop parallelism, serial semantics, and race
conditions. Describe performance concepts, including work, span, and
parallelism. Show an understanding of the practical implications of
elementary scheduling theory.
Module 1 - The multicore-software challenge
Technology trends | Problems amenable to parallelism | Chip multiprocessors
and cache consistency | Leading multicore concurrency platforms, including
Pthreads/WinAPI threads, OpenMP, Threading Building Blocks, and Cilk++ |
Program correctness and race conditions
________________________________________
Module 2 (LAB) - Introduction to parallel programming
Parallelizing quicksort | Cilkscreen race detector | Cilk performance analyzer
________________________________________
Module 3 - Parallelism and performance
Nested and loop parallelism | Serial semantics and composability |
Programming examples | Measures of work, span, and parallelism | Scheduling
theory
________________________________________
Module 4 (LAB) - Matrix multiplication
________________________________________
Module 5 - How the Cilk++ concurrency platform works
Work stealing
________________________________________
Day 2
Goals for the day: Display a familiarity of advanced parallel programming
concepts, such as locking, deadlock, synchronizing through memory, and
reducers. Show an ability to deal with hurdles to parallelization, including
insufficient parallelism, loop-carried dependencies, grain size, burdened
parallelism, memory bandwidth, nondeterminism, and legacy threading.
Module 6 - Nonlocal variables and synchronization
Global and nonlocal variables | Locking | Deadlock, lock contention,
convoying | Synchronizing through memory |Memory models | Reducer hyperobjects
________________________________________
Module 7 (LAB) - Nonlocal variables and reducers
________________________________________
Module 8 - Practical issues in parallelization
Lack of parallelism | Loop-carried dependencies | Grain size | Burdened
parallelism | Memory bandwidth | Nondeterminism | Legacy threading
________________________________________
Module 9 (LAB) - Overcoming parallelization hurdles
________________________________________
Module 10 - Multicore Jeopardy!
|