Cilk Arts has teamed up with MIT Professional Education to put together a
2-day Multicore Programming workshop for developers, architects, and educators.
Where: MIT Campus, Cambridge, Mass
When: June 8 & 9, 2009
Who Should Attend?
Software developers, architects, team leaders, project managers, and educators.
Early Bird Discount:
Apply by May 10th and save $100!
Goals for the day: Understand multicore architecture trends (Moore’s law,
chip multiprocessors, etc.). Exhibit the ability to compile and run basic
Cilk++ programs. Display a hands-on knowledge of basic multicore-programming
concepts, including nested and loop parallelism, serial semantics, and race
conditions. Describe performance concepts, including work, span, and
parallelism. Show an understanding of the practical implications of
elementary scheduling theory.
Module 1 - The multicore-software challenge
Technology trends | Problems amenable to parallelism | Chip multiprocessors
and cache consistency | Leading multicore concurrency platforms, including
Pthreads/WinAPI threads, OpenMP, Threading Building Blocks, and Cilk++ |
Program correctness and race conditions
Module 2 (LAB) - Introduction to parallel programming
Parallelizing quicksort | Cilkscreen race detector | Cilk performance analyzer
Module 3 - Parallelism and performance
Nested and loop parallelism | Serial semantics and composability |
Programming examples | Measures of work, span, and parallelism | Scheduling
Module 4 (LAB) - Matrix multiplication
Module 5 - How the Cilk++ concurrency platform works
Goals for the day: Display a familiarity of advanced parallel programming
concepts, such as locking, deadlock, synchronizing through memory, and
reducers. Show an ability to deal with hurdles to parallelization, including
insufficient parallelism, loop-carried dependencies, grain size, burdened
parallelism, memory bandwidth, nondeterminism, and legacy threading.
Module 6 - Nonlocal variables and synchronization
Global and nonlocal variables | Locking | Deadlock, lock contention,
convoying | Synchronizing through memory |Memory models | Reducer hyperobjects
Module 7 (LAB) - Nonlocal variables and reducers
Module 8 - Practical issues in parallelization
Lack of parallelism | Loop-carried dependencies | Grain size | Burdened
parallelism | Memory bandwidth | Nondeterminism | Legacy threading
Module 9 (LAB) - Overcoming parallelization hurdles
Module 10 - Multicore Jeopardy!