CdSINFORMATICA E NETWORKING
|STRUMENTI DI PROGRAMMAZIONE PER SISTEMI PARALLELI E DISTRIBUITI||INF/01||LEZIONI||48|
The course deals with design, evaluation and utilization of programming tools and environments for parallel and distributed applications. MPI, Thread Building Blocks and OpenCL are used as examples of programming tools addressing diverse kinds of architectural parallelism. oneAPI is presented as a unifying technology that aims at expressing parallelism over several distinct architectural layers.
The programming paradigms and the related cost models can be applied to achieve high performance and parallel efficiency on several types of systems, exploiting parallelism at diverse levels/scales in order to address
- high-performance stream-parallel and data-parallel computations, distributed shared memory systems
- unifying methodologies and programming environments for parallel/distributed computing
- parallel systems with hierarchical/multilevel architecture
- adaptive and context-aware programming, event-based programming, fault-tolerance strategies for high-performance computing
For these paradigms, static and dynamic tools are defined and their performances are evaluated through case studies in experimental and laboratory activites. Tools for experiment management and application scripting are also discussed. Several of the case studies involve the parallelization of mining/KDD/data analysis algorithms.
Student knowledge is evaluated during the course thanks to
- the hands-on activities in the lab time,
- the exercises made at home
- interaction during the lessons
and after the course
- evaluation of the project code, of the project report and final oral exam.
The student will achieve
- acquantaince with at least three different parallel/distributed programming environments, covering both shared-memory and distributed memory systems (tipically MPI, Thread Building Blocks and OpenCL, and oneAPI)
- practical experience of applying analytical behavioural models for parallel patterns and full programs with respect to performance, reliability, memory/power efficiency
- practical experience of the full cycle of : problem analysis / parallel solution modeling and design / model-driven implementation / cross evaluation of implementation and analytical models via benchmarking and test results
- critical and empirical reasoning when evaluating parallel programs to guide design and coding choices
- Hands-on activities during lab-time
- "Homework" exercises
- Through the final project: the project and the written report (describing the performed work, testing activities and evaluation) are discussed with the student as part of the final examination, focusing on the "a priori" modeling and implementation choices made by the student and its ability to evaluate them "a posteriori" based on empirical results.
The course requires at least good proficiency in C and C++ programming in order to exploit the programming frameworks presented.
The course requires some previous knowledge of parallel and distributed computing system architecture (shared and distributed memory parallel systems, multiprocessors and multi-core processors), of structured parallel programming / behavioural skeletons and the associated basic analytical models, at least with respect to performance.
- attending lectures
- studying reference texts/papers
- attending lab time with hands-on activities
- personal study and coding experience with the programming tools presented
Lesson Attendance: Not mandatory
Slide-based lessons are integrated with classical blackboard presentation of auxiliary and additional material wheter needed.
- Lab-time with hands-on experience
Hands-on Lab time with assigned tasks and support from the teacher
Usually individual tasks, using the students' own devices, possibly in remote connection with specific parallel/distributed systems
- Course site used for the distribution of studying material
Slides, papers and book references, text of the exercises for the practical sessions are made available on the dokuwiki page maintained by the Department.
- A final project is mandatory.
All teaching material is in the Eglish language.
Tools and environments for parallel, high performance
- Message-passing programming
- Shared-memory, thread-based programming
- Shared-memory, stream-oriented multicore programming
Applications to case studies of
- high-performance stream- and data-parallel computations,
- distributed shared memory algorithms,
- adaptive and context-aware programming,
- high-performance event-based programming,
- programming of fault-tolerance strategies,
- run-time supports of languages/frameworks
- B. Wilkinson, M. Allen – Parallel Programming, 2nd edition. 2005, Prentice-Hall.
- Michael Mc Cool, Arch D. Robinson and James Reinders – Structured Parallel Programming (patterns for Efficient Computation) 2012, Morgan Kaufmann.
- Lesson slides, papers, exercises -- made available via the Department's dokuwiki course official page.
- The MPI official standard, version 3.0 (as reference)
- James Reinders – Intel Threading Building Blocks 2007, O'Reilly Media.
- M. Voss, R. Asejo, J. Reinders – Pro TBB Book code samples ported to oneAPI (Open access book on Springer)
- J. Reinders et al. - Data Parallel C++ (Open access book on Springer)
- Slides/notes from the teacher (via the course web page).
The course web page lists slides and additional sources in the "course journal" sub-page.
Please contact the teacher when preparing the course, at least by email, in order to
- receive announces and additional material that is occasionally sent by email
- obtain login credential on the systems that are made available for homework and project work
- define the goal and tools for your personal homework
It is also advised to contact the teacher to book a question-time meeting and discuss any issue that the student may experience.
- Project work comprising algorithm design, coding, testing and evaluating the code
- Final written examination (written report on project work and evaluation)
- Final oral exam (including project discussion)