Parallel and Distributed Programming 2018
Kenjiro Taura

(the page is encoded in UTF-8)

What's New (in the newest-first order)

Periodically reload this page. Dates in parentheses indicate when they are posted.


Slides will be significantly updated this year.
  1. Introduction
  2. OpenMP
  3. CUDA
  4. Divide and Conquer
  5. SIMD Programming
  6. How to get nearly peak FLOPS (with CPU)
  7. What You Must Know about Memory, Caches, and Shared Memory
  8. Understanding GPU performance
  9. Neural Network Basics
  10. Math notes on VGG
  11. Analyzing Data Access of Algorithms and How to Make Them Cache-Friendly
  12. Understanding Task Scheduling Algorithms

Course Objectives

The main objectives of this course are to have hands-on experiences on parallel programming and good understanding about how to solve problems in parallel and what determines performance of parallel programs.

Lecture Plan

I wish to cover the topics below, ranging from a very gentle introduction to parallel programming to performance of parallel programs to fundamental topics that may be difficult to understand for first learners (due to time constraints, it's unlikely to fully cover them all).


Programming Exercises Week(s)

NEW Announcement: How to Get the Credit

Here are the requirements for getting the credit.
  1. You participated in the classes often enough and generally follow what are covered in the class (this is a prerequisite to understand what you are really asked to do in what follows). As you know, I have not been keeping track of attendance/absence, but I am pretty sure I can tell if somebody who has almost never been in the class suddenly sends a report.
  2. Finish the SpMV hands-on exercises we had during the class.
  3. Write and submit a final report (term paper).
    • abstract deadline: January 18th (Sat), 23:59.
    • final deadline: February 9th (Sat), 23:59.

    Both are to be submitted through ITC-LMS.

The abstract can be just a text (detailed instructions will be announced later) of a paragraph or two describing your plan on what you will be doing for the final report.

The final report must be a logical, consistent, and sufficiently self-contained document, in a PDF file. The topic of the final report can be chosen from the following.


Parallel Programming in Practice

Taxonomy of parallel machines and programming models

Understanding performance of parallel programs (and achieving high performance)

Fundamental Topics (as time permits)

Links and References