Welcome!
This is a course on high performance computing with an emphasis on scientific computing. It is designed for senior undergraduate students in computer science as well as graduate students interested in computational science and engineering.
Materials for this course were created and tested (i.e., taught!) by Edmond Chow at Georgia Tech. This course is continually undergoing revision. Check back often for improvements and additions!
Presentations (Graduate level)
- Forms of parallelism and Posix/C++11 threads
- OpenMP - intro
- Accessing the clusters and git
- OpenMP - advanced
- Application: Molecular simulation and cell lists
- In-class discussion
- Intel compilers and multithreading environment
- SIMD vectorization
- Thread affinity and NUMA
- Tasking and Cilk Plus
- False sharing
- Notes on Intel MKL - Math kernel library
- Notes on performance analysis
- MPI basics
- Application: iterative PDE solver
- Sparse matrix data structures for HPC
- Graph and hypergraph partitioning for parallel computing
- MPI remote memory access (one-sided)
- MPI shared memory programming
- Introduction to KNL
Presentations (Undergraduate level)
- Introduction, roofline model
- Code optimization: cache, BLAS, SIMD
- OpenMP shared memory programming
- Measuring and reporting performance
- Interconnects
- MPI and algorithms for collective communication
- Distributed DGEMV and DGEMM algorithms
- Partitioning a problem for parallel computing
- Intel Xeon Phi
- NUMA and thread affinity
- Parallel data analysis: Netflix
- Parallel particle simulation
- Parallel quantum chemistry
- Review
Worksheets (Undergraduate level)
Assignments (Undergraduate level)
Acknowledgments
Georgia Tech is grateful to the Intel University Program Office for a grant providing hardware used in this course. Additional support was generously provided by an Intel Parallel Computing Center grant at Georgia Tech. For information about the center activities, click here.