Architecture of High Performance Computers

Course: COL718
Semester I, 2021-22
Credits: 4 (3-0-2)



Instructor: Dr. Smruti R. Sarangi

Lectures
: [Online on Youtube] We will use Teams, Impartus, Acadly and Moodle.
                Slot D: Tuesday, Wednesday, Friday: 9 to 10 AM
Course Description: This course will give an introduction to designing and programming high performance processors.

Evaluation
  1. Assignments (10% + 15% + 20%)
  2. Minor (25%)
  3. Major (30%)

Teaching Assistants
         1. Akshin Singh

Piazza:  piazza.com/iit_delhi/fall2021/col718

Textbook:
Main Textbook: Advanced Computer Architecture:
Advanced Computer Architecture (homepage of the book)
Background on processors and caches: Basic Computer Architecture, (link)


Date
Lecture
Slides
References
Week 1

Pipelining Chapter 2: Out-of-order pipelines
Basic Computer Architecture
Chapter 9 of Basic Computer Architecture: All four parts
Chapter 2 of Advanced Computer Architecture
1. Summary of in-order pipelining [YouTube]
Week 2
Out-of-order pipelines

2. Motivation for out-of-order pipelining [YouTube]
3. Register renaming and precise exceptions [YouTube]
Week 3
Fetch logic
Chatper 3: The fetch and decode stages
Chapter 3 of Advanced Computer Architecture
1. Fetch logic and predicting if an instruction is a branch or not. [YouTube]
2. Branch prediction [YouTube]
3. Decode stage [YouTube]
Class notes: [pdf]
Week 3
Interaction between the
architecture and the OS

Notes: [pdf]
Week 4
Issue, Execute, and Commit
Stages
Chapter 4: Issue, execute, commit
Chapter 4:
1. Instruction renaming [YouTube]
2. Wakeup, select, and broadcast [YouTube]
3. Load store queue [YouTube]
Week 5
Alternative approaches to Issue and Commit
Chapter 5: Alternative approaches to issue
and commit

4. Instruction commit [YouTube]
Chapter 5:
1. Aggressive Speculation [YouTube]
2. Replay schemes [YouTube]
Week 6:
Compiler based techniques
and caches
Chapter 7: Caches Chapter 5:
3. Compiler based techniques [YouTube]
Chapter 7:
1. Overview of caches [YouTube]
2. Cache optimizations and virtual memory [YouTube]
3. SRAM and CAM arrays [YouTube]
Week 7
Rest of caches

4. Cacti tool, Elmore delay [YouTube]
5. Advanced cache optimizations [YouTube]
6. Trace caches, instruction, and data prefetching [YouTube]
Week 8
Multicore Systems
Chapter 9: Multicore Systems Chapter 9:
1. Parallel programming and hardware threads [YouTube]
2. Theoretical foundations: overview of coherence and consistency [YouTube]
A. Introduction to OpenMP (pdf)  [self-study mode]
Week 9
Consistency

3. Sequential consistency, PLSC, and coherence [YouTube]
4. Execution witnesses, access graphs, causal graphs [YouTube]
Week 10
Multicore Systems

5. Cache coherence: snoopy and directory protocols [YouTube]
6. Advanced directory protocols and atomic operations [YouTube]
7. Memory models and data races [YouTube]
Week 11
Security
Chapter 13: Security
1. Cryptographic fundamentals and encryption [YouTube]
2. Hashing and secure processors [YouTube]