Architecture of High Performance Computers

Course: COL718
Semester I, 2020-21
Credits: 4 (3-0-2)



Instructor: Dr. Smruti R. Sarangi

Lectures
: [Online on Youtube] We will use Teams, Impartus, Acadly and Moodle.
                Slot D: Tuesday, Wednesday, Friday: 9 to 10 AM
Course Description: This course will give an introduction to designing and programming high performance processors.

Evaluation
  1. Assignments (10% + 15% + 20%)
  2. Minor (25%)
  3. Major (30%)

Teaching Assistants
         1. Shubhankar Suman Singh


Textbook:
Main Textbook: Advanced Computer Architecture:
Advanced Computer Architecture (to be published by McGrawHill, 2021). The link will be mailed to the registrants.
Background on processors and caches: Computer Organisation and Architecture, Smruti R. Sarangi, McGrawHill India. Link to buy. Slides, and videos (link)


S. No.
Date
Lecture
Slides
References
1
Week 1
Sept 29 - Oct 6
Pipelining Chapter 2: Out-of-order pipelines
Computer Organisation and Architecture
Chapter 9 of the Computer Organisation and Architecture: All four parts
Chapter 2 of Advanced Computer Architecture
1. Summary of in-order pipelining [YouTube]
2
Oct 7th to Oct 11th
Out-of-order pipelines

2. Motivation for out-of-order pipelining [YouTube]
3. Register renaming and precise exceptions [YouTube]
3
Oct 7 - 14
Fetch logic
Chatper 3: The fetch and decode stages
Chapter 3 of Advanced Computer Architecture
1. Fetch logic and predicting if an instruction is a branch or not. [YouTube]
2. Branch prediction [YouTube]
3. Decode stage [YouTube]
Class notes: [pdf]
4.
Oct 16
Interaction between the
architecture and the OS

Notes: [pdf]
5.
Oct 20-25
Issue, Execute, and Commit
Stages
Chapter 4: Issue, execute, commit
Chapter 4:
1. Instruction renaming [YouTube]
2. Wakeup, select, and broadcast [YouTube]
6.
Oct 27-31
--

3. Load store queue [YouTube]
4. Instruction commit [YouTube]
7.
Nov 2 - 7
Alternative approaches to Issue and Commit
Chapter 5: Alternative approaches to issue
and commit

Chapter 5:
1. Aggressive Speculation [YouTube]
2. Replay schemes [YouTube]
8.
Nov 12-13
Compiler based techniques

Chapter 5:
3. Compiler based techniques [YouTube]
9.
Nov 16-27
Background of caches Chapter 7: Caches
Chapter 7:
1. Overview of caches [YouTube]
2. Cache optimizations and virtual memory [YouTube]
3. SRAM and CAM arrays [YouTube]
4. Cacti tool, Elmore delay [YouTube]
10.
Dec 13-22
Network-on-chip
Chapter 8: NoC
5. Advanced cache optimizations [YouTube]
6. Trace caches, instruction, and data prefetching [YouTube]
Chapter 8:
Already shared on dropbox
11.
Dec 23 - Jan 4
Multicore Systems
Chapter 9: Multicore Systems
Chapter 9:
1. Parallel programming and hardware threads [YouTube]
2. Theoretical foundations: overview of coherence and consistency [YouTube]
3. Sequential consistency, PLSC, and coherence [YouTube]
4. Execution witnesses, access graphs, causal graphs [YouTube]
5. Cache coherence: snoopy and directory protocols [YouTube]
6. Advanced directory protocols and atomic operations [YouTube]
7. Memory models and data races [YouTube]