Next-Gen Computer Architecture: Till the End of Silicon

Version 1.0 of the book Publisher: McGrawHill


Buy version 1.0

1. Amazon India 
2. Flipkart India
3. McGrawHill Express library

Cite the book

Version 2.0.
Publisher: WhiteFalcon 


Cover of version 2

Download the pdf of version 2.0 of the book for free (CC-BY-ND 4.0 license).

New name: Next-Gen Computer Architecture: Till the End of Silicon
[The print version and the e-book are available worldwide via Amazon. Amazon India, Amazon US ,...]

Table of contents

Join the Google group for discussing concepts, sharing doubts, and for getting tips from other computer architecture enthusiasts. bar

First part of this two-book series
Basic Computer Architecture

old book


Chapter 1: Introduction
Chapter 8: The On-Chip Network
Chapter 2: Out-of-order Pipelines
Chapter 9: Multicore Systems
Chapter 3: The Fetch and Decode Stages
Chapter 10: Main Memory
Chapter 4: The Issue, Execute, and Commit Stages
Chapter 11: Power and Temperature
Chapter 5: Alternative Approaches to Issue and Commit
Chapter 12: Reliability
Chapter 6: Graphics Processors
Chapter 13: Secure Processor Architectures
Chapter 7: Caches
Chapter 14: Architectures for ML Learning
Appendices: ISA, Tejas simulator, Intel, AMD,
and Qualcomm processors


Slides and YouTube Videos ( CC-BY 4.0 license )

YouTube Videos (click the  YouTube link link)
Chapter 1: Introduction
1. Introduction YouTube link
Chapter 2: Out-of-order pipelines
1. Summary of in-order pipelining YouTube link
2. Motivation for out-of-order pipelining YouTube link
3. Register renaming and precise exceptions  YouTube link
Chatper 3: The fetch and decode stages
1. Fetch logic. YouTube link
2. Branch prediction YouTube link
3. Decode stage YouTube link

Notes on operating systems: [pdf]
Chapter 4: Issue, execute, commit
1. Instruction renaming YouTube link
2. Wakeup, select, and broadcast YouTube link
3. Load store queue YouTube link
4. Instruction commit YouTube link
Chapter 5: Alternative approaches to issue and commit
1. Aggressive speculation YouTube link
2. Replay schemes YouTube link 
3. Compiler based techniques YouTube link 
4. VLIW and EPIC processors YouTube link
Chapter 6: Graphics Processors
1. Traditional graphics pipeline YouTube link
2. The CUDA programming language YouTube link
3. Design of GPGPUs YouTube link
Chapter 7: Caches
1. Overview of caches YouTube link 
2. Cache optimizations and virtual memory YouTube link 
3. SRAM and CAM arrays YouTube link 
4. Cacti tool, Elmore delay YouTube link
5. Advanced cache optimizations YouTube link 
6. Trace caches, instruction, and data prefetching YouTube link
Chapter 8: NoC

1. Network topologies and basic concepts YouTube link
2. Flow control and flit/packet level switching YouTube link
3. Routing Algorithms and 5-Stage Router Pipeline YouTube link
4. Arbiters and Allocators, Pipeline Optimization YouTube link
5. Non-Uniform Caches, Synthetic Traffic YouTube link

Chapter 9: Multicore Systems
1. Parallel programming and hardware threads YouTube link
2. Theoretical foundations YouTube link
3. Sequential consistency, PLSC, and coherence YouTube link 
4. Execution witnesses, access graphs, causal graphs YouTube link
5. Cache coherence: snoopy and directory protocols YouTube link 
6. Advanced directory protocols and atomic operations  YouTube link
7. Memory models and data races YouTube link 
8. Methods to detect races YouTube link
9. Transactional memory YouTube link
Chapter 10: Main Memory

1. DRAM devices and arrays YouTube link
2. Synchronous and asynchronous transfer protocols YouTube link
3. DDR4 states and timing YouTube link
4. Flash and FeRAMs YouTube link
5. MRAMs, PCM, ReRAMs and the Roofline model YouTube link

Chapter 11: Power and Temperature

1. Dynamic and leakage power YouTube link
2. Temperature modeling YouTube link
3. Methods to manage power and temperature YouTube link

Chapter 12: Reliability 1. Soft errors and inductive noise YouTube link
2. Non-determinism and design faults YouTube link
3. Process variation, ageing and hard errorrs YouTube link
Chapter 13: Secure Architectures

1. Cryptographic fundamentals and encryption YouTube link
2. Hashing and secure processors YouTube link
3. Side channel attacks and oblivious RAM YouTube link

Chapter 14: Architectures for ML

1. Basic ML Concepts YouTube link
2. Mathematical representation of CNN computations, stationarity YouTube link
3. Hardware architectures for 1D/2D convolution YouTube link
4. Optimizations and memory systems YouTube link


Tejas Architecture Simulator --  Can be used to simulate the behaviour of simple and complex multicore
processors including their pipelines, memory hierarchies, and NOCs. The simulator can also run in parallel,
simulate GPUs, and simulate energy consumption. The latest version supports the ARM and RISC-V
ISAs. Written by the SRISHTI group

Cite the book

author = {Smruti R. Sarangi},
title = {Next-Gen Computer Architecture},
date = {October 2023},
edition = {1st edition},
publisher={White Falcon},
isbn={8119510143} }