Content

1. Overview
2. Paged Memory System
    1. Private Address Space
3. Demand Paging
    1. Page Fault Handling
4. Page Tables
    1. Linear Page Table
    2. Hierarchical Page Table
    3. Translation Lookaside Buffer (TLB)
        1. TLB Designs
5. Address Translation
    1. TLB Miss Handling
        1. Software
        2. Hardware
        3. Page Table Walk
    2. Performance
        1. Physical Addressed Cache v.s. Virtual Addressed Cache
        2. Virtual-Index Physical-Tag Caches
6. Historical Use

Overview

Virtualization
- Assume having access to infinite memory address space
Relocation
- Assume running at fix memory addresses
Protection
- Assume being the only running program

Paged Memory System

Address -> | page number | offset |
2^offset B = page size

Needs to be fast and space efficient.

Private Address Space

Demand Paging

Virtual memory requirements too large; can use hard disk for saving memory pages.

Page Fault Handling

Exception -> OS take over
Create new page OR locate page on disk
Copy content to memory
- Swap pages to swap space on disk if no more physical space
- Thrashing: excessive swapping e.g. working set > memory
Update page tables

Long time to transfer pages -> handled completely by OS, with untranslated addressing mode on. Other jobs can be run on the CPU while waiting.

Page Tables

Too large to keep in registers -> keep in memory. 2 memory accesses per request, one for page base, one for offset.

Linear Page Table

Page table entry (PTE)
1. Valid bit
2. Physical page number (PPN)
3. Disk page number (DPN)
4. Status bit (protection & usage)
Limitations
- Too big
- Larger pages?
  - Increased internal fragmentation
  - Increased page fault penalty

Hierarchical Page Table

Translation Lookaside Buffer (TLB)

TLB hit: single-cycle
TLB miss: page table walk

TLB Designs

Typically 32-128 entries, fully-associative
- Each entry maps a large page -> less spatial locality across pages -> more likely for 2 entries to conflict
- Sometimes 256-512 B TLB are 4-8 way set-associative
- Larger systems can have multi-level TLBs
Random or FIFO policy
Process information
- Process identifier: may waste space and is not useful for modern large-space processes running
TLB reach: size of virtual memory space that can be simultaneously mapped by TLB

Address Translation

TLB Miss Handling

Handling TLB miss: S/W or H/W
Handling page fault: restartable exception for S/W to resume
Handling protection fault: abort process or handles like page fault

Software

Exception caused, OS walks page table and reloads
Privileged untranslated addressing mode

Hardware

Memory Management Unit walks page table and reloads
If page not found, raise page fault exception

Page Table Walk

Performance

Additional latency of TLB
- Slow down clock
- Pipeline TLB and cache
- Virtual addressed cache
- Parallel TLB/cache access

Physical Addressed Cache v.s. Virtual Addressed Cache

Physical addressed cache
Virtual addressed cache
- One-step process
- Need to flush on context switch or use address space identifiers (ASID) in tags
- Aliasing problem
  - 2 copies of same physical data; not visible to others
  - Prevent aliases coexisting in cache OR shared pages must agree in index bits
- Cache coherence problem

Virtual-Index Physical-Tag Caches

Allows concurrent access to TLB and cache.

Historical Use

Bare machine with physical addresses only
- One program
Batch-style multiprogramming
- Multiple programs sharing CPU while waiting for I/O
- Base & bound: translation & protection between programs
- External fragmentation
Time sharing
- More interactive programs
- Motivated to move to fixed-size page translation & protection -> no external fragmentation, but internal fragmentation
- Motivated adoption of virtual memory
Virtual machine monitor
- Multiple OS sharing 1 machine
- Guest OS virtual -> Guest OS physical -> Host machine physical
Full demand-paged virtual memory
- Portability
- Protection
- Share small physical memory among active tasks
- Simplified implementation of some OS features
Vector supercomputers
- Translation & protection
- Rarely complete demand-paging
- Prevent thrashing
- Difficult to implement restartable vector instructions
Embedded systems & DSPs: physical address only
- Cannot afford area/speed/power of VM support
- No secondary storage to swap to
- Customised for specific memory configuration
- Difficult to implement restartable instructions for exposed architectures

Virtual Memory