Content
- Overview
- Branch Prediction
1. Static Branch Prediction 2. Dynamic Branch Prediction 1. Simple Dynamic Branch Prediction 2. 2-Bit Branch Prediction 3. Implementation Considerations 4. Branch History Table (BHT) 5. 2-Level Branch Predictor 6. Branch Target Buffer (BTB) 7. Combining BHT & BTB
- Interrupts & Exceptions
1. Asynchronous Interrupts 1. Interrupt Handler 2. Synchronous Interrupts 1. Exception Handling 3. Speculating on Exceptions
Overview
- Why instructions not dispatched every cycle?
- Full bypath expensive to implement
- Infrequently used paths may increase cycle time
- Loads have 2-cycle latency
- Conditional branches may cause bubbles
- Full bypath expensive to implement
- Branches & Jumps |Instruction|Taken Known|Target Known| |-----------|-----------|------------| |JAL|After decode|After decode| |JALR|After decode|After reg. fetch| |B|After exc.|After decode|
- Reduce control flow penalty
- S/W
- Eliminate branches: e.g. loop unrolling
- But increases run length
- Reduce resolution time: compute as soon as possible
- But branches often in critical path
- Eliminate branches: e.g. loop unrolling
- H/W
- Delay slots
- Requires software cooperation
- Speculate: branch prediction
- Delay slots
- S/W
Branch Prediction
- Required H/W support
- Prediction structure
- Branch history tables (BHT), branch target buffers (BTB)
- Mispredict recovery mechanism
- Keep result computation separate from commit
- Kill instructions following branch
- Restore state to that following branch
- Prediction structure
Static Branch Prediction
- ISA can attach semantics to branches to indicate preference of taken/not taken
- ISA can allow arbitrary choice of statically predicted direction
Dynamic Branch Prediction
- Temporal correlation
- The way a branch resolves -> good predictor of the next execution
- Spatial correlation
- Several branches may resolve in a highly correlated manner
Simple Dynamic Branch Prediction
Predict based on the immediate previous result.
Misses for loops & nested loops.
2-Bit Branch Prediction
Predict based on the 2 previous results of the same branch.
Implementation Considerations
Ideally, branch predictor exists for every branch.
- Challenges
- Branches may occur anywhere, impossible to have 2^32 predictors
- Dynamically mapping too slow
- Solution
- Map lower k bits of instruction address to a H/W table
Branch History Table (BHT)
- Updated when branch resolves in EX stage
- Limitations
- Cannot redirect fetch stream until target calculated
2-Level Branch Predictor
Use history register H recording the last N branches executed by the processor to exploit spatial correlation.
Branch Target Buffer (BTB)
- Use it only for control instructions (branch & jump)
- Only taken branches and jumps are held
- Others always PC + 4
- Next PC can be determined before the instruction is fetched & decoded
- Updated when branch resolves in EX stage
Combining BHT & BTB
Interrupts & Exceptions
- Interrupts: external event; asynchronous
- I/O device service request - Timer expiration - Power disruptions, hardware failure
- Exceptions: internal event; synchronous
- Undefined opcode, privileged instructions - Arithmetic overflow, FPU exception - Misaligned memory access - Virtual memory exceptions (page fault, TLB misses, protection violations) - System calls
Asynchronous Interrupts
- An I/O device requests attention by asserting one of the prioritized interrupt request lines
- Processor stops instructions at Ii, completing all previous instructions - Saves PC of Ii in special register (EPC) - Disables interrupts, transfer control to designated interrupt handler running in kernel mode
Interrupt Handler
- Saves EPC before enabling interrupts (nested interrupts)
- Need an instruction to move EPC to GPRs - Need to mask further interrupts until EPC moved
- Read interrupt cause from status register
- Use special indirect jump instruction RFE (return from exception)
- Enables interrupts - Restores processor to user mode - Restores H/W status & control state
Synchronous Interrupts
- Cause by particular instructions
- The instruction cannot be completed and needs to be restarted after the exception has been handled
- System call traps are considered to be completed
Exception Handling
- Hold exception flags in pipeline until commit point
- Exceptions in earlier pipeline stages override later ones for a given instruction
- Inject external interrupts at commit point (override others)
- If exception at commit:
- Update __Cause__ and __EPC registers__ - Kill all stages - Inject handler PC into fetch stage
Speculating on Exceptions
- Prediction
- No prediction required, since rare
- Check prediction
- Exceptions detected at end of instruction pipeline - Special H/W for various exception types
- Recovery
- Only write architectural states at commit point - Can throw away partially executed instructions - Launch exception handler after flushing