Content

  1. Instruction Scheduling
    1. H/W Resources
    2. Data Dependencies
  2. Software Pipelining
    1. Example
    2. Modulo Scheduling

Instruction Scheduling

  • Motivation
    • RISC architecture philosophy: expose H/W features to compilers for optimizations
  • Scheduling Constraints
    • H/W resources
    • Data dependencies
    • Control dependencies

H/W Resources

  • Conflict between register allocation & instruction scheduling
    • Register allocation: minimize # of registers
    • Instruction scheduling: reduce stalls by using more registers
  • Cache
    • e.g. loop reordering, cache blocking

Data Dependencies

Directed acyclic graph with edges indicating dependency. Find topological sort to produce better sequence.

  • Heuristics
    • Instructions that can run to completion without interference before those with interference
    • Instructions with more dependents before those with less dependents
    • Latency-weighted path

Software Pipelining

  • Hiding instruction latency
    • Increase distance between dependent operations by spreading them over more iterations; latency can be absorbed by other non-dependent operations

  • Often used in combination with loop unrolling
    • Variable latency of instructions?
    • Unroll multiple times, loop body huge; bad for cache

Example

for i = 1 to bignumber
  A(i) ; 3 cycle latency
  B(i) ; 3
  C(i) ; 12(perhaps a floating point operation)
  D(i) ; 3
  E(i) ; 3
  F(i) ; 3
end

->

prologue
for i = 1 to (bignumber - 6)
  A(i+6)
  B(i+5)
  C(i+4)
  D(i+2) ; note that we skip i+3
  E(i+1)
  F(i)
end
epilogue

Distance of 12 between C(i) and D(i).

; loop prologue (arranged on lines for clarity)
A(1)
A(2), B(1)
A(3), B(2), C(1)
A(4), B(3), C(2)
A(5), B(4), C(3), D(1)
A(6), B(5), C(4), D(2), E(1)

; loop epilogue (arranged on lines for clarity)
B(bignumber), C(bignumber-1), D(bignumber-3), E(bignumber-4), F(bignumber-5)
C(bignumber), D(bignumber-2), E(bignumber-3), F(bignumber-4)
D(bignumber-1), E(bignumber-2), F(bignumber-3)
D(bignumber), E(bignumber-1), F(bignumber-2)
E(bignumber), F(bignumber-1)
F(bignumber)

Modulo Scheduling

Interleaving instructions of an inner loop to achieve better parallelism of software pipelining.

Repeatedly schedule body of loop with different initiation intervals until perfect schedule reached.

References

results matching ""

    No results matching ""