CO C4 The Processor
RECALL
Problem issues for single-cycle implementation
- Longest delay determines
clock period
- Critical path: load instruction
- Instruction memory – register file – ALU – data memory – register file
- Not feasible to vary period for different instructions
- Violates design principle: Making the common case fast.
Performance can be improved by pipelining.
Pipelining
Pipelining is implementation technique whereby different instructions are overlapped in execution at the same time, which makes fast CPUs.
Why pipelining
- Does Not improve latency for individual instruction
- Improving of Throughput ( rather than individual execution time)
- Improving of efficiency for resources (functional unit)
Performance and Speed up
-
Balanced stage makes
speedup
ideal:Ideal
speedup
is ,i.e
Number of pipe stages. -
If not balanced,
speedup
is less -
Speedup
due to increasedthroughput
, However,latency
(time for each instruction) does not decrease.(even increase! )
Pipelining and ISA Design
Instruction set design affects complexity of pipeline implementation.
RISC-V ISA
is designed for pipelining.
- All instructions has fixed length, 32-bits
- Easier to fetch and decode in one cycle
c.f
x86: 1- to 17-byte instructions
- Few and regular instruction formats Can decode and read registers in one step
- Load/store addressing
Can calculate address in 3rd stage, access memory in 4th stage
Pipelined datapath
Right-to-left flow leads to hazards.
-
To avoid data disorder, we need pipeline registers (or latch) between stages to hold information produced in previous cycle.
-
However, some computations just won’t divide into any finer (shorter in time) logical implementation since the latches are not free: area comsumption and time delay.