CO C3 Arithmetic

Introduction

Computer words are composed of bits;
- words: represented as binary numbers
- there are 32bits/word or 64bits/word in RISC-V
- Contains four bytes for 32bits word
Simplified to contain only in course:
memory-reference instructions: lw, sw
arithmetic-logical instructions: add, sub, and, or, xor, slt…
control flow instructions: beq, bne, jal
Generic Implementation:
use the program counter (PC) to supply instruction address
get the instruction from memory
read registers
use the instruction to decide exactly what to do
All instructions use the ALU after reading the registers
Why? memory-reference? arithmetic? control flow?

Numbers

Number systems

Representation

Signed Number formats

Sign and magnitude
1’s complement
2’s complement

Two’s Biased notation invert all bits and add 1 with end.
$[X]_b＝2^n＋Ｘ$

types

sign extension

Expansion e.g8 bit numbers to 64/32 bit numbers;
Required for operations with registers(32/64 bits) and
immediate operands (8 bits)

lb,Load byte: Take the lower 8 bits as they are and copy the highest bit to the remaining 24/56 bits.

e.g.

0000 0010 $\rightarrow$ 0000 0000 0000 0000 0000 0000 0000 0010

1111 1110 $\rightarrow$ 1111 1111 1111 1111 1111 1111 1111 0010

lbu,Load byte unsigned: Take the lower 8 bits as they are and copy the do zero-extension to the remaining 24/56 bits.

e.g.

0000 0010 $\rightarrow$ 0000 0000 0000 0000 0000 0000 0000 0010

1111 1110 $\rightarrow$ 0000 0000 0000 0000 0000 0000 1111 1110

Arithmetic

Addition and Subtraction

Adding bit by bit and take carries to next digit. Simply use addition of 2’s complement for Subtraction.

Overflow/underflow The sum of two numbers can exceed any representation, may happens when positve + positive, positve - negative, negative - negative or negative - positive.

Jugde by the double sign bits

Operation Operand A Operand B Result overflow double sign bits

Reaction

An exception (interrupt) occurs

Control jumps to predefined address for exception

Interrupted address is saved for possible resumption

Different language treats overflow differently.

Signaling to application (Ada, Fortran)

Ignore, don’t always want to detect overflow©

Comparison If rs < rt, rd=1, else rd=0
slt/sltu rd,rs,rt

Process

Hardware detection in the ALU and Generation of an exception (interrupt)

Save the instruction address (not PC) in special register(byte addressing!) SEPC

Jump to specific routine in OS (take instruction in SEPC and take back in PC)

Correct & return to program

Return to program with error code

Abort program

Arithmetic for Multimedia

Graphics and media processing operates on vectors of 8-bit and 16-bit data
- Use 64-bit adder, with partitioned carry chain
- Operate on 8×8-bit, 4×16-bit, or 2×32-bit vectors
- SIMD (single-instruction, multiple-data)
Saturating operations
overflow: result is largest representable value

c.f 2s-complement modulo arithmetic
e.g clipping in audio, saturation in video

Logical operations

slli x11,x9,3         #take x9,lshift 3 bytes 
					#and save to x11
and x3,x10,x16        # and(&) x10 and x16 and save to x3
or x4,x10,x16         # or(|) x10 and x16 and save to x4

Logical shift filled with ‘0’

slli/srli register1, register2, num

bit-wise AND between registers

and register1, register2, register3

bit-wise OR between registers

or register1, register2, register3

Constructing a simple ALU

A half adder

\begin{align} Sum = \bar{a}b + a\bar{b} \\ Carry = ab \end{align}

A full adder

\begin{align} Sum &= A\oplus B\oplus Carry_{in} \\ Carry_{out} &= (A + B)\cdot Carry_{in} + AB \end{align}

Extended 1 bit ALU

AND,OR, ADD

Substraction: Inverting b and set 1st CarryIn= 1.

A - B = A + \overline{B} + 1 = A + B' + Carry_{in}

Comparison: Subtraction (rs - rt), if the result is negative→ rs < rt Use of sign bit as indicator

Most significant bit  Set for comparison  Overflow detect
 Cell
Cascade Element

Basic 32 bit ALU

Inputs parallel
Carry is cascaded: Ripple carry adder (行波进位加法器)
- Slow but simple
- 1st Carry In = 0

Multiplication

Binary multiplication

\rm Product = Multiplicand × Multiplier

At current bit position

If multiplier is 1, then add multiplicand Else add 0
shift multiplicand left by 1 bit

Version 1

As there are at most $m+n$ bits when multiply $m$ bit and $n$ bit number.

Deawbacks: Very big, Too slow!

For a 64-bit ALU, the multiplicant should be-128 bit.
Requires 64 iterations :Addition, Shift and Comparison
Almost 200 cycles

Version 2

Least significant bits of the product don’t change, hence Shift the product and the multiplier instead of the multiplicand !

For a 64-bit ALU, it is reduced to 64 bits, only left half of product register is changed.

Addition performed only on left half of product register
Shift of product register

Version 3

The product register contains only 0 at the initial state. And the lower 64 bits are simply shifted out Therefore use these lower 64 bits for the multiplier.

Set product register to 0
Load lower bits of product register with multiplier
Test least significant bit of product register

Signed multiplication

Basic approach:
- Store the signs of the operands  Convert signed numbers to unsigned numbers (most significant bit (MSB) = 0) Perform multiplication  If sign bits of operands are equal sign bit = 0, else sign bit = 1
Improved method: Booth’s Algorithm
Idea: If you have a sequence of '1’s  subtract at first ‘1’ in multiplier  shift for the sequence of '1’s
 add where prior step had last '1‘

Result: Possibly less additions and more shifts  Faster, if shifts are faster than additions

last1 last2 Operations

1 0 subtract multiplicand from left

1 1 no arithmetic operation-shift

0 1 add multiplicand to left half

0 0 no arithmetic operation-shift

Advantages:
- keeps the leftmost bit constant
- no change of sign bit !

last1	last2	Operations
1	0	subtract multiplicand from left
1	1	no arithmetic operation-shift
0	1	add multiplicand to left half
0	0	no arithmetic operation-shift

Faster Multiplication

Uses multiple adders

RISC-V Multiplication

Division

Binary division

\rm Quotient = \lceil\frac {Dividend}{Divisor}\rceil + Remainder

Check for 0 divisor
Long division approach ₠ If divisor ≤ dividend bits ☉ 1 bit in quotient, subtract
₠ Otherwise
☉ 0 bit in quotient, bring down next dividend bit
Restoring division ₠ Do the subtract, and if remainder goes < 0, add divisor back
Signed division ₠ Divide using absolute values ₠ Adjust sign of quotient and remainder
as required

Version 1

Version 2

Reduction of Divisor and ALU width by half
Shifting of the remainder
Saving 1 iteration
Remainder register keeps quotient - No quotient register required

Faster Division

Can’t use parallel hardware as in multiplier as Subtraction is conditional on sign of remainder
Faster dividers (e.g. SRT division) generate multiple quotient bits per step but Still require multiple steps

RISC-V Division

Floating point arithmetic

Adding all parts to get an ALU

Computer Science > Computer Organization

#Computer Hardware

CO C3 Arithmetic

http://example.com/2023/03/06/CO-3/

Author

Tekhne Chen

Posted on

March 6, 2023

Licensed under

DB C2 Relational Model Previous

CO C0,1 Introduction Next