HW2 Instructions

Exercise 2.3

[5] <§§2.2, 2.3> For the following C statement, write the corresponding RISC-V assembly code. Assume that the variables f, g, h, i and j are assigned to registers x5, x6, x7, x28 and x29, respectively. Assume that the base address of the arrays A and B are in registers x10 and x11, respectively.

B[8]=A[ij];\rm B[8] = A[i−j];

1
2
3
4
5
sub x30,x28,x29
slli x30,x30,3
add x30,x30,x10
ld x9,0(x30)
sd x9,64(x11)

Exercise 2.4

[10] <§§2.2, 2.3> For the RISC-V assembly instructions below, what is the corresponding C statement?Assume that the variables f, g, h, i and j are assigned to registers x5, x6, x7, x28 and x29, respectively. Assume that the base address of the arrays A and B are in registers x10 and x11, respectively.

1
2
3
4
5
6
7
8
9
10
slli x30, x5, 3 		# x30 = f*8 
add x30, x10, x30 # x30 = &A[f]
slli x31, x6, 3 # x31 = g*8
add x31, x11, x31 # x31 = &B[g]
ld x5, 0(x30) # f = A[f]

addi x12, x30, 8 # x12 = &A[f] + 8 i.e. x12 = &A[f + 1]
ld x30, 0(x12) # x30 = A[f + 1]
add x30, x30, x5 # x30 = A[f + 1] + A[f]
sd x30, 0(x31) # B[g] = x30 i.e. B[g] = A[f] + A[f + 1]

Exercise 2.5

[5] <§2.3> Show how the value 0xabcdef12 would be arranged in memory of a little-endian and a big-endian machine. Assume the data are stored starting at address 0 and that the word size is 4 bytes.

little-endian big-endian
Address(Byte) Data Data
3 0xab 0x12
2 0xcd 0xef
1 0xef 0xcd
0 0x12 0xab

Exercise 2.11

Assume that x5 holds the value 128ten128_{ten} .

2.11.1

[5] <§2.4> For the instruction add x30, x5, x6, what is the range(s) of values for x6 that would result in overflow?

There will be an overflow if either x5 ++ x6 >2631> 2^{63} - 1 or x5 ++ x6 <263< -2^{63}. That’s to say,

  1. x6 >263129> 2^{63} - 129
  2. x6 <263128< -2^{63} - 128

Given that x6 is between 263- 2^{63} and 26312^{63} - 1, x6 (263129,2631]z\in (2^{63} - 129,2^{63} - 1] \cap \mathbb{z}.

2.11.2

[5] <§2.4> For the instruction sub x30, x5, x6, what is the range(s) of values for x6 that would result in overflow?

There will be an overflow if either x5 - x6 >2631> 2^{63} - 1 or x5 - x6 <263< -2^{63}. That’s to say,

  1. x6 >263+128> 2^{63} + 128
  2. x6 <129263< 129 -2^{63}

Given that x6 is between 263- 2^{63} and 26312^{63} - 1, x6 [263,129263)z\in [- 2^{63},129 -2^{63}) \cap \mathbb{z}.

2.11.3

[5] <§2.4> For the instruction sub x30, x6, x5, what is the range(s) of values for x6 that would result in overflow?

There will be an overflow if either x6 - x5 >2631> 2^{63} - 1 or x6 - x5 <263< -2^{63}. That’s to say,

  1. x6 >263+127> 2^{63} + 127
  2. x6 <128263< 128 -2^{63}

Given that x6 is between 263- 2^{63} and 26312^{63} - 1, x6 [263,128263)z\in [- 2^{63},128 -2^{63}) \cap \mathbb{z}.

Exercise 2.12

[5] <§§2.2, 2.5> Provide the instruction type and assembly language instruction for the following binary value:

0000 0000 0001 0000 1000 0000 1011 0011two0000\ 0000\ 0001\ 0000\ 1000\ 0000\ 1011\ 0011_{two}

Hint: Figure 2.20 may be helpful.

image-20230430200434530

The operation code is 0110011, which tells the instruction type is R-type. The R-type format is

funct7 rs2 rs1 funct3 rd opcode
7 bits 5 bits 5 bits 3 bits 5 bits 7 bits
0000 000 0 0001 0000 1 000 00001 0110011

The assembly code is

1
add x1,x1,x1

Exercise 2.13

[5] <§§2.2, 2.5> Provide the instruction type and hexadecimal representation of the following instruction:

1
sd x5, 32(x30)

The instruction type is S-type.

0000 0010 0101 1111 0011 0000 0010 0011\rm 0000\ 0010\ 0101\ 1111\ 0011\ 0000\ 0010\ 0011

The hexadecimal representation is 0x025F3023.

Exercise 2.23

Consider a proposed new instruction named rpt. This instruction combines a loop’s condition check and counter decrement into a single instruction. For example rpt x29, loop would do the following:

1
2
3
4
if (x29 > 0) {
x29 = x29 −1;
goto loop
}

2.23.1

[5] <§2.7, 2.10> If this instruction were to be added to the RISC-V instruction set, what is the most appropriate instruction format?

The most appropriate instruction format is J(UJ)-type.

2.23.2

[5] <§2.7> What is the shortest sequence of RISC-V instructions that performs the same operation?

1
2
3
4
loop: 	
addi x29,x29,-1
bge x29,x0,loop
addi x29,x29,1

Exercise 2.24

Consider the following RISC-V loop:

1
2
3
4
5
LOOP: beq x6, x0, DONE 
addi x6, x6, -1
addi x5, x5, 2
jal x0, LOOP
DONE:

2.24.1

[5] <§2.7> Assume that the register x6 is initialized to the value 10. What is the final value in register x5 assuming the x5 is initially zero?

The final value in register x5 is 20.

2.24.2

[5] <§2.7> For the loop above, write the equivalent C code. Assume that the registers x5 and x6 are integers acc and i, respectively.

1
2
3
4
5
6
i = 10;
acc = 0;
while(i != 0){
i --;
acc += 2;
}

2.24.3

[5] <§2.7> For the loop written in RISC-V assembly above, assume that the register x6 is initialized to the value N. How many RISC-V instructions are executed?

For every loop there are 4 instructions are executed. There will be N loops if x6 is initialized to the value N. and in the N+1N+1 loop there is still an instruction to jump out of the loop. Therefore, there 4N+14N + 1 instructions to be executed in total.

2.24.4

[5] <§2.7> For the loop written in RISC-V assembly above, replace the instruction beq x6, x0, DONE with the instruction blt x6, x0, DONE and write the equivalent C code.

1
2
3
4
5
6
i = 10;
acc = 0;
while(i >= 0){
i --;
acc += 2;
}

Exercise 2.25

[10] <§2.7> Translate the following C code to RISC-V assembly code. Use a minimum number of instructions. Assume that the values of a, b, i and j are in registers x5, x6, x7 and x29 respectively. Also, assume that register x10 holds the base address of the array D.

1
2
3
for(i=0; i<a; i++) 
for(j=0; j<b; j++)
D[4*j] = i + j;
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
add x7,x0,x0
loop1:
bge x7,x5,exit1
add x29,x0,x0
loop2:
bge x29,x6,exit2
slli x30,x29,5
add x30,x30,x10
add x31,x7,x29
sd x31,0(x30)
addi x29,x29,1
beq x0,x0,loop2
exit2:
addi x7,x7,1
beq x0,x0,loop1
exit1:

Exercise 2.26

[5] <§2.7> How many RISC-V instructions does it take to implement the C code from Exercise 2.25? If the variables a and b are initialized to 10 and 1 and all elements of D are initially 0, what is the total number of RISC-V instructions executed to complete the loop?

There are 12 instructions. If the variables a and b are initialized to 10 and 1 and all elements of D are initially 0, 1+1+10×(4+1+1×7)=1221 + 1 + 10\times(4 + 1 + 1\times 7) = 122 instructions are to be executed in total.

Exercise 2.31

[20] <§2.8> Translate function f into RISC-V assembly language. Assume the function declaration for g is int g(int a, int b). The code for function f is as follows:

1
2
3
int f(int a, int b, int c, int d){
return g(g(a,b), c+d);
}
1
2
3
4
5
6
7
8
9
10
11
12
funcf:
add x20,
addi sp,sp,-16
sd x1,8(sp)
add x5,x12,x13 #x5 = c + d
sd x5,0(sp)
jal x1,g #results in x10: x10 = a,x11 = b
ld x11,0(sp) #x11 = c + d
jal x1,g #results in x10: x10 = g(a,b),x11 = c + d
ld x1,8(sp)
addi sp,sp,24
jalr x0,0(x1)

Exercise 2.35

Consider the following code:

1
2
lb x6, 0(x7) 
sd x6, 8(x7)

Assume that the register x7 contains the address 0×10000000 and the data at address is 0x1122334455667788.

2.35.1

[5] <§2.3, 2.9> What value is stored in 0x10000008 on a big-endian machine?

2.35.2

[5] <§2.3, 2.9> What value is stored in 0x10000008 on a little-endian machine?

big-endian little-endian
Address(Byte) Data Data
0x10000008 0x11 0xFF
0x10000007 0x00 0xFF
0x10000006 0x00 0xFF
0x10000005 0x00 0xFF
0x10000004 0x00 0xFF
0x10000003 0x00 0xFF
0x10000002 0x00 0xFF
0x10000001 0x00 0x88
0x10000000 0x11 0x88

Exercise 2.36

[5] <§2.10> Write the RISC-V assembly code that creates the 64-bit constant 0x1122334455667788
and stores that value to register x10.

1
2
3
4
5
6
lui x10,0x112233
ori x10,0x344
slli x10,32
lui x5,0x556677
ori x5,0x788
add x10,x10,x5

Exercise 2.40

Assume that for a given program 70% of the executed instructions are arithmetic, 10% are load/store, and 20% are branch.

2.40.1

[5] <§§1.6, 2.13> Given this instruction mix and the assumption that an arithmetic instruction requires two cycles, a load/store instruction takes six cycles, and a branch instruction takes three cycles, find the average CPI.

CPIavg=0.7×2+0.1×6+0.2×3=2.6CPI_{avg} = 0.7\times 2 + 0.1 \times 6 + 0.2\times3 = 2.6

2.40.2

[5] <§§1.6, 2.13> For a 25% improvement in performance, how many cycles, on average, may an arithmetic instruction take if load/store and branch instructions are not improved at all?

CPIavg=0.7×x+0.1×6+0.2×3=1.2+0.7x=0.75×CPIavgCPI'_{avg} = 0.7\times x + 0.1 \times 6 + 0.2\times3 = 1.2 + 0.7x = 0.75 \times CPI_{avg}

CPIavgCPIavg=1.2+0.7x2.6=0.75\frac{CPI'_{avg}}{CPI_{avg}} = \frac{1.2 + 0.7x}{2.6} = 0.75

Hence,

x=1.07x = 1.07

2.40.3

[5] <§§1.6, 2.13> For a 50% improvement in performance, how many cycles, on average, may an arithmetic instruction take if load/store and branch instructions are not improved at all?

CPIavg=0.7×x+0.1×6+0.2×3=1.2+0.7x=0.5×CPIavgCPI'_{avg} = 0.7\times x + 0.1 \times 6 + 0.2\times3 = 1.2 + 0.7x = 0.5 \times CPI_{avg}

CPIavgCPIavg=1.2+0.7x2.6=0.5\frac{CPI'_{avg}}{CPI_{avg}} = \frac{1.2 + 0.7x}{2.6} = 0.5

Hence,

x=0.14x = 0.14

HW3 Arithmetic

Exercise 3.1

[5] <§3.2> What is 5ED4 − 07A4 when these values represent unsigned 16bit hexadecimal numbers? The result should be written in hexadecimal. Show your work.

0x5ED40x07A4=0101 1110 1101 010020000 0111 1010 01002=0101 0111 0011 00002=0x5730\begin{align*} \rm 0x5ED4 - 0x07A4 &= 0101\ 1110\ 1101\ 0100_2 - 0000\ 0111\ 1010\ 0100_2\\ &= 0101\ 0111\ 0011\ 0000_2\\ &= \rm 0x5730 \end{align*}

Exercise 3.5

[5] <§3.2> What is 4365 − 3412 when these values represent signed 12-bit octal numbers stored in sign-magnitude format? The result should be written in octal. Show your work.

4365834128=100 011 110 1012011 100 001 0102=24510180210=2407=011 111 111 1118=7777\begin{align*} \rm 4365_8 - 3412_8 &= 100\ 011\ 110\ 101_2 - 011\ 100\ 001\ 010_2\\ &= -245_{10} - 1802_{10}\\ &= -2407\\ &= - 011\ 111\ 111\ 111_8 = 7777 \end{align*}

Exercise 3.8

[5] <§3.2> Assume 185 and 122 are signed 8-bit decimal integers stored in sign-magnitude format. Calculate 185 − 122. Is there overflow, underflow, or neither?

1851012210=1011 100120111 10102=57122=17910\begin{align*} \rm 185_{10} -122_{10} &= 1011\ 1001_2 - 0111\ 1010_2\\ &= -57 - 122\\ &= -179_{10} \end{align*}

There is an overflow.

Exercise 3.9

[10] <§3.2> Assume 151 and 214 are signed 8-bit decimal integers stored in two’s complement format. Calculate 151 + 214 using saturating arithmetic. The result should be written in decimal. Show your work.

15110+21410=1001 01112+1101 01102=0110 100120010 10102=105104210=14710\begin{align*} \rm 151_{10} + 214_{10} &= 1001\ 0111_2 + 1101\ 0110_2\\ &= -0110\ 1001_2 - 0010\ 1010_2\\ &= -105_{10} - 42_{10}\\ &= -147_{10} \end{align*}

Exercise 3.13

[20] <§3.3> Using a table similar to that shown in Figure 3.6, calculate the product of the hexadecimal unsigned 8-bit integers 62 and 12 using the hardware described in Figure 3.5. You should show the contents of each register on each step.image-20230501192629117

image-20230501192611280

iteration step multiplicand multiplier/ product
0 initial values 110 010 000 000 001 010
1 no op 110 010 000 000 001 010
Right shift Product 110 010 000 000 000 101
2 Product += Mcand 110 010 110 010 000 101
Right shift Product 110 010 011 001 000 010
3 no op 110 010 011 001 000 010
Right shift Product 110 010 001 100 100 001
4 Product += Mcand 110 010 111 110 100 001
Right shift Product 110 010 011 111 010 000
5 no op 110 010 011 111 010 000
Right shift Product 110 010 001 111 101 000
6 no op 110 010 001 111 101 000
Right shift Product 110 010 000 111 110 100

The answer is 0764.

Exercise 3.19

[30] <§3.4> Using a table similar to that shown in Figure 3.10, calculate 74 divided by 21 using the hardware described in Figure 3.11. You should show the contents of each register on each step. Assume A and B are unsigned 6-bit integers. This algorithm requires a slightly different approach than that shown in Figure 3.9. You will want to think hard about this, do an experiment or two, or else go to the web to figure out how to make this work correctly.

Hint: one possible solution involves using the fact that Figure 3.11 implies the remainder register can be shifted either direction.

image-20230501194230545

iteration step Divisor Remainder/Quotient
0 initial values 010 001 000 000 111 100
1 Left shift Remainder(Rem) 010 001 000 001 111 000
Rem -= Divisor 010 001 111 000 111 000
Rem < 0, then add back 010 001 000 001 111 000
2 Left shift Rem 010 001 000 011 110 000
Rem -= Divisor 010 001 110 010 110 000
Rem < 0, then add back 010 001 000 011 110 000
3 Left shift Rem 010 001 000 111 100 000
Rem -= Divisor 010 001 110 110 100 000
Rem < 0, then add back 010 001 000 111 100 000
4 Left shift Rem 010 001 001 111 000 000
Rem -= Divisor 010 001 111 110 000 000
Rem < 0, then add back 010 001 001 111 000 000
5 Left shift Rem 010 001 011 110 000 000
Rem -= Divisor 010 001 001 101 000 000
Rem > 0, then R0 = 1 010 001 001 101 000 001
6 Left shift Rem 010 001 011 010 000 010
Rem -= Divisor 010 001 001 001 000 010
Rem > 0, then R0 = 1 010 001 001 001 000 011

Exercise 3.20

[5] <§3.5> What decimal number does the bit pattern 0x0C000000 represent if it is a two’s complement integer? An unsigned integer?

  • two’s complement: 201,326,59210201,326,592_{10}
  • unsigned integer:201,326,59210201,326,592_{10}

Exercise 3.21

[10] <§3.5> If the bit pattern 0x0C000000 is placed into the Instruction Register, what RISC-V instruction will be executed?

sorry, i don’t know.

Exercise 3.22

[10] <§3.5> What decimal number does the bit pattern 0x0C000000 represent if it is a floating point number? Use the IEEE 754 standard.

  • exponent: 000 1100 0 - 127 = 24 - 127 = -103
  • fraction = 0
  • answer = 1.0×21031.0 \times 2^{-103}

Exercise 3.23

[10] <§3.5> Write down the binary representation of the decimal number 63.25 assuming the IEEE 754 single precision format.

  • 63.23 = 111111.01 = 1.1111101×251.1111101 \times 2^5
  • sign = 0
  • exp = 5 + 127 = 132
  • answer = 0 1000 0100 1111 1010 0000 0000 0000 000 = 0x427D0000

Exercise 3.24

[10] <§3.5> Write down the binary representation of the decimal number 63.25 assuming the IEEE 754 double precision format.

  • sign = 0
  • exp = 5 + 1023 = 1028
  • answer = 0 100 0000 0100 1111 1010 0000 ... 0000 = 0x404FA00000000000

Exercise 3.27

[20] <§3.5> IEEE 754-2008 contains a half precision that is only 16 bits wide. The leftmost bit is still the sign bit, the exponent is 5 bits wide and has a bias of 15, and the mantissa is 10 bits long. A hidden 1 is assumed. Write down the bit pattern to represent 1.5625×101−1.5625 × 10^{−1} assuming a version of this format, which uses an excess-16 format to store the exponent. Comment on how the range and accuracy of this 16-bit floating point format compares to the single precision IEEE 754 standard.

  • 1.5625×101−1.5625 × 10^{−1} = -0.00101 = 1.01×23-1.01 \times 2^{-3}
  • sign = 1
  • exp = -3 + 15 = 12
  • answer = 1 01100 0100 0000 00

Exercise 3.29

[20] <§3.5> Calculate the sum of 2.6125×1012.6125 × 10^1 and 4.150390625×1014.150390625 × 10^{−1} by hand, assuming A and B are stored in the 16-bit half precision described in Exercise 3.27. Assume 1 guard, 1 round bit, and 1 sticky bit, and round to the nearest even. Show all the steps.

2.6125×101+4..150390625×101=1.1010001000×24+1.01010100111×22=1.1010001000×24+0.00000101010100111×24=1.10101000101(G)0(R)×24\begin{align*} 2.6125 × 10^1 + 4..150390625 × 10^{−1} &= 1.1010001000 \times 2^4 + 1.01010100111\times 2^{-2}\\ &=1.1010001000 \times 2^4 + 0.00000101010100111\times 2^{4}\\ &=1.10101000101(G)0(R) \times 2^4 \end{align*}


http://example.com/2023/04/30/CO HW/
Author
Tekhne Chen
Posted on
April 30, 2023
Licensed under