-
컴퓨터구조) 02 Instructions: Language of the ComputerCS/Computer Architecture 2023. 1. 2. 15:24
Introduction
- The words of a computer’s language are called instructions
- Its vocabulary is called an instruction set
- High-level language program, Assembly language program, Binary machine language program
- Instruction is the basic command
- Instruction consists of
- Opcode (operation code) : what the instruction does
- Operands: the object of an operation
- ex) add 2 4 2, add - opcode, 2, 4, 2 - operand
- Instruction set architecture (ISA) is a set of instructions that processor understands
- Different processors have different instruction set architecture
- However, they are similar across different processors because
-
- Similar hardware technology based on similar principles
- A few common and basic operations that all computers must provide
-
- There are many different ISAs
- ARM: ARMv7, ARMv8
- Sun SPARC
- IBM POWER
- MIPS
- RISC-V *
- Intel: x86, lA-64
- Its vocabulary is called an instruction set
- Microarchitecture
- Microarchitecture is implementation of the ISA under specific design constraints and goals
- Implantation can be various as long as it satisfies the specification (ISA)
- Anything done in hardware without exposure to software
- Pipelining
- Speculative execution
- Memory access scheduling
- Arithmetic units
CISC (Complex Instruction Set Computer)
- CISC (Complex instruction set computer)
- Older design idea
- Complex and variable length instructions
- Many powerful instructions supported within the ISA
- Pros
- Makes assembly programming much easier → Compiler is also simpler
- Reduced instruction memory usage
- Cons
- Designing CPU is much harder
RISC
- New concept than CISC
- Simple, standardized instructions
- Small instruction set: CISC type operation become a chain of RISC operations
- ex) ARM, MIPS, SPARC, RISC-V
- Pros
- Easier to design CPU
- smaller instruction set → higher clock speed
- Cons
- Assembly language typically longer (compiler design issue)
- Heavy use of memory (메모리 많이 사용)
RISC - V
- RISC - V is an open standard ISA based on RISC principles
- Fifth RISC ISA design developed at UC Berkeley
- Typical of many modern ISAs
- Similar ISAs have a large share of embedded core market
Operands of the Computer Hardware
- The RISC-V assembly language notation
- add a, b, c → a = b+c
- All RISC-V arithmetic operations have this form
- Three variables: two sources and one destination
- If we want to place the sum of four variables b,c,d and e into variable a? →Three instructions are required
- add a, b, c. // The sum of b and c is placed in a a = b+c
- add a, a, d // The sum of b,c, and d is now in a a = a + d —> a = b+ c + d
- add a, a, e // The sum of b, c, d, and e is now in a a = a+e → a = b +c+d+e
- Design Principle 1: Simplicity favors regularity
- Arithmetic instructions use register operands
- Registers are primitive storage used in hardware design
- RISC-V has 32 X 32-bit register file : x0~x31 // register (cpu안에 있어서 매우 빠름)
- Design Principle 2 : Smaller is faster
- A large number of register may increase the clock cycle
- Tradeoff between the number of registers and performance
Memory Operands (레지스터 ≠ 메모리)
- Main memory used for composite data, such as arrays and structures
- Register are limited and so can NOT accommodate those composite(합성의) data
- Since arithmetic operations in RISC-V occurs only on registers, to apply these operations with the complex data in memory
- Load values from memory to register (메모리에서 레지스터로 값을 가져옴)
- Store result from register to memory (레지스터의 값을 메모리로 저장)
- Data transfer instructions are used to transfer data between registers and memory
Memory Addressing
- To access a word in memory, the instruction must supply the memory address(메모리에 접근하려면 메모리 주소를 제공해야함)
- Memory is just a large and single-dimensional array, with the address acting as the index to that array, starting at 0
- 8-bit bytes are useful in many programs → virtually all architectures address individual bytes
- The byte address of a word matches the address of one of the 4 bytes within the word
- Addresses of sequential words differ by 4
Endian
- RISC-V is little endian processor (RISC-V는 소형 endian 프로세서)
- Least significant byte is at least address of a word
- Little Endian vs Big Endian
- Big-endian : the most significant byte of data is placed at the lowest address (가장 중요한 주소를 낮은 주소에)
- Little-endian : the least significant byte of data is placed at the lowest address (가장 안중요한 주소를 낮은 주소에)
- RISC-V does NOT require words to be aligned in memory
- Words do NOT need to start at address that are multiples of 4
Memory Operand Example
- Compiling Using Load and Store
- C code:
- h is associated with register x21
- A is an integer (4-byte) array and its base address is in x22
- A[12] = h + A[8];
- Compiled RISC-V code:add x9, x21, x9 // Temporary reg x9 gets h + A[8] , x9 = x21 + x9
- lw: load word
- sw: store word
- sw x9, 48(x22) // Stores h + A[8] back into A[12]. 48 = 12*4, Memory[x22+48] = x9
- lw x9, 32(x22) // Temporary reg x9 gets A[8] 32 =8 * 4 , x9 = Memory[x22 + 32]
- C code:
Register vs Memory
- Registers are much faster to access than memory
- In RISC-V, data in memory can NOT be directly accessed by arithmetic instructions
- Operating on memory data requires loads and stores
- Compiler must use register for variables as much as possible
- Register optimization is important
Constant or Immediate Operands
- Many times a program will use a constant in an operation
- Constant data can be specified in an instruction in RISC-V
- addi: addition with immediate value (constant)
- 4 : immediate value (constant)
- addi x22, x22, 4. // x22 = x22 + 4
- Instruction using the immediate operand are much faster and consume less energy than if constants were loaded from memory
- RISC-V offers the constant zero by dedicating register x0 to be hardwired to the value zero
- Constant zero is useful for common operations
- ex) sub x1, x0, x1 → x1 = -x1 , x0은 항상 상수 0만 저장 → 숫자 부호를 바꿀때 자주 사용
Binary Numbers
- Numbers may be represented in any base
- ex) 123 base 10 = 11110111 base 2
- A single digit of a binary number is the “atom” of computing
- All information is composed of binary digits of bits’
- Generalizing the point, in any number base, the value of (i)th digit d is
- d X Base(i)
- MSB( most significant bit: most left bit), LSB( least significant bit: most right bit) —> endian과는 다름!
Unsigned Numbers
- The RISC-V word is 32 bit long
- 32-bit binary numbers can be represented in terms of the bit value times a power of 2
Signed Numbers
- There are some representation
-
- sign and magnitude, 2) 1’s complement, and 3) 2’s complement(2의 보수)
-
- The final solution is two’s complement representation
- In 32-bit long, the range is -pow(2,31) ~ pow(2,31)-1
- MSB is the sign bit
Negation(부정) and Sign Extension
- To negate the number,
- “~” is complement operator ( converting 0→1, 1→0 for every bit)
- ex) ~(0001) = 1110
- ~x + 1 = -x
- The sign extension is to represent a number using more bits
- Preserve the numeric value
- Replicate the sign bit to the left
- +2 : 0000 0010 → 0000 0000 0000 0010
- -2 : 1111 1110 → 1111 1111 1111 1110
- 비트가 늘어나면 젤 앞의 sign 비트를 늘림
Representing Instructions in the Computer
- Instructions are encoded in binary, called machine code
- RISC-V instructions
- Encoded as 32-bit instruction words → Regularity!
- Small number of format encoding operation code (opcode), register numbers,
- RISC-V instruction : 32 bits
- The layout of the instruction is called instruction format
- R-type: most arithmetic and logical instructions (except for “immediate”)
- I-type: data transfer (load), arithmetic with immediate
- S-type: data transfer (store)
R-type Instructions

I-type Instruction

- Design Principle 3 : Good design demands good compromises
- Different format complicate decoding, but allow 32-bit instruction uniformly
- Keep formats as similar as possible
S-type Instructions

- The 12-bit immediate is split into two fields
- it keeps the rs1 and rs2 fields in the same place in all instruction format
- Similarly, the opcode and funct3 fields are the same size/place in all locations
Logical Operation
- RISC-V provides instructions for bitwise logical operations

shift right arithmetic과 shift right의 차이(sign extension의 유무), xori 명령어의 중복 (해결)
- Shift instructions allows all the bits in a word to move to the left or right
- slli x11, x19, 4 // reg x11 = reg x19 << 4 bits

AND/OR Instructions
- AND instruction is useful to mask bits in a word(and, andi)
- Select some bits, and clear others to 0
- and x9, x10, x11. // reg x9 = reg x10 & reg x11
- OR instruction is useful to include bits in a word(or,ori)
- Set some bits to 1, and leave others unchanged
- or x9, x10, x11 // reg x9 = reg x10 | reg x11
XOR instruction
- XOR instruction allows a differencing operation (xor,xori)
- 같으면 0, 다르면 1
- xor x9, x10, x12 // reg x9 = reg x10 ^ reg x12
- In RISC-V, NOT operation can be achieved by XOR instruction
- ex) x9 becomes the inverting of x10 by “x10 XOR 111…1111”
Instructions for Masking Decision
- Branch to a labeled instruction if a condition is true
- otherwise, continue sequentially

- These two instructions are called conditional branches
Compiling if-then-else into Conditional Branches
if (i == j) f = g+h; else f= g-h;
-f, g, h, i and j correspond to the register x19/20/21/22/23
- Compiled RISC-V code

Compiling a while Loop in C
while (save[i] == k) i += 1;
i, k and the base of array save correspond to x22/24/25, respectively

More Conditional Operators
- Branch by comparison (blt/bltu and bge/bgeu)if (rs1 < rs2), branch to instruction labeled L1if (rs1 ≥ rs2), branch to instruction labeled L1
- bge rs1, rs2, L1
- blt rs1, rs2, L1
- Signed and unsigned comparison

Procedure Execution
- The program must follow the six steps in the execution of the procedure
- Put parameters in a place where the procedure can access them
- Transfer control to the procedure
- Acquire the storage resource needed for the procedure
- Perform the desired task
- Put the result value in a place where the calling program can access it
- Return control to the point of origin where the procedure called

Supporting Procedures in Computer Hardware
- RISC-V allocates the 9 registers for procedure calling
- x10 ~ x17: eight parameter registers in which to pass parameters or return values
- x10 and x11 are used to place the results
- x1: one return address register to return the point of origin
- x10 ~ x17: eight parameter registers in which to pass parameters or return values

- RISC-V provides an instruction just for the procedures
- It branches to an address and simultaneously saves the address of the following (next) instruction to the destination register rd
- Jump-and-link instruction (jal) for procedure call
- jal x1, ProcedureAddress // jump to ProcedureAddress and write return address to x1
- Jump: transfer control to the procedure
- Link: make a link so that control can jump back to the point of origin
- This link is called the return address and stored in register x1
- jal x1, ProcedureAddress // jump to ProcedureAddress and write return address to x1

- RISC-V uses an indirect jump to support the return from a procedure
- Jump-and-link register instruction (jalr) for procedure return
- jalr x0, 0(x1)
- Branches to the address stored in register x1 (i.e., 0 + address in x1)
- Uses x0 as the destination register → the effect is to discard the return address

Using More Register
- if a procedures uses more than eight arguments and two return values, “memory” is required to store the extra arguments and return values
- To execute a procedure, we need to move the caller’s local value in registers to the memory due to the limited number of registers
- Register spilling: save values from the registers to memory
- The ideal data structure for spilling registers is a stack
- Stack is a last-in-first-out queue
- No explicit push and pop operations
- Load and store instructions are used to access the stack in memory
- Stacks grow from higher to lower address
- in RISC-V, the stack pointer is register x2 (named sp)
- The stack pointer is adjusted by one word for each register that is saved (push) and restored (pop)
Compiling a Leaf Procedure (함수에서 다른 함수를 부르지 않음)

Register Saving Convention
- To avoid saving and restoring a register whose value is never used, RISC-V separates 19 of the registers into two groups
- Caller saved registers
- x5~x7 and x28~x31: temporary registers that are NOT preserved by the callee on a procedure call
- Caller saves temporary values in the stack before the call
- Contents of these registers can be modified as a result of procedure call
- Callee saved register
- x8~x9 and x18~x27: save registers that must be preserved on a procedure call (if used, the callee saves and restores them)
- Callee saves temporary values in the stack before using
- Callee restores them before returning to caller
- The contents of these register are preserved across a procedure call
- Therefore, sw/lw x5 and x6 is NOT required in the previous example but sw/lw x20 is required

Nest Procedures
- Leaf procedures do not call others but nonleaf ones do
- For nested call, caller needs to save on the stack
- Restores from the stack after the call
- For example
- Main program calls A with an argument of 3 → placing 3 in x10 → jal x1, A (오버라이드)
- A calls B with an argument of 7 → placing 7 in x10 → jal x1, B
- What happens to x10 and x1 (i.e., return address?)
- One solution is to push all the other registers that must be preserved on the stack
- Caller pushes x10~x17 (i.e., argument register), x5~x7 and x28~x31
- Callee does x1 (i.e., return address register), x8~x9, and x18~x27


Allocating Space for New Data on the Stack
- The stack is also used to store variables that are local to the procedure, such as local arrays and structures
- The segment of the stack containing the saved registers and local variables is called a procedure frame or activation record
- A frame pointer (fp, or x8) points to the first word of the frame of a procedure

Allocating Space for New Data on the Heap
- The RISC-V convention for allocation of memory
- Text: the home of the RISC-V machine code
- Static data: the place for constants and other static variables
- Heap (Dynamic data): the place for dynamically allocated memory and
- ex) linked list, tree → malloc()/free() in C
- Stack: the place for local variables and register savings
- Stack and heap grow towards each other to maximize storage use before collision

Wide Immediate Operands
- Although constants are frequently short and fit into the 12-bit field, sometimes they are bigger
- RISC-V instruction set includes the instruction load upper immediate (lui: U-type) to load a 20-bit constant into bits 12 through 31 of a register
- The rightmost 12 bits are filled with zeros
- An example to load the following 32-bit constant into x19

Addressing in Branches
- RISC-V branch instructions uses the SB-type format
- This format can represent branch addresses from -4096 to 4094, in multiples of 2 → only possible to branch to even address
- Example (opcode : 1100011, funct3 code: 001 for bne)

- The unconditional jump-and-link instruction (jal) is the instruction that uses the UJ-type format
- 20-bit address immediate → larger range
- Example (opcode: 1101111 for jal)

'CS > Computer Architecture' 카테고리의 다른 글
컴퓨터구조) 3 Arithmetic for Computers (0) 2023.01.02 컴퓨터구조) 01 Computer Abstractions and Technology (0) 2023.01.02 - The words of a computer’s language are called instructions