[Lecture Notes] CMU 15-513 Intro to Computer Systems
Disclaimer: 记得不全,已经会的都没记,不建议用作参考。
Lec 3. Machine-Level Programming I
- x86-64 integer registers
%rax
64-bit%eax
32-bit,%ax
16-bit,%ah
%al
high 8-bit and low 8-bit respectively- backwards compatibility 向下兼容❎ 历史包袱✅
Register | Origin (许多已废弃) | ||
---|---|---|---|
rax (eax, ax, ah, al) | Accumulate 累加器 | ||
rbx ebx | Base 基地址(例如数组首地址 和si di差不多) | ||
rcx ecx | Counter | ||
rdx edx | Data | ||
rsi esi | Source Index (源变址) | ||
rdi edi | Destination Index (目的变址) | ||
rsp esp | Stack Pointer | ||
rbp ebp | Base Pointer | ||
r8 r8d | |||
r9 , r10 , r11 , …, r15 |
关于 bx/bp/si/di 寄存器:
1 |
|
Operands 操作数
- Immediate: e.g.
$0x400
,$-533
(1, 2 or 4 bytes) - Register: e.g.
%rax
,%r13
.- But
%rsp
reserved for special use.
- But
- Memory: 8-byte mem at addr given by reg: e.g.
(%rax)
.
- Immediate: e.g.
movq
operand combinations- ✅ imm to reg:
movq $0x4, %rax
- ✅ imm to mem:
movq $-147, (%rax)
- ✅ reg to reg:
movq %rax, %rdx
- ✅ reg to mem:
movq %rax, (%rdx)
- ✅ mem to reg:
movq (%rax), %rdx
- ❌ Cannot do mem-mem transfer!
- ✅ imm to reg:
Complete mem addressing modes
D(Rb, Ri, S) = MEM[Reg[Rb] + S*Reg[Ri] + D] (类似二维数组)
D: constant 1, 2, or 4 bytes
Rb: base register
Ri: index register (except
%rsp
)S: scale 1, 2, 4, or 8
二维数组array[10][20] 访问 a[2][3] = a[2*20+3] a=Reg[Rb]基地址, 2=Reg[Ri](下标i), 20=S (数组是int还是short等) 二维数组 基址变址寻址 (基址rb bx bp, 变址si di) 一维数组
Special case:
- (Rb, Ri) = MEM[Reg[Rb] + Reg[Ri]]
- D(Rb, Ri) = MEM[Reg[Rb] + Reg[Ri] + D]
- (Rb, Ri, S) = MEM[Reg[Rb] + S*Reg[Ri]]
Some arithmetic operations
add
,sub
,imul
(imul
带符号乘法;mul
无符号乘法)sal
shl
(left shift),sar
(arithmetic right shift),shr
(logical)xor
,and
,or
inc
(++dest),dec
(–dest),neg
(-dest),not
(~dest)lea
(load effective address 把存储单元的地址给dst)- CPU designers’ inteneded use: calculate pointer
- Compiler often do: ordinary arithmetic
- e.g.
lea (%rbx, %rbx, 2), %rax
meansrax = rbx * 3
.
- e.g.
In most instructions, a memory operand access memory. LEA is exception!
Lec 4. Machine Control
EFLAGS (RFLAGS in 64-bit) register
- CF (Carry Flag): unsigned overflow 进位(+)或借位(-)
- e.g. uint8_t 255+1=0
- ZF (Zero Flag): zero
- SF (Sign Flag): negative 最高位为1 / 判断负数
- OF (Overflow Flag): signed overflow
- 两个正数相加得到负数,或两个负数相加得到正数
- 最高位w,y,
w == y && w != z
- e.g. int8_t 127+1=-128
- PF (Parity Flag)
- 一般算术运算会影响所有标志位
- 一般逻辑运算会更改eflags然后把CF和OF置零(因为:逻辑运算只和本位有关)
- 传送类指令
mov
,lea
等不影响标志位
- CF (Carry Flag): unsigned overflow 进位(+)或借位(-)
Compare
cmp a, b
: compute b-a, update condition codessub
和cmp
唯一的区别在于减出来的差值是否保留
test a, b
: compute b&a, update SF and ZFtest
和and
唯一的区别在于结果是否保留- e.g. 测试某一个bit是否为1,可使用 test reg, 0b000100 指令,接下来使用 jz 判断是否全零
- e.g. test reg, reg 将自己和自己相与,可判断是否自己是零(和 cmp reg 0一致,但硬件层面逻辑运算速度很快,逻辑运算(与门)比算术运算(sub)运算快,因此选择test指令)
Jump
指令 和等效标记 判断条件 (不用记) jmp
true jump 无条件跳转 je
jz
ZF Equal / Zero jne
jnz
~ZF Not Equal / Not Zero js
SF 负数 jns
~SF 非负数 jg
jnle
ZF=0 and SF=OF > (signed) jge
jnl
SF=OF >= (signed) jl
jnge
SF != OF < (signed) jle
jng
ZF=1 or SF != OF <= (signed) ja
jnbe
CF=0 and ZF=0 > (unsigned) jb
jnae
jc
CF < (unsigned) jae
jnb
jnc
CF=0 >= (unsigned) jbe
jna
CF=1 or BF=1 <= (unsigned) Set
- 将condition codes赋值给寄存器低1字节? 高7字节不会修改
Lec 5. Machine-Level Programming - Procedures
x86-64 stack
pushq
src: decrement rsp by 8popq
dest: increment rsp by 8, copy valuecall label
: push return address on stack (address of the next instruction after call), jump to labelret
: pop address from stack, jump to that address
传参
- 前六个参数:
rdi
,rsi
,rdx
,rcx
,r8
,r9
- 更多的参数放在stack中
- 返回值:
rax
Managing local data / 递归函数 Frame
- Stack allocated in Frames
Register Saving Conventions
- Caller Saved (Call-Clobbered)
- rax (return value)
- rdi, rsi, rdx, rcx, r8, r9 (arguments)
- Callee Saved (Call-Preserved)
- rbx, r10, r11, r12, r13, r14, r15 (temporaries)
- rbp (maybe used as frame pointer, can mix & match)
- rsp (special: restored to original value upon exit)
Lec 6. Machine-Level Programming - Data
Arrays 略
Structures 略
Floating Point
- Arguments passed in
%xmm0
,%xmm1
, … - All XMM reg are call-clobbered (caller saved)
Lec 7. Machine Advanced
Buffer overflow
- #1 technical cause of security vulnerabilities
- #1 overall cause is social engineering / user ignorance
- Stack smashing attacks: overwrite return address, to jump to other code
- code injection attacks: input string contains byte representation of executable code
- How to avoid?
- Avoid overflow vulnerabilities: e.g.
fgets(buf, 4, stdin)
instead ofgets
- Randomized stack offsets: at the start of program, reposition stack. Makes it difficult for hacker to predict beginning of inserted code.
- Non-executable memory: mark regions of mem as not executable. If jump into such region, immediate crash.
- Stack canaries: 在stack的局部变量和返回地址之间插入一个特殊的标记值canary, 函数返回前检查该值是否被修改.
- Avoid overflow vulnerabilities: e.g.
- Return-oriented Programming attacks (ROP)
- 不插入恶意代码,而是利用现有程序中的代码片段 (gadgets) 通常以
ret
结尾
- 不插入恶意代码,而是利用现有程序中的代码片段 (gadgets) 通常以