[Lecture Notes] CMU 15-513 Intro to Computer Systems

Disclaimer: 记得不全,已经会的都没记,不建议用作参考。

Lec 3. Machine-Level Programming I

  • x86-64 integer registers
    • %rax 64-bit
    • %eax 32-bit, %ax 16-bit, %ah %al high 8-bit and low 8-bit respectively
    • backwards compatibility 向下兼容❎ 历史包袱✅
RegisterOrigin (许多已废弃)
rax (eax, ax, ah, al)Accumulate 累加器
rbx ebxBase 基地址(例如数组首地址 和si di差不多)
rcx ecxCounter
rdx edxData
rsi esiSource Index (源变址)
rdi ediDestination Index (目的变址)
rsp espStack Pointer
rbp ebpBase Pointer
r8 r8d
r9, r10, r11, …, r15

关于 bx/bp/si/di 寄存器:

1
2
3
4
5
6
7
8
9
10
11
a[10] --> b[10]
for (int i = 0; i < 10; i++) {
b[i] = a[i];
}
/**
a -- source index -- si register
b -- destination index -- di register
现在 bx bp si di差不多
以前 si di用于数组中(用户), bx操作系统解决程序重定向问题使用 (程序每次加载到内存中都是不同地址 relative addressing)
bx和bp段不一样, bx si di数据段, bp 堆栈段(ss, stack segment)
*/
  • Operands 操作数

    • Immediate: e.g. $0x400, $-533 (1, 2 or 4 bytes)
    • Register: e.g. %rax, %r13.
      • But %rsp reserved for special use.
    • Memory: 8-byte mem at addr given by reg: e.g. (%rax).
  • movq operand combinations

    • ✅ imm to reg: movq $0x4, %rax
    • ✅ imm to mem: movq $-147, (%rax)
    • ✅ reg to reg: movq %rax, %rdx
    • ✅ reg to mem: movq %rax, (%rdx)
    • ✅ mem to reg: movq (%rax), %rdx
    • Cannot do mem-mem transfer!
  • Complete mem addressing modes

    • D(Rb, Ri, S) = MEM[Reg[Rb] + S*Reg[Ri] + D] (类似二维数组)

      • D: constant 1, 2, or 4 bytes

      • Rb: base register

      • Ri: index register (except %rsp)

      • S: scale 1, 2, 4, or 8

      • 二维数组array[10][20]
        访问 a[2][3] = a[2*20+3]
        a=Reg[Rb]基地址, 2=Reg[Ri](下标i), 20=S (数组是int还是short等)
        
        二维数组  基址变址寻址 (基址rb bx bp, 变址si di)
        一维数组  
        
    • Special case:

      • (Rb, Ri) = MEM[Reg[Rb] + Reg[Ri]]
      • D(Rb, Ri) = MEM[Reg[Rb] + Reg[Ri] + D]
      • (Rb, Ri, S) = MEM[Reg[Rb] + S*Reg[Ri]]
  • Some arithmetic operations

    • add, sub, imul (imul 带符号乘法; mul 无符号乘法)
    • sal shl (left shift), sar (arithmetic right shift), shr (logical)
    • xor, and, or
    • inc (++dest), dec (–dest), neg (-dest), not (~dest)
    • lea (load effective address 把存储单元的地址给dst)
      • CPU designers’ inteneded use: calculate pointer
      • Compiler often do: ordinary arithmetic
        • e.g. lea (%rbx, %rbx, 2), %rax means rax = rbx * 3.
  • In most instructions, a memory operand access memory. LEA is exception!

Lec 4. Machine Control

  • EFLAGS (RFLAGS in 64-bit) register

    • CF (Carry Flag): unsigned overflow 进位(+)或借位(-)
      • e.g. uint8_t 255+1=0
    • ZF (Zero Flag): zero
    • SF (Sign Flag): negative 最高位为1 / 判断负数
    • OF (Overflow Flag): signed overflow
      • 两个正数相加得到负数,或两个负数相加得到正数
      • 最高位w,y, w == y && w != z
      • e.g. int8_t 127+1=-128
    • PF (Parity Flag)
    • 一般算术运算会影响所有标志位
    • 一般逻辑运算会更改eflags然后把CF和OF置零(因为:逻辑运算只和本位有关)
    • 传送类指令mov, lea 等不影响标志位
  • Compare

    • cmp a, b: compute b-a, update condition codes
      • subcmp唯一的区别在于减出来的差值是否保留
    • test a, b: compute b&a, update SF and ZF
      • testand唯一的区别在于结果是否保留
      • e.g. 测试某一个bit是否为1,可使用 test reg, 0b000100 指令,接下来使用 jz 判断是否全零
      • e.g. test reg, reg 将自己和自己相与,可判断是否自己是零(和 cmp reg 0一致,但硬件层面逻辑运算速度很快,逻辑运算(与门)比算术运算(sub)运算快,因此选择test指令)
  • Jump

    指令 和等效标记判断条件 (不用记)
    jmptruejump 无条件跳转
    je jzZFEqual / Zero
    jne jnz~ZFNot Equal / Not Zero
    jsSF负数
    jns~SF非负数
    jg jnleZF=0 and SF=OF> (signed)
    jge jnlSF=OF>= (signed)
    jl jngeSF != OF< (signed)
    jle jngZF=1 or SF != OF<= (signed)
    ja jnbeCF=0 and ZF=0> (unsigned)
    jb jnae jcCF< (unsigned)
    jae jnb jncCF=0>= (unsigned)
    jbe jnaCF=1 or BF=1<= (unsigned)
  • Set

    • 将condition codes赋值给寄存器低1字节? 高7字节不会修改

Lec 5. Machine-Level Programming - Procedures

x86-64 stack

  • pushq src: decrement rsp by 8

  • popq dest: increment rsp by 8, copy value

  • call label: push return address on stack (address of the next instruction after call), jump to label

  • ret: pop address from stack, jump to that address

传参

  • 前六个参数: rdi, rsi, rdx, rcx, r8, r9
  • 更多的参数放在stack中
  • 返回值: rax

Managing local data / 递归函数 Frame

  • Stack allocated in Frames

Register Saving Conventions

  • Caller Saved (Call-Clobbered)
    • rax (return value)
    • rdi, rsi, rdx, rcx, r8, r9 (arguments)
  • Callee Saved (Call-Preserved)
    • rbx, r10, r11, r12, r13, r14, r15 (temporaries)
    • rbp (maybe used as frame pointer, can mix & match)
    • rsp (special: restored to original value upon exit)

Lec 6. Machine-Level Programming - Data

Arrays 略

Structures 略

Floating Point

  • Arguments passed in %xmm0, %xmm1, …
  • All XMM reg are call-clobbered (caller saved)

Lec 7. Machine Advanced

Buffer overflow

  • #1 technical cause of security vulnerabilities
    • #1 overall cause is social engineering / user ignorance
  • Stack smashing attacks: overwrite return address, to jump to other code
    • code injection attacks: input string contains byte representation of executable code
    • How to avoid?
      • Avoid overflow vulnerabilities: e.g. fgets(buf, 4, stdin) instead of gets
      • Randomized stack offsets: at the start of program, reposition stack. Makes it difficult for hacker to predict beginning of inserted code.
      • Non-executable memory: mark regions of mem as not executable. If jump into such region, immediate crash.
      • Stack canaries: 在stack的局部变量和返回地址之间插入一个特殊的标记值canary, 函数返回前检查该值是否被修改.
  • Return-oriented Programming attacks (ROP)
    • 不插入恶意代码,而是利用现有程序中的代码片段 (gadgets) 通常以ret结尾

Union


[Lecture Notes] CMU 15-513 Intro to Computer Systems
https://www.billhu.us/2024/051-cmu-15513/
Author
Bill Hu
Posted on
September 3, 2024
Licensed under