Linköping Studies in Science and Technology Dissertations, No. 1993 Ulf Kargén FACULTY OF SCIENCE AND ENGINEERING Linköping Studies in Science and Technology, Dissertations, No. 1993, 2019 Department of Computer and Information Science Linköping University SE-581 83 Linköping, Sweden Scalable Dynamic Analysis of Binary Code www.liu.se Scalable Dynamic Analysis of Binary Code Ulf Kargén 2019
Linköping Studies in Science and Technology Disserta ons, No. 1993 Scalable Dynamic Analysis of Binary Code Ulf Kargén Linköping University Department of Computer and Informa on Science Division of Database and Informa on Techniques SE-581 83 Linköping, Sweden Linköping 2019
I I I
0 2
A B A C B C
1. r2 = load r0 2. r3 = load r1 3. r2 = add r2, r3 4. r4 = cmp_greater r2, 0 5. branch_if r4, line_7 6. r2 = 1 7. r3 = load r5 8. r0 = div r3, r2 9. call print A B C
R R
A B A B A C C A = 2 = 3
base exp 1 1 1. float power(int base, int exp) { 2. if(exp == 0) 3. return 1.0; 4. 5. int result = base; 6. for(int i = 1; i < abs(exp); i++) 7. result = result * base; 8. 9. if(exp < 0) 10. return 1.0/result; 11. else 12. return result; 13.} 5 7 7 6 6 9 2 3 12 = 2 = 3 {1, 1, 2, 3, 5, 6, 7, 9, 12} {1, 5, 7, 12} {1, 2, 3, 6, 7, 9, 12} {1, 2, 6, 9}
.. a = read_one_value(file_name) b = a * 2 c = a + b print(c).... a = read_one_value(file_name) shadow_a = check_and_taint(file_name) b = a * 2 shadow_b = shadow_a c = a + b shadow_c = union(shadow_a, shadow_b) print(c) taint_sink(shadow_c).. (a) (b)
O(nd) d n
push %rbp mov sub mov mov movl mov add %rsp,%rbp $0x30,%rsp %edi,-0x24(%rbp) %rsi,-0x30(%rbp) $0x0,-0x14(%rbp) -0x30(%rbp),%rax $0x8,%rax push %rbp mov sub mov mov movl mov add %rsp,%rbp $0x30,%rsp %edi,-0x24(%rbp) %rsi,-0x30(%rbp) $0x0,-0x14(%rbp) -0x30(%rbp),%rax $0x8,%rax push %rbp mov sub mov mov movl mov add %rsp,%rbp $0x30,%rsp %edi,-0x24(%rbp) %rsi,-0x30(%rbp) $0x0,-0x14(%rbp) -0x30(%rbp),%rax $0x8,%rax push %rbp mov sub mov mov movl mov add %rsp,%rbp $0x30,%rsp %edi,-0x24(%rbp) %rsi,-0x30(%rbp) $0x0,-0x14(%rbp) -0x30(%rbp),%rax $0x8,%rax push %rbp mov sub mov mov movl mov add %rsp,%rbp $0x30,%rsp %edi,-0x24(%rbp) %rsi,-0x30(%rbp) $0x0,-0x14(%rbp) -0x30(%rbp),%rax $0x8,%rax push %rbp mov sub mov mov movl mov add %rsp,%rbp $0x30,%rsp %edi,-0x24(%rbp) %rsi,-0x30(%rbp) $0x0,-0x14(%rbp) -0x30(%rbp),%rax $0x8,%rax push %rbp mov sub mov mov movl mov add %rsp,%rbp $0x30,%rsp %edi,-0x24(%rbp) %rsi,-0x30(%rbp) $0x0,-0x14(%rbp) -0x30(%rbp),%rax $0x8,%rax sub $0x8,%rsp mov 0x8(%rsi),%rdi mov $0xa,%edx xor %esi,%esi callq 0x400490 lea -0x1(%rax),%edx cmp $0x1,%edx jbe 0x40050e sub $0x2,%eax test %eax,%eax jle 0x400515 mov $0x1,%edi sub $0x8,%rsp mov 0x8(%rsi),%rdi mov $0xa,%edx xor %esi,%esi callq 0x400490 lea -0x1(%rax),%edx cmp $0x1,%edx jbe 0x40050e sub $0x2,%eax test %eax,%eax jle 0x400515 mov $0x1,%edi sub $0x8,%rsp mov 0x8(%rsi),%rdi mov $0xa,%edx xor %esi,%esi callq 0x400490 lea -0x1(%rax),%edx cmp $0x1,%edx jbe 0x40050e sub $0x2,%eax test %eax,%eax jle 0x400515 mov $0x1,%edi sub $0x8,%rsp mov 0x8(%rsi),%rdi mov $0xa,%edx xor %esi,%esi callq 0x400490 lea -0x1(%rax),%edx cmp $0x1,%edx jbe 0x40050e sub $0x2,%eax test %eax,%eax jle 0x400515 mov $0x1,%edi sub $0x8,%rsp mov 0x8(%rsi),%rdi mov $0xa,%edx xor %esi,%esi callq 0x400490 lea -0x1(%rax),%edx cmp $0x1,%edx jbe 0x40050e sub $0x2,%eax test %eax,%eax jle 0x400515 mov $0x1,%edi sub $0x8,%rsp mov 0x8(%rsi),%rdi mov $0xa,%edx xor %esi,%esi callq 0x400490 lea -0x1(%rax),%edx cmp $0x1,%edx jbe 0x40050e sub $0x2,%eax test %eax,%eax jle 0x400515 mov $0x1,%edi sub $0x8,%rsp mov 0x8(%rsi),%rdi mov $0xa,%edx xor %esi,%esi callq 0x400490 lea -0x1(%rax),%edx cmp $0x1,%edx jbe 0x40050e sub $0x2,%eax test %eax,%eax jle 0x400515 mov $0x1,%edi
4.3. Improving Fuzzing using Dynamic Slicing Mutational Fuzzer Valid PDF file (a) PostScript to PDF converter Valid PostScript file MutaGen (b) Figure 4.2: Conceptual difference between mutational fuzzing (a) and MutaGen (b), for the example case of generating test inputs for PDF readers. and also allows MutaGen to support closed-source generating programs. Using Valgrind allows us to avoid the intricacies of the x86 instruction set, and apply mutations on Valgrind s simplified IR. We use several mutation operators from mutation testing, such as switching addition and subtraction. In contrast to mutation testing we apply mutations to computations rather than branch predicates, since our goal is to mutate the computed output of the generating program rather than drastically changing its internal logic. Therefore, we also use several arithmetic mutation operators, for example adding or subtracting a constant from different instruction operands. Since applying mutations to every executed instruction of the generating program would be very time consuming, we also utilize our backwards dynamic slicer from Paper I to limit the set of instructions that are viable for mutation. We instrument system calls for writing output to a file, and treat every byte of the generating program s output as a combined slicing criterion for backwards slicing. This means that the slice will contain every instruction that is directly involved in computing the program s output. (Or more precisely, every instruction that at least one byte of output has a transitive data dependency on.) We found 49
Papers The papers associated with this thesis have been removed for copyright reasons. For more details about these see: http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-157626
Linköping Studies in Science and Technology Dissertations, No. 1993 Linköping Studies in Science and Technology, Dissertations, No. 1993, 2019 Department of Computer and Information Science Ulf Kargén FACULTY OF SCIENCE AND ENGINEERING Linköping University SE-581 83 Linköping, Sweden Scalable Dynamic Analysis of Binary Code www.liu.se Scalable Dynamic Analysis of Binary Code Ulf Kargén 2019