6.858 Lecture Bugs and Symbolic Execution today's topics: finding bugs, symbolic execution, EXE can we automate the search for security vulnerabilities? basically the same as finding bugs; let's borrow bug-finding ideas symbolic execution powerful method of finding "deep" bugs you'll use it in lab 3 bugs: major source of security exploits bugs ~ exploit maybe hard, but should assume it is possible e.g., not buffer overrun is an exploit, but it maybe possible to exploit encourage researchers/security practitioners/enthusiasts to find exploits protocol for disclosing exploits contact vendor after some time disclose bug to the public (e.g., 90 days) Example policy (https://insights.dice.com/2020/01/15/google-project-zero-bug-disclosure-etiquette/) CVE: community effort to record bugs (https://cve.mitre.org/) CVE = Common Vulnerabilities and Exposures many companies have bounty programs today's paper: a novel approach to finding bugs in applications application has inputs, which the attacker can provide (e.g. web requests) we add checks (e.g. assertions) for bad situations that might arise the checks could be too expensive to use in the real system paper provides method to find inputs the attacker could provide that would cause the code to fail the checks can the attacker use this too? yes! what kinds of bugs? divide by zero, null-pointer dereference, out-of-bounds array access application-specific bugs (through asserts) bank transaction changed sum of all accounts X's transaction decreased Y's balance program opened a file with name determined by input request sets cookie to impersonate another user application-specific asserts in programs often turned off during deployment they are sometimes expensive we rely on testing with asserts turned on why not directly prove that the program is correct? this can be done but it's a lot of work not yet practical for big programs someday you could write a suite of tests (and you should) each test runs the program with some input the test knows the expected output the asserts in the code get a chance to catch errors pro: good for intended functionality pro: good for bugs that are already known con: not so good for unintended functionality == vulnerabilities con: not so good for as-yet-unknown bugs con: hard to write tests that get complete "coverage" == cause every line of code to execute if no test causes some line of code to execute, you can't spot bugs in it con: takes effort to write tests can we automate test-case generation? can we automate search for as-yet-unknown bugs? can we automate generation of tests that achieve good coverage? fuzzers idea: execute program on lots of randomly-generated inputs 1. find input sources command-line arguments HTTP requests 2. write input generation code completely random inputs unlikely to get very "deep" if method == "GET" ... usually need smart input generator: generate syntactically correct input with random content where freedom is allowed GET /xxx GET /transfer?from=xxx&to=yyy&amount=zzz use test suite to help generate inputs 3. execute with random inputs until you get bored maybe some random inputs will trigger an assert that you didn't think an attacker could violate e.g. maybe bank code doesn't check that "from" account has enough balance to cover a transfer; fuzzing may discover relevant arguments, triggering an assert that checks for negative balances or conservation of balances. demo: cd fuzz cat simple_test.go go test -fuzz . fix bug (return if divide by zero) go test -fuzz . fix bug (return if OOB) go test -fuzz . do fuzzers work? they are widely used and have found lots of bugs particularly good at bugs like buffer overflow that may not need a specific input value to trigger them many out-of-bound values are likely to generate crash don't even have to write assert advantage: better than programmer at testing for unexpected behavior advantage: no need to control entire execution env can fuzz black-box systems over the network as long as there's a way to detect errors advantage: may not need source code disadvantage: uses lots of CPU time disadvantage: hard to cover everything can miss bugs because didn't happen to try a particular input e.g. if command == "credit" or if inputs must have complex structure e.g. hard to test a compiler with purely random input symbolic execution: a more sophisticated testing scheme e.g. EXE goal: find interesting/deep bugs drive program along all paths in program ideas: compute on symbolic values (instead of concrete values) branch on each if statement create a path condition for the if branch and one for the else branch use constraint solver to see if branch is possible overview of EXE At compile time: C input ---> [C-to-C translator] --> instrumented if, assignments, expressions + table: memory range -> symbolic value At runtime -> [ application process ] -------> path condition ----> [constraint solver (STP)] | at if statement, | construct path condition is this path condition satisfiable? | for each branch | <--------- YES/NO/DON'T KNOW (timeout) --- | | | | | | If unsatisfiable, don't explore branch | If satisfiable, then fork to explore branch | | | | | V |--- [ scheduler process ] selects which application process to explore example: 1. read x, y 2. if x > y: 3. x = y 4. if x < y: 5. x = x + 1 6. if x + y == 7 7. error() line 6+7 is the expansion of assert x + y != 7 this is a simple example of code that's hard for humans or fuzzers to test i.e. to decide if attacker could trigger the error requires some careful thought by human requires a lot of luck in a random fuzzer EXE executes with symbolic values i.e. variable/memory holds an *expression* in terms of inputs x and y hold the example's inputs they hold symbolic values -- not concrete values let's say x=alpha and y=beta EXE remembers which memory locations hold symbolic values and what each location's current symbolic value is EXE remembers "constraints" imposed by executed if statements the "path constraints" (pc) EXE views execution as a tree, which splits at each if EXE executes down one side of each if, then down the other the "if" adds the condition to one execution's pc, and "not" to the other when EXE gets to the error() call, it checks whether the current path constraints can be satisfied EXE uses the STP constraint solver if yes, STP indicates the satisfying values of input variables if attacker inputs those values, program will call error() one path: 1. pc = { }, x = a, y = b (alpha, beta) 2. pc = { a > b }, x = a, y = b (fork) 3. pc = { a > b }, x = b, y = b 4. same (no fork) 6. cannot satisfy b + b == 7 skips if, continues executing Demo: python import z3 a = z3.BitVec('a', 8) b = z3.BitVec('b', 8) z3.solve(a>b) z3.solve(a>b, bb, b+b==7) another path: 2. pc = { a <= b }, x = a, y = b 4. pc = { a <= b AND a < b }, x = a, y = b 5. pc = { a <= b AND a < b }, x = a + 1, y = b 6. (a + 1 + b) == 7 for a < b? lots of solutions e.g. a=0 b=6 EXE would report an assert failure w/ inputs e.g. a=0 b=6 Demo: z3.solve(z3.Not(a>b)) z3.solve(z3.Not(a>b), ab), a