Introduction ============ Administrivia Lectures will be MW11-12:30, in 36-156 Each lecture will cover a paper in systems security (except today) Preliminary paper list posted online, likely to change a bit If you are interested in specific topics or papers, send us email Read the paper before lecture Turn in answers to a short homework question on paper before lecture Will discuss the paper in class Interrupt, ask questions, point out mistakes Two quizzes during regular lecture time slot No "final exam" during finals week; second quiz near end-of-term Assignments: 4 labs + final project Lab 1 out today: buffer overflows. Start early. Labs will look like real-world systems. Many interacting parts written in different languages Will look at/write x86 asm, C, Python, PHP, Javascript, .. Final project spans the second half of the course (groups of 2-3 people) Presentations on the last 1 or 2 days of class Think of projects you'd like to work on as you're reading papers Sign up for course email list Send an email to 6.858@mit.edu Warning about security work/research on MITnet (and in general) Know the rules: http://ist.mit.edu/services/athena/olh/rules Just because something is technically possible, doesn't mean it's legal Ask course staff for advice if in doubt What is security? Achieving some goal in the presence of an adversary Many systems are connected to the internet, which has adversaries Thus, design of many systems might need to address security i.e., will the system work when there's an adversary? High-level plan for thinking about security: Policy: the goal you want to achieve. e.g. only Alice should read file F. Threat model: assumptions about what the attacker could do. e.g. can guess passwords, cannot physically grab file server. Mechanism: knobs that your system provides to help uphold policy. e.g. user accounts, passwords, file permissions, encryption. Goal: no way for adversary within threat model to violate policy. Note that goal has nothing to say about mechanism. Why is security hard? Negative goal. Need to guarantee policy, assuming the threat model. Difficult to think of all possible ways that attacker might break in. Realistic threat models are open-ended (almost negative models). Contrast: easy to check whether a positive goal is upheld, i.e. Alice can actually read file F. Weakest link matters. What goes wrong #1: bugs/problems with the policy. Real-world example: Fairfax County, VA school system. Each user has a principal corresponding to them, files, and password. (Just to be clear: technical term, not the job of school principal) Student can access only his/her own files. Teacher can access only files of students in his/her class. Superintendent has access to everyone's files. Teachers can add students (principals) to their class. Teachers can change password of students in their class. What's the worst that could happen if student gets teacher's password? Policy amounts to: teachers can do anything. Real-world example: Sarah Palin's email account. Yahoo email accounts have a username, password, and security questions. User can log in by supplying username and password. If user forgets password, can reset by answering security Qs. Security questions can sometimes be easier to guess than password. Some adversary guessed Sarah Palin's high school, birthday, etc. Policy amounts to: can log in with either password or security Qs. (no way to enforce "Only if user forgets password, then ...") How to solve? Think hard about implications of policy statements. Some policy checking tools can automate this process. Automation requires higher-level goal (e.g. no way for student to do X). What goes wrong #2: bugs/problems with threat model. Real-world example: human factors not accounted for Phishing attacks User gets email asking to renew email account, transfer money, or ... Tech support gets call from convincing-sounding user to reset password "Rubberhose cryptanalysis". Real-world example: subverting military OS security. In the 80's, military tried to encourage research into secure OS'es. One unexpected way in which OS'es were compromised: Adversary gained access to development systems, modified OS code. Real-world example: subverting firewalls Adversaries can connect to an unsecured wireless behind firewall Adversaries can trick user behind firewall to disable firewall Might suffice just to click on link http://firewall/?action=disable Or maybe buy an ad on CNN.com pointing to that URL (effectively)? How to solve? Think hard (unfortunately) Simpler, more general threat models What goes wrong #3: bugs/problems with the mechanism. Real-world example: PayMaxx W2 form disclosure. Web site designed to allow users to download their tax forms online. Login page asks for username and password. If username and password OK, redirected to new page. Link to print W2 form was of the form: http://paymaxx.com/print.cgi?id=91281 Turns out 91281 was the user's ID; print.cgi did not require password Can fetch any user's W2 form by going directly to the print.cgi URL Possibly a wrong threat model: doesn't match the real world? System is secure if adversary browses the web site through browser System not secure if adversary synthesizes new URLs on their own Hard to say if developers had wrong threat model, or buggy mechanism.. Real-world example: Debian PRNG weakness. Debian shipped with a library called OpenSSL for cryptography. Used to generate secret keys (for signing or encrypting things later). Secret key generated by gathering some random numbers. Developer accidentally "optimized away" part of random number generator. No-one noticed for a while, because could still generate secret keys. Problem: many secret keys were identical, and not so secret as a result. How to solve? Use common, well-tested security mechanism ("Economy of mechanism") Audit these common security mechanisms (lots of incentive to do so) Avoid developing new, one-off mechanisms that may have bugs Good mechanism supports many uses, policies (more incentive to audit) Examples of common mechanisms: OS-level access control (but, could often be better) network firewalls (but, could often be better) cryptography, cryptographic protocols Open vs. closed design or mechanism Why not make everything closed or secret (design, impl, code, ...)? Threat model: best to have the most conservative threat model possible What if you assume your implementation or design are secret? If implementation is revealed, hard to change to re-gain security! Must re-implement or re-design system If only assumption was that password/key/... is secret, can change. What if you don't make assumptions about design/impl being secret? Often a good idea to publish design/impl to get more review. If others use your mechanism, it will be much better tested! Helps ensure you don't accidentally assume design/impl are secret. What goes wrong #4: bugs/problems with implementations. Policy often talks about users performing actions, e.g. reading a file. User usually cannot read a file on disk directly. Instead, we say some code runs on behalf of some user (principal). Ideally: code only performs actions that the user requested. Hard to achieve ideal goal, because code can be buggy. Real-world example: buffer overflow vulnerability. Imagine an application that adds numbers for a spreadsheet has code: int read_num(void) { char buf[128]; gets(buf); return atoi(buf); } Look at what's going on in terms of memory layout on call to read_num() What happens when the gets() function reads more than 128 bytes? Could happen if user feeds a "maliciously-constructed" file to program. Happens in practice: Google China compromise Adversary found target victim (software developer at Google China) Adversary found victim's friends online, compromised one friend's acct Used friend's account to send victim a message asking to click on link Link contained a buffer overflow similar to the above example. Buffer overflow exploited bug in IE, ran backdoor program on computer. Backdoor program connected to adversary's server, asked for commands. Adversary instructed it to (presumably) read/modify Gmail source code. What to do about bugs in implementations? One of the biggest problems in practice! In terms we've been using so far: can think of it as a buggy mechanism? Except that this part of mechanism has to do with interpreting what operations the user wants to perform, as opposed to checking action against the policy with the assumption the action was issued by user. For example, imagine we run the buggy spreadsheet on a Linux system. +-------------+ +-----------+ +-----+ +------+ | File system | <--- | OS kernel | <--- | App | <--- | User | +-------------+ +-----------+ +-----+ +------+ Terminology: X is trustworthy: means that X is worth trusting -- good! X is not buggy, will not be compromised, will not fail. X is trusted: means that X _must_ be trustworthy -- bad! Something will fail if X is buggy, compromised, or fails. In our example, the OS kernel is trusted by everyone (users, apps). The application is trusted by the user (w.r.t. user's files). "Trusted" is often transitive -- bad again! If Alice has to trust Google, and Google has to trust X, then Alice might have no choice but to trust X too. High-level plan: Reduce amount of trusted code .. by enforcing policy using a more trustworthy mechanism .. e.g. the OS kernel should let this app write to only one file? "Principle of least privilege" Doing this requires well-designed, flexible security mechanisms. Improve trustworthiness of trusted code .. by auditing, program analysis, better languages, etc References http://catless.ncl.ac.uk/Risks/26.02.html#subj7.1 http://en.wikipedia.org/wiki/Sarah_Palin_email_hack http://www.thinkcomputer.com/corporate/whitepapers/identitycrisis.pdf http://en.wikipedia.org/wiki/Operation_Aurora http://www.zdnet.co.uk/news/security-management/2010/01/26/report-cyberattackers-hit-google-staff-via-friends-40005860/ http://www.wired.com/threatlevel/2010/03/source-code-hacks/