Race conditions on Facebook, DigitalOcean and others (fixed)

Race conditions on Facebook, DigitalOcean and others (fixed)(josipfranjkovic.blogspot.com)

294 points by franjkovic 11 years ago | 88 comments

ejcx 11 years ago |

I actually fixed the issue that was reported to LastPass.

I could be mistaken but I believe he reported the security issue through our regular support channel which is why it took three days to see (instead of our security channel). From the time I saw it, I fixed it with the patch going live within an hour or two.

When I DID see it, tried it myself with a quick shell script that that curled and backgrounded the same request a bunch of times, I just kind of chuckled. It was a good bug. Josip is top notch.

franjkovic 11 years ago | |

Thanks! I reported the bug to security@ email, and one of your team's members replied on the same day (January 6th). Either way, good job on fixing this really fast. I wish more teams are as responsive as yours.

ejcx 11 years ago | | |

Oh okay I was mistaken then.

I believe the race condition is on the rise in terms of severity and importance. Developers are aware of common OWASP bugs, but this type of race condition is often overlooked and developers are going to NEED to be just as aware of. Way to go.

homakov 11 years ago | |

> When I DID see it, tried it myself with a quick shell script that that curled and backgrounded the same request a bunch of times, I just kind of chuckled. It was a good bug

That's the problem with OWASP, when developer from a big company sees race condition for the first time and is surprised

monksy 11 years ago | |

BTW: I just subscribed to LastPass a few days ago. I'm pretty happy with the service.

jdubs 11 years ago | | |

LastPass is awesome but I hate their website login process! It bothers me to no extreme that if I type in my email address with a wrong username, it pops back with, "Invalid password" while typing in a obviously random email, it pops back with a "Unknown email address. Would you like to create an account now?."

I worry that a malicious attacker could finger the service for potential victims.

MichaelGG 11 years ago |

We should see lots more of these if people embrace eventual consistency instead of "slow" ACID transactions. And interestingly, the more larger scale a system, the more likely that globally consistent operations are too expensive to enable in general, and developers will overlook cases where they must implement some locking or double checking.

partisan 11 years ago | |

I would have thought that the opposite would be true; by having an CQRS/event sourcing system with eventual consistency would allow you to avoid posting duplicates to your database:

1. The user submits X number of requests within a second. 2. The system puts the request in a command queue that synchronizes the commands by coupon code, for example. 3. The command is popped off the queue and an event is generated and saved saying the coupon was redeemed. 4. The next command is picked up and all events are applied before processing. At this point, the command is no longer valid so you reject and send an event saying that an attempt was made to redeem a redeemed coupon. 5. Do the same for subsequent requests.

To me, this approach is safer and easier to reason about. You have a log of the fact that someone made the attempt so you can report on this. Not sure you get that benefit from a stored procedure and a transaction unless you build it in and then increase the running time of the open transaction.

pyvpx 11 years ago | |

when did eventual consistency equate to race conditions, or even increased susceptibility to race conditions? I don't follow. could you explain your reasoning further?

MichaelGG 11 years ago | | |

It's probably just an ease-of-use question. The more guarantees your database can deliver, the easier it is to reason about things and make sure you aren't being caught on a gotcha.

It's not necessarily different than using a normal RDBMS, right - you could do a check in SQL outside a transaction and end up writing multiple times. But with an RDBMS, you can easily solve the situation by turning on a transaction and leaving no question about things.

This is why things like VoltDB ("NewSQL") are pushing to keep SQL and ACID, and figure out a way to scale, instead of throwing it all aside and making the developer deal with consistency issues.

It's not that you can't end up with the same functionality using eventual consistency, just that it's harder. Just look at Amazon's "apology based computing" (I think that was the name) and how they structure their systems to be resilient by being able to resolve multiple conflicting commands in a proper way (deciding, without communication, which command wins, figuring out rollbacks, etc.) It's fantastic, and perhaps it's the only feasible way to operate at their scale. But it's also a hell of a lot more complicated than "UseTransaction = true".

(So my predictions/guesses: If developers that'd otherwise use a traditional ACID RDBMS switch to non-ACID (BASE?) systems, they'll end up introducing bugs due to the shifted responsibility of handling data consistency. And seeing how big servers are, and even how far sharding can take you with normal RDBMS, the scale at which people "need" to drop ACID is probably far higher than the point at which people are dropping it.)

janoelze 11 years ago |

appreciating the joke (?) in the comments. https://i.imgur.com/zWE5ABQ.png

unclesaamm 11 years ago |

Wow, it seems like there is room here for a 3rd party vendor to implement promo code handling as a service, and to do it right once and for all.

d_luaz 11 years ago |

No bounty for bug report? Should at least have a nominal fee of $100 (else no one would bother to report it).

reagan83 11 years ago | |

The economics of bug bounty programs could lead to misaligned incentives. Because the overhead cost to validate and communicate around bug reports isn't zero, the % of non-bugs submitted could become imbalanced because it is free to submit.

In most systems the reward is zero, so you can infer if a person has taken the time to submit a bug report it is because he/she is invested in seeing it fixed.

Context: I work at a decent sized company in SV on this type of problem.

nmjohn 11 years ago | | |

So when I find a bug in say Paypal which allows complete account takeover and could sell it to an organized hacker group for say $100,000 or report it to Paypal "because I'm invested in seeing it fixed" and receive nothing - that is only an easy decision for the whitest of white hat hacker.

Properly designed bug bounty programs are a cornerstone to any company who remotely cares about the security of their product, period.

The idea of misaligned incentives due to poor bug reports being free to submit is ignorant - and worse toxic, because it sounds so true to an executive who has no actual understanding of the issue.

A quality bug report should take no more than 1 minute for a reviewer to look at and know if it's really a bug or not. If it can't, it should be rejected saying provide more clear details. For example a dom based xss attack could be reported with just a target URL and it is quite clear what the problem is. That would take 10 seconds to analyze.

Additionally, most bugs reported to most decent sized companies are reported by someone who has previously reported a bug to the company before. If someone is constantly reporting good bugs or the opposite, its quite easy to prioritize which of those individuals gets their emails read first.

d_luaz 11 years ago | | |

So the best solution is not to have a reward? Or not to have a publicized reward? Or don't depend on the public on bug hunting? Or just hope on goodwill?

squiguy7 11 years ago | |

I agree. If I had my own company I would surely provide some incentive for bugs found in the product. Whether that incentive was monetary, a free membership, etc. I think it's important to acknowledge that all software systems are imperfect.

diminoten 11 years ago | |

Clearly not, though.

Kiro 11 years ago |

I'm a novice but would like to know how these issues can arise. What kind of backend setup is needed for it to be a problem? What is happening when a race condition occurs in these examples?

emmab 11 years ago |

It would be cool if there was a browser addon that let you submit a form N times in parallel.

ejcx 11 years ago | |

I do a lot of App Sec related things and I actually use mostly Chrome dev tools and command line instead of burp and other tools. The way I reproduced the bug when it was reported was by using the "Copy to curl" feature in Chrome, and then using it as follows

    for i in `seq 1 16`;do
        curl.*&               #copied from chrome dev tools. & to background
    done

bburky 11 years ago | | |

Also, curl gained a --next command line option somewhat recently. It lets you send off multiple requests in the same curl invocation. These requests will all be pipelined in the same HTTP connection, which might trigger slightly different behavior in the website.

I have considered writing a program that will let me send of a bunch of HTTP requests at once, but wait to close all the connections at the exact same time. That would probably be the most effective way to trigger race conditions.

odonnellryan 11 years ago | |

If you go down to the "proof of concept" here it's not hard to test this: https://defuse.ca/race-conditions-in-web-applications.htm

SixSigma 11 years ago | |

why would it ?

andersonmvd 11 years ago |

More interesting than the bounty itself is to understand which defense works best at scale and the nitty gritty details of those kind of attacks. Intuitively I think that we just need to avoid inconsistencies between the Time of Check (TOC) and Time of Use (TOU), so veryfing the existence of a discount coupon while inserting it in one query should do the trick (INSERT INTO coupons (...) Values (...) WHERE NOT EXISTS (SELECT 1 FROM coupons WHERE (...)) instead of increasing the time between the TOC/TOU, e.g. one query to check if the coupon exists and a second one to insert the coupon. Besides it I am wondering if I am missing something, e.g. is this really a problem limited to the application layer or are the databases unable to prevent such attacks? I think I am right regarding the app protection, but let's see what people have to say :)

pilif 11 years ago | |

In many databases, your suggested "where not exists" sub query might not actually protect you but just make the possible window to hit the race much smaller. What happens is that your database would evaluate the subquery, the rest of the where, commit another transaction and then finally run the insert part of your query.

There are no guarantees in the SQL standard that queries with subqueries should be atomic.

The only truly safe way to protect yourself is to fix the schema in a way that you can make use of unique indexes. Those are guaranteed to be unique no matter what.

inportb 11 years ago |

So the review bug was a security issue but the username bug wasn't? I wonder what else the review bug affected.

franjkovic 11 years ago | |

I think they did not reward me because you cannot really hurt anyone by having multiple usernames.

joshschreuder 11 years ago | | |

What about squatting on valuable ones? But probably not a big deal unless it relates to Pages.

georgerobinson 11 years ago |

Can anyone comment on how the author flooded HTTP requests to the endpoint URLs? Did he use developer tools in his browser and execute his own JavaScript, or use CURL in a tight loop with the cookie and CSRF token from his browser session?

gislifb 11 years ago | |

Without knowing exactly how he did it I assume this is possible by doing a POST with cURL inside a loop or with parallel.

You can then get the exact request by using Chrome developer-tools. (Find the POST-request in the network-tab, right-click and select copy as cURL)

Rafert 11 years ago |

I have reported the same issue with Digital Ocean (security) in November 2014, and they told me I was using the wrong address and that they forwarded it to the proper team. I triggered it by accident, using the same GitHub code twice, and I (or the DO staffer) didn't realize it was a race condition. I never heard back but they let me keep the balance :)

numair 11 years ago |

I would be really interested to know how various forms of this bug are resolved. This seems like a problem that, on its surface, seems easy to fix, but isn't. Especially if you've designed your architecture for real-time-ness and global redundancy. Google's servers with atomic clocks come to mind...

ekimekim 11 years ago | |

cynical answer: I've seen alot of races get "fixed" by adding a sleep() or similar

less cynical answer: Commonly you already have some kind of means to handle races - locking, transactions, some other variety of extra check - and the fix for newly discovered races is "oh, I didn't realise that could happen. add lock"

hobarrera 11 years ago | | |

If you get three requests in at the same time, and sleep the tree for N (say, 400) miliseconds they'll all still run concurrently.

Adding a random time to sleep might work, but some requests would run noticeably slower.

jbkkd 11 years ago |

Now that race condition bugs have been widely exposed, I have a feeling we'll start seeing more of these "attacks" in the near future. They are relatively easy to execute and don't raise a high suspicion.

tomcam 11 years ago |

Now please fix race conditions everywhere else, like Baltimore.

yesmade 11 years ago |

$3k for the facebook review bug. that's a little bit too much

- update

thanks for the downvotes guys. keep up the good work

franjkovic 11 years ago | |

The bounty actually surprised me, too. I expected between $1000-$2000. That is one of reasons I like reporting bugs to Facebook - they pay really good, critical bugs are fixed really fast (<1 day).

One time they paid me $5000 for a bug I never could have found, but they did internally based on my low severity report. (http://josipfranjkovic.blogspot.com/2013/11/facebook-bug-bou...)

mwsherman 11 years ago | | |

It’s impressive that they are able to fix them so quickly – one needs to imagine they get a non-trivial number of reports, and that some majority of them are junk. They have a good triage + repro + escalation system.

yesmade 11 years ago | | |

congratulations on both findings

totony 11 years ago | |

This bug actually seems quite critical imo, defeats the purpose of a feature and permits abuse/cheating

mikeash 11 years ago | |

Who are you to say that it's "too much," when it's their money than they can spend as they wish?

yesmade 11 years ago | | |

> seems > too > much

relax guy nobody here is angry at the amount he made

Gigablah 11 years ago | |

Instead of questioning why others are getting so much, question why you're getting so little.

yesmade 11 years ago | | |

chill out man. you are turning this into something personal. it was only a comment at the amount he got for cheating the review system. even the OP said he wasn't expecting that much.

stop jumping into the hate wagon everybody

Class Discount belongs_to :promo_code belongs_to :customer belongs_to :order validates_presence_of :promo_code, :customer, :order validates_associated :promo_code validates_uniqueness_of :promo_code_id, :scope => [:customer_id, :order_id] end