Race conditions on Facebook, DigitalOcean and others (fixed)(josipfranjkovic.blogspot.com) |
Race conditions on Facebook, DigitalOcean and others (fixed)(josipfranjkovic.blogspot.com) |
I could be mistaken but I believe he reported the security issue through our regular support channel which is why it took three days to see (instead of our security channel). From the time I saw it, I fixed it with the patch going live within an hour or two.
When I DID see it, tried it myself with a quick shell script that that curled and backgrounded the same request a bunch of times, I just kind of chuckled. It was a good bug. Josip is top notch.
I believe the race condition is on the rise in terms of severity and importance. Developers are aware of common OWASP bugs, but this type of race condition is often overlooked and developers are going to NEED to be just as aware of. Way to go.
That's the problem with OWASP, when developer from a big company sees race condition for the first time and is surprised
I worry that a malicious attacker could finger the service for potential victims.
1. The user submits X number of requests within a second. 2. The system puts the request in a command queue that synchronizes the commands by coupon code, for example. 3. The command is popped off the queue and an event is generated and saved saying the coupon was redeemed. 4. The next command is picked up and all events are applied before processing. At this point, the command is no longer valid so you reject and send an event saying that an attempt was made to redeem a redeemed coupon. 5. Do the same for subsequent requests.
To me, this approach is safer and easier to reason about. You have a log of the fact that someone made the attempt so you can report on this. Not sure you get that benefit from a stored procedure and a transaction unless you build it in and then increase the running time of the open transaction.
It's not necessarily different than using a normal RDBMS, right - you could do a check in SQL outside a transaction and end up writing multiple times. But with an RDBMS, you can easily solve the situation by turning on a transaction and leaving no question about things.
This is why things like VoltDB ("NewSQL") are pushing to keep SQL and ACID, and figure out a way to scale, instead of throwing it all aside and making the developer deal with consistency issues.
It's not that you can't end up with the same functionality using eventual consistency, just that it's harder. Just look at Amazon's "apology based computing" (I think that was the name) and how they structure their systems to be resilient by being able to resolve multiple conflicting commands in a proper way (deciding, without communication, which command wins, figuring out rollbacks, etc.) It's fantastic, and perhaps it's the only feasible way to operate at their scale. But it's also a hell of a lot more complicated than "UseTransaction = true".
(So my predictions/guesses: If developers that'd otherwise use a traditional ACID RDBMS switch to non-ACID (BASE?) systems, they'll end up introducing bugs due to the shifted responsibility of handling data consistency. And seeing how big servers are, and even how far sharding can take you with normal RDBMS, the scale at which people "need" to drop ACID is probably far higher than the point at which people are dropping it.)
In most systems the reward is zero, so you can infer if a person has taken the time to submit a bug report it is because he/she is invested in seeing it fixed.
Context: I work at a decent sized company in SV on this type of problem.
Properly designed bug bounty programs are a cornerstone to any company who remotely cares about the security of their product, period.
The idea of misaligned incentives due to poor bug reports being free to submit is ignorant - and worse toxic, because it sounds so true to an executive who has no actual understanding of the issue.
A quality bug report should take no more than 1 minute for a reviewer to look at and know if it's really a bug or not. If it can't, it should be rejected saying provide more clear details. For example a dom based xss attack could be reported with just a target URL and it is quite clear what the problem is. That would take 10 seconds to analyze.
Additionally, most bugs reported to most decent sized companies are reported by someone who has previously reported a bug to the company before. If someone is constantly reporting good bugs or the opposite, its quite easy to prioritize which of those individuals gets their emails read first.
for i in `seq 1 16`;do
curl.*& #copied from chrome dev tools. & to background
doneI have considered writing a program that will let me send of a bunch of HTTP requests at once, but wait to close all the connections at the exact same time. That would probably be the most effective way to trigger race conditions.
There are no guarantees in the SQL standard that queries with subqueries should be atomic.
The only truly safe way to protect yourself is to fix the schema in a way that you can make use of unique indexes. Those are guaranteed to be unique no matter what.
You can then get the exact request by using Chrome developer-tools. (Find the POST-request in the network-tab, right-click and select copy as cURL)
less cynical answer: Commonly you already have some kind of means to handle races - locking, transactions, some other variety of extra check - and the fix for newly discovered races is "oh, I didn't realise that could happen. add lock"
Adding a random time to sleep might work, but some requests would run noticeably slower.
- update
thanks for the downvotes guys. keep up the good work
One time they paid me $5000 for a bug I never could have found, but they did internally based on my low severity report. (http://josipfranjkovic.blogspot.com/2013/11/facebook-bug-bou...)
relax guy nobody here is angry at the amount he made
stop jumping into the hate wagon everybody
1. Code sent.
2. Check if valid.
3. Redeem code.
4. Invalid code.
Now if i send 10 requests at the same time with the same code maybe 4-6 will hit the code part after 2.
And your window of opportunity is the time it takes to go from 3 to 4. Sometimes certain tasks are put inside async queue, you have a slight delay to your database server or you need to wait for db replication to kick in.
Because normally there is no code part to recheck how often this code was used.
The problem is concurrency. Whenever you have multiple things happening at once, you have concurrency and programming concurrent system is always really hard.
Unfortunately the software industry has never really got a grip on this problem and there are lots of developers who have never really studied multi-threading at all. That's a problem, because it's something that takes a lot of practice and you have to just incorporate it into the way you think. After a while you do get a sixth sense for race conditions, but you'll still write racy code from time to time anyway. It's just tricky to get it right 100% of the time.
spdy has already outlined what is happening here, but this problem is something that is covered in literally any introductory course to database systems or multi-threaded programming. If you have two threads (or processes) in flight simultaneously that are reading and writing to a shared data store, then you need some kind of mutual exclusion. That can mean a lock:
1. Request for /reviews/add is received.
2. Database lock on the page reviews table is acquired.
3. Check if the user has already posted a review. If so, release the lock and abort (any good framework will release locks for you automatically if you throw an exception).
4. Add review to table.
5. Release lock.
At the point where the lock is acquired if another web server is in the middle of the operation, conceptually speaking the first one will stop and wait for the table to become available.
Real implementations don't actually "stop and wait" - that would be too slow. They use database transactions instead where both web server processes/threads proceed optimistically, and at the end the database will undo one of the changes if they detect that there was a conflict .... but you can imagine it as being like stop and wait.
Of course once you have concurrency, you have all the joy that comes with it like various kinds of deadlock.
It's funny, a few days ago I was letting my mind wonder and ended up thinking about web apps for some reason. Oh yes, now I remember, I was thinking about concurrency strategies in a software library I maintain and how to explain it to people. And then I was thinking how hard multi-threading is and how many people are selling snake-oil silver bullets to it, and started to wonder how many web apps had race conditions in them. And then I just discarded the thought as one of no consequence and got on with my day, haha :) Perhaps I should have tried to earn a bit of money this way instead.
Or put the whole thing in a transaction, right?
I guess if you were using an append only log that recorded the exact timestamp of the transcation, your datastore would eventually reconcile that for example promo code 1 was applied twice. But what do you do then? Rollback the 2nd application of promo code and deduct the credit from user account?
Where would the logic for that be programmed?
It'd be much better to make sure you're updating the same unique key and/or use the DB's conflict resolution system.
(https://www.fb.com/818902394790655)
They probably got a couple people working exclusively on bug bounty reports. I also have to say they did a great job changing communication channels from emails to tickets which show in /support/, it is way easier now. The downside is that you must have a Facebook account, not sure if it was needed before the change.
HN can be frustrating if you provoke it. The problem isn't so much what you said as how you put it: the combination of dismissive tone and superficial content puts readers here on edge, because too many comments are like that and we all find them annoying. As a result it's easy to have your good intentions misread. If you had explained the thought process behind your comment, I think it would have been received better.
You could put the transaction in SERIALIZABLE mode, but that would mean that your database has a lot of additional locking to do which you might or might not want to pay the price for:
Your two-part query now block all other transactions from writing to the table(!) and conversely also has to wait until everybody else has finished their write operation.
Doing an opportunistic attempt with READ COMMITTED and reacting to the unique index violation (official SQLSTATE 23505) is probably the better option.
Resist the temptation of READ UNCOMMITED in this case because that might lead to false-positives as competing transactions might yet be aborted in the future.
If you're trying to build a social network that's full of graphs and edges between them ...... good luck. Google developed technologies like MegaStore and Spanner to handle this. Before it had those, it used huge sharded MySQL instances.
If you introduced some combination of a user ID and promo code, then it won't prevent a race of one user firing many queries with different promo codes and stacking them up. It would, however, fix the original problem.
Class Discount
belongs_to :promo_code
belongs_to :customer
belongs_to :order
validates_presence_of :promo_code, :customer, :order
validates_associated :promo_code
validates_uniqueness_of :promo_code_id, :scope => [:customer_id, :order_id]
end
Limiting down to a single Promo-code per order: Class Discount
# ...
validates_uniqueness_of :order_id, :scope => :customer_id
endYou need to enforce the uniqueness in the DB.
add_index :discounts, [:promo_code_id, :customer_id, :order_id], :unique => truehttp://blogs.msdn.com/b/oldnewthing/archive/2011/12/15/10247...
(I've never been on the receiving end of a security mailbox, so I have no personal testimony as to the reasonableness of this approach.)
The bugs expressed in this post, such as the duplicate account creation one, are not going to impact a company's bottom or top lines, so it is questionable whether it warrants a small merit award which is the idea that spawned this thread.
Now that I've sufficiently named my experience, allow me to give my side:
1. You will never receive $100,000 for selling a vulnerability in PayPal. You probably couldn't even find a buyer for it on the "black market." I have explained why repeatedly on Hacker News before, so I'm just going to link this: http://breakingbits.net/2015/04/01/understanding-vulnerabili...
2. Bug bounties are not always a net positive for an organization. They are also not a cornerstone of good security posture. A foundational focus on robust software security would start with various other things until the financials are worked out and there is someone knowledgable to read incoming reports.
Only 7% of submitted reported to companies for a responsible disclosure program are valid. This is especially true for paid programs, where the validity percentage often drops to 3% or 4%. Loads of people who know nothing about software security try to find bugs, desperate for the gold rush of bounties they see headlining places like HN. They submit spurious reports and as a result the signal to noise ratio of responsible disclosure is fantastically bad. What this means practically is that the average organization spends between 50 and 300 hours a month investigating incoming security reports.
You can quickly see how the cost adds up here. I'm not an executive or manager trying to cut costs - I've managed bug bounties for plenty of startups and Fortune 500 companies. I've also reported bugs that loads of people tell me I could have sold for "millions" of dollars - and received nothing for it.
I love bug bounties. I run them, I participate in them. But they can be a frivolous waste of time for development teams without a solid enough grasp of security to review incoming reports, and a waste of money in the worst case.
3. I'm sorry, but you lose credibility by claiming most security reports can be qualified in a minute or less. You can certainly throw out many in a similar time frame, perhaps five minutes, but real vulnerabilities? No.
If I report a server-side request forgery in your API that requires a very specific set of actions to occur in an obscure, undocumented application function, you will not qualify this quickly. Unless you are literally verifying a CSRF issue, it is completely unrealistic to assume this.
A race condition will not be qualified in a minute. A buffer overflow will not be qualified in a minute. Budget an hour per report, and be happy when you come across the reports that take you a few minutes. XSS and CSRF are comparatively simple to verify with a good report, yes, but most other classes aren't.
Let's add to this the folks who can find great vulns but write bad reports. No exploit code, but he found something real? Good luck verifying. I spoke to a fellow infosec engineer the other day and he told me he spent an entire morning out of his work day verifying a report that came in. Not patching or even triaging mind you - verifying. Most security teams do not have the olympic level efficiency and skillset diversity that Google's and Facebook's do - it is unreasonable to assume a report can or even should be verified quickly.
This is all to say that I believe your outlook is not consistent with reality, with all due respect. Bug bounties are not a simple decision to make. I've seen development teams swamped, overwhelmed and jaded from the reports they receive.
That was exactly what I was trying to convey: real vulnerabilities can easily be separated from non-issues quite quickly because the later mostly entail things which can be checked in a matter of minutes.
> You will never receive $100,000 for selling a vulnerability in PayPal. You probably couldn't even find a buyer for it on the "black market."
$100,000 may have been slightly inflated, but with a bit of creativity it isn't that hard to believe. Orchestrated correctly one could walk away with a few million dollars from exploiting such a vulnerability.
> This is all to say that I believe your outlook is not consistent with reality, with all due respect. Bug bounties are not a simple decision to make. I've seen development teams swamped, overwhelmed and jaded from the reports they receive.
Or perhaps you and I simply have experienced different realities - as I have not seen development teams swamped from them and have seen major security improvements come about as a direct result to a bug bounty program.
Of the perhaps 15-20 companies (albeit all < $1b market cap) I've spoken/worked with in regards to bug bounty programs or security in general - none of them were receiving more than a handful of reports a week which took up perhaps 2 hours of an engineer's time.
What about the non-issues that are reported with complicated conditions but don't actually work? Just because you can throw out the obviously bad items doesn't mean the rest are real.
>Orchestrated correctly one could walk away with a few million dollars from exploiting such a vulnerability.
Exploiting it is rather different from selling it, though, right? And since a vuln in a website can literally be closed immediately, and PayPal's got whole divisions dedicated to preventing and undoing the damage you can do even with "account takeover", it'd be rather much a risk to pay someone cash for a vulnerability. At the first slip, the value drops to $0. Plus all the issues of verifying the bug and establishing trust for both parties. Seems rather difficult.
With the state of the media in the infosec industry, having your finding widely publicized doesn't mean much, either.
I would know far more millionare engineers/hackers if that was actually true
> go to federal prison for 30 years
If one was talented enough to find such a vuln, it is hardly a stretch to say they would be smart enough to avoid getting caught.
... This is plainly not true. First, the ease of finding a bug in a web app varies considerably. This article, for instance, was simply about resending requests quickly. It doesn't necessarily require amazing intellect to come across such a bug. Look at famous "hackers" that dicked around with querystrings and got into all sorts of fun.
Second, even if someone is smart and figures out how to solve a certain problem to gain root, it does not mean they're clever, aware, or dedicated enough to maintain opsec. One mistake, any time, and you're toast.
Yes. There will be some which don't fit into the overly simplistic categories I provided.
However in what I've seen the complicated condition requiring reports which turn out to not actually be bugs are rare enough where they aren't relevant to the discussion.
> Exploiting it is rather different from selling it, though, right? And since a vuln in a website can literally be closed immediately, and PayPal's got whole divisions dedicated to preventing and undoing the damage you can do even with "account takeover", it'd be rather much a risk to pay someone cash for a vulnerability. At the first slip, the value drops to $0. Plus all the issues of verifying the bug and establishing trust for both parties. Seems rather difficult.
You are hung up on what was an arbitrary example.
My point simply is if the reward for serious vulnerabilities is orders of magnitude higher if the researchers chooses the black hat instead of the white one - the overall result is a huge net negative for the world.
Do you have first hand knowledge of selling such an exploit?