Zoom: Remote Code Execution with XMPP Stanza Smuggling

Zoom: Remote Code Execution with XMPP Stanza Smuggling(bugs.chromium.org)

231 points by Flowdalic 4 years ago | 90 comments

twoodfin 4 years ago |

The XML parsing/validation bugs are, I suppose, not shocking, but deeply disappointing.

The one thing XML & its tooling were supposed to get right was document well-formed-ness. Sure, it might be a mess of a standard in other ways, but at least we could agree what a parser should and shouldn’t accept! (Not the case for the HTML tag soup of then or now.)

That, 25 years on, a popular XML processor can’t even meet that low bar for tag names is maddening.

Diggsey 4 years ago | |

There are just so many issues here.

1) Don't rely on two parsers having identical behaviour for security. Yes parsers for the same format should behave the same, but bugs happen, so don't design a system where small differences result in such a catastrophic bug. If you absolutely have to do this, at least use the same parser on both ends.

2) Don't allow layering violations. All content of XML documents is required to be valid in the configured character encoding. That means layer 1 of your decoder should be converting a byte stream into a character stream, and layers 2+ should not even have the opportunity to mess up decoding a character. Efficiency is not a justification, because you can use compile-time techniques to generate the exact same code as if you combined all layers into one. This has the added benefit that it removes edge-cases (if there is one place where bytes are decoded into characters, then you can't get a bug where that decoding is only broken in tag names, and so your test coverage is automatically better).

3) Don't transparently download and install stuff without user interaction, regardless of where it comes from!

4) Revoke certificates for old compromised versions of an installer so that downgrade attacks are not possible.

iancarroll 4 years ago | | |

> Revoke certificates for old compromised versions of an installer so that downgrade attacks are not possible.

Worth noting that Windows accepts signatures from revoked code signing certificates so long as it has a signed timestamped before the revocation.

CaliforniaKarl 4 years ago | | |

> 4) Revoke certificates for old compromised versions of an installer so that downgrade attacks are not possible.

I suggest the following alternative: When your own software is triggering the upgrade process, don't allow triggering an upgrade to an older version of the software.

In other words: If a user wants to downgrade, they will have to do the work of running the installer for the older version (and possibly uninstalling the newer version first).

This modified behavior addresses the problem mentioned in the article (a newer version of software running the installer for an older version), but still gives users the power to install an older version if they want.

joefkelley 4 years ago | | |

> 3) Don't transparently download and install stuff without user interaction, regardless of where it comes from!

This is an interesting one. I totally get your point. But also users are terrible about updating their software if you give them the choice. Automatic updates have very practical security benefits. I've witnessed non-technical folks hit that "remind me later" button for years.

bombcar 4 years ago | | |

I doubt anyone actively revokes certificates ever - perhaps maybe the game console makers.

jerf 4 years ago | |

Unfortunately, the problem here is programmers moreso than formats. It literally doesn't matter what you specify, programmers will not implement it to a T. Most programmers simply don't know that every single detail matters. Many of those who may have some idea don't really care, since they can't imagine how something like this could happen.

It's not just XML. It's every ecosystem I've ever used. Push it around the edges and you will find things.

This is neat, not because it is special to JSON in particular but because it's an example of examining a good chunk of a large ecosystem: https://seriot.ch/projects/parsing_json.html Consider this is likely to be true in any ecosystem that doesn't make it a top priority to avoid.

IshKebab 4 years ago | | |

I disagree. The way the format is designed has a direct effect on how likely implementors are to implement it correctly. So the format designers bear some responsibility.

For example how many Protobuf parser libraries have security bugs? I'm guessing very few because the standard is nice and simple, and it's very clearly defined without much "it's probably like this" wiggle room (much easier for binary formats!).

XML had a ton of unnecessary complexity that could have been avoided to make implementations simpler. I haven't actually read this bug so let's see if it was one of:

* Closing tags having to repeat the name / two different ways of closing tags.

* CDATA

* Namespaces (especially how they are defined)

* &entities;

Edit: Ha it wasn't any of those - but it was still an issue with text based formats. Seems like Expat assumes the content is valid UTF-8 (and doesn't validate it), while Gloox assumes it is ASCII. Obviously this couldn't have happened with binary formats.

If you care about security DON'T USE TEXT FORMATS!

twoodfin 4 years ago | | |

This is just so basic a screwup though. The W3C spec for XML has had a formal syntactic description of valid tag names for decades:

https://www.w3.org/TR/2006/REC-xml11-20060816/#sec-common-sy...

Plenty of libraries get this right because it’s so easy. You’d almost have to try—probably by being “clever”—to get it wrong.

mwcampbell 4 years ago | | |

I suppose it's safest to use a binary format where variable-length fields are prefixed with their length.

lmm 4 years ago | | |

Programmers respond to their incentives. Like most security bugs, this one happened because someone was dumb enough to use C for something connected to the internet. But the reason programmers do that is because of a culture that rewards fast and insecure more than slightly less fast and correct.

Flowdalic 4 years ago |

It appears that Gloox, a relative low-level XMPP-client C library, rolled much of its Unicode and XML parsing itself, which made such vulnerabilities more likely. There maybe good reasons to not re-use existing modules and rely on external libraries, especially if you target constraint low-end embedded devices, but you should always be aware of the drawbacks. And the Zoom client typically does not run on those.

dgellow 4 years ago |

Some relevant info in case you don’t want to read the whole description but wonder if you’re concerned by the issue:

> Zoom fixed the server-side issues in February and client-side issues on April 24 in version 5.10.4.

> Zoom published a security bulletin about client-side fixes at https://explore.zoom.us/en/trust/security/security-bulletin

CVE-2022-25235 CVE-2022-25236 Fixed-2022-Apr-24 CVE-2022-22784 CVE-2022-22785 CVE-2022-22786 CVE-2022-22787

kevincox 4 years ago |

This is another lesson that you should always parse+serialize rather that just validate. It is much harder to smuggle data this way to exploit different parsers.

Basically the set of all messages that will satisfy your validator is far larger than the set of all messages that will be produced by your serializer.

fsflover 4 years ago | |

Or, it's another lesson that you should not completely trust any code but compartmentalize instead. Thanks to Qubes OS, I am still safe, since Zoom is running in a hardware-virtualized VM.

JoshTriplett 4 years ago | | |

I'm safe as well, because I only use the web version of Zoom. Code you don't trust should always run in a sandbox, if it runs at all.

jeffbee 4 years ago | | |

How is that helpful? This exploit completely replaces the Zoom software with arbitrary attacker software and it executes in your VM that has access to camera, microphone, network, and presumably screen recording. It sounds to me like the highest possible level of access and your VM is just performative.

autoexec 4 years ago | | |

The real lesson is not to use Zoom. Anyone who does deserves everything they get. There have been so so many red flags that using Zoom will leak your data to 3rd parties (often in china) and compromise your security that people using it now must simply not care if it happens. So no surprise, it's happened yet again, and you can bet it will again and again in the future.

There are other options besides Zoom. They are different from Zoom, each with their own strengths and weaknesses, but they don't have example after example showing total incompetence and/or malicious intent the way Zoom does.

lovasoa 4 years ago | |

I am not sure this applies in this case. I don't know how Zoom's XMPP backend works, but it could very well parse and serialize and still be vulnerable. If the xml library accepts invalid 3-byte utf8 characters on parse, then its internal representation supports these characters, and I don't see why they would not be serialized just as well.

ifratric1 4 years ago | |

XMPP servers (including Zoom's) already parse + serialize ;)

bobbylarrybobby 4 years ago |

Having multiple, potentially different parsers is incredibly dangerous. One person used the fact that different plist parsers in the macOS kernel choked in different ways when interpreting malformed xml, leading some to believe the plist was "safe" because it did not grant certain permissions, while others trusted this "safe" plist but believed it did grant these permissions.

https://blog.siguza.net/psychicpaper/

dqv 4 years ago |

I didn’t even consider the existence of XMPP vulns until I listened to the Darknet Diaries episode about Kik[0]. It’s a really interesting class of vulnerabilities.

[0]: https://darknetdiaries.com/episode/93/

robertlagrant 4 years ago |

This vuln writeup is extremely well written. Actually quite interesting to read!

rektide 4 years ago |

How much of Zoom is powered by XMPP? Do we know much about these internals? This would be super cool to learn about.

henearkr 4 years ago |

Good thing that I never used the standalone client and always the in-browser webapp instead.

user23894295637 4 years ago | |

How do you do that? On any OS I tried (Debian, Windows) it always *forces* me to download the standalone client, otherwise I can't join. There's no alternative link ("Join via web") like MS Teams has for example.

I really feel uncomfortable each time I have to install the client on a machine for my relatives :/

ydant 4 years ago | | |

I've always been able to use the in-browser client, but you have to download the client once or twice before the page will update to show the alternative "use browser". It's definitely an intentional dark pattern.

mehagar 4 years ago | | |

Check out https://github.com/arkadiyt/zoom-redirector. You can also join meetings from https://pwa.zoom.us/wc/.

woojoo666 4 years ago | | |

After you click "download Zoom client" the button will turn into a "use Web app". You don't even need to download anything if you cancel the system dialog asking you where to save the download. However I still find this UX pattern incredibly deceptive. People and companies seriously need to stop using Zoom

0daystock 4 years ago | |

Unfortunately they don't allow you to both speak and present using the webapp - forcing desktop client use.

thinkmassive 4 years ago |

Heh, it’s like an AIM punter, but better!

pabs3 4 years ago |

Are these issues bugs in libxml, gloox, ejabberd? Or just in the Zoom client and server?

jeffbee 4 years ago |

At some point we are going to need enforceable professional standards that effectively deal with commercial software publishers who choose to parse untrusted inputs in non-performance-sensitive contexts with C libraries.

turminal 4 years ago | |

This bug has nothing to do with language choice.

I agree that better professional standards and accountability should be introduced for software like zoom though.

userbinator 4 years ago | |

No. We don't need more authoritarian dystopia.

TedDoesntTalk 4 years ago | |

We are? Why?

defen 4 years ago | | |

Since most software users are not tech-savvy and care about convenience and price significantly more than they care about security (revealed preference), the "worse is better" phenomenon incentivizes commercial developers to implement the minimum security practices that their customers will bear. This is individually rational for the developers and the users, but the result is untold billions of dollars of costs costs. Regulation would be one way to change the incentives.

spyc 4 years ago |

Thanks to Ivan Fratric and Google Project Zero!