Reverse Engineering TikTok's VM Obfuscation (Part 2)(ibiyemiabiodun.com) |
Reverse Engineering TikTok's VM Obfuscation (Part 2)(ibiyemiabiodun.com) |
Isn't that supposed to be prevented by the os via the permissions thing?
There's not a huge performance hit, there's a portion that runs on page load and a smaller portion that runs on each HTTP request (which isn't too often).
With the obfuscated fingerprinting demonstrated by the virtualized code, I reckon they're doing something more malicious.
That is not really 'Obfuscation', at least to the degree that TikTok is doing.
In fact the source language is likely to be JS itself; a JS-to-some-sort-of-vm-bytecode-to-JS compiler is made. I know that Tencent has a similar VM; an interesting aspect of that VM is that the instruction set is dependent on the code being compiled (and the opcodes are dynamically generated and shuffled when compiling), so unused instructions are not generated.
I don't know what they're trying to obfuscate but it must be worth hiding to allow such inefficient javascript to run on clients around the world. I can't think of any non malicious reason to develop such a system for a website about silly videos.
Yes, this should set of big warning bells.
while (hasData) {
switch (code) {
case INSTR1 -> …,
case INSTR2 -> …
…
}
}Sometimes I wonder if we could just make everything open. No obfuscation, no captchas, just a neat API for everything. Of course that wouldn't work ceteris paribus, everything else unchanged, due to bad actors, spam, or just competitors who want to take your work. But if you'd change the incentives - make society non-adversarial, non-profit oriented - then all that gating and obfuscation would become unneccessary.
The relation between this article and the one it's based upon is clearly indicated in the first paragraph. I don't think this is disingenuous at all.
I just feel it would have been more tasteful to choose a title different than the one that the original author is obviously going to use for their next blog post.
The good news is that the scope of "malicious activity" is (at least in theory) much smaller when you constrain it to what web sites can do, as opposed to the scope of what can be done by executing ARM instructions and making syscalls.
The bad news is that the scope of "things web sites can do" keeps growing and is fingerprintable.
This isn't regarding the app at all, which is likely not as heavily obfuscated as this (mostly because you can't just "view source" on an app).
They couldn't. Apple does not perform any meaningful review of apps for malicious activity, do they do it for rent seeking.
Everyone expects these sites to scrape as much personal information as possible (China did not invent that, they are following), but beyond that any additional imagined state-ran initiative would be server side, right? What is worth hiding in the front end beyond preventing people re-using their code? (which would be overkill to use a VM for, as light obfuscation would be enough)
Note that the parent article is about the website, not mobile app.
Let me give you a few examples of cases where obfuscated or inefficient code exists:
1. re-captcha is "obfuscated", and so difficult that computers aren't meant to be able to compute it. It is used to rate-limit login attempts to secure people's accounts, among many other uses.
2. Cloudflare runs some obfuscated "are you a human" javascript to fingerprint you for the purpose of DDoS protection. This one is arguably less noble, but still not obviously malicious.
3. Games obfuscate code all the time to make cheats harder to develop and maintain
4. Youtube, netflix, etc have obfuscated code (DRM code) because legally they have to make an attempt to protect copyrighted works from being downloaded etc, or else they lose access to said copyrighted works. Tiktok also allows using copyrighted music I'll point out.
Depending on your viewpoint, perhaps all of those are also malicious, but I think many people view them as non-malicious, though perhaps dumb.
That said, I personally wouldn't give tiktok the benefit of the doubt here. All the other large social media companies maliciously capture as much data as they can about me in order to sell it to brainwashing companies (ahem, advertisers), to the point where I think it should be illegal, so I don't really expect tiktok to be any different.
I accept DRM for online games where one modified client can ruin the fun for everyone else, but this isn't a multilayer game. It's a video player with comments.
As for DRM, that's just plain malicious in my eyes. I need to go through the effort of pirating Netflix content I pay for because my OS isn't in the magical white list. It's not as if DRM is preventing piracy either, because pirated content in 1080p is available the day a show goes live on Netflix, sometimes even earlier because of bullshit International contracts. There's no legal requirement for their DRM either, that's all part of the contracts they sign with media companies who are just as responsible.
I don't think Tiktok cares about copyright law in the slightest, as far as I can tell they just started using copyrighted music and dealt with license deals after the fact. Bytedance knows the MPAA isn't going to get the app banned, not without running the risk of American artists losing the lucrative Chinese market, so I very much doubt that they'll use this for any kind of DRM.
Perhaps this is just an in house recaptcha alternative or a way to block youtube-dl, but so far I haven't seen this stop redistribution of the videos on their platform in any meaningful way. Tiktok videos get downloaded all the time and even youtube-dl has support built in these days so I don't see what it's trying to accomplish.
Every proper obfuscator with virtualization should be auto-generated. Of course, when I reverse it, then my scripts are also automatized as much as they can be.
I agree to your main point, but I would like to point out, that the definition of "bad actor" can be different, including "they are bad because they have something we want, so we need to deploy our military towards them"
The other comments here remind me of how pessimistic and lacking in imagination the tech crowd tends to be.
Because earlier attempts failed, does that make an endeavor "impossible"/"naive"?
People cite "human nature", but forget that there exist and have existed systems of living, radically different from what they are used to, which gave rise to completely different "human nature" - or perhaps merely gave it a healthier environment, so it manifested very differently. (This "problem" is frequently refuted in anarchist texts.)
I'll leave some links which I think are relevant to this quest. I hope they provide some answers, or at least provoke further searching.
1. https://donellameadows.org/archives/leverage-points-places-t...
2. https://theanarchistlibrary.org/library/the-anarchist-faq-ed...
3. https://www.youtube.com/watch?v=l7TONauJGfc
4. https://en.wikipedia.org/wiki/Tao_Te_Ching
I hope we find a way. And for those looking and trying to move beyond the status quo, I supply a maxim to bear in mind when facing the unimaginative and the complacent - "It's always impossible until it's done."
No. It is far better to recognize that there is atm no simple way we can rid the world of bad actors, i.e. not naïvely hope that some rather simple technical adjustment would somehow make society non-adversarial. Having fallen into that trap at least once I for one has come to realize the profound wisdom that the founding fathers possessed: the best we can do at this point and in the foreseeable future is to create checks and balances.
Which in this case means making it difficult for bots and other bad actors.
In the meanwhile we had incredible technological advancements. Electrification, modern chemistry, computers, the internet. What was the last comparable political and social disruption? I'd say the civil rights movement (1960s), or women's suffrage (1920s) were very important. But apart from that, we are mostly running the same social OS since the industrial revolution and the birth of modern nation states.
The ideas of the founding fathers of the US and other thinkers of that era were great in their time - republicanism and a bourgeois revolution - compared to the absolutism that came before. But were are new ideas of this caliber? What is going to be the disrupting new 'tech' in politcal theory?
(I realize it seems silly to jump from TikTok to systemic criticism, but I think in the fringes of the tech sector is a really good place where you can see fast moving technology chafing against fossilized society. And a lot of people here have the desire to make the world a better place - my point is that the way to go is not to invent radical new technology, but maybe by inventing new political ideas.)
Arguably the world would be a much better place if corporations, governments and people were both unable and unwilling to keep secrets.
Although, again, with all things being equal, good luck finding people to "go first".
https://www.theinformation.com/articles/facing-hostile-chine...
Note: This is the same as having no ethics.
The very rare times I've had to watch a video on there, I didn't even need to run their JS. The URL was just in the source of the page. I don't think they were trying to stop that, given that they clearly could've tried to.
I have a very short JS whitelist, and it's not getting lengthened just to watch a video. I'd sooner extract the URL and stick that in my own media player.
As for disruptions in social/technological matters, there is no denying that we live in interesting times, but as always it is perilous to ascertain if most relevant things are the same, or some things are the same, or no such things are the same as before (sameness is notoriously difficult to assess, at the extreme one might argue that all discernible things are ultimately if not categorically different).
We can however unequivocally state that the argument (that things are now categorically different) has been heard before, many times, and has always turned out to be largely false - self interested deceptions even. Recent history has the "new economy" around the turn of the century claiming that everyone would from now be rich because internet, even more recently the crypto currency claims of a new social order. Going back a few hundred years we had the emerging laissez faire economists claiming that a totally free economy without state interventions would lead to world peace, a claim parroted by the socialists (socialism would lead to peace). Later the women's rights movement made the same claim - if only women would be in charge then ...
Make what one will out of this, to me it is but one long list of falsehoods and half truths, and I've personally swiveled back to a position which is close to that of classical economy: humans are essentially (in a statistical mean sense) self interested, and will from that simple fact come into conflict with both other humans, the environment and the common good - and will skew everything in their perceived favor unless there are appropriate checks and balances. This is the ultimate factual circumstance that I see no sign of changing in the foreseeable future - through technological breakthroughs or anything else. We are still essentially in the age of the founding fathers in that regard.
As for peace, the only path that has proven to be successful is the ad hoc and "organic" creation of ever larger political units: neighboring villages that were at each others throats during prehistory made friends under the rule of warlords, warlords became friends under the rule of a feudal power, feudal powers became friends after the conquest by empires, empires morphed into global trade organizations that requires global legal frameworks to work efficiently. That in my mind is incidentally the breakthrough for global peace that one might hope for in the next few hundred years - if we can keep the belligerents on all sides in sufficient check while global inequalities in power and wealth are evened out through ever more trade and international cooperation.
But right now the odds seem to be that we blow everything up before that TBH.
(And sure, there are degrees to this.)
We used minification on government apps I've worked on before, despite the source being available on GitHub.
This is typically done at build time, so there's no or little impact to developers writing code.
Minification is the frontend equivilent to compiling code. We typically don't think of building a Go program as 'obfuscating' it when compiling it down to machine code even though the resulting artefact is harder to read than the original source code.
1. That wasn't a bug for "over a decade," as far as I'm aware. It was only introduced with an OS update
2. I could be wrong here, but it required you to already have created a root user with a blank password, which isn't possible unless you know how to trigger another bug that does it.
Lots of "studios" use these APIs as for-profit astro-turfing, sometimes spreading mis-informatin, with tons of fake accounts
A layer of VM obfuscation helps, like a DRM or packer.
Of course obfuscating the APIs is still an attempt to trust the client, which is not secure in a strict sense, but might slow people down.
It's because minification is far more intelligent than it's made out to be: Your code may pull in a dependency like lodash for a single function in a number of places. gzip would be able to notice that and reduce the total amount of data sent over the wire by de-duplicating and compressing lodash but minification would notice that you're only using a single function and only that function would end up being included in the minified source (and similarly de-duplicated).
For webpack tree shaking is the removal the declarations of exported values that are never imported. Further, imported values that were used used in functions such exported can be removed (although removing the last import from a file entirely requires knowing the imported file has no side effects).
Separately to that the minimizer (terser) has its own dead code elimination process, that performs more in depth analysis. Terser cannot do the tree shaking part itself because of how webpack structures modules that Webpack was unable to concatenate, and because code slitting might make that live in a different file. But if webpack can eliminate unused exports and ensure that split files only access each other via exports, then terser can handleremaining dead code elimination within each module.