Hacking YouTube with a MP4

dylan604 4 years ago |

If you've played around with video formats long enough, you'll have seen something like this. This is the basis for most speed change "filters". Only the high end ones do any kind of pixel based motion estimation so that super slo-mo does not look like a slide show.

Also, it's not uncommon to get odd frame rates in the containers. Even on things as "innocent" as listing the frame rate as 29.97 vs 30000/1001 will affect timing (depending on usage). The variations on 23.976 is fun too: 24000/10001. 2997/125.

The muxer is an important step. When using software decoders, things can be a lot more flexible. Back when shiny round discs were popular, there were verifiers that ensured your muxed data was correct. When your decoders are in hardware, there is a very strict set of parameters the input is expected. Any deviation means the hardware cannot play the video. Early days of "cheaper" DVD software had issues with the muxing.

kmeisthax 4 years ago | |

Video editors (or at least Adobe Premiere) have similar problems: they ignore the timestamps entirely, and any clips you import into them will desynchronize unless you've either recorded them from a known-good source with a constant timebase, or re-encoded them at a constant frame rate.

oefrha 4 years ago |

Video timestamps are weird. Years ago I routinely pulled event VODs from an HLS source and re-uploaded to YouTube. To speed up downloading, I downloaded the MPEG-TS segments in parallel and assembled them with FFmpeg. Initially I used the basic and familiar concat demuxer during assembly. The results were fine locally. Months in a visitor told me that all my VODs had subtle yet frequent stutters. Turned out the videos played perfectly fine in any libavcodec-based (i.e. FFmpeg-based) video player, and still played fine even after libavcodec re-encoding, yet once they went through YouTube’s encoder, which AFIAK was also libavcodec-derived, subtle stutters appeared at segment boundaries. I then switched to the hls demuxer during assembly and the YouTube problem went away. I never got to the bottom of this, so to this day it’s still a mystery to me.

thiscatis 4 years ago | |

This is what I love about tech. I have 15 years experience as a software developer but I have no clue what any of these words mean. Amazing you can have such specialised knowledge about something.

Omniusaspirer 4 years ago | | |

Your comment gave me perspective on how far down the rabbit hole my media server has taken me, as I was nodding along to everything the parent poster said having encountered similar issues with FFmpeg in the past.

Niche knowledge really can creep up on you over the years as you gradually encounter problems and work to solve them a few hours at a time.

diftraku 4 years ago | | |

While I have not been this deep into the inner workins, I personally have skimmed that rabbit hole when I started a side-project that was essentially grep but for (mainly) MKV files.

Idea was that you could "grep" by specific text in the subtitles and automatically create a clip of every occurrence of the text (by looking at the subtitle timing and padding that in both directions).

The biggest source of my frustration was that I was unable to get the clipping to work exactly as I wanted, where the start or end of the clip would seemingly drift back and forth. That was until I realized it boiled down to how the different seek modes in ffmpeg handled keyframes.

I still haven't gotten the clipping to work exactly as I want but I figured doing two passes might be the way to go: first pass would do a fuzzy match and ensure there is enough extra on both ends of the desired clip and the second pass could re-encode the fuzzy-matched clip to shuffle the keyframes around, allowing more accurate clipping.

IanCal 4 years ago | | |

It's tricky. Subs aren't always brilliantly timed either.

I did something similar many moons ago to auto create summary videos based on changing sentiment in the subtitles for the BBC. Worked... interestingly.

If you want to nail scene changes, one thing you can do is look for sudden changes to the histogram frame by frame. It'll change pretty smoothly as people move, cameras move etc but there's a discontinuity when there's a cut. One issue though then is that there are a lot of camera cuts! Surprising how many there are that you don't really notice.

rozab 4 years ago |

I've seen many strange mp4s and webms floating around various discord communities. Some crash your client at a fitting moment in the video, some appear to be thousands of hours long, some appear to be seconds long but are actually hours long, some even loop! somehow.

jart 4 years ago | |

Do you still have copies of them? Could you send them to jtunney@gmail.com? I'd like to setup a web page hosting MPEG torture tests, since there doesn't appear to be one already. This is actually a very common practice for things like RFCs written for text-based protocols. We should ideally have more accessible information online that helps video software authors to harden their implementations against these sorts of busy beaver attacks.

Rompect 4 years ago | |

I even saw videos that play something entirely different the second time you play it!

retox 4 years ago |

On some sites with a video duration limit that don't do transcoding, at least those that allow vp8 WEBM uploads, you can change a few bytes on the input to report a false duration and upload longer videos. If you're uploading audio only, with a static image, you can sometimes upload hours of audio before you hit the filesize limit.

koprulusector 4 years ago |

I have no idea what or how YouTube’s backend works, but I thought it would be useful to share here that if using ffmpeg one can use the arguments -vsync drop to generate fresh time stamps based on frame rate

deathanatos 4 years ago |

It's almost like we didn't learn from the days of MP3. I have several MP3s that, in certain players, are like a half hour long, despite being only 2 minutes long. My best guess was that they were assumed to be CBR, despite nothing about MP3 implying CBR… (there's not a flag or anything that says "this is a VBR" file, CBR files are just special…)

Nowadays it's mostly moot since MP3 is obsolete.

dleslie 4 years ago | |

> MP3 is obsolete

What should we be using instead for lossy audio?

jchw 4 years ago | | |

The current state of the art is Opus, but HE-AAC is also superior, and then there’s always the appeal of lossless which is a lot more practical than it once was.

LinAGKar 4 years ago | | |

HE-AAC is only useful at low bitrates though (below 64 kb/s), and supposedly never reaches transparency. Above that, you should use AAC-LC (or, of course, Opus if you can).

Vorbis is also notable as a better format than MP3, although that too is made obsolete by Opus.

hulitu 4 years ago | | |

A format which cannot deliver quality is not state of the art. Opus is the Internet Explorer 6 of musical and video formats.

Thorrez 4 years ago | | |

Opus is an audio format, not a video format. Opus is better than MP3. Wouldn't MP3 actually be the Internet Explorer 6 of audio formats?

https://sound.stackexchange.com/questions/26167/opus-vs-mp3-...

bjoli 4 years ago | | |

What on earh are you talking about? For lossy formats, there is currently nothing better than opus in actual use.

jchw 4 years ago | | |

Are you mixing up Opus with something else? By many metrics, it is better at delivering quality than just about any other lossy audio codec.

Synaesthesia 4 years ago | | |

AAC is far superior, as is OGG, on a technical basis

rwmj 4 years ago | | |

On the basis of "can it play in my car", MP3 is the only winner. My car's player has one of those baseline decoding chips that can only do MP3.

LinAGKar 4 years ago | | |

That's true, MP3 is by far the most widely supported lossy audio format (except presumably MP1/MP2, since MP3 decoders have to support them), so it will live on for a long time, although Opus is the best one nowadays. Just like with PNG and JPEG for images, which will live on for a long time even though we have WebP, AVIF and JPEG XL. And AVC will probably live on for a long time even though we have HEVC, VP9 and AV1.

thaumasiotes 4 years ago | | |

I mean, what I want is for the car to have an audio input that I plug a cable into. There's no reason for the car to be decoding audio at all.

rwmj 4 years ago | | |

"Now you have two problems." Specifically the steering wheel controls won't work and I'd have to deal with charging the second device.

herbst 4 years ago | | |

Same here. And it doesn't even do that very well. Imagine spending 15k or more on a brand new car in 2021 just to realize that the sound tech is borrowed from a $5 MP3 player from the early 2000s.

rwmj 4 years ago | | |

Almost exactly the same situation as me. The worst part of it is it doesn't sort the directory entries! It displays them in the same order they are written to the directory (ie. usually random). Luckily there is https://fatsort.sourceforge.io/

dleslie 4 years ago | | |

This is why I prefer cars where the stereo is replaceable.

NamTaf 4 years ago | |

My Diamond Rio PMP300 would play VBR but would shit the bed on displaying duration and seeking because it assumed CBR, as you suggested. When VBR was new, this was a pretty familiar situation and oldschool mp3 encoding standards for share sites would give the option to stick to CBR for that reason - they'd specify alt preset standard for VBR and a couple of CBR options, generally around 256kbps.

pjmlp 4 years ago | |

Plenty of obsolete audio files on my phone.

smoldesu 4 years ago |

What's the bug here? It looks like you fooled the container codec with a incorrect timecode and then when it was uploaded to YouTube, the file was rasterized into a sane format. I don't really see an attack here, nor do I see a mitigation.

uptown 4 years ago |

Came across a video on YouTube recently that I think may be misreporting its length due to this issue:

https://www.youtube.com/watch?v=5Grsvyt5xps

The video is 22 minutes but it's reported at nearly 3 hours in length.

thrdbndndn 4 years ago | |

But OP's video is just a video with very low frame rate (reported as 0.030 FPS from `mediainfo`). There is nothing broken about it.

Just become its file size is small, does not mean it can't be 15 hours long (one of the author is takeaway is "[t]he size of a video file is not an proper indicator for how long it is": but even without this hack, you can't do that either, since video can have whatever bitrates.)

loo 4 years ago | |

I saw this one misreported as 22 minutes in Firefox and Discord. Only 1:43 ? in reality https://www.youtube.com/watch?v=RerbrfVd1nI

> Herbie Hancock on Miles: Don't play the butter notes!

bluedino 4 years ago |

One of my favorite "breaking YouTube" (jpeg, really) demos was the slow motion glitter

https://youtu.be/BtYKDamqo2I

silisili 4 years ago | |

I spent time skipping back and forth in the video looking for the side by side(raw vs yt) until it dawned on me. I may in fact be an idiot.

1023bytes 4 years ago |

>Regards, Google Security Bot

AtlasBarfed 4 years ago |

So it's basically a compression bomb? Like those small zip files that can expand to gigantic sizes?

0x000000001 4 years ago | |

42.zip, yep. i have a copy if anyone wants it

swyx 4 years ago | | |

never heard of this, TLDR on how it works?

jffry 4 years ago | | |

From https://www.unforgettable.dk/ :

"The file contains 16 zipped files, which again contains 16 zipped files, which again contains 16 zipped files, which again contains 16 zipped, which again contains 16 zipped files, which contain 1 file, with the size of 4.3GB. So, if you extract all files, you will most likely run out of space :-)"

Why recursively extract zip files? Well maybe a security tool is truing to inspect or process zip file contents

swyx 4 years ago |

i thought this as well explained. title a bit clickbaity, but it got me to click.

i'm interested in learning more about the mp4 format. where can I read more? is there a canonical read that everyone but me knows about?

OP seems like he has some kind of file explorer UI for it - also interested in that

99112000 4 years ago | |

MP4 Inspector (Windows)

The MP4 format is fundamentally pretty easy, at least the box structure. But there have been so many standards that overlap that the MP4 format is also really messy. Aaand you need to pay to get access to the specs of the format..

nightpool 4 years ago | |

"This is clickbait-y enough that I fell for it" is uh. Not exactly an endorsement? It seems like kind of the opposite of what you'd want to encourage?

rendall 4 years ago | | |

I didn't like the title either. On the other hand, I doubt the fellow expected much of an audience but hit HN's front page. Also, the article is pretty great

chx 4 years ago |

Remember zip quines? Ah, good old days.

dapids 4 years ago |

"Hacking YouTube" is a stretch description ...

amelius 4 years ago | |

Yes:

> To the best of my knowledge, the impact was rather low because their transcoders are setten up in such a way that they will eventually give up on file if it takes too many resources.

leafandcoffee 4 years ago |

Recently been playing with this. Using FFMpeg to generate videos from a series of stills, I assumed a frame rate of 1 and a fixed video length would be suitable... Turns out a lot of players are very particular about how they like their files to be set up. Windows couldn't open the file, VLC could, Google couldn't generate thumbnails, but could show the video. Playing them on a Pi lead to more fun and games.

In the end I just encoded them the 'correct' way, but it was eye opening to the wildness going on in video files. I just assumed I would be able to set a duration, a frame rate, and things would "work".

galad87 4 years ago |

And wait until you get to the edit lists part of the MP4 specs. Some real powerful stuff in that.

_Gyan_ 4 years ago |

With ffmpeg, use the below command to calculate maximum timestamps

    ffmpeg -i INPUT -map 0:v -map 0:a -enc_time_base -1 -c copy -f null -

Note the time= value at the end of the process.

Ginden 4 years ago |

I'm bit surprised about lack of financial reward- it can realistically be used to takedown processing servers in rather simple DDoS attack.

ademarre 4 years ago |

Earlier this year people were setting false video metadata to bypass TikTok's duration limit and upload very long videos.

jaimex2 4 years ago |

Zip bombed Youtube. Nice.

phit_ 4 years ago |

looks like Discord is vulnerable to this too, oopsie

LinuxBender 4 years ago | |

Not discord, but the default player is vulnerable to many different crash shenanigans. I get them sent to me all the time to look into and its usually just people using bogus timestamps, bogus seek times or concatenating multiple videos of different resolutions/rates that the player can't handle. If there was a way to get discord to spawn VLC for playing videos by default this would be less of a problem.

ronsor 4 years ago | | |

> get discord to spawn VLC

So rather than loading the bogus videos in a sandboxed Chromium instance, you want to load them in an unsandboxed VLC instance? I smell eventual RCE.

LinuxBender 4 years ago | | |

Yes.

- VLC has decades of battle hardening and entirely discards all the aforementioned nonsense. In a perfect world, both Discord and VLC would be sandboxed themselves, but I accept that this world is far from perfect. Discord could at least sanitize anything that strays from a filename when passed to VLC.

- Discord is already vulnerable to crashes from multimedia. This has been a long running problem that has not been resolved by sandboxing in Electron. The folks at Discord will not be able to resolve this with code changes in Electron AFAIK. If you can crash it, there is potential for an RCE. What that RCE can effectively accomplish will entirely depend on sandboxing boundaries external to the application, not sandboxing within the application.

In reference to sandboxing, I could make a document that explains how to enable the OS wide sandboxing features of Windows 10 [1] VirtualSecureMode / DeviceGuard / CredentialGuard and Linux SELinux / AppArmor. I don't have one for MacOS. I should add, don't enable the Windows 10 security features if you depend on any virtualization outside of Hyper-V. Enabling those will break all hypervisors that don't rhyme with Hyper-V.

I should add that my solution for Discord is to not preview videos or play them in the client. I click on the links and VLC plays them but that is not the default behavior of the application.

[1] - https://techcommunity.microsoft.com/t5/iis-support-blog/wind...

foepys 4 years ago | | |

Aren't quite a few Android security fixes every month related to the media framework? Are those not severe in a browser context because it's sandboxed?

jhgg 4 years ago | |

We don't transcode video, so no.

deathanatos 4 years ago | | |

I presume you're Discord eng. You must do some sort of pass or parse of it, because every now and then I'll upload something and it will fail to process and result in what I'll call "the sad Discord poop"…

jhgg 4 years ago | | |

We try to grab the first frame to show a preview. But if we can't for whatever reason, that's when the sad poop appears :(

phit_ 4 years ago | | |

the player is malfunctioning anyway, similarly to those videos that report short runtime and then go on forever that get passed around quite frequently

jhgg 4 years ago | | |

This is how the video element works in chromium. I suspect it looks at the same metadata field. Beyond leading to a bit of absurd UI state though it's not the same kind of issue that this post describes, which deals with trying to transcode these kinds of videos which could multiply storage utilization on the backend.

warent 4 years ago |

Folks, just because the author wasn't wearing a Guy Fawks mask with a black hoodie and made no mention of gaining access to the Central Meme Database, that doesn't mean they weren't hacking.

They were hacking around with MP4 muxers and YouTube. This is definitely the hacker spirit. The word doesn't need to be re-appropriated by Hollywood caricatures.

shultays 4 years ago | |

Does anyone else really use "hack" in the way HN uses it, ie with its original meaning?

For your average person a hacker is a person in Guy Fawks mask with a black hoodie that steals your facebook password.

For people in the industry a "hack" is a code that works but might be a placeholder/potentially dangerous code. The author would want to write a better version of it but perhaps is not able to due to time or design constraints

I don't think anyone other than HN using "hacker" in the way it mean to be, perhaps it is time to catch up with the time

atherton33 4 years ago | | |

Literally everyone who uses the word "lifehack"

herbst 4 years ago | | |

To be fair, thanks to lifehacks, growth hacks and whatever it seems hack slowly fades back into the original meaning. At least from my POV

fzzzy 4 years ago | | |

Growth hacking