Xbox 360 Architecture(copetti.org) |
Xbox 360 Architecture(copetti.org) |
(works better with accessibility tools, like text to speech; and translators)
There are also new PDF and EPUB editions here: https://www.patreon.com/posts/67035520 (it's a public post, I just needed a place to upload those files without consuming my hosting's bandwidth).
If you spot a mistake, please log an issue on the repo (https://github.com/flipacholas/Architecture-of-consoles) so I can review it. Thanks!
Otherwise this is a great comprehensive rundown!
I was just recently talking to a coworker too about how I think the Xbox 360 was the first consumer device to have the following types of attacks done to it:
1. Hypervisor attack to then reboot the console into a newer system OS version to retain the vulnerable hypervisor but be able to play new games and get online. This required soldering a separate flash chip to hold the newer system files.
2. Fault injection (reset glitch hack) to attack the system's bootloader
As a teenager who learned programming/hacking by messing with the Xbox 360, I'm thankful that Nintendo is still keeping the dream alive for our next generation of hackers.
I wish I could remember enough to contribute, especially to the DEV/XBL online chapters.
If I ever recover my old /modding/ folder, I will reach out [OP].
I was pseudo-privy to a lot of non-public stuff that I think the public could benefit now from, with a decade passing, I think it would be be appropriate to share.
This was Frostbite/Battlefield 3 era. Good Times.
This is especially true for things you intend to compare. For example, I was awkwardly flipping between the "Xenon" and "Cell" tabs a lot, to compare the two block diagrams.
Then I think it got the red ring? That was the end for me. Only recently got into gaming a bit more now. I still like how consoles are mostly set and forget compared to pc based gaming. Looks like the ps4 is getting homebrew which is interesting as it is basically a pc?
Do the major manufacturers just copy each other?
In Aerospace applications, it's very popular to have triple-redundant system, that's why IBM designed an unusual 3-core PowerPC CPU. Later, when Microsoft came into the picture, they used the design, with essentially minimal modifications.
I haven't followed the F-35s development that closely, so I have no idea what they ended up using, but it's an interesting rumor nontheless.
I am curious though, other that one brief mention, why didn't you touch on the early hardware reliability issues at all?
Also, either the site has been swarmed and its down right now, or the links in footers 117, 118, and 119 are bad.
I guess because they likely didn't have anything to do with the architecture? Or were there really reliability issues that were a result of the architecture, instead of physical/electrical hardware issues?
It's funny to me that IBM - a company not known to promote fun and games - ended up providing the processors that powered the PS3, XBox360, and I think the GameCube and Wii, too.
Thank you very much for the deep dive, I imagine the research must have been exhausting (but also rewarding, I hope!).
Though by not as much of a dramatic margin (Nintendo stayed with IBM for the Wii U and Nvidia for the Switch), one could say AMD won the eighth- and ninth-gen console wars in a similar fashion.
Even with the advances in the M* chips from Apple, it seems the "chip wars" are basically over, at least on the design choice side of things.
The old screenshots of the dash bring back lots of memories. The releases were all named after European cities at the time; Berlin, Geneva, Stockholm, Madrid, etc.
Thanks for compiling this, the quality is excellent and the content is approachable.
It could also use a round of fact checking. Some of the info appears to be based on third-hand speculation.
https://www.gamasutra.com/view/feature/3687/sponsored_featur...
This lead the Wii U to be able to do things like Run Mass Effect 3 and Deus Ex better (arguably) than the PS3 and 360 most of the time. The Wii U was probably the better hardware platform in hindsight but it came too late and the development tools were not as robust so ports just kinda afterthoughts.
The idea using it in xenon without the spus was that high perf code could be tailored specifically for this core's uarch being a console, so the in order nature wasn't the worst thing and the gate savings are pretty huge.
Statements denoting speculation are started with words like 'presumably' or 'it is assumed that'.
I've also took an extra month to send the draft to many experts (part of whom are still active in the Xbox 360 scene) to gather feedback and make all the appropriate corrections. See the 'Changelog' at the end of the article.
I'm afraid I can only do so much. Just like you said, this is all voluntary work. I also keep a repo with all the manuscripts to correct all the mistakes after publishing any content (https://github.com/flipacholas/Architecture-of-consoles).
I don't know how else I could improve this, but I'm always open to suggestions.
I don't know what you mean with reorganization and editing.
The overwhelming tone of the comments here is positive. You took a complex subject , researched it deeply , open sourced it . That’s so rare.
Ignore the haters, always.
I issued a minor correction on the GameCube article and you were extremely quick to correct it.
Op: if you’ve found something, please post it. Everyone benefits :)
You should ask yourself “why am I writing this document“, and then check to see if the document is achieving your goals. I am guessing your goal is “I want to serve as the definitive publicly available textbook for the Xbox 360.“. If that’s your goal, I would split the doc into chapters by topic, and then edit the heck out of each chapter. Look at a model textbook like Computer Architecture, a Quantitative Approach, and see if you can imitate their style.
I think a good technical editor could help you edit this down to about 25% of the words, and still cover the same information.
The job is too big for drive-by code review style comments. You’re going to have to put in the work yourself if you want to improve this.
The book "The Race for a New Game Machine" goes over the timing pretty well.
> Showing the 'Xenon' revision (the first one), taken from my model from 2005. Xenon motherboards are also famous for being defective by design (they get too hot to play games with!). Remaining GDDR3 chips are found on the back.
Btw, I've replaced the bad links with archived ones. I triggered an archive of that site back in December just in case such great source would disappear, never expected to happen just after releasing the article!
For context, Nintendo has always been weirdly quirky and low-buck when it comes to core silicon engineering. The Switch is a Tegra X1 in a trenchcoat, the SNES used a 65C816 at about half the clockspeed it needed to be[0] and had half the VRAM removed at the last minute, and the NES stole[1] the 6502 masks so they didn't have to pay MOS for legit chips. All of those design decisions were made purely to improve margins and genuinely constrained game developers in the process. "Lateral thinking with withered technology" is kind of just their thing.
At least now they're 100% on board with a silicon vendor with a sane roadmap, so they'll at least have a steady supply of backwards-compatible last-gen chips to repackage.
[0] At least it wasn't as slow as the Apple IIgs they pulled it from
[1] Technically legal as IC maskwork rights did not exist yet. This is also why decimal mode was removed - it was literally the only thing MOS had a patent on in the 6502 design.
I've never heard anything suggesting that video RAM was removed. AFAIK, the SNES was planned to have only 8KB of main RAM, which was increased to 128KB by release. I think any support for 128KB VRAM was for future proofing, like if the SNES's hardware was reused for arcade systems, or something.
Source: https://www-chrismcovell-com.translate.goog/secret/sp_sfcpro...
The Genesis's video chip can support 128KB video RAM as well, which besides allowing a larger variety of tiles on screen and doubles DMA bandwidth. It was used in the System C2 arcade board. The Genesis was originally designed to use 64KB video RAM, but after hearing about the SNES, support for 128KB was added. Then they decided that the extra RAM didn't make enough of a difference to justify the cost, so they left it at 64KB.
Source: https://readonlymemory.vg/shop/book/sega-mega-drive-genesis-...
> which used the bespoke paired singles and locked cache line features of the GameCube's PPC 750 derivative
Where can I find some information on that specifically?
Narrator: It didn't solve anything.
Perhaps they do a good job on popular platforms like x86, because we can encode decades of experience, but not so great on brand new ones.
Our modern CPU cores have hundreds of instructions in flight at any one moment because of the depth of OoO execution they go to. You can only go that deep on OoO if you have the branch prediction accurate enough not to choke it.
I'd call the state of the field quite bad. For example they do embarrassingly little for you to help with the 2 main bottlenecks we've had for a long time: concurrency and data layout optimization. And for even naive model (1 cpu / free memory) there is just so much potentially automateable manual toil in doing semantics based transformations in perf work that it's not even funny.
A large part is using languages that don't support these kinds of optimizations. It's not "C compiler improvements hit a wall", it continues "and we didn't develop & migrate to languages whose semantics allow optimizations". (There's a little of this in the GPU world, but the proprietary infighting there has produced a dev experience and app platform so bad that very few apps outside games venture there)
There's a whole alternative path of processor history not taken in the case that VLIW had panned out, and instead of failing because of optimism about compiler optimizations.
LLVM and GCC both have models of out of order pipelines but other than making sure they don't stall the instruction decoder it's really hazy whether they actually do anything.
The processors themselves are designed around the code the compilers emit and vice versa.
Yep. For example, on this die shot of a Skylake-X core,[0] you can see the branch predictor is about the same area as a single vector execution port (about 8% of the non-cache area).
[0]: https://twitter.com/GPUsAreMagic/status/1256866465577394181
Zen in particular combines an L1 perceptron and L2 TAGE[0] predictor[1]. TAGE in particular requires an immense amount of silicon, but it has something like 99.7% prediction accuracy, which is... crazy. The perceptron predictor is almost as good: 99.5%.
I wrote a software TAGE predictor, but too bad it didn't perform as well as predicted (heh) by the authors of the paper.
[0]: https://doi.org/10.1145/2155620.2155635 [1]: https://fuse.wikichip.org/news/2458/a-look-at-the-amd-zen-2-...