Some 4090s have this extra fp16 -> fp32 ALU disabled because it's defective on that chip.
Other 4090s have it disabled because it failed as an Ada 6000 for some other reason, but NVIDIA didn't want to make a 4095 SKU to sell it under.
Or if you generalize this for every fusable part of the chip: NVIDIA didn't want to make a 4094.997 SKU that only one person gets to buy (at what price?)
For the purposes you tested it, sure. Maybe some esoteric feature you don't use is broken. NVIDIA still can't sell it as the higher end SKU. The tests a chip maker runs to bin their chips are not the same tests you might run.
I'm sure chip makers make small adjustments to supply via binning to satisfy market demand, but if the "technical binning" is too far out of line from the "market binning", that's a lot of money left on the table that will get corrected sooner or later.
edit: And that correction might be in the form of removing redundancies from the chip design, rather than increasing the supply/lowering the price of higher end SKUs. The whole point here is, that's two sides of the same coin.
Runs fast? i9. slower? i7. missing cores? i5 slowest? i3
perfect chips probably not only have all the cores working, they also run at low voltages so don't get as hot.
I wonder if they can figure out what parts of the chip run at what speeds, and disable the ones that run slow/hot
I'm pretty sure gpus are overclocked by vendors, so there must be some sort of binning either by the vendors or they buy binned parts. I'll bet if parts could go faster, you would have an ASUS/MSI/etc 4090-2x-max-$$$$
https://www.tomshardware.com/reviews/glossary-binning-defini...
It's like the logic people have on /r/pcmasterrace is that if they didn't bin, they would just release all 4090s at 4080 prices. No, there would just be less 4080s for people to buy. No chip maker is going to sell their chips at sub-market rates just because they engineered them to/have a fab that can produce them at very low defect rates.
Now, Nvidia certainly has done dubious things. They've hurt their partners (EVGA, you're missed), and skyrocketing the baseline GPU prices is scummy as hell. But binning isn't anything I necessarily consider dubious.
Sure, and more 4090s at a lower price.
Different markets, same techniques. It's in the company DNA. That and their aversion to open source drivers and various other bad decisions around open source support that make maintaining a working Linux GPU setup way harder than it should be even to this day.
My point is these things don't seem to last in tech.
AMD and Intel might only be competing in the entry-to-midrange market sector but my needs aren’t likely to exceed what RX 8000 or next-gen Intel cards are capable of anyway.
On one hand, it's very bad because it reduces economic output from the exact same input resources (materials and labor and r&d).
On the other hand, allowing market segmentation, and more profits from the higher segments, allows more progress and scaling for the next generation of parts (smaller process nodes aren't cheap, and neither is chip R&D).
man.
This is not true with the economies of scale in the semiconductor industry.
We should demand that it's unlockable after a certain time.
So I don't think the nerfing here was to lower power consumption. It's just market segmentation to extract maximum $$$$ from ML workloads.
nvidia have always been pretty open about this stuff - they have EULA terms saying the GeForce drivers can't be used in data centres, software features like virtual GPUs that are only available on certain cards, difficult cooling that makes it hard to put several cards into the same case, awkward product lifecycles, contracts with server builders not to put gaming GPUs into workstations or servers, removal of nvlink, and so on.
In a competitive market, if you have a surplus of top-tier chips, you lower prices and make a little more $$ selling more power.
With a monopoly (customers are still yours next upgrade cycle), giving customers more power now sabotages your future revenue.
By the way, AMD also uses fuse blowing if you e.g. overclock some of their CPUs to mark them as warranty voided. They give you a warning in the BIOS and if you resume a fuse inside the CPU gets blown that will permanently indicate that the CPU has been used for overclocking (and thus remove the warranty)
I wouldn't dismiss this so aggressively.
Frequently (more frequently than not), efuses are simply used as configuration fields checked by firmware. If that firmware can be modified, the value of the efuse can be ignored. It's substantially easier to implement a fused feature as a bit in a big bitfield of "chicken bits" in one-time programmable memory than to try to physically fuse off an entire power or clock domain, which would border on physically irreversible (this is done sometimes, but only where strictly necessary and not often).
Bullshit. There will be hackers in the future who can do it in their garage. Just... not anytime soon.
> By the way, AMD also uses fuse blowing if you e.g. overclock some of their CPUs to mark them as warranty voided. They give you a warning in the BIOS and if you resume a fuse inside the CPU gets blown that will permanently indicate that the CPU has been used for overclocking (and thus remove the warranty)
Emphasis on "some." You can buy plenty of CPUs from them made for overclocking.
There was a day in which Intel was top dog and for a long time, now AMD are competing in 1st place.
AMD and/or other competitors will have their day, but today its Nvidia.
https://xcancel.com/realGeorgeHotz/status/1868356459542770087If a product is made, and the cost to provide that product is the same one way or the other but you cripple it to create segmentation, then that is greed. Period. Objectively. And if you're okay with that, then fine, no problem. Just don't try to tell me it isn't maximization of profit.
There are no heroes in the megacorp space, but it would be nice for AMD and Intel to bring Nvidia to heel.
For this performance nerf, IDK, seems fine to me. Software companies do this all the time. Same piece of software but you have to pay to unlock features. I don't see why hardware should be any different.
Granted, even for the CAD nerf, it's a gray area. You pay for features, not for silicon, and NVidia is clear about what you have to pay for what features, so. But I'm a bit more biased on that one because my 10-person HW company had to spring for several of those workstation cards.
Had AI and Crypto not been a thing. These so called price discrimination was what kept the company afloat and continue to spend stupid money ( according to many ) on CUDA rather than making gaming better.
And when often people say "X" needs competition. What they really want is just cheaper price for the same thing. Would it be great if Nvidia had more competition? Absolutely, but is Nvidia not making progress and milking everything they have? Absolutely not. They invested even more in CUDA, Large Die Size Correction Tooling, Assisted EDA Design and many more to their arsenal to built their moat.
I also often found when a successful founder is still working at a company, that company is often pushed far harder by its founder than whatever market force is driving them. So we need competition for Intel 2009 - 2021, Microsoft in 2000, Any company who just sit there and no longer improves or failed to execute.
Nvidia? They are doing fine if not better than I could imagine. ( I just wished they spend a little more money to compete in the Mobile and Desktop Consumer SoC space. I guess that is coming soon. )
They'd probably be happy with competitive market pricing, which is unlikely to happen without a competitive market.
Nvidia's gross margins are extraordinary vs. AMD or intel margins (or even "overpriced" Apple.)
There is competition of a sort - for example at the low end with intel ARC and at the high end with AMD Instinct based supercomputers. However, competitors don't seem to be able to match the CUDA software platform, particularly for AI/ML. The deepening CUDA moat is hard for competitors to cross and for customers to escape.
In the cloud space it seems that Nvidia may face more competition with platforms like GCP/Cloud TPU and AWS/Trainium.
we're lucky they still do gaming by limiting the datacenter chips
it's like getting a ferrari speed limited for usd20,000 and then I complain I don't get the acceleration of a usd100,000 model. they sold the product cheaper, they cared, they adapted. I'm happy they are still improving year after year for the same dollar value
No, they don't.
That they are able to price discriminate this way is a sign that they are functionally a monopoly exercising pricing power, otherwise, they would be easily undercut in the market where they charge premium prices by a competitor.
What makes Nvidia seem unbeatable is that Nvidia does the best job on hardware design, does a good job on the software for the hardware and gets its designs out quickly such that they can charge a premium. By the time the competition makes a competitive design, Nvidia has the next generation ready to go. They seem to be trying to accelerate their pace to kill attempts to compete with them and so far, it is working.
Nvidia just does not do the same thing better in a new generation, but tries to fundamentally change the paradigm to obtain better than generational improvements across generations. That is how they introduced SIMT, tensor cores, FP8 and more recently FP4, just to name a few. While their competitors are still implementing the last round of improvements Nvidia made to the state of the art, Nvidia launches yet another round of improvements.
For example, Nvidia has had GPUs on the market with FP8 for two years. Intel just launched their B580 discrete GPUs and Lunar Lake CPUs with Xe2 cores. There is no FP8 support to be seen as far as I have been able to gather. Meanwhile, Nvidia will soon be launching its 50 series GPUs with FP4 support. AMD’s RDNA GPUs are not poised to gain FP8 until the yet to be released RDNA 4 and I have no idea when Intel’s ARC graphics will gain FP8. Apple’s recent M4 series does have FP8, but no FP4 support.
Things look look less bad for Nvidia’s competitors in the enterprise market, CDNA 3 launched with FP8 support last year. Intel had Gaudi 2 with FP8 support around the same time as Nvidia, and even launched Gaudi 3. Then there is tenstorrent with FP8 on the wormhole processors that they released 6 months ago. However, FP4 support is no where to be seen with any of them and they will likely not release it until well after Nvidia, just like nearly all of them did with FP8. This is only naming a few companies too. There are many others in this sector that have not even touched FP8 yet.
In any case, I am sure that in a generation or two after Blackwell, Nvidia will have some other bright idea for changing the paradigm and its competition will lag behind in adopting it.
So far, I have only discussed compute. I have not even touched on graphics, where Nvidia has had many more innovations, on top of some of the compute oriented changes being beneficial to graphics too. Off the top of my head, Nvidia has had variable rate shading to improve rendering performance, ray tracing cores to reinvent rendering, tensor cores to enable upscaling (I did mention overlap between compute and graphics), optical flow accelerators to enable frame generation and likely others that I do not recall offhand. These are some of the improvements of the past 10 years and I am sure that the next 10 years will have more.
We do not see Nvidia’s competition put forward nearly as many paradigm changing ideas. For example, AMD did “smart access memory” more than a decade after it had been standardized as resizeable bar, which was definitely a contribution, but not one they invented. For something that they actually did invent, we need to look at HBM. I am not sure if they or anyone else I mentioned has done much else. Beyond the companies I mentioned, there are Groq and Cerebras (maybe Google too, but I am not sure) with their SRAM architectures, but that is about it as far as I know of companies implementing paradigm changing ideas in the same space.
I do not expect Nvidia to stop being a juggernaut until they run out of fresh ideas. They have produced so many ideas that I would not bet on them running out of new ideas any time soon. If I were to bet against them, I would have expected them to run out of ideas years ago, yet here we are.
Going back to the discussion of Intel seeming to be unbeatable in the past, they largely did the same thing better in each generation (with occasional ISA extensions), which was enough when they had a process advantage, but it was not enough when they lost their process advantage. The last time Intel tried to do something innovative in its core market, they gave us Itanium, and it was such a flop that they kept doing the same thing incrementally better ever since then. Losing their process advantage took away what put them on top.
This is the most important point. Everyone seems to think that Nvidia just rests on its laurels while everyone and their dog tries to catch up with it. This is just not how (good) business works.
In summary the software story was very surprisingly better than I expected (no Jax though).
It's able to get the best process node from /whoever is willing to sell it to Nvidia/: it's vulnerable (however unlikely) to something very similar -- a competitor with a process advantage.
BK failed to understand the moat Intel had was the Fab. The moat is now gone and so is the value.
Rich webapps hadn't been invented. Smartphones? If you're lucky your flip phone might have a colour screen. If you've got money to burn, you can insert a PCMCIA card into your Compaq iPAQ and try out this new "802.11b" thing. Java was... being Java.
Almost all the software out there - especially if it had a GUI, and a lot of it did - was distributed as binaries that only ran on x86.
I'm thinking about buying Nvidia
(this is bullshit)
And this goes down to consumer drivers too. I've sworn to myself that I'm not buying AMD for my next laptop, after endless instability issues with the graphics driver. I don't care how great and cheap and performant and whatever it is when I'm afraid to open Google Maps because it might kernel panic my machine.
AMD is definitely not perfect but I don't think it's fair to say they decided not to invest in software. Better late than never, and I'm hoping AMD learned their lesson.
Perhaps they were distracted by dismantling Intel's CPU hegemony? I wouldn't fault them for that, fighting 2 Goliaths simultaneously isn't a sound strategy.
The hardware is a bit finnicky, but honestly I prefer a thing to just be broken and tricky as opposed to nvidia intentionally making my life hard.
Edit: the only downside is that the hw h265 encoder is pretty bad. Av1 is fine though
The drivers are unreliable, Adrenalin is buggy, slow, and bloated; AMD's cards have poor raytracing, and AMD's compute is a dumpster fire, especially on Windows; ROCm is a joke.
None of the LLM or Stability Matrix stuff works on AMD GPUs under Windwos without substantial tweaking and even then it's unreliable garbage, whereas the NVIDIA stuff Just Works.
If you don't care about any of that and just want "better than integrated graphics", especially if you're on Linux where you don't need to worry about the shitshow that is AMD Windows drivers - then sure, go for AMD - especially the cards that have been put on sale (don't pay MSRP for any AMD GPU, ever. They almost always rapidly discount.)
AMD simply does not have the care to compete with NVIDIA for the desktop market. They have barely a few percent of the desktop GPU market; they're interested in stuff like gaming consoles.
Intel are the only ones who will push AMD - and it will push them to either compete or let their product line stagnate and milk as much profit out of the AMD fanboys as they can.
With that out of the way, market segmentation is often good for budget customers, who, in the case of Nvidia GPUs, are gamers. They get GPUs that run their games just as well as the uncrippled model, for a much lower price. Without market segmentation, all the GPUs would go to Amazon, Microsoft, Google, etc... since they are the ones with the big budget, gamers will be left with GPUs they can't afford, and Nvidia with less profits as they will lose most of the market for gamers.
With market segmentation, Nvidia wins, gamers win, AI companies and miners lose. And I don't know about you, but I think that AI companies and miners deserve the premiums they pay.
It sounds stupid to pay for crippled hardware, but when buying a GPU, the silicon is only a small part of the price, the expensive part is all the R&D, and that cost is the same no matter how many chips they sell, and it makes sense to maximize these sales, and segmentation is how they do it without sacrificing their profits.
Of course, should AMD or Intel come back, they would do their own market segmentation too, in fact, they already do.
French polymath (economist, engineer, bureaucrat) Jules Dupuit famously described this concerning railway carriage accomodations and the parlous state of third-class carriages:
It is not because of the several thousand francs which they would have to spend to cover the third class wagons or to upholster the benches. ... [I]t would happily sacrifice this [expense] for the sake of its popularity.
Its goal is to stop the traveler who can pay for the second class trip from going third class. It hurts the poor not because it wants them to personally suffer, but to scare the rich.
<https://www.inc.com/bill-murphy-jr/why-does-air-travel-suck-...>
More on Dupuit:
<https://en.wikipedia.org/wiki/Jules_Dupuit>
Market segmentation by performance is a long-standing practice in the information technology world. IBM would degrade performance of its mainframes by ensuring that a certain fraction of CPU operations were no-ops (NOPs), meaning that for those clock cycles the system was not processing data. A service engineer would remove those limits on a higher lease fee (IBM leased rather than sold machines, ensuring a constant revenue stream). It's common practice in other areas to ship products with features installed but disabled and activated for only some paying customers.
Another classic example: the difference between Microsoft Windows NT server and workstation was the restriction of two registry keys:
We have found that NTS and NTW have identical kernels; in fact, NT is a single operating system with two modes. Only two registry settings are needed to switch between these two modes in NT 4.0, and only one setting in NT 3.51. This is extremely significant, and calls into question the related legal limitations and costly upgrades that currently face NTW users.
"NVIDIA is so far ahead that all the 4090s are nerfed to half speed.
There's an eFuse blown on the AD102 die to halve the perf of FP16 with FP32 accumulate. RTX 6000 Ada is the same die w/o the blown fuse.
This is not binning, it's segmentation. Wish there was competition."
Professionally I usually see OTP referred to as "fuses," "OTP," "straps," or "chicken bits," with the specific word "eFuse" reserved for the current-limiting device. But in popular media the trend seems the opposite.
A vast amount of code was only intended to compile and run on a single OS and architecture (circa 2000, that was usually x86 Win32; Unix was dying and Wintel had taken over the world). If some code needed to be ported to another platform, it was as good as a from-scratch re-write.
[0] in case you wanted to use the thing in Visual Basic, which you very well might.
"had". That's what helped prop up their monopoly but it didn't last. These days if can't run your software on another architecture, like ARM, you can run at least on AMD. AMD can basically run the same software as Intel. This isn't the situation for NVIDIA vs everyone else, so far.
You do not revert a blown e-fuse with a software update.
They probably learnt their lesson in the early 2000s, when users could do a bit of soldering and apply a patch to unlock more performance on cards that had been artificially throttled.
The years it took them to get their Linux drivers into a usable shape are another issue.
Decoding is much more deterministic, so speed and power efficiency are the main ways hardware decoders can differ.
But he also made sure that resting won't ever happen again. Andy Grove from Intel once said that only the paranoid survive and I bet Jensen is the most paranoid CEO alive. You won't see him in public that way because Jensen in public is Nvidia marketeer.
Nvidia has also understood early on how important marketing and brand recognition is. They learned it the hard way with the utter failure of their first chip the NV1 which was a technological master piece which no one wanted. Witht the bad GeForce FX Nvidia even made marketing videos to make fun of themselves and that helped. ATI didn't crush Nvidia as much as expeceted because of such activities despite having the ultra superior Radeon 9700/9800 series back then.
~"Good fortune favors those who are prepared to take advantage of it", etc.
Given that the issue (or variants thereof, because there were at least 10 different workarounds to try) was somewhat widely reported, the time it took to get this fixed far exceeded anything I would consider tolerable.
Artificially restricting supply of high-end chips and increasing supply of mid-range chips by disabling fully functional cores is how chip makers preserve their pricing structure. Without doing this, market pressures would force prices down on high-end chips and cause lower bins to mostly disappear from the market, leaving the product line with lower overall margins and a PR nightmare every time a new generation launches with pricing reset back to the initially high levels.
As a rule of thumb: if a chip product line goes a whole year without having new SKUs show up with a higher percentage of cores enabled or higher clock speeds for the same core count, then the manufacturer is artificially restricting supply to make more of the lower-bin parts than naturally occur in the fab output.
> And that correction might be in the form of removing redundancies from the chip design, rather than increasing the supply/lowering the price of higher end SKUs.
Those two courses of action take place on completely different timescales. Disabling cores and other binning tricks can be implemented in no more than a few months. Adding a new chip with a different number of copies of the same IP blocks takes well over a year. Removing redundancy within an IP block (eg. by having fewer spare SRAM blocks for a cache of a fixed capacity) isn't going to happen within a single chip generation.
In the semiconductor world, corrections of any kind tend toward "later" rather than "sooner".
There were also ISA extensions. Even if Intel had trouble competing on existing code, they would often extend the ISA to gain a temporary advantage over their competitors by enabling developers to write more optimal code paths that would run only on Intel’s most recent CPUs. They have done less of that ever since the AVX-512 disaster, but Intel still is the one defining ISA extensions and it historically gained a short term advantage whenever it did.
Interestingly, the situation is somewhat inverted as of late given Intel’s failure to implement the AVX-512 family of extensions in consumer CPUs in a sane way, when AMD succeeded. Intel now is at a disadvantage to AMD because od its own ISA extension. They recently made AVX-10 to try to fix that, but it adds nothing that was not already in AVX-512, so AMD CPUs after Zen 3 would have equivalent code paths from AVX-512, even without implementing AVX-10.
Thats where Nvidia learned to "optimize" Cuda software path. Single threaded x87 FPU on SSE2 capable CPUs.
https://arstechnica.com/gaming/2010/07/did-nvidia-cripple-it...
https://www.realworldtech.com/physx87/3/ "For Nvidia, decreasing the baseline CPU performance by using x87 instructions and a single thread makes GPUs look better."
They doubled down that approach with 'GameWorks' crippling performance on non Nvidia GPUs, Nvidia paid studios for including GameWorks in their games.
Nvidia is helping power the next generation of big brother government programs.
And we just gave them billions in tax dollars. Failing upwards...
They're cutting down 4090s into 4080s to fulfill a demand for a cheaper chip, while still supplying their premium option. Your fanciful world concept of things being sold for no/minimum margins is just that: fanciful.
Granted, it's Nvidia, and they've been featured in devices that were notoriously hackable, but also, it's not 2018 anymore.
Needless to say, people should understand when they buy an Nvidia card, they should fully expect to use Nvidia firmware, with whatever that entails.
EDIT: I'd really like to remember the name of this device. It was the same era as Blackberry releasing... the Storm? Some resistitive-touch device with a physically clickable screen. Motorola Storm? I really wish I could recall. (sub-edit: I think the Storm was the Blackberry device. So something else...)
I'll take your bet on this. Silicon designers aren't unaware of this potential vulnerability, and if you want to prevent eFuses from being un-blown, you can design for that. I would place money on there not being any commercially viable way to restore an eFuse in a 4090 die at any point in the future. You can probably do it, but it would require millions of dollars in FIB and SEM equipment and likely would destroy the chip for any useful purpose.
Usually the only useful reason to attempt to recover/read/unblown fuses is to read out private keys built into chips.
The price tag and size of these things are what I'm talking about. SOME day it will get much cheaper and smaller. A 4090 will be useless at that point, but I still play with 8086s and vacuum tubes, so...
No point in betting though. We'll both be dead by then.
So, is the market really absurd?
Just because some billionaires are desperate for growth to grow their hundred billions into trillions outbidding each other does not mean that 90% of humanity cannot make use of ML running locally on cheaper GPUs.
Also, housing is a basic human right, whereas fast GPUs probably are not.
That's how this scheme works. The card is most likely not profitable at consumer price points. Without this segmentation, consumer cards would trail many years behind the performance of datacenter cards.
But clearly, lack of competition is one thing that supports whatever rent Nvidia seeks.
And what does pytorch et al. use under the hood? cuBLAS and cuDNN, proprietary libraries written by NVidia. That is where most of the heavy lifting is done. If you think that replicating the functionality and performance that these libraries provide is easy, feel free to apply for a job at NVidia or their competitors. It is pretty well paid.
Anyone that wants off the shelf parts at scale is going to turn to Nvidia.
I think there might have been one or two OMX devices in this era with locked bootloaders that weren't bypassed due to a lack of research, but I actually find this example a bit amusing: early Qualcomm Motorola Android phones were touted as "unhackable" due to their use of fuses (Qualcomm even went on a marketing pitch calling them "Q-Fuses"), but were extremely quickly unlocked using trivial TrustZone supervisor vulnerabilities (iirc, there was an SMC that literally had a write-what-where primitive in it).
I never claimed it was easy. I meant in my opinion it is in the order of 10s of millions dollars of investment, not a trillion dollar CUDA moat that people comment here.
Just because Ferrari might be capable of making that car for $20k, I don't have a fundamental right to demand it from them any more than I have a fundamental right to demand that you make me a sandwich right now for $5.
> they can afford
Before using the word "they" in a prescriptive sentence, think about whether you could substitute "I" and you would still be happy with it.
I have no issue selling into a competitive market, that’s just how things work for individuals. It’s only at the scale of countries and giant companies that the ability for anti competitive behavior really shows up.
What about all other stuff? i.e. maybe you or somebody else can "afford" to sell their labour at 10-80% of what they are paid?
That a competitive market drives prices to zero economic profit is a fairly basic result; no active measures besides the existence of competition are necessary for this.
> What about all other stuff?
Yes, this applies in all competitive markets. If it doesn't apply in a market, there is a constraint on competition causing it.
Yes and that's not necessarily a good thing in all markets. Very low profit margins can result in less innovations and would certainly discourage companies from taking risks (basically by definition)
Infact you can argue that something is really wrong in our governance if housing is human right and yet there are people profiteering from how unaffordable it has become.
I am more appalled at how long it has taken for the big tech other than Google to standardized ML workload and not be bound by CUDA.
There is, however, a political barrier to increasing the housing supply.
There's been a few different stories like this lately:
https://interestingengineering.com/videos/guy-builds-integra...
People said the same thing you're saying now about computers. You're just being silly and forgetting history.
You’re trivializing the challenge of modifying something that is on the order of 50nm wide and specifically designed to not be able to be tampered with.