"Nvidia is so far ahead that all the 4090s are nerfed to half speed"

"Nvidia is so far ahead that all the 4090s are nerfed to half speed"(twitter.com)

207 points by BIackSwan 1 year ago | 176 comments

creato 1 year ago |

NVIDIA is obviously not above market segmentation via dubious means (see: driver limitations for consumer GPUs), but I think binning due to silicon defects is a more likely explanation in this case.

Some 4090s have this extra fp16 -> fp32 ALU disabled because it's defective on that chip.

Other 4090s have it disabled because it failed as an Ada 6000 for some other reason, but NVIDIA didn't want to make a 4095 SKU to sell it under.

Or if you generalize this for every fusable part of the chip: NVIDIA didn't want to make a 4094.997 SKU that only one person gets to buy (at what price?)

gary_0 1 year ago | |

Depending on who you ask, binning is segmentation. Generally demand isn't going to exactly match how the yields work out, so companies often take a bunch of perfectly good high-end chips, nerf them, and throw them in the cheapo version. You used to be able to (and still can, in some cases) take a low-end device and, if you'd won the chip lottery, "undo" the binning and have a perfectly functional high-end version. For some chips, almost all the nerfed ones had no defects. But manufacturers like nVidia hated it when customers pulled that trick, so they started making sure it was impossible.

creato 1 year ago | | |

> You used to be able to (and still can, in some cases) take a low-end device and, if you'd won the chip lottery, "undo" the binning and have a perfectly functional high-end version.

For the purposes you tested it, sure. Maybe some esoteric feature you don't use is broken. NVIDIA still can't sell it as the higher end SKU. The tests a chip maker runs to bin their chips are not the same tests you might run.

I'm sure chip makers make small adjustments to supply via binning to satisfy market demand, but if the "technical binning" is too far out of line from the "market binning", that's a lot of money left on the table that will get corrected sooner or later.

edit: And that correction might be in the form of removing redundancies from the chip design, rather than increasing the supply/lowering the price of higher end SKUs. The whole point here is, that's two sides of the same coin.

wmf 1 year ago | |

It's more likely that any defect in a core causes the whole core to be disabled. Especially in the this case where I assume the FP16 x FP16 -> FP32 path uses the same hardware as the FP16 x FP16 -> FP16 path.

danjl 1 year ago | | |

Exactly. They can easily sell more Ada 6000s, and I'm pretty sure they would do so rather than sell them for much less as 4090s.

m463 1 year ago | |

I think this is just like intel does.

Runs fast? i9. slower? i7. missing cores? i5 slowest? i3

perfect chips probably not only have all the cores working, they also run at low voltages so don't get as hot.

I wonder if they can figure out what parts of the chip run at what speeds, and disable the ones that run slow/hot

I'm pretty sure gpus are overclocked by vendors, so there must be some sort of binning either by the vendors or they buy binned parts. I'll bet if parts could go faster, you would have an ASUS/MSI/etc 4090-2x-max-$$$$

https://www.tomshardware.com/reviews/glossary-binning-defini...

Arrath 1 year ago | | |

I recall reading before that as yields improved over process maturation Intel has ended up binning faster passing chips as lower SKUs just to meet demand.

monocasa 1 year ago | |

I'm not sure that it's completed as a separate fp16 ALU. There's cute ways to share logic between a dual fp16 alu and a single fp32 ALU such that it's really just one ALU with those being different ops.

a_e_k 1 year ago | | |

As I understand it, that's how the original MMX got started. It was largely reusing the x87 ALU, but breaking the carry chains at the obvious points.

ashoeafoot 1 year ago | |

There must be a whole layer of reroute if defective plumbing in drivers.

deaddodo 1 year ago | |

I don't even understand why binning non-defect cards is dubious.

It's like the logic people have on /r/pcmasterrace is that if they didn't bin, they would just release all 4090s at 4080 prices. No, there would just be less 4080s for people to buy. No chip maker is going to sell their chips at sub-market rates just because they engineered them to/have a fab that can produce them at very low defect rates.

Now, Nvidia certainly has done dubious things. They've hurt their partners (EVGA, you're missed), and skyrocketing the baseline GPU prices is scummy as hell. But binning isn't anything I necessarily consider dubious.

revnode 1 year ago | | |

> No, there would just be less 4080s for people to buy.

Sure, and more 4090s at a lower price.

thatfrenchguy 1 year ago | |

This starts as binning and ends up at down binning :)

modeless 1 year ago |

This is why Nvidia needs competition. I love the performance of their hardware and the quality of their drivers, but I don't love being their customer. They have a long, long history of price discrimination using techniques like this. Back in the day it was "workstation" graphics for CAD programs that they would nerf for consumer cards by downgrading various features of OpenGL.

Different markets, same techniques. It's in the company DNA. That and their aversion to open source drivers and various other bad decisions around open source support that make maintaining a working Linux GPU setup way harder than it should be even to this day.

jsiepkes 1 year ago |

There was a time when Intel seemed unbeatable. In 2000 they had a 500 billion USD valuation. That's almost a trillion dollars in today's (2024) USD. Today they are valued at 90 billion USD and Broadcom was thinking about buying them...

My point is these things don't seem to last in tech.

cosmic_cheese 1 year ago |

Between EVGA getting out of the Nvidia card business, Nvidia continuing to be problematic under Linux (even if that’s improving), all the nonsense with the new power connector, and the company’s general sliminess, I’m increasingly leaning towards an AMD (or potentially Intel) card for my next tower upgrade.

AMD and Intel might only be competing in the entry-to-midrange market sector but my needs aren’t likely to exceed what RX 8000 or next-gen Intel cards are capable of anyway.

harshreality 1 year ago |

I'm ambivalent about this sort of thing (or, as another example, Intel's CPUs many years ago that offered paid firmware upgrades to enable higher performance).

On one hand, it's very bad because it reduces economic output from the exact same input resources (materials and labor and r&d).

On the other hand, allowing market segmentation, and more profits from the higher segments, allows more progress and scaling for the next generation of parts (smaller process nodes aren't cheap, and neither is chip R&D).

itsthecourier 1 year ago | |

I want to add nvidia sales are 90% data center and 10% gaming, and the author being part of the 10% who wasn't abandoned is complaining they got a product at half the speed, way lower price, instead of half the price same specs as the datacenter client

man.

zeusk 1 year ago | | |

Or the 90% are charged absurd markup because clearly they can deliver the hardware for 10% use-case for 1000$ and still make money on top but they would rather charge the data centers 50k for the same product

mensetmanusman 1 year ago | |

“On one hand, it's very bad because it reduces economic output from the exact same input resources (materials and labor and r&d).”

This is not true with the economies of scale in the semiconductor industry.

chillingeffect 1 year ago | |

Interesting no one considers the environmental impact. This creates tonnes of e-waste with a shortened useful life.

We should demand that it's unlockable after a certain time.

dotancohen 1 year ago | | |

Maybe not demand that it be unlockable, but rather if Nvidia were to provide a paid upgrade path to unlock these features that would help. They would need some way to prevent the open source drivers from accessing the features, though.

wmf 1 year ago | | |

E-waste is mostly a fake concept. By the time a 4090 has outlived its usefulness for gaming it's also likely that no one wants it for AI (if they ever did).

rbanffy 1 year ago |

Halving the clock also reduces heat dissipation and extends component life.

michaelt 1 year ago | |

In this case, the 2-slot RTX 6000 consumes 300 W whereas the "nerfed" 3.5-slot 4090 can draw 450 W.

So I don't think the nerfing here was to lower power consumption. It's just market segmentation to extract maximum $$$$ from ML workloads.

nvidia have always been pretty open about this stuff - they have EULA terms saying the GeForce drivers can't be used in data centres, software features like virtual GPUs that are only available on certain cards, difficult cooling that makes it hard to put several cards into the same case, awkward product lifecycles, contracts with server builders not to put gaming GPUs into workstations or servers, removal of nvlink, and so on.

rbanffy 1 year ago | | |

I didn’t say they don’t do artificial segmentation. I just noted that, in this case, it might have an upside for the user. There might also be some binning involved- maybe the parts failed as A300 parts.

beefnugs 1 year ago | |

Yeah, somebody knew the new power connectors were going to be sus, so halving the power was at least somewhat safe thing to do

chrsw 1 year ago | |

Yeah, this is far from new too

kijin 1 year ago |

Binning and market segmentation are not mutually exclusive. Of course they're going to put their best-performing chips in the most expensive segment.

sdwr 1 year ago | |

The difference is whether chips with no defects get artificially binned.

In a competitive market, if you have a surplus of top-tier chips, you lower prices and make a little more $$ selling more power.

With a monopoly (customers are still yours next upgrade cycle), giving customers more power now sabotages your future revenue.

kube-system 1 year ago | | |

I don't think anyone has ever gone to TSMC and said "hey we're short on our low end chips, can you lower your yields for a bit?"

tetrisgm 1 year ago |

Can this be fixed by removing the efuse or having a custom firmware?

xvfLJfx9 1 year ago | |

You can't lol. How do you wanna restore a blown fuse on nanometer level INSIDE the GPU die. Its simply not possible.

By the way, AMD also uses fuse blowing if you e.g. overclock some of their CPUs to mark them as warranty voided. They give you a warning in the BIOS and if you resume a fuse inside the CPU gets blown that will permanently indicate that the CPU has been used for overclocking (and thus remove the warranty)

bri3d 1 year ago | | |

> You can't lol. How do you wanna restore a blown fuse on nanometer level INSIDE the GPU die. Its simply not possible.

I wouldn't dismiss this so aggressively.

Frequently (more frequently than not), efuses are simply used as configuration fields checked by firmware. If that firmware can be modified, the value of the efuse can be ignored. It's substantially easier to implement a fused feature as a bit in a big bitfield of "chicken bits" in one-time programmable memory than to try to physically fuse off an entire power or clock domain, which would border on physically irreversible (this is done sometimes, but only where strictly necessary and not often).

guerrilla 1 year ago | | |

> How do you wanna restore a blown fuse on nanometer level INSIDE the GPU die. Its simply not possible.

Bullshit. There will be hackers in the future who can do it in their garage. Just... not anytime soon.

> By the way, AMD also uses fuse blowing if you e.g. overclock some of their CPUs to mark them as warranty voided. They give you a warning in the BIOS and if you resume a fuse inside the CPU gets blown that will permanently indicate that the CPU has been used for overclocking (and thus remove the warranty)

Emphasis on "some." You can buy plenty of CPUs from them made for overclocking.

MPSimmons 1 year ago | |

e-fuses (https://en.wikipedia.org/wiki/EFuse) are typically etched into the silicon as an exactly-once operation, meant to irrevocably set a configuration. Some devices, for instance, have an e-fuse that makes it impossible to change cryptographic trust signatures after the e-fuse has been blown.

sabareesh 1 year ago | | |

That's intriguing; one might assume that a key feature of eFuse would be the ability to reset easily. But I guess it could be implemented without it

NoPicklez 1 year ago |

There is competition, but the competition isn't winning 1st, 2nd or 3rd.

There was a day in which Intel was top dog and for a long time, now AMD are competing in 1st place.

AMD and/or other competitors will have their day, but today its Nvidia.

stuckkeys 1 year ago |

This is so sketch. Although, I have not seen any reports of misrepresentation, but I hope EU looks into this.

devops99 1 year ago |

  https://xcancel.com/realGeorgeHotz/status/1868356459542770087

ryao 1 year ago |

Did they do this to the 3090 Ti too?

knowitnone 1 year ago |

so the next questions, how to deposit a small bit of metal to fix the fuse?

tayo42 1 year ago |

How do you go to from that screen shot to this conclusion?

zamadatix 1 year ago | |

You go from the eFuse mentioned at the beginning to the conclusion in the screenshot at the end, not the other way around.

tayo42 1 year ago | | |

Am I missing something then? Is there some context to the linked tweet and screen shot that didn't come up?

h_tbob 1 year ago |

I wonder why everyone on here isn’t saying “copyright is stupid”. You know it grants a monopoly?

throwaway314155 1 year ago | |

What?

Uw5ssYPc 1 year ago |

Imagine running this card at full speed, with fully unlocked potential. What would happen to new tiny power connectors? I am betting insta-fire.

wmf 1 year ago | |

It would just throttle like it already does.

Uw5ssYPc 1 year ago | | |

Throttle means that performance gain would not be there. Unthrottled power is needed for (hypothetical) unthrottled GPU chip. Unthrottled power is impossible on current power design, unless melted connectors are not a concern.

sitzkrieg 1 year ago |

stock must go up

daxfohl 1 year ago |

Yeah but we can only use 10% of our brain too, so.

unethical_ban 1 year ago |

It's interesting how we the people (broadly) accept this practice in software and even some hardware, but not in other areas. Note how frustrated people are when you hear about "unlocking" sensors and services available on cars.

If a product is made, and the cost to provide that product is the same one way or the other but you cripple it to create segmentation, then that is greed. Period. Objectively. And if you're okay with that, then fine, no problem. Just don't try to tell me it isn't maximization of profit.

There are no heroes in the megacorp space, but it would be nice for AMD and Intel to bring Nvidia to heel.