Nvidia announces new AI chip for personal computers

Nvidia announces new AI chip for personal computers(bbc.com)

41 points by rishikeshs 3 hours ago | 56 comments

eigenspace 1 hour ago |

I'm surprised they released this thing. Brand perception is probably a lot more important to Nvidia than whatever sales they could get from this thing, and if it's basically just DGX Spark, it's likely to underwhelm.

I've heard there's still a large backlog of both software problems, and hardware problems with the platform. The software problems could be fixed with time, but they'll still give a shitty first impression. I'd have thought Nvidia would just bury this and try again with a successor run of silicon with a new design.

This thing seems practically destined to just be a repeat of the Snapdragon laptop debacle.

fg137 1 hour ago | |

I cannot think why someone would run those workflows on a Windows laptop, unless someone has way too much money to spend.

bigfishrunning 1 hour ago | | |

> someone has way too much money to spend.

that's what nvidia is hoping for

thinkingtoilet 1 hour ago | | |

If the workload is offloaded to the chip, why would the host platform matter?

aseipp 1 hour ago |

The GB10 itself is pretty good and I love using mine for broad Linux development. But it's too expensive for consumer level pricing, and even for the "prosumer" the price is pretty stiff. Even if they dropped the CX-7 and halfed the RAM and shipped a smaller hard drive, would it be below, say, $2500 USD? I guess we'll see, but this variant is coming out pretty late so maybe it's just best to wait for the 2nd generation.

h14h 1 hour ago | |

This feels like getting a foot in the door to ensure Apple doesn't entirely eat Nvidia's lunch if AI inference workloads start to shift from cloud to local.

With MLX, Apple is building an answer to CUDA, and if people start switching from ChatGPT & Claude to some app that runs on their M5, suddenly Apple starts to look like Nvidia's biggest competitor.

If Nvidia doesn't have a pathway towards getting hardware into the hands of consumers, it could be a really difficult road ahead for them.

comandillos 1 hour ago |

So they have basically reused the same hardware as in the DGX Spark (GB10)... That chip isn't great for LLM inference actually.

https://www.techpowerup.com/gpu-specs/gb10.c4342 https://www.nvidia.com/en-us/products/rtx-spark/

general_reveal 1 hour ago | |

The RTX GPU laptops run very hot. Even though they are pound for pound better, it’s just runs too hot for local llm usage for me at least. Prefer Macs for this. A lot of AMD cards also run cooler. I wonder if undervting would help with smaller models and heat.

comandillos 32 minutes ago | | |

I mean the GB10 is pretty efficient for the power it has, but imho is nowhere near the power efficiency of Apple Silicon (it was never intended to be a chip used for mobile devices). I guess this is kind of the movement Apple did with the A12Z and the Mini but... the other way around?

I think its gonna be another failure as we are used to see with the PC market these days.

ewklwekl 1 hour ago | |

It is great for inference for single user/single session. it is not replacement for graphical accelerator, that run several concurrent inference sessions in parallel.

Basically the same tradeoff as macmini with unified memory.

joe_mamba 1 hour ago | |

>That chip isn't great for LLM inference actually.

Why do I have the feeling it's been intentionally made to be bad in order to get you on to their most pensive datacenter gear.

ekidd 1 hour ago | | |

It's probably more that LLM inference speed comes from having a large amount of fast RAM. And fast RAM is brutally expensive right now.

At this point, your cost-efficient options include used 3090s, "frankenrigs" using recycled data center cards, and a handful of "workstation" class cards, where the originally high margins and the long enterprise purchasing cycles have kept prices from going up too fast.

In contrast, a lot of these "personal" AI systems are basically a GPU-like core wired to larger amounts of slow RAM. Which is still semi-affordable. Generally speaking, they make for OK chatbots but extremely slow coding agents. Whereas you can run a modestly useful coding agent at reasonable speed on a 3090.

So yeah, a lot of these systems are bit scammy. But not because it's a secret conspiracy to protect data center cards. Rather, there simply isn't enough fast RAM in the entire world. So they'll flog you disappointly slow RAM instead.

TL;dr: Might be useful for some use cases, but benchmark very carefully.

Tiberium 2 hours ago |

Also see https://news.ycombinator.com/item?id=48352939

dom96 1 hour ago |

I’m getting more and more convinced that we will end up running LLMs in our personal computers. Which makes me wonder where Anthropic/OpenAIs moats will come from.

orthoxerox 1 hour ago |

Well, it was only a matter of time, since both AMD and now Intel are now switching to APUs. Nvidia could either cede the desktop GPU market to them, going all-in into AI datacenter chips, or it could challenge them.

Maybe the Nth time's the charm and Microsoft+Nvidia will manage to make Windows on ARM a viable platform.

xeyownt 1 hour ago |

Great! More pressure on fabs, price of standard GPU will again rise.

Guess I need to postpone my gamer PC renewal to end 2030.

chris_money202 1 hour ago |

It’s a step in the right direction, but there’s still a long ways to go in terms of smaller LLMs ability and hardware costs

koolkao 1 hour ago |

Very exciting! sounds like we're finally leaving x86 behind

lanycrost 1 hour ago |

I'm waiting for powerful on device LLM models, since that not worth it

Hugsun 1 hour ago | |

Have you tried Qwen 3.6 or Gemma 4? They're not frontier level but certainly have their uses.

agnosticmantis 1 hour ago |

How would these compare to a MacBook Pro M5 in terms of performance and price?

PunchyHamster 1 hour ago |

The fact they advertise it as some step forward in PCs is outright bizzare.

It's just worse Strix Halo, as you are landing square in middle of Windows ARM problems

Iolaum 1 hour ago | |

Strix Halo chips have around 210+ GB/sec gpu memory bandwidth and announcements put the new nvidia chip at around 300GB/sec gpu memory bandwidth.

I 'd say that is an improvement if you want to run local llm inference. Still well below with what you can achieve with Apple chips though.

ocdtrekkie 1 hour ago |

The thing I think is really funny is that if this takes off, frontier model companies and datacenters will end up holding the bag, and as per usual after the last few tech hype cycles, NVIDIA will still be selling.

Eventually a lot of inference will get right-sized into something you affordably run yourself.

LatencyKills 1 hour ago |

First:

> "Our goal is to deliver unmetered intelligence to every home and every desk with Windows," said Satya Nadella, chairman and head of Microsoft.

Then:

> However, Ian Fogg, Research Director at industry analyst firm FDM CCS Insight said the change was "likely to come with a significant price tag" and Nvidia would be targeting "those looking for workstation-class performance".

So... not every desk with Windows.

pitched 1 hour ago | |

First, make it possible. Then, expand the market. The early adopters help pay R&D for later efforts. Every desk is a good goal, even if not hit by the first doodad.

It just feels too much like what they said about Apple II and early Windows. A play at nostalgia instead putting real thought into it.

LatencyKills 1 hour ago | | |

I was an engineer at both MS and Apple, and wholeheartedly agree with you.

My question is, what happens to the people who use RTX cards for gaming? This new solution isn't meant for that. Do they need an "AI accelerator" and a gaming-centric GPU?

cryo32 1 hour ago | |

I don’t know anyone other than a very small but vocal minority who will give a shit about this.

Even in the analytics side most of the stuff is some shonky ass numpy or excel gank.

I don’t know what the market is. I just can’t see it.

netdevphoenix 1 hour ago | |

The constant deliberate conflagration of LLMs with general intelligence is so grating.

sylware 3 hours ago |

Did they tell Trump that if you don't use chips with the latest silicon process, machine learning will just take a bit more time and more energy, but it will happen anyway and at the same level of quality if the machine learning "recipes" and training data are close enough?

twobitshifter 1 hour ago | |

Right, the export controls are only forcing Chinese AI to innovate, build their own fabs, and make training and inference more efficient. The end game of this will be NVIDIA chips won’t be wanted because you can get a $50 chinese chip running a ternary model that is competitive with claude in English and is much better in Mandarin.

adrian_b 1 hour ago | | |

The US government has failed to learn from its own history.

60 years ago the US government had forbidden the export of fast computers to France, with the hope that this sanction will prevent the French from developing thermonuclear bombs.

The result was that the French state (which at that time was lead by de Gaulle, not much less autocratically than China) subsidized some of their computer manufacturers, which previously could not compete with the American companies like IBM and CDC, and also their semiconductor manufacturing industry, which had to provide the components for the locally-made computers.

Eventually, the French produced TTL circuits and mainframe computers made with them, and finally they also made thermonuclear bombs.

So the American "sanctions" against France have been a complete failure and have been great for the French industry of semiconductors and computers.

Many years later, when USA no longer had export restrictions towards France and the French state no longer protected their industry, the French industries of integrated circuits and computers have been greatly reduced, their companies either becoming bankrupt or being bought or merged into multinational companies.

pitched 1 hour ago | | |

I would order that in a heartbeat. Even if it required proprietary Chinese-government drivers. I would try to segregate in a VM without internet or something. Please make this happen! Tokens cost too much in the current system.

asimovDev 1 hour ago |

>Lenovo, HP, Dell and Apple accounted for almost 75% of the world's PC market in the first three months of this year, according to research firm Gartner.

https://www.gartner.com/en/newsroom/press-releases/2026-4-10...