AI chipmaker Cerebras files for IPO(cnbc.com) |
AI chipmaker Cerebras files for IPO(cnbc.com) |
I don’t think their value add is simple “single wafer” with all other variables the same. In fact I think the block and system that gets the most out of that form factor is the secret sauce and not as easily replicated - especially since the innovations are almost certainly protected by an enormous moat of patents and guarded by a legion of lawyers.
So, performance is iffy. Density for density sake doesn’t matter since clusters are power limited.
"56x the size of H100 but only 64x the performance improvement"
Doesn't sound too shabby.
It doesn't sound like it's too bad for a 9 year old company. Nvidia had a 20-year head start. I would expect that they will continue to shrink it and increase performance. At some point, that might become compelling?
Now, is Cerebras going to eventually beat Nvidia or at least compete healthily with Nvidia and other tech titans in the general market or a given lucrative niche of it? No idea. That'd be a cool plot twist, but hard to say. But it's worth acknowledging that investing in a company and buying their products are two entirely separate decisions. Much of silicon valleys success stories are a result of people investing in the potential of what they could become, not because they were already the best on the market, and for nothing else, Cerebras approach is certainly novel and promising.
What are they?
Is this related to defects? Can't they disable parts of defective chip just like other CPUs do? Sounds cheaper than cutting up and packaging chips individually!
My recollection is that theres speculation cerebras is building in significant duplicate features to account for defects. They cant “bin” their wafers in the same way as packaged chips. That will reduce total yield/utilization of the surface area.
The actual packaging steps are relatively low tech/cost compared to the semiconductor manufacturing. Theyre commonly outsourced somwhere like malaysia or thailand.
get extorted by nvidia sales people for a 2026 delivery date that gets pushed out if you say anything about it or decline cloud services
or another provider delivering earlier
thats what the market wants, and even then, who cares? this company is trying to IPO at whay valuation? this article didnt say but the last valuation was like $1.5bn? so you mean a 300x of delta between this and Nvidia’s valuation if these guys get a handful of orders? ok
I'm sure tgey test it thoroughly. /s
On the other, Nvidia is worth 3trn so they can sell a pretty good dream of what success looks like to investors.
Personally I would expect them to get a valuation well about the 4bln from the 2021 round, despite the financials not coming close to justifying it.
The more concerning thing is just not having diversity of revenue, since most of it comes from G42.
$24.6M $78.7M $270M($136.4M)
Sounds like a rocketship. You also get a better sharp if you take some money off the table in the form of leverage and put it in other firms within the industry. E.G. Leveraging your NVDA shares and buying Cerebras.
Please don't do this. Sell your Nvidia shares and rebalance to Cerebras, whatever. But financially leveraging a high-multiple play to buy a correlated asset (which is also high multiple) is begging for a margin call. You may wind up having been right. But leverage forces you to be right and on time.
I don't think they did expect that.
https://www.zdnet.com/article/cerebras-did-not-spend-one-min...
To put my money where my mouth is I’m long TSMC and ASML among others, and (moderately) short NVidia. Very long the industry as a whole though.
"The documents also say that a single customer, G42, accounted for 83% of revenue in 2023 and 87% in the first half of 2024."
https://www.eetimes.com/cerebras-ipo-paperwork-sheds-light-o...
Their idea is to have 44 GB SRAM per chip. SRAM is _very_expensive_ compared to DRAM (about two orders of magnitude).
It's easy to design larger chip. What determines the price/performance ratio are things like
- performance per chip area.
- yield per chip area.
"The cross scribe line wiring has been developed by Cerebras in partnership with TSMC. TSMC allowed us to use the scribe lines for tens of thousands of wires. We were also allowed to create certain keep-out zones with no TSCM test structures where we could embed Cerebras technology. The short wires (inter-die spacing is less than a millimeter) enable ultra-high bandwidth with low latency. The wire pitch is also comparable to on-die, so we can run the inter-die wires at the same clock as the normal wires, with no expensive serialization/deserialization. The overheads and performance of this homogeneous communication are far more attractive than those of multi-chip systems that involve communication through package boundaries, transceivers, connecters or cables, and communication software interfaces."
My guess is there are a lot of bespoke limitations that the software has to work around to run on a "whole wafer" chip, and even companies that have 99% similar designs to Nvidia already are struggling to deal with software incompatibilities, even with such a tiny difference.
I have never heard of any models trained on this hardware. How does a company IPO on the basis of having the "best tech" in this industry, when all the top models are trained on other hardware.
It just doesn't add up.
Tesla IPOed in 2010 after selling only a few hundred Roadsters.
From the article
>Cerebras had a net loss of $66.6 million in the first six months of 2024 on $136.4 million in sales, according to the filing.
That doesn't sound very good.
What makes them think they can compete with Nvidia, and why IPO right now?
Are they trying to get government money to make chip fabs like Intel or something?
It’s fascinating.
"...next we have this rubber sheet, which is very clever, and very patented!"
Wow 200k amps in a chip. Whole thing looks like an early computer from 50s.
And especially don't tell them to start looking into who "sovereign clouds" actually are!
https://www.servethehome.com/cerebras-wafer-scale-engine-ai-....
I might be a bit suspicious if a company in some low-capital-intensive industry was IPOing while unprofitable, but this is chip making. Even if they're not making their own fabs this is still an industry with high capital requirements.
We should be thrilled at a company actually using an IPO for its original intended purpose as opposed to some financialization scheme.
They don't just make accelerators, they'll sell you the hardware too (unlike TPUs). They don't just sell you the hardware, the software ecosystem will work too (unlike AMD or Intel). That hardware won't just do a lot of computations, it'll also have a lot of off-chip memory bandwidth (vs Cerebras or others). Need to embed those capabilities in a device that can't fit a wafer cabinet or a server rack's worth of compute, Nvidia will sell you similar hardware that uses a similar stack, certified for your industry (e.g. automotive). Take any of that away and you're left with a significantly weaker offering.
Also they benefit from the priority of paying fabs a lot of money and placing a lot of orders.
If anything, Nvidia is less dominant than they should be because they've managed to ensure absolutely no one wants to buy from them when there are viable alternatives.
Yes, but you also need a lot of capital if you want node parity with them. Nvidia (supposedly) spent an estimated $9 billion dollars getting onto TSMC's 4nm node. https://www.techspot.com/news/93490-nvidia-reportedly-spent-...
They get their chips from the same company that Nvidia does.
It's not necessarily to TSMC's advantage for Nvidia to become a monopolist either, although they wouldn't be totally dependent on Nvidia even if they did because TSMC serves every chip market.
The actual design and R&D is still done by Nvidia, Cerebras, AMD, Groq, etc.
Think of TSMC like Kinko's - they do printing and fabrication which is very low margins.
The main PMF for Cerebras is in simulations, drug discovery, and ofc ML.
As I've mentioned before on HN, Public-Private Drug Discovery and NatLab research has been a major driver for HPC over the past 20 years.
Similar to Tenstorrent who chose GDDR instead of HBM, they throught production AI models won't get bigger than GPT3.5 due to cost.
2. The price of the shares in private markets has been steadily inclining, so I think there is demand.
You may have a hugely profitable idea that could realize crazy gains over a 5 year horizon, but if you get margin called and liquidated in year 3, you'll end up with nothing.
The magic of investment is compound returns, not crazy leverage. Take some of the crazy Nvidia profits and reinvest it elsewhere where you expect geometric growth. Keep things decently diversified.
That might mean VCs are turning them down, yeah, but that’s just one of many possible factors into “where do we raise money”
The second generation of Arc is called Battlemage and the successor to Ponte Vechio is Falcon Shores and both are promised in 2025.
This is later than originally expected by a few months but no reason to think they've abandoned it.
If they do ever decide to do that it wouldn't be because the first gen didn't make money but because they no longer think they can subside it long enough for it to become profitable, which can either be because they run out of money or because it isn't gaining market share at the rate they expected.
https://www.eng.biu.ac.il/fishale/files/2020/12/A-1-Mbit-Ful...
It's not a simple process at all but requires a lot of engineering and engineers to do it.
https://companiesmarketcap.com/usa/largest-companies-in-the-... https://companiesmarketcap.com/tsmc/marketcap/
It only became profitable NOW in the last 2-3 years.
Before that, foundry after foundry was shutting down or merging.
TSMC, UMC, Samsung, Intel Foundry Services, and GloFlo are the last men standing after the severe contraction in the foundry model in the 2000s-2010s due to it's extremely high upfront costs and lack of moat to prevent commodification.
[1] https://www.macrotrends.net/stocks/charts/TSM/taiwan-semicon...
Almost every other foundry system died because of low net margins.
Software (and fabless hardware like chip design) is expected to have 60-70% gross margins or the ability to reach that.
Semiconductors is part of TMT just like Software or Telecom, and this has an impact on available liquidity.
This is why TSMC is heavily subsidized by the Taiwanese government.
What an amazingly reductive analogy :)
At the end of the day it's all about performance per dollar/TCO, too, not just raw perf. A standardized benchmark helps to evaluate that.
My guess is that they neglected the software component (hardware guys always disdain software) and have to bend over backwards to get their hardware to run specific models (and only those specific models) well. Or potentially common models don't run well because their cross-chip interconnect is too slow.
Artificial analysis does good API provider inference benchmarking and has evaluated Cerebras, Groq, Sambanova, the many Nvidia-based solutions, etc. IMO it makes way more sense to benchmark actual usable end points rather than submit closed and modified implementations to mlcommons. Graphcore had the fastest BERT submission at one point (when BERT was relevant lol) and it didn't really move the needle at all.
Without optimized implementations their performance will look like shit, even if their chip were years ahead of the competition.
Building efficient implementations with an immature ecosystem and toolchain doesn't sound like a good time. But yeah, huge red flag. If they can't get their chip to perform there's no hope for customers.
“nvidia’s chip is better than yours. If you can’t make your software run well on nvidia’s chip, you have no hope of making it run well on your chip, least of all the first version of your chip.”
That’s why tinycorp is betting on a simple ML framework (tinygrad, which they develop and make available open source) whose promise is, due to the few operations needed by the framework: it’ll be very easy to get this software to run on a (eg your) new chip and then you can run ML workloads.
I’m not a (real) expert in the field but find the reasoning compelling. And it might be a good explanation for the competition for nvidia existing in hardware, but seemingly not in reality (ie including software that does something with it).
This sounds easy in theory, but in reality, based on current models, the implementations are often tuned to make them work fast on the chip. As an engineer in the ML compiler space, I think this idea of just using small primitives, which comes from the compiler / bytecode world, is not going to yield acceptable performance.
I would love to try some of the stuff I do with CUDA on AMD hardware to get some first-hand experience, but it's a though sell: They are not as widely available to rent and telling my boss to order a few GPUs, so we can inspect that potential mess for ourselves is not convincing either.
When a foundry wishes to raise capital from the private or public markets, it's bucketed under TMT - which includes software and fabless hardware as well.
This means it's almost impossible to raise capital without a near monopoly and/or government support and intervention - which is what Taiwan did for TSMC and UMC - because the upfront costs are too high and the margins are much lower compared to other subsegments in the same sector.
This is why industrial subsidizes like the CHIPS act are enacted - to minimize the upfront cost of some very CapEx heavy projects (which almost everything Foundry related is).
[1]https://cerebras.ai/blog/cerebras-architecture-deep-dive-fir...
[2]https://www.eenewseurope.com/en/raaam-signs-lead-licensee-fo...
> There are many efforts that are ultimately going in this direction, from Google's Tensorflow to the community project Aesara/PyTensor (née Theano) to the MLIR intermediate representation from the LLVM folks.
The various GPU companies (AMD, NVIDIA, Intel) are some of the largest contributors to MLIR, so saying that they're going in the direction of standardization is not wholly true. They're using MLIR as a way to share optimizations (really to stay at the cutting edge), but, unlike tiny grad, MLIR has a much higher level overview of the whole computation and the company's backends will thus be able to optimize over the whole model.
If tiny grad were focused on MLIR's ecosystem I'd say they had a fighting chance of getting NVIDIA-like performance, but they're off doing their own thing.