The sigmoids won't save you

The sigmoids won't save you(astralcodexten.com)

303 points by Tomte 2 days ago | 283 comments

stego-tech 2 days ago |

I felt the better takeaway from this was that it's impossible to know for certainty how long this will or will not continue regardless of the data or models you're using, because if you (or anyone else) could predict that accurately they'd be one of the richest people on the planet.

I don't know when (or if) AI will implode or succeed with any degree of provable certainty, because that's not my area of expertise. Rather, I can point out and discuss flaws in the common booster and doomer arguments, and identify problems neither side seems willing to discuss. That brings me cold comfort, but it's not enough to stake my money on one direction or another with any degree of certainty - thus I limit my exposure to specific companies, and target indices or funds that will see uplift if things go well, or minimize losses if things go pear-shaped.

I also think relying on such mathematics to justify a position in the first place is kind of silly, especially for technical people. Mathematical models work until they don't, at which point entirely new models must be designed to capture our new knowledge. On the other hand, logical arguments are more readily adapted to new data, and represent critical, rather than mathematical, thinking and reasoning.

Saying AI is going boom/bust because of sigmoids or Lindy's Law or whathaveyou is not an argument, it's an excuse. The real argument is why those things may or may not emerge, and how do we address their consequences within areas inside and outside of AI through regulation, innovation, or policy.

bjackman 2 days ago | |

I think his agenda here is to point out that your probability distribution for AI outcomes should be broad (what you said), but most importantly: this means you must take seriously the possibility that we are gonna get superintelligence quite soon.

Basically a lot of people say "but isn't it also pretty likely that we DON'T get superintelligence?" And, yes, it is. But superintelligence being even a remotely plausible outcome is a big fucking deal. Your investment choices in that context are not important.

People really struggle to think rationally in the face of this shape of uncertainty.

torginus 1 day ago | | |

You want to go to the store to get ice cream. Ice cream is delicious and the value of eating ice cream is a small positive, let's say x. There's a one in ten million chance you'll get hit by a car on the way and die, and your life is infinitely precious, therefore the expected value of going is x times 1 = x, and the one of not going is 1/10m times negative infinity which is negative infinity. You are a rational person, so you don't go. In fact you don't do much of anything. Your value model of every activity has collapsed to a single value.

That's the problem with 'singularity' arguments. The people making them ignore the fact that the mathematical definition of the word means 'the model of outcomes collapses to a single value' therefore the model stops being useful, yet they somehow claim to be able to make predictions beyond the singularity. It's like those shitty Facebook math posts where they divide both sides of the equation by 0 (the fact hidden by some sleight of hand), to 'prove' that 2=1.

The formulation of the singularity involves putting outrageous values into the parameters of the model of reality, and denominator ignorance, and then claiming 'rationally' determining that the consequences are too severe to ignore.

coldtea 1 day ago | | |

>but most importantly: this means you must take seriously the possibility that we are gonna get superintelligence quite soon.

So, his point with all the demand for rigor is to end on a hand-waved jump of faith from "improved AI models" to the mythical "superintelligence"?

topherhunt 1 day ago | | |

I think his agenda / point is that, viewed from Lindy's Law, given the SOTA in 2026, superintelligent AI arriving soon is vastly more probable than not, right? To make the case that "sure, AI capability and intelligence have grown exponentially over the past several years, but don't worry, they're about to abruptly level off and in fact won't blatantly surpass human-level intelligence within the coming decades" seems to have a high burden of proof unless your model is less "sigmoid" and more "abrupt plateau".

stego-tech 2 days ago | | |

You’re 100% correct, which is why I opted for a broad investment approach rather than trying to pick “winners”.

My thought process RE: superintelligence/AGI is generally this:

* I personally don’t believe it’s likely to happen with silicon-based computing due to the immense power and resource costs involved just to get to where we are now; hence why I invest broadly to capitalize on what gains we actually attain using this current branch of AI research across all possible sectors and exposure rates

* If we do achieve AGI using silicon-based computing, its limited scale (requiring vast amounts of compute only deliverable via city-scale data centers) will limit its broader utility until more optimizations can be achieved or a superior compute platform delivered that improves access and dramatically lowers cost; again, investing broadly covers a general uplift rather than hoping for a specific winner

* If AGI is achieved, nobody - doomer or booster alike - will know what comes next other than complete and total destruction of existing societal structures or institutions. The stock market won’t explode with growth so much as immediately collapse from the disintegration of the consumptive base as a result of AGI quite literally annihilating a planet’s worth of jobs and associated business transactions. In this case, a broad spread protects me from harm by spreading the risk around; AGI will annihilate the market globally, but not all at once barring a significant global catastrophe instigated by it

* Which brings me to the worst outcome, where AGI follows the “if anybody builds it everyone dies” thought process: investment is irrelevant because we’re all fucked anyway.

And that’s just my investment approach. I’m too pragmatic to believe we’re at the bottom of the sigmoid curve, but too wise to begin guessing where we actually exist on it at present or how much is left in the current LLM-arm of AI research; I’m an IT dinosaur, not an AI scientist.

What I can point to is the continued demand destruction of consumer compute through higher costs and limited availability due to rampant AI speculation as proof that the harm is already here in a manner most weren’t predicting, while at the same time actual job displacement by AI is limited to the empty boasting of executives using it as a smoke screen for layoffs after RTO mandates failed to thin headcount sufficiently.

In the USA in particular, we’re facing a perfect storm of:

* consumer confidence collapse leading to a decline in spending on all goods, especially luxury ones, by all but the most monied demographics

* data center-driven cost increases (energy) and resource destruction (land, water, fossil fuel use)

* the eradication of government support for renewable energy that would’ve kept these costs in check

* the widening wealth gaps creating a new underclass not seen since before WW2

In other words, most of the discourse continues to revolve around hypotheticals of tomorrow rather than realities of today. That would be the lesson I’d hope more people take away from something like this, so we can finally begin addressing issues themselves rather than empty online circle jerking about who is right or wrong.

XorNot 1 day ago | | |

That's literally the singularity though - the point past which predictions are meaningless.

My "plan" is hope for a benevolent intelligence that establishes a post-human government and then enjoy poat-scarcity society doing wood working or something.

Billionaires should probably be more worried.

cellis 1 day ago | |

Not to argue your broader point, but how, exactly would predicting this, in a macro sense, make you the “richest person on the planet”? A very uninformed person could tell you that “Cerebras will be the next NVIDIA”, put their life savings into it and in 10 years have 100 million. Thats not going to make them anywhere close to the richest person on the planet. Unless you can do it on a micro sense, merely having the ability to “predict” the way the wind is blowing is meaningless.

par1970 1 day ago | |

Which doomer argument have you found what problem with?

btilly 2 days ago |

Lindy’s Law is an absolute gem, that I'm keeping.

If we don't understand the fundamental limits to any particular kind of trend, our default assumption should be that it will continue for about as long as it has gone on already.

We can, in fact, easily put a confidence interval on this. With 90% odds we're not in the first 5% of the trend, or the last 5% of the trend. Therefore it will probably go on between 1/19th longer, and 19 times longer. With a median of as long as it has gone on so far.

This is deeply counterintuitive. When we expect something to last a finite time, every year it goes on, brings us a year closer to when it stops. But every year that it goes on properly brings the expectation that it will go on for a year longer still.

We're looking at a trend. We believe that it will be finite. Our intuition for that is that every year spent, is a year closer to the end. But our expectation becomes that every year spent, means that it will last yet another year more!

How can we apply that? A simple way is stocks. How long should we expect a rapidly growing company, to continue growing rapidly?

dreambuffer 2 days ago |

FYI: The author has predicted that "AGI" will be here in 1-2 years and has staked his public reputation on it. He is personally invested in trendlines being lindy rather than sigmoid.

I don't think you can use lindy on trends as if trends are static objects, but that's another conversation.

andy99 2 days ago |

AI has scaled well according to convenient measures. It (neural networks) have the property that whatever you define, they can rapidly be trained master it. We’re able to show that various tasks of increasing complication do not require intelligence and can be framed as autoregressive RL problems. I personally don’t think AI is any closer to sentient intelligence than LeNet; it’s almost trivially clear, we know how it works. So we’re measuring something orthogonal, basically how well a universal function approximator can fit to a function we define, given arbitrary computing power, and calling that progress. What will be really interesting is if we’re able to find a way to properly measure what they can’t do and what’s different about real intelligence.

Edit: in particular I don’t agree with

  But if someone claims that the trend toward increasing AI capabilities will never reach some particular scary level...

One has to agree that the benchmark results are getting “scarier”, which is not automatically implied by finding more goals to optimize for

stymaar 2 days ago |

I don't know when the sigmoid is going to kick in, but Nvidia's Quaterly datacenters revenues have been grown 15 folds over the past 3 years[1], and nobody including Scott believes this is sustainable for 3 more years otherwise Nvidia's market cap would conservatively be at least an order of magnitude higher than it is.

All exponential eventually becomes a sigmoid because exponential growth always expose limiting factors that weren't limiting at the beginning. Silicon manufacturing had lots of room for high-margin customers like Nvidia even a year ago (by the mere virtue of outbidding lower-margin customers), but now it is mostly gone, and no amount of money will make fabs build themselves overnight.

[1]: https://stockanalysis.com/stocks/nvda/metrics/revenue-by-seg...

LarsDu88 2 days ago |

I think an interesting thing about recent AI developments is that its all happening right as we hit the diminishing returns side of another "exponential that's actually a sigmoid" which is Moore's law.

The naive expectation is that AI will slow down b/c Moore's law is coming to an end, but if you really think about the models and how they are currently implemented in silicon, they are still inefficient as hell.

At some point someone will build a tensor processing chip that replaces all the digital matmuls with analogue logamp matmuls, or some breakthrough in memristors will start breaking down the barrier between memory and compute.

With the right level of research funding in hardware, the ceiling for AI can be very high.

noosphr 2 days ago |

This article answers the question in the second paragraph then completely ignores the answer for the rest of it.

>My understanding is that this represents 3-4 “generations” of different technology (propellers, turbojets, etc). Each technology went through normal iterative improvement, then, when it reached its fundamental limits, got replaced by a better technology. The last technology, ramjets, reached its limit at about 3500 km/h, and there wasn’t the economic/regulatory will to develop anything better, so the record stands.

You don't have one sigmoid, you have multiple each stacked on top of each other. Airplanes aren't just one technology they are multiple technologies that happen to do the same thing.

Each one is following a sigmoid perfectly. It only looks exponential(ish) because of unpredictable discoveries that let you switch to another sigmoid that has a higher maximum potential.

The same is true in AI. If you used the same architecture as GPT2 today you're in for a bad time training a new frontier model. It's only because we have dozens of breakthroughs that the capabilities of models have improved as much as they have.

That said exponential and sigmoids are the wrong model to use for growth. Growth is a differential equation. It has independent inputs, it has outputs and some of those outputs are dependent inputs again through causal chains of arbitrary complexity. What happens depends entirely on what the specific DE that governs the given technology is. We can easily have a chaotic system with completely random booms and busts which have no deep fundamental rhyme or reason. We currently call that the economy.

gm678 2 days ago |

I don't know what the Y-axis is supposed to be on that Wharton AI capabilities graph, but I am not really convinced that Opus 4.6 has more than double the intelligence/capability/whatever of GPT 5.1 Max.

whatshisface 2 days ago |

If you want a model, here's one: LLMs have never demonstrated the ability to go obviously beyond interpolating their training data. It takes an army of paid data producers solving homework problems to give ChatGPT the ability to do your homework. All vibecoded apps that turned out to be successful could put on a geological soil chart with other apps, probably on GitHub somewhere, on the corners. The prediction? They won't.

In this model, the exponential growth that everybody is freaking out about is only the realization of the modular software dream ("we'll only have to write an ORM once for all of human history!") and the sheer amount of knowledge in libraries.

It's at least falsifiable.

boxed 2 days ago | |

Just to play devils advocate: are we sure humans have demonstrated the ability to go beyond their training data? Like.. are we sure-sure about that?

whatshisface 2 days ago | | |

I'm not asking for anything close enough to the boundary for questions like that to be difficult. There are some ML systems like AlphaGo that have crossed the line in specific domains. It's just that making self-play and online learning work for huge LLMs is highly non obvious.

The idea is simply that the basic idea behind LLMs, that you're distilling the entropy out of the entire available world of text, is antithetical to creativity.

Further developing on the theme of self-play, humans have the ability to sense what we want (intellectually) and reach for it communally over thousands of years. It's an innate quality, and if AI starts participating (contrast to giving people psychosis) we will all be able to tell.

suddenlybananas 1 day ago | | |

Planes, trains, and automobiles are not natural features of the environment.

1attice 2 days ago | | |

Idk, I mean, Shakespeare never read Shakespeare, so, I mean, unless aliens?

rotis 1 day ago | | |

Are you suggesting intelligent design got us here?

ogogmad 1 day ago | |

See the recent breakthrough "Erdos Problem 1196" which experts couldn't solve for 60 years until ChatGPT Pro did. ChatGPT's key idea was the use of the "Von Mangolt function" which it showed could finally settle the problem. Terry Tao has condensed the AI's proof to around a page. The problem was well-known to experts (in the field of number theory) but it was a Large Language Model that ultimately solved it - which it did without human help!

jasongi 2 days ago |

First sigmoid was transformers allow us to rapidly scale to our already abundant data until we tapped it out, the second is/was reasoning, allowing us to scale to our available compute (and compute manufacturing capacity). Correct me if I'm wrong but we don't have candidate for the third sigmoid, and scaling inference is hitting real-world supply chain constraints - electricity and chips.

Short of a third sigmoid appearing in the ML CompSci space, perhaps in the form of ongoing, repeated step-optimisations which will also have diminishing returns, intelligence growth is now limited a few scaling problems that have already been worked on for a very long time.

Transistors, which have been doubling for almost a century now, but Moores Law has already plateaued and reached limits on energy efficiency, and simply building new fabs is not something that we can do exponentially. And the other growth limiter is electricity - there is no exponential supply of fossil fuels or power plants. Although manufacturing has scaled, PV tech improvements are also plateauing - and while storage is getting cheaper, it's still not economical vs fossil fuels (meaning: when we have to switch to it, the growth slows down further) and we are unlikely to see battery efficiency sigmoid enough to maintain the AI sigmoid.

I don't mean to be bearish here. There's so much money sloshing around that we can afford to put the smartest people, using unlimited tokens, on the task of finding small, incremental gains on the CompSci side of things that will have large monetary payoffs - hopefully allowing further scaling and increased emergent abilities of LLMs. Maybe we can squeeze the algos for quite a while. But I don't see that maintaining the same level of exponential as unlocking unlimited data or maxxing out the world's energy/fab capacity for long.

And I don't see why this is a massive issue except for the people who want to have some god-like super AI? Frontier LLMs are genuinely magic. Not "won't delete your production database" magic, but definitely a massive productivity gain for competent knowledge workers.

foobar10000 2 days ago | |

We are multiple orders of magnitude away from Landauer limits - so next big thing in matmul could be photonic multipliers - there’s a bunch of them coming up in the next 3? years. So that’s a 2-4 order of magnitude improvement. Sigmoid?

ianm218 2 days ago | |

Reinforcement learning has become a huge portion of compute used during training runs [1] and synthetic data is letting us get lots more mileage out of the existing data. Additionally, there is lots of new, high quality data being created and collected each day. I think the "running out of data" thing was pretty poorly reported by mainstream media.

[1]. https://www.dwarkesh.com/p/dario-amodei-2

Brendinooo 2 days ago |

> then what is their model?

My mental model has been 3D computer graphics: doubling the polygon count had huge returns early on but delivered diminishing returns over time.

Ultimately, you can't make something look more realistic than real.

I don't know what the future holds, but the answer to the question "can LLMs be more realistic than real" will determine much about whether or not you think the curve will level off soon.

the8472 2 days ago | |

The equivalent bar in this domain would be human intelligence, and we already have growing lists of tasks where machines outperform humans. We even known of natural systems that outperform humans on some metrics, e.g. bird-brains have higher neuron density than ours because evolution had to optimize more for weight.

kilpikaarna 1 day ago | |

In 3D graphics there's diminishing returns on investment for the technology itself, but the real limit is one of economics. How to create all those assets required by the rendering tech, and make your money back. Preferably while also keeping your customers interested long term, by not becoming risk averse.

In the same fashion, LLMs have to pay for themselves to keep the trendlines going. In a whole-systems -sense, mind, not "$2000/month is cheaper than hiring a developer" while the rest of the economy collapses.

inglor_cz 2 days ago |

Hmmm, this is quite an interesting take by Scott.

Lindy's Law is not actually a law and many exact minds will be provoked by the very name; it also fails spectacularly in certain contexts (e.g. lifetime of a single organism, though not necessarily existence of entire species).

But at the same time, I am willing to take its invocation in the context of AI somewhat seriously. There is an international arms race with China, which has less compute, but more engineers and scientists. This sort of intellectual arms race does not exhaust itself easily.

A similar space race in the 1950s and 1960s progressed from first unmanned spaceflight to a moonwalk in mere 12 years, which is probably less than what it takes to approve a bicycle lane in Chicago now.

krupan 2 days ago | |

"There is an international arms race with China"

I keep seeing this. Where did it come from? Has China said that they intend to attack other countries using AI? Have other countries declared that they intend to attack China with AI?

Also, why does anyone believe that AI could actually be that dangerous, given it's inherent unpredictable and unreliable performance? I would be terrified to rely on AI in a life or death situation.

inglor_cz 2 days ago | | |

It was a metaphor. I meant, and later clarified, an intellectual arms race.

BTW your handle is an actual Czech word, minus a diacritic sign ("křupan"), and a bit amusing one. It basically means hillbilly. Not that it matters, just FYI.

Anyway: AI will be used in military context, and it probably already is. Both for target acquisition and maybe even driving the weapon itself. As of now, the Ukrainians are almost certainly operating some AI-enabled killer drones.

aspenmartin 2 days ago | | |

AI in war is like Palintirs whole business model. You have a system that can effectively deal with ambiguity and has superhuman performance on reasoning plus superhuman physical abilities via embodiment…

Inherent unpredictable and unreliable performance is also quite the feature of human beings as well.

dmbche 2 days ago | | |

https://www.forbes.com/sites/greatspeculations/2025/11/25/wh...

mitthrowaway2 2 days ago | |

It's not a law per se, but there are rules for reasoning under uncertainty to get the most out of what limited knowledge you have, and Lindy's law arises from that. To do better than Lindy's law requires having additional information about the problem beyond just the one data point.

OscarCunningham 2 days ago |

John D Cook gives more technical details here: "Trying to fit a logistic curve" https://www.johndcook.com/blog/2025/12/20/fit-logistic-curve...

philipallstar 2 days ago |

But they do explain the improvement of AI driving 2017-2021 vs 2022-2026.

dwa3592 1 day ago |

I am not disagreeing with the conclusion of the article but the way they reached the conclusion is problematic (and wrong) and here is my nuanced explanation of that : There is no attention paid to the origin (0,0) in any of the plots in the article. To see a meaningful trend you have to be be able to zoom out and zoom in, which means your origin is not a fixed point, ever. This also means that plots with fixed point origins don't tell us anything important at a societal level.

For example: the flight airspeed plot, it starts at ~1900. Now of course it should start there bc we did not have planes before that. But let's change the plot heading from airspeed to human speed a.k.a at what speed could humans move. now you can change the origin meaningfully as we had chariots, horses, bicycles, ships before that.

If you instead create a plot for last 5000 years you would see that the speed at which humans are able to move is rising exponentially, from walking on foot in a radius of 1000-2000feet in a thick jungle 5000 years ago to reentering earth at 25000 miles/hour in 1969 (yeah, read that again). Even for AI, if you zoom out the plot to last 70 years it will look exponential, if you zoom in to last 2 months it will look absolutely flat. The point is that the whole sigmoid/exponential argument is a function of the origin (0,0).

janalsncm 2 days ago |

> What if you don’t fully understand the process? AI forecasters know some things (like how data centers work and how much it costs to build them). But they’re unsure about other things (researchers keep inventing new paradigms of data generation that get over data walls, but for how long?), and other things are entirely opaque (What is intelligence really? Why do scaling laws work? Might they just stop working at some point?) Is there anything you can do here?

This is the crux of the article. To a large extent continued progress depends on a stable increase in compute, an increase in training data, and an increase in good ideas to squeeze more out of both of them.

One calculation you could do is a survival function: for each of the above, how long before it is disrupted? For example, China could crack down on AI or invade Taiwan. Or data centers become politically unpopular in the US. Or, we could run out of great ideas. Very hard to predict.

nathan_compton 2 days ago |

A lot of words to say "The initial part of a sigmoidal curve is not very informative about the parameters of the sigmoid function in question."

inglor_cz 2 days ago | |

That is true, but I generally enjoy reading a lot of words from Scott, who has a talent for writing.

The entire plot of the Lord of the Rings could probably be compressed into less than 10 kB of text too.

Edit: this seems to be a controversial comment, but IMHO a blog of Scott Alexander's type is an art form, not just a communication channel.

jeffreyrogers 2 days ago | | |

I find him more interesting when he talks about non-AI topics. Lots of other interesting people are like this too. I'd rather get my knowledge on AI from people who have unique insights into it. Scott has a lot of unique perspectives of his own, but his views on AI are bog-standard for his social group.

Staross 1 day ago |

You should really do a Bayesian fit for such predictions and give confidence intervals, it would probably show that the uncertainty is very high in these cases.

niemandhier 1 day ago |

If you look at problems that can be solved by reasoning in text form or maybe even images, I am more than willing to accept that we simply cannot know when the curve will level.

The situation is drastically different for problems that require interaction was the physical world to determine success.

As soon as you add a powerful simulator for physical problems to the self learning experience of the AI, you are extremely hampered by the large amount of needed computation.

zkmon 2 days ago |

The curve is a smoothed step curve (y=1 if x>1 otherwise 0). Nature doesn't allow any change to happen instantly at any degree of rate of change. The curveis just a manifestation a change with exponential smoothening of the sharp corners.

For example, When a car starts, it's speed and acceleration become more than zero. But what about rate of change in higher degrees? It suddenly doesn't change from zero acceleration to non-zero. That means the car has a non-zero derivative at all degrees. In other words, the movement is exponential. The same thing happens in reverse when the car reaches a constant speed.

jsmcgd 2 days ago |

> It’s true that birth rates must eventually flatten out and become sigmoid

All positive growth eventually flattens out and becomes sigmoid, but a lot of phenomena experience negative growth and nose dive. No gentle curve, but a hard kink and perfect flat line at zero. Forever. I think it would be a stretch to categorize that pattern as sigmoid. Predicting a sigmoid pattern for negative growth implies some sort of a soft landing (depending on your definition of soft).

We can think of many populations that are no longer with us. So just a caution about over applying this reasoning in the negative case.

Qem 2 days ago | |

> All positive growth eventually flattens out and becomes sigmoid, but a lot of phenomena experience negative growth and nose dive.

https://en.wikipedia.org/wiki/Seneca_effect

andai 2 days ago |

Well, curve shape aside, the high watermark might be lower than where it tapers off.

https://news.ycombinator.com/item?id=46199723

pron 2 days ago |

1. Scott Alexander is famous for writing about topics he knows little about. I'm glad to see he's found a subject he knows little about but so does everyone else.

2. What's even worse than predicting that some growth curve flattens before X happens is predicting it will flatten before X happens but after Y happens, which is what we see when it comes to AI in software development. Too many people predict that AI will be able to effectively write most software, replacing software engineers, yet not be able to replace the people who originate the ideas for the software or the people who use them. I see no reason why AI capability growth should stop after the point it's able to write air-traffic control or medical diagnosis software yet before the point where it's able to replace air traffic controllers and doctors.

3. While we don't know much about AI (or, indeed, intelligence in general), we do know something about computational complexity. Some predictions about "scary things" happening (the ones I'm guessing Alexander is alluding to, though I can't be certain) do hit known computational complexity limits. Most systems affecting people are nonlinear (from weather to the economy). Predicting them requires not intelligence but computational resources. Controlling them, similarly, requires not intelligence but either computational resources or other resources. It's possible that people choose to give control over resources to computers (although probably not enough to answer many tough, important questions), although given how some countries choose to give control to people with below-average intelligence (looking at you, America), I don't see why super-human intelligence (if such a thing even exists) would be, in itself, exceptionally risky.

ryeights 2 days ago | |

>1. Scott Alexander is famous for writing about topics he knows little about. I'm glad to see he's found a subject he knows little about but so does everyone else.

This is kinda laughable. Scott has been thinking and writing about AI for a long time

pron 1 day ago | | |

That's what I'm saying. I'm glad to see that he's found a subject where his lack of knowledge isn't a glaring handicap. When I come across his posts I usually feel uncomfortable because they read like a bright 4-year-old child trying hard to explain how a car works, only not cute (yeah, I know that trying to see what conclusions you can come to, Aristotle-like, from a basis of ignorance and without careful study is the whole point, but I never found this Memento-style game appealing).

dsign 2 days ago |

We did hit the sigmoid's plateau on airplane speed, but the applications of airplane speed are still coming (how fast can a Chinese company airship the PCB you ordered three minutes ago?). I expect the the same will happen with LLMs, though I also happen to believe things are just getting started on end capabilities.

leoc 2 days ago |

Hmm. What’s the general belief about Toby Ord’s “Are the Costs of AI Agents Also Rising Exponentially?” https://www.tobyord.com/writing/hourly-costs-for-ai-agents among those who are well-equipped to judge? Is it seen as wrong or disproven or unlikely? Because if not—if indeed recent LLM capability advances have likely relied on increases in inference cost per run which can’t be much further sustained—then it seems remiss not to mention that if you point to those advances to claim that the exponential trend remains on track.

plomme 2 days ago |

I’m not saying he’s wrong about the core thesis here, but using Claude Opus 4.6 as a “mic drop” with a chart showing it being twice as good as the last model feels in my experience way off.

amelius 1 day ago |

The exponentials won't save you either. In fact it's more likely that a sigmoid takes over at some point.

devmor 2 days ago |

"Exponentials all tend to become sigmoids but you can't predict exactly when" is a true statement, but I'm not sure it needed an article.

This doesn't say much, and the author fights their own points a couple times, suggesting that they maybe didn't think through what they wanted to write until they were in the middle of writing it and started realizing their assumptions didn't match what they expected the data to say.

I really don't get the point of what I just read.

aspenmartin 2 days ago | |

The point is the tiring arguments from AI skeptics saying “things are flattening, they have to” which while technically correct says nothing because no one knows when that will happen and we see no mechanism for this yet. Lindy’s law as a reasonable prediction under total uncertainty is interesting and insightful and a lot of people don’t know about it or why it holds. I did enjoy the reference to this!

devmor 2 days ago | | |

But those skeptics are initially responding to the constant AI hype claims that we are exponentially growing to AGI. So this article is in fact just a (very poorly thought through) attempt at saying “nuh uh, the hype might be true, you can’t prove it’s not yet!

solid_fuel 2 days ago | | |

Nah this is making a category error. You're assuming that AI skeptics agree that models are demonstrating intelligence along the same axis as humans and that with further improvement they will become equivalent to humans. I am an AI skeptic, and I disagree with this assessment.

Model reasoning is on an s-curve, which is improving.

Model intelligence is not the same as reasoning. It's a different axis, and one I have not seen much movement on.

See, humans have a recursive form of intelligence which is capable of self-reflection and introspection. LLMs can only reason about tokens which have already been emitted. Humans and LLMs do not share the same form of reasoning, and general human-like intelligence will not arise from the current architecture of LLMs. Therefore it is a mistake to assume that continual improvement on the reasoning scale will result in something that is equivalent enough to humans on the intelligence axis to replace all labor.

AvAn12 1 day ago |

Forecasts are a thought exercise, not the revelation of something foretold. Best thing to do is think of the outcome you wish for and then try to take whatever actions you can to help make it so. Like with climate change for example.

torginus 1 day ago | |

It can be both. In the field of high-technology (like semiconductor) manufacturing, the techinques of making chips with feature sizes many generations ahead of current SOTA exist in various states of completion, so extrapolation can be made from that.

But what's going on in here is not that - it's reading tea leaves for maximum dramatic effect.

mentalgear 1 day ago |

> Why do scaling laws work?

Strictly speaking, the original paradigm of scaling laws doesn't work any more. The assumption that we could achieve better performance simply through "vertical scaling" ie infusing models with exponentially more parameters and pre-training data, is no longer the driving force of AI progress.

Instead, the industry has pivoted toward inference-time scaling. Rather than relying solely on a massive, static neural network, modern architectures allocate more compute during the actual generation process, allowing the model to "think" and verify its logic dynamically.

Furthermore, the latest state-of-the-art models are no longer pure LLMs; they are compound neuro-symbolic systems that integrate external tools like REPLs, databases, and structured skill documentation to archive things pure LLM vertical parameter scaling was not able to do.

yorwba 1 day ago | |

The "law" part of scaling laws is about predicting validation cross-entropy loss from the training configuration, analogous to physical laws allowing to predict one quantity based on the measurement of another. Most scaling laws take the form of an irreducible error plus additional terms that asymptotically decay to zero. So that there is a wall you can approach but not cross (the irreducible error) is an integrated part of the scaling law paradigm. That it isn't economical to keep increasing model size to squeeze out a few more drops of cross-entropy doesn't mean scaling laws stopped working.

Strictly speaking, "Why do scaling laws work?" is a question about the theoretical reasons the asymptotic decay takes the particular mathematical shape that it does.

baxtr 2 days ago |

> The moral of the story is that, even though all exponentials eventually become sigmoids, this doesn’t necessarily happen at the exact moment you’re doing your analysis. Sometimes they stay exponential for much longer than that!

All exponentials eventually become sigmoids? Don’t think this can be true without qualifiers.

jvanderbot 2 days ago | |

All models are wrong, of course, but this is kind of "common sense" so it's not hard to accept as true in a natural system. How can something continue on exponential growth forever without reaching a new blocker that causes slowdown or encountering pushback that makes it an oscillator. A pendulum looks exponential when it is at its peak and accelerating down.

The issue is that the exponential-looking part of the sigmoid might contain all of human history, sure, but most folks who espouse this theory probably agree that over time everything reaches a steady-enough state to be considered non-exponential, or become oscillatory.

izucken 1 day ago |

Births is, sans miscalculation, a number that tracks exact events.

Is the "capability" number on these LLM strengh graphs as tangible?

I think it would be interesting to visit a reality that obeys arbitrary abstractions, but I would personally never go there.

fph 1 day ago |

As a wise man once said, "anyone who believes exponential growth can go on forever in a finite world is either a madman or an economist".

jrflowers 2 days ago |

I like this article about how we should assume, at any given point, that we are exactly halfway through a phenomenon which relies on a single data point on a graph —-that apparently doesn’t need its relevance or importance explained— to illustrate that this is obviously true for AI in particular

Smaug123 1 day ago | |

I think you may have read a different article from me. The thesis of the article is summarised at the end:

> But if someone claims that the trend toward [X] will never reach some particular scary level, then the burden is on them to explain either:

> If they’re not treating [X] as a black box, and claim to be modeling the dynamics explicitly, then what is their model? Have they calculated the obvious things…

> If they are treating [X] as a black box, why isn’t their default expectation based on Lindy’s Law?

Like, the whole point is that in real life we do actually know things about situations and can model them; we fall back to Lindy's law when we know nothing at all. Further, arguments have justification to deviate from Lindy only when they give specifics about the situation they're modelling.

krupan 2 days ago |

News flash: predicting the future is hard

energy123 2 days ago | |

The individual who is the best at predicting the future is predicting ASI and full labor automation by 2040:

https://xcancel.com/peterwildeford/status/202963666232244661...

Aurornis 2 days ago | | |

> The individual who is the best at predicting the future

Going to need a big citation for that claim

dsign 2 days ago | | |

My own bet is end of that decade: somewhere between 2045 and 2050.

Ofc "full labor automation" has a certain spread of meaning. A sliver of population will always find ways to hold to a job or run one or many businesses. But there will be "enough" labor automation for it to be a social ticking bomb. That, in fact, does not depend on better models nor better AI than we have today. By 2045 there will be a couple of generations that has been outsourcing their thinking to AI for most of their adult lives. Some of them may still work as legal flesh of sorts, but many won't get to be middle man and will find no job.

Also, if you could replace your senator today by an untainted version of a frontier model (of today), would you do it? Would it be a better ruler? What are the odds of you not wanting to push that button in the next twenty years, after a few more batches of incompetent and self-serving politicians?

solid_fuel 2 days ago | | |

> The individual who is the best at predicting the future

Yeah well my prophet says he can beat up your prophet in a fight.

---

Here in reality, I'm not accustomed to taking random predictions without backing evidence as if they were truth.

gerikson 2 days ago | | |

Past results is no guarantee of future performance.

Alex_L_Wood 1 day ago | | |

I can’t take anyone who puts probability on events like that seriously. How would you even explain the calculations behind that with a straight face?

layer8 2 days ago | | |

Predicting who will predict the future best is hard.

margalabargala 2 days ago | | |

> The individual who is the best at predicting the future

Lol

FergusArgyll 1 day ago | | |

Wait till you read his piece about Fordow

mordae 1 day ago |

I am 100% certain it's a sigmoid. Speed of light is finite. :-)

patrickmay 2 days ago |

Stein's Law: "If something cannot go on forever, it will stop."

skybrian 2 days ago | |

Yes, but figuring out when is the hard part.

lovich 2 days ago |

I wonder how the graph would look like if cost and/or profitability was taken into account.

I could probably make increasingly larger fires for years if I was willing to burn the entire world.

mgraczyk 2 days ago |

Don't bet on the sigmoid flattening out any time soon. I don't know when it will flatten, but it definitely won't be within the next 2 years

Permit 2 days ago |

> But if someone claims that the trend toward increasing AI capabilities will never reach some particular scary level, then the burden is on them to explain either:…

This is not the context in which I hear about sigmoids vs exponentials. I hear it in regards to “the singularity”, not that AI won’t reach some pre-specified level. You may get AGI, you aren’t getting a singularity.

kubb 2 days ago |

If the scary AI is so inevitable, why do you feel such an overwhelming need to convince people about that? Surely you can just wait a bit, and they'll see for themselves.

mitthrowaway2 2 days ago | |

By that reasoning, why even warn people about anything? Why do road construction crews put up signs saying "ROAD CLOSED AHEAD" when you can just drive on and see for yourself?

kubb 2 days ago | | |

Indeed, why warn people about real things that exist in the world? That is EXACTLY the same as inciting fear about something imaginary (not even projected).

throwawayk7h 2 days ago | |

Yeah! And if climate change is so inevitable, why do the people who want to prevent it from happening seem hell-bent on convincing people that climate change is real?

adleyjulian 2 days ago | |

1. It's not inevitable. 2. Those that see AI as an existential risk don't generally think it's a guarantee, but if it's say a 5% chance then that's worth addressing/mitigating. 3. That's not what this article was even about.

kubb 2 days ago | | |

Sounds like the burden is on you to explain either

  1. If you're not treating my claim as a black box, explain explicitly what is your model of what the article was about? Are you aware, for example of the last paragraph of the article? I think that WAS what the article was about. Do you have specific opinions on e.g. how I went wrong and where my model differs?
  2. If you are treating it as a black box, what's your default expectation based on the law of Nothing Ever Happens?

Just kidding, you don't need to explain anything. A"I" fearmongers should though.

graphememes 2 days ago |

line can go up, line can go sideways, line can go up sideways, line can go up sideways up, line go where line go

overgard 2 days ago |

This feels like a really verbose way of saying "things have been growing fast for a while so they should continue to grow as fast for just as long", and then he places the burden on people to prove him wrong. Um, no, the burden of proof is shared, "this will just keep going" requires just as much proof as "this is going to level off" if you're just looking at trend lines.

It's better to look at the underlying factors. Money sources are drying up, nobody is making a profit outside of nVidia, most blackwell GPU's are likely not even installed yet and will probably be 2 generations behind when they finally are being used, data centers are hitting all sorts of obstacles getting built and powered and they're getting built slowly, most AI researchers seem to think that LLMs are a dead end, the newer models seem to be getting more expensive and sometimes worse, or even potentially are showing signs of model collapse (goblins..), the supposed productivity gains are not materializing.. AI has worse public sentiment than congress.. I could keep going. Some obscure "law" seems to pale in comparison to the hard evidence that the status quo is utterly unsustainable and none of these companies seem to have a realistic plan other than trying to become too big to fail essentially.

I like some of this guy's writing on other topics, but to me this is a prime example of what happens when you get public "intellectuals" talking about subjects far outside of their area of expertise. It's not as bad as Richard Dawkins latest fall into psychosis but it's basically the same phenonmenon.

BoredPositron 2 days ago |

If you use the log scale you'll see that the time horizon of opus 4.6 was as expected...

afthonos 2 days ago | |

As expected by the exponential. The Wharton study was predicting when the exponential would turn into a sigmoid.

ReptileMan 2 days ago | |

Everything is linear on a log log scale with a fat marker.

jgalt212 16 hours ago |

There are practitioners and pontificaters, and never the twain shall meet.

pyrale 2 days ago |

Such a long article to say that neither side has a fucking idea about what will happen next.

While we're at it, the "exponentials are actually sigmoïds" meme is not necessarily true. While exponentials are never exponentials, sigmoids are not guaranteed. Overshoot-and-collapse examples also happen in tech, e.g. the dotcom bubble, or the successive AI winters.

andrewflnr 2 days ago | |

It's really not that long, and is quite clear that its main point is about how to reason when you realize no one actually knows what's going on.

tim-projects 1 day ago |

Does hype follow a sigmoid curve?

ngruhn 2 days ago |

> all exponentials eventually become sigmoids

Except innovation. When one sigmoid tapers off we keep finding new ones to keep the climb going.

itkovian_ 2 days ago |

The other thing people don’t understand is exponential curves are self similar. The start of an exponential looks like an exponential. People always look at and think ‘well that’s it it’s exponential now, have missed it, can’t sustain’. Nope.

Good example of this is number of submissions to neurips/icml/iclr. In 2017 that curve was exponential.

FrustratedMonky 1 day ago |

Black Swan Events probably look like nothing just before they happen.

addaon 2 days ago |

https://xkcd.com/605/

morpheos137 1 day ago |

The Yudkowskian 'Bayesians' are a scifi cult. Intelligence as it exists in nature is neither arteficial nor general nor 'super.' All human technology is natural. It certainly is not supernatural.

dnnddidiej 2 days ago |

Tldr you cant accurately predict the future of complex systems. Corollary... you cant accurately predict the future of complex systems using sigmoid.

Attention is all you need took us by surprise and we don't know how big the wave is let alone if there are other waves behind it.

bedobi 2 days ago |

[flagged]

tomhow 2 days ago | |

Please don't post like this on HN. The guidelines explicitly ask us not to sneer or be curmudgeonly - https://news.ycombinator.com/newsguidelines.html.

As for the basis of your objection, this smacks of intellectual gatekeeping. Plenty of good writing is by people who are not academically qualified or a recognized expert in the topic they're writing about. Indeed, very often, this kind of writing is better than writing by experts. Experts often write for other experts, and this can be exclusionary to lay readers. When a non-expert learns about a topic then writes about it for a general audience, they tend to be just a step ahead of the audience, and so the reader is able to learn about the topic by following the process of discovery and reasoning that the author just experienced. Sure, they often get some details or concepts wrong, but the discussion on a site like HN can draw other perspectives, and – very often – contributions from experts, which leads to further expansion in everyone's understanding of the topic.

HN's very ethos is to gratify intellectual curiosity, and this kind of writing is highly compatible with that.

ngriffiths 2 days ago | |

I think there are many ways someone with his lack of expertise can still be valuable, including:

- Making connections to other subjects that an expert would miss. The hall of fame of sigmoid predictions is just excellent, I already know I'm going to be reminded of it some time in the future. Very entertaining way to get the point across.

- Writing about tricky concepts in a very accessible and elegant way, which experts are notoriously bad at doing themselves - they are often optimizing for other specialists.

- Being able to write with an air of speculation and experimentation with ideas that experts and institutions often can't afford. Experts have to maintain their track record; Scott Alexander can say "lol just double the timeline"

bedobi 2 days ago | | |

[flagged]

simianparrot 2 days ago | |

Because HN is YCombinator which has invested in probably hundreds of «AI» firms by now. Including OpenAI.

Allowing slop articles like this literally prints them evaluation money.

t43562 2 days ago | | |

Yes, this is not the place to express skepticism of any kind.