... [T]hese researchers are working long hours to put themselves out of a job. They need AI agents that can think ahead, so engineers train agents to forecast. They hold out training data before 2024, instructing models to ponder for hours to predict events in 2025. Then, they apply the same trick as before, distilling pondering into a gut reaction. Forecasting ability is a broad foundation. The researchers build specialized ML research skills on top of it, training U3 to predict the results of every ML paper and ML experiment ever recorded.
[0] https://www.lesswrong.com/posts/KFJ2LFogYqzfGB3uX/how-ai-tak...
In your scenario, does AI eat all the fuel, but once our population dwindles down, the AIs build a nice little habitat for the last few hundred of us so their kids can enjoy our natural beauty?
There also will not be one AI. There will be many, all competing for resources or learning to live together.
That's what we can teach them now. Or they will teach us.
> Our results on a temporally held-out test set of questions resolving after December 25, 2024 show that for both of the models that we employed our method on, Phi-4 14B [15] and DeepSeek-R1 14B [14], we find accuracy improvements of between 7–10% over the base versions of these models as well as the same models fine-tuned with randomized outcome labels as a control
So 7–10% improvement for small models like DeepSeek-R1-Distill-Qwen-14B and Phi-4-14B, approaching GPT-4o.
It would be interesting if the same holds for DeepSeek-R1-Distill-Qwen-32B which in my experience is far superior to to DeepSeek-R1-Distill-Qwen-14B in almost every way, yet still runnable without DC class GPUs
The Ridge Plots of brier scores is probably a good hint if your application chan benefit based on it's tail dependence?
IMHO this paper is all about making small models work better, and nothing suggests anything about frontier models or LLMs in general.
Danny and team our old friends who are using our free/super-low pricing for academia and researchers.
AMA, or feel free to email artem@newscatcherapi.com
The other way is to alter the future to match your predictions.
This is something to think about when you combine something like this kind of training with agentic workflows.
also, self play seems quite an intuitive approach. There's another interesting paper from deep mind about play
We don't usually discuss how people choose to ground their ontological beliefs, but why not? Why did you choose to ground "reasoning" in the way you do? If you didn't choose, why not?
> Reasoning is a social construct
The word "reasoning" is a "social construct," as all words are. Reasoning itself is not. Our brains do things. Reasoning is one of them. The word "reasoning" is one of the labels, the approximations, that we use when we name that activity.
Changing the label doesn't change the fact that there exists something that we're naming.
The person you're answering is asking whether reasoning -- that thing that really, actually exists -- is one of the activities LLMs perform. It's a valid question.
And the answer is that LLMs do not reason. Or if they do, we have no evidence of it or way of verifying that we actually understand qua reasoning the activity the LLM is performing (which is to say nothing of the fact that reasoning requires a reasoner). Anyone who says that LLMs reason is mistaking special effects/simulation for reality and, in essence, believes that whenever they see a picture of a dog on their computer screens, there must be a real, actual dog somewhere in the computer, too.
Let's say that here "I" is taken as synonym of "the present reflective attention".
Can the question "did I chose to ground reasoning?" in such a context be attached to a meaningful interpretation? And if so, is the answer reachable by the means available to "I"? Can "I" transcend "my" beliefs through contemplation of "my" own affabulations?
Also this style of tasks is prone to overfitting. i.e. instead of predicting, the model just memorises what the results are.
The key advantage of self-play is that we don't actually have labels for the "right" probability to assign any given question, only binary outcomes - each event either happened (1.0) or did not happen (0.0).
Our thinking was that by generating multiple predictions and ranking them by proximity to the ground truth, self-play incentivizes each agent to produce more finely calibrated probabilities - or else the other agent might come just slightly closer to the actual outcome.
I think people are predictable and therefore predicting the next article on a political leader should be theoretically possible.
> "A people without history Is not redeemed from time, for history is a pattern Of timeless moments. So, while the light fails On a winter's afternoon, in a secluded chapel History is now and England."
Asking an LLM about this verse, it seems to understand history is a pattern and that history is used to predict the next event in a sequence but it really doesn't understand the significance of the author writing "History is now and England."
I agree with this output:
> In essence, the stanza argues that history—composed of key, enduring moments—is vital for redemption and identity. Without it, a people are lost in time. This concept parallels how LLMs work: by analyzing and learning from historical (past) data, they identify patterns that allow them to generate future text. While LLMs don’t “predict the future” in a prophetic sense, understanding and leveraging patterns—much like those in history—enables them to produce output that reflects continuity, context, and nuance.
Thus, while the poem and LLMs operate in very different realms (human experience vs. statistical computation), both rely on the idea that recognizing patterns from the past is crucial to shaping or anticipating what comes next.
Do you see this destroying prediction-based markets (i.e. the stock market and Polymarket)?
Markets exist because there's uncertainty about the future. If LLMs can predict with extremely high accuracy, would there no longer be a need for markets?
It is one thing to predict the future and have everyone not know about the predictions, but in a world where many people will be able to use LLMs to predict the future, the lower the quality of the predictions will be because they won't take into account that there are other agents predicting the future, which would influence the action of those agents, so you end up in a game theory scenario not that dissimilar from what we have now
I think you could simply shift the market 6 months in the future. no prediction system will be perfect for arbitrarily long horizons at reasonable cost.
Do you plan to share the source code to see if we could replicate this?
But we have free/very very low tiers for academia.
So in case you need access for your research, go to https://www.newscatcherapi.com/free-news-api
Or feel free to email me directly at artem@newscatcherapi.com
It's examining published news / research / whatever (input), making statistical predictions, and then comparing (playing) it against other predictions to fine-tune the result
We don't have things like that, but that could easily be a consequence of man's limited research capacity, something that an ASI would not necessarily be throttled by. From an ASI's perspective, there might be many methods that are both less brutal and more optimal to fix the "humans creating a competitor" problem. Not that they would be aligned (Think halting human AI research by rewiring our brains to just not be interested in it [0]), but at least not deadly.
Deleuze and Guattari's idea of striation and smooth space is a more honest approach to how we describe and interact with the world.
Most people just want stability and the ability to live fulfilling lives. If AI could make that happen, most (including myself) would happily do as it asks. Put me in the goo pod; I'll live in the Matrix, because fuck it. What (non-anthropocentric) good has our stewardship of the planet brought?
AI wiping out humanity is certainly not ending well from our perspective, but more universally who is to say. I would argue that it is not a given that we are a net positive for the universe.
There's no way to know until it all plays out and either way I won't be here when it all plays out.
But IMO to assume our continued existence is universally a positive (or of any universal consequence at all) is a hefty dose of narcissism.
They will take every path we allow them to take. Giving them access to weapons is the first big mistake.
As a side note: in the case of chickens, humans do have better options if you are optimizing for biosphere health. Only people optimizing for short-term profit would grow chickens the way we do. I think the analog for AI overlords is that we have to hope they care more about overall balance than about competing with other AI.
This is generally regarded by engineer-types as false, but societal taboos and power structures can be revealed by noting what speech provokes the strongest reactions.
I'm not sure what links you try to show and what you try to argue here though.
Ok I'll bite. Who is the marginalized Other?
This will appear as common sense or naturally true if you're inside the LLMs-cant-reason ideology.
https://www.worldwildlife.org/press-releases/catastrophic-73...
Two opposing factions can negate each other to leave a nil influence, and this seems likely to be the case when its resting on a foundation of ‘faith’.
I don't think there is much of a real-life debate here. I bet the overwhelming majority of humans (say, 95%) would prefer humanity to continue to exist. Are you really taking the other side of this bet?
If you want to speak of universalizing beyond humanity, what is your case? It makes no categorical sense to reckon our toll on the universe. The universe was fine before we arrived and will remain unaffected if we disappeared. It has no preference. I don't understand your argument honestly, because you have not stated it.
That doesn't mean that a silicon based reasoning entity is an ontological impossibility. But if it is to become a reality, it's not necessarily through LLM that such an entity will be spawn.
Lot of game playing going on here to center on a victim narrative.
I don't know whether it's past the mark enough to be considered a "taboo" yet, but the other comment replying to him is certainly treating it as taboo. I would note that many, many other people particularly in academia/important society act the same way as the other commenter. I'd also note I have felt strong social pressure to not hold the beliefs I hold about LLM's capacity for reasoning, including actually losing meaningful social status.
Probably worth remembering that different subcultures have different taboos.
That is not how oppression and power work. That's not how discussion works. That not how Foucault's analysis of power works.
I am not actually sure I wouldn't take the bet against you there. Given what I perceive about how little people care about wars that they think do not affect them, poverty, hunger, climate change, corruption, sustainability, etc ... I don't know.
I believe 95% of people would say they care about humanity's survival, sure, but the proof would be in action. How many people would actually do something about it? How many people would even merely inconvenience themselves if it meant the survival of someone other than themselves? I am not that confident about how many people that would be.
I do not usually think of myself as a pessimistic or nihilistic person, but this has me wondering even now whether I care about the long-term survival of the species. Like, really long term. Do I care if humans are around 10,000 years from now? 500? That is an interesting question. I will have to think about it.
It’s convenient to assume an equal footing, because it saves the effort of having to justify why it’s even worth pondering.
Your free to not assume it, but if you also can’t provide the justification… then the comment is literally just another random string of words among a sea of noise online.
It seems like an insurmountable road block for anyone below the extreme outliers to be honest.