Big LLMs weights are a piece of history(antirez.com) |
Big LLMs weights are a piece of history(antirez.com) |
> oh HELL YEAH they will be. future historians are gonna have a fucking field day with us.
> imagine some poor academic in 2147 booting up "vintage llm.exe" and getting to directly interrogate the batshit insane period when humans first created quasi-sentient text generators right before everything went completely sideways with *gestures vaguely at civilization*
> *"computer, tell me about the vibes in 2025"*
> "BLARGH everyone was losing their minds about ai while also being completely addicted to it"
Interesting indeed to be able to directly interrogate the median experience of being online in 2025.(also my apologies for slop-posting; i slapped so many custom prompting on it that I hope you'll find the output to be amusing enough)
From a random web search, it seems the sizes above Large are: Extra Large, Jumbo, Extra Jumbo, Giant, Colossal, Super Colossal, Mammoth, Super Mammoth, Atlas.
You mean the EU, right? The UK isn't covered by the AI act.
/s
We could take a page from Trump’s book and call them “Beautiful” LLMs. Then we’d have “Big Beautiful LLMs” or just “BBLs” for short.
Surely that wouldn’t cause any confusion when Googling.
I’ve seen corporate slogans fired off from the shoulders of viral creatives. Synergy-beams glittering in the darkness of org charts. Thought leadership gone rogue… All these moments will be lost to NDAs and non-disparagement clauses, like engagement metrics in a sea of pivot decks.
Time to leverage.
XXLLM: ~1T (GPT4/4.5, Claude Opus, Gemini Pro)
XLLM: 300~500B (4o, o1, Sonnet)
LLM: 20~200B (4o, GPT3, Claude, Llama 3 70B, Gemma 27B)
~~zone of emergence~~
MLM: 7~14B (4o-mini, Claude Haiku, T5, LLaMA, MPT)
SLM: 1~3B (GPT2, Replit, Phi, Dall-E)
~~zone of generality~~
XSLM: <1B (Stable Diffusion, BERT)
4XSLM: <100M (TinyStories)
teensy 4B to 29B
smol 30B to 59B
mid 60B to 99B
biggg 100B to 299B
yuuge 300B+
Gotta leave room for future expansion.
I really appreciated the way they managed to come up with a new naming scheme each time, usually used exactly once.
But the quality of Apple Intelligence shows us what happens when you use a tiny ultra-low-wattage LLM. There’s a whole subreddit dedicated to its notable fails: https://www.reddit.com/r/AppleIntelligenceFail/top/?t=all
One example of this is “Sorry I was very drunk and went home and crashed straight into bed” being summarized by Apple Intelligence as ”Drunk and crashed”.
I actually applied to YC in like ~2014 or such for thus;
-JotPlot - I wanted a timeline for basically giving a histo timeline of comms btwn me and others - such that I had a sankey-ish diagram for when and whom and via method I spoke with folks and then each node eas the message, call, text, meta links...
I think its still viable - but my thought process is too currently chaotic to pull it off.
Basically looking at a timeline of your comms and thoughts and expand into links of thought - now with LLMs you could have a Throw Tag od some sort whereby you have the bot do work on research expanding on certain things and plugging up a site for that Idea on LOCAL HOST (i.e. your phone so that you can pull up data relevant to the convo - and its all in a timeline of thought/stream of conscious
hopefully you can visualize it...
It's like saying Automated ATM. Whoever wrote it barely knows what the acronym means.
This whole article feels like written by someone who doesn't understand the subject matter at all
I.e. when pretty much every tool or script I used before doesn't work anymore, and need a special tool (gsutil, bq, dusk, slurm), it's a mind shift.
Amen. There is an active effort to create an Internet Archive based in Europe, just… in case.
;)
(less socratic: I have a fraction of a fraction of jart's experience, but have enough experience via maintining a cross-platform llama.cpp wrapper to know there's a ton of ways to interpret that bag o' floats and you need a lot of ancillary information.)
Then again, CPUs will be fast enough that you'd probably just emulate amd64 and run it as CPU-only.
If I want to read a post, a book, a forum, I want to read exactly that, not a simulacrum built by arcane mathematical algorithms.
Tangent: I was thinking the other day: these are not AI in the sense that they are not primarily intelligence. I still don't see much evidence of that. What they do give me is superhuman memory. The main thing I use them for is search, research, and a "rubber duck" that talks back, and it's like having an intern who has memorized the library and the entire Internet. They occasionally hallucinate or make mistakes -- compression artifacts -- but it's there.
So it's more AM -- artificial memory.
Edit: as a reply pointed out: this is Vannevar Bush's Memex, kind of.
It has always been like that, in the past people wrote on paper, and most of it was never archived. At some point it was just lost.
I inherited many boxes of notes, books and documents from my grandparents. Most of it was just meaningless to me. I had to throw away a lot of it and only kept a few thousand pages of various documents. The other stuff is just lost forever. And that’s probably fine.
Archives are very important, but nowadays the most difficult part is to select what to archive. There is so much content added to the internet every second, only a fraction of it can be archived.
I don't think the big scientific publishers (now, in our time) will ever fail, they are RICH!
It might be possible to create an L LM that can write a custom vintage game or program on demand in machine code and simultaneously generate assets like sprites. Especially if you use the latest reinforcement learning techniques.
Are there any search experiences that allow me to search like it's 1999? I'd love to be able to re-create the experience of finding random passion project blogs that give a small snapshot of things people and business were using the web for back then.
Just with a pre-LLM knowledge
Personally I'd like that if all the knowledge and information (K & I) are readily available and accessible (pretty sure most of the prople share the same sentiment), despite the consistent business decisions from the copyright holders to hoard their K & I by putting everything behind paywalls and/or registration (I'm looking at you Apple and X/Twitter). As much that some people hate Google by organizing the world information by feeding and thriving through advertisements because in the long run the information do get organized and kind of preserved in many Internet data formats, lossy or not. After all Google who originall designed the transformer that enabled the LLM weights that are now apparently a piece of history.
I feel like the more people use GenAI, the less intelligent they become. Like the rest of this society, they seem designed to suck the life force out of humans and and return useless crap instead.
https://vancouversun.com/news/local-news/the-internet-archiv...
(Edited: apparently just a new HQ and not THE HQ)
[1] https://www.nbcnews.com/politics/donald-trump/trump-quest-co...
The physical assets are stored in the blast radius of an oil refinery. They don't have air conditioning. Take the tour and they tell you the site runs slower on hot days. Great mission, but atrociously managed.
Under attack for a number of reasons, mostly absurd. But a few are painfully valid.
EDIT: asking Claude:
Based on historical data, major refinery explosions in developed countries might occur at a rate of approximately 1 in 1,000 to 1 in 2,000 refinery-years of operation. Using this very rough estimate, a single refinery might have approximately a 50% chance of experiencing a significant explosion somewhere between 700-1,400 years of continuous operation.
Of course they vary widely in quality.
And I'm sure they have or will have the ability to influence the responses so you only see what they want you to see.
“Vannevar Bush's 1945 article "As We May Think". Bush envisioned the memex as a device in which individuals would compress and store all of their books, records, and communications, "mechanized so that it may be consulted with exceeding speed and flexibility".
Correction: you occasionally notice when they hallucinate or make mistakes.
To me intelligence describes something much more capable than what I see in these things, even the bleeding edge ones. At least so far.
https://lcamtuf.coredump.cx/lossifizer/
I think a fun experiment could be to see at what setting the average human can no longer decipher the text.
I think "it's just compression" and "it's just parroting" are flawed metaphors. Especially when the model was trained with RLHF and RL/reasoning. Maybe a better metaphor is "LLM is like a piano, I play the keyboard and it makes 'music'". Or maybe it's a bycicle, I push the pedals and it takes me where I point it.
Yes!
> artificial memory
Well, "yes", kind of.
> Memex
After a flood?! Not really. Vannevar Bush - As we may think - http://web.mit.edu/STS.035/www/PDFs/think.pdf
First, there is no objective dividing line. It is a matter of degree relative to something else. Any language that suggests otherwise should be refined or ejected from our culture and language. Language’s evolution doesn’t have to be a nosedive.
Second, there are many definitions of intelligence; some are more useful than others. Along with many, I like Stuart Russell’s definition: the degree to which an agent can accomplish a task. This definition requires being clear about the agent and the task. I mention this so often I feel like a permalink is needed. It isn’t “my” idea at all; it is simply the result of smart people decomplecting the idea so we’re not mired in needless confusion.
I rant about word meanings often because deep thinking people need to lay claim to words and shape culture accordingly. I say this often: don’t cede the battle of meaning to the least common denominators of apathy, ignorance, confusion, or marketing.
Some might call this kind of thinking elitist. No. This is what taking responsibility looks like. We could never have built modern science (or most rigorous fields of knowledge) with imprecise thinking.
I’m so done with sloppy mainstream phrasing of “intelligence”. Shit is getting real (so to speak), companies are changing the world, governments are racing to stay in the game, jobs will be created and lost, and humanity might transcend, improve, stagnate, or die.
If humans, meanwhile, can’t be bothered to talk about intelligence in a meaningful way, then, frankly, I think we’re … abdicating responsibility, tempting fate, or asking to be in the next Mike Judge movie.
Followed by another company introducing their "Plus Ultra" model.
LLM 3.0, LLM 3.1 Gen 1, LLM 3.2 Gen 1, LLM 3.1, LLM 3.1 Gen 2, LLM 3.2 Gen 2, LLM 3.2, LLM 3.2 Gen 2x2, LLM 4, etc...
Alternatively, just make sure you keep things consensual, and keep yourself safe, no judgement or labels from me :)
Brewster is giving a speech on Tuesday March 18 at the University of Leiden. Not sure if you're in Europe, or in the Netherlands, but we're here.
https://apen4ej.medium.com/11-years-ago-the-chevron-refinery...
Keep in mind that Brewster bought the building because it looked like the icon, not vice versa. Not exactly the amount of thought that might be expected of an archival institution.
- 1989 explosion and fire
- 1999 explosion and fire
- 2012 fire
The 2012 incident sent 15,000 people to the hospital.
Also until recently their whole model was storing physical material (on an active fault line next to an oil refinery) then allowing digital access to it. Courts ruled that illegal for modern works.
(Obviously ∞ is for the actual singularity, and ℵ₁ is the thing after that).
https://en.m.wikipedia.org/wiki/Continuum_hypothesis
;-)
A big chunk was outsourced to Sun at one point. And that name alone should tell you how current the information is. https://en.wikipedia.org/wiki/Sun_Modular_Datacenter
In 2020 at least one public filing shows expenses of $19.9MM with $9.2MM classified as wages. So no more than $900k/month in 2020 and maybe double that now. Recent data is messy due to Covid donations and lawsuits.
https://ncua.gov/newsroom/press-release/2016/internet-archiv...
Imagine scrolling through a comment section that feels tailor-made to your tastes, seamlessly guiding you to an ice-cold Coca-Cola. You see people reminiscing about their best summer memories—each one featuring a Coke in hand. Others are debating the superior refreshment of Coke over other drinks, complete with "real" testimonials and nostalgic stories.
And just when you're feeling thirsty, a perfectly timed comment appears: "Nothing beats the crisp, refreshing taste of an ice-cold Coke on a hot day."
Algorithmic engagement isn’t just the future—it’s already here, and it’s making sure the next thing you crave is Coca-Cola. Open Happiness.
Imagine waking up like I do every morning. Refreshed and full of energy. I’ve tried many mattresses and the only one that has this property is my Slumber Sleep Hygiene mattress.
The best part is my partner can customize their side using nothing more than a simple app on their smartphone. It tracks our sleep over time and uses AI to generate a daily sleep report showing me exactly how good of a night sleep I got. Why rely on my gut feelings when the report can tell me exactly how good or bad of a night sleep I got.
I highly recommend Slumber Sleep Hygiene mattresses. There is a reason it’s the number one brand recommended on HN.
Look at the people who want to control this, they do not want to sell you Coke.
Now we can mass-produce it!
Copyright issues aside (let's avoid that mess) I was referring to basic technical issues with the site. Design is atrocious, search doesn't work, you can click 50 captures of a site before you find one that actually loads, obvious data corruption, invented their own schema instead of using a standard one and don't enforce it, API is insane and usually broken, uploader doesn't work reliably, don't honor DMCA requests, ask for photo id and passports then leak them ...
It's the worst possible implementation of the best possible idea.
And my experience isn't unique in any way here and it's really hard to not see it pervasive through our culture.
They're not merely real values, they're also rational.
I should say though, that's the only place I've seen this particular localization.
So in that sense, maybe people would prefer a private alternative.
(Fyi I was a designer at fb and while it was luxious I still hated what I saw in zucks eyes every morn when I passed him.
Super diff from Andy Grove at intel where for whateveer reason we were in the sam oee schekdule
(That was me typing with eues ckised as a test (to myself, typos abound
- Extremely Low Frequency (ELF)
- Super Low Frequency (SLF)
- Ultra Low Frequency (ULF)
- Very Low Frequency (VLF)
- Low Frequency (LF)
- Medium Frequency (MF)
- High Frequency (HF)
- Very High Frequency (VHF)
- Ultra High Frequency (UHF)
- Super High Frequency (SHF)
- Extremely High Frequency (EHF)
- Tremendously High Frequency (THF)
Maybe one day some very smart people will make Tremendously Large Language Models. They will be very large and need a lot of computer. And then you'll have the Extremely Small Language Model. They are like nothing.
https://en.wikipedia.org/wiki/Radio_frequency?#Frequency_ban...
https://en.m.wikipedia.org/wiki/Overwhelmingly_Large_Telesco...
Horrendous being based on the Latin root for "trembling with fear", tremendous on another Latin root meaning "shaking from excitement" and terrible deriving from a Greek root for, again, "trembling with fear".
* assuming someone else already spent tremendous effort to develop an emulator for your binary's target that is 100% accurate...
They didn't even ask for donations until they accidentally set fire to their building annex. People offered to help (SF was apparently booming that year) and of course they promptly cranked out the necessary PHP to accept donations.
Now it's become part of the mythology. But throwing petty cash at a plane in a death spiral doesn't change gravity. They need to rehabilitate their reputation and partner with organizations who can help them achieve their mission over the long term. I personally think they need to focus on archival, legal long-term preservation and archival, before sticking their neck out any further. If this means no more Frogger in the browser, so be it.
I certainly don't begrudge anyone who donates, but asking for $17 on the same page as copyrighted game ROMs and glitchy scans of comic books isn't a long-term strategy.
Personally, I think the summaries of alerts is incredibly useful. But my expectation of accuracy for a 20 word summary of multiple 20-30 word summaries is tempered by the reality that’s there’s gonna be issues given the lack of context. The point of the summary is to help me determine if I should read the alerts.
LLMs break down when we try to make them independent agents instead of advanced power tools. Alot of people enjoy navel gazing and hand waving about ethics, “safety” and bias… then proceed to do things with obvious issues in those areas.
- Passed out drunk
- Crashed in bed
- Slacking because drunk
...
The issue isn't a lack of context; it's that even the available context was handled poorly.
> Let's think and communicate more clearly regarding intelligence. Stuart Russell offers a nice definition: an agent's ability to do a defined task.
Maybe something about my comment got you riled up? What was it?
You wrote:
> What you're doing, with this whole "let's make a bullshit word logical" is more similar to medieval scholasticism, which was a vain attempt at verbal precision.
Again, I'm not quite sure what to say. You suggest my comment is like a medieval scholar trying to reconcile dogma with philosophy? Wow. That's an uncharitable reading of my comment.
I have five points in response. First, the word intelligence need not be a "bullshit word", though I'm not sure what you mean by the term. One of my favorite definitions of bullshitting comes from "On Bullshit" by Harry Frankfurt:
> Frankfurt determines that bullshit is speech intended to persuade without regard for truth. The liar cares about the truth and attempts to hide it; the bullshitter doesn't care whether what they say is true or false. - Wikipedia
Second, I'm trying to clarify the term intelligence by breaking it into parts. I wouldn't say I'm trying to make it "logical" (in the sense of being about logic or deduction). Maybe you mean "formal"?
Third, regarding the "what you're doing" part... this isn't just me. Many people both clarify the concept of intelligence and explain why doing so is important.
Fourth, are you saying it is impossible to clarify the meaning of intelligence? Why? Not worth the trouble?
Fifth, have you thought about a definition of intelligence that you think is sensible? Does your definition steer people away from confusion?
You also wrote:
> We never would have been able to create science, if it weren't for focusing on the kinds of thinking that can be made logical.
I think you mean _testable_, not _logical_. Yes, we agree, scientists should run experiments on things that can be tested.
Russell's definition of intelligence is testable by defining a task and a quality metric. This is already a big step up from an unexamined view of intelligence, which often has some arbitrary threshold.* It allows us to see a continuum from, say, how a bacteria finds food, to how ants collaborate, to how people both build and use tools to solve problems. It also teases out sentience and moral worth so we're not mixing them up with intelligence. These are simple, doable, and worthwhile clarifications.
Finally, I read your quote from Dijkstra. In my reading, Dijkstra's main point is that natural language is a poor programming interface due to its ambiguity. Ok, fair. But what is the connection to this thread? Does it undercut any of my arguments? How?
* A common problem when discussing intelligence involves moving the goal post. Whatever quality bar is implied has a tendency to creep upwards over time.*
I reacted negatively to the idea earlier that agency should be considered an aspect of intelligence. I think separating the concepts helps me better understand people, their unique strengths, and puzzles like why sometimes people who aren't geniuses who know everything and can rotate complex shapes are sometimes very successful, but most importantly, why LLMs continue to feel like they're lacking something, compared to people, even though they're so outrageously intelligent. It's one thing to be smart, another thing entirely to be useful.
In the hopes of clarifying any misunderstandings of what I mean... I said "agent" in Russell's sense -- a system with goals that has sensors and actuators in some environment. This is a common definition in CS and robotics. (I tend to shy away from using the word "agency" because sometimes it brings along meaning I'm not intending. For example, to many, the word "agency" suggests free will combined with the ability to do something with it.)
I recommend Russell to anyone willing to give him a try. I selected part of his writing that explains why his definition is important to his goals. From page 2 of https://people.eecs.berkeley.edu/~russell/papers/aij-cnt.pdf
> My own motivation for studying AI is to create and understand intelligence as a general property of systems, rather than as a specific attribute of humans. I believe this to be an appropriate goal for the field as a whole...
To say my point a different way, intelligence is contextual. I'm not using "contextual" as some sort of vague excuse to avoid getting into the details. I'm not saying that intelligence cannot be quantified at all. Quite the opposite. Intelligence can be quantified fairly well (in the statistical sense) once a person specifies what they are talking about. Like Russell, I'm saying intelligence is multifaceted and depends on the agent (what sensors it has, what actuators it has), the environment, and the goal.
So what language would I use instead? Rather than speaking about "intelligence" as one thing that people understand and agree on, I would point to task- and goal-specific metrics. How well does a particular LLM do on the GRE? The LSAT?
Sooner or later, people will want to generalize over the specifics. This is where statistical reasoning comes in. With enough evaluations, we can start to discuss generalizations in a way that can be backed up with data. For example, might say things like "LLM X demonstrates high competence on text summarization tasks, provided that it has been pretrained on the relevant concepts" or "LLM Y struggles to discuss normative philosophical issues without falling into sycophancy, unless extensive prompt engineering protocols are used".
I think it helps to remember this: if someone asks "Is X intelligent?", one has the option to reframe the question. One can use it as an opportunity to clarify and teach and get into a substantive conversation. The alternative is suboptimal. But alas, some people demand short answers to poorly framed questions. Unfortunately, the answers they get won't help them.
The closest thing we have to a definition for intelligence is probably the LLMs themselves. They're very good at predicting words that attract people. So clearly we've figured it out. It's just such a shame that this definition for intelligence is a bunch of opaque tensors that we can't fully explain.
LLMs don't just defy human reasoning and understanding. They also challenge the purpose of intelligence itself. Why study and devise systems, when gradient descent can figure it out for you? Why be cleverer when you can just buy more compute?
I don't know what's going to make the magical black pill of machine learning more closely align with our values. But I'm glad we have them. For example, I think it's good that people still hold objectivity as a virtue and try to create well-defined benchmarks that let us rank the merits of LLMs using numbers. I'm just skeptical about how well our efforts to date have predicted the organic processes that ultimately decide these things.
Interesting. I wonder why you make this connection. Do you know?
Your choice of definition seems to be what I would call "perception of intelligence". But why add that extra layer of indirection; why require an observer? I claim this extra level of indirection is not necessary. I eschew definitions with unnecessary complexity (a.k.a "accidental complexity" in the phrasing of Rich Hickey).
Here are some examples that might reveal problems with the definition above:
- DeepBlue (decisively beating Kasparov in 1997) showed a high level of intelligence in the game of chess. The notion of "being good at the game" is simpler (conceptually) than the notion of "being attractive to people who like the game of chess". See what I mean?
- A group of Somali pirates working together may show impressive tactical abilities, including the ability to raid larger ships, which I would be willing to call a form of tactical intelligence to achieve their goals. I grant the intelligent behavior even though I don't find it "attractive", nor do I think the pirates need any level of "gravitas" to do it. Sure, the pirates might use leadership, persuasion, and coordination to accomplish their goals but these concepts are a means to an end accomplishing the goal. But these traits are not necessary. Since intelligent behavior can be defined without using those concepts, why include them? Why pin them to the definition?
- The human brain is widely regarded as a intelligent organ in a wide variety of contexts relating to human survival. Whether or not I find it "attractive" is irrelevant w.r.t. intelligence, I say. If the neighboring tribe wants to kill me and my tribe (using their tribally-oriented brains), I would hardly call their brains attractive or their methods being nuanced enough to use "gravitas".
My claim is then: Intelligence should be defined by functional capability which leads to effectiveness at achieving goals, not by how we feel about the intelligence or those displaying it.
You're don't like pirates? You're either in the Navy or grandstanding. People love pirates and even killers. But only if they're successful. Otherwise One Piece wouldn't be the most popular manga of all time.
Achieving goals? Why not define it as making predictions? What makes science science? The ability to make predictions. What does the brain organ and neural networks do? They model the world to make predictions. So there you have it.
This whole conversation has been about reducing intelligence to its defining component. So I propose this answer to your question. Take all the things you consider intelligent, and order them topologically. Then define intelligence as whatever thing comes out on top. Achieving goals depends on the ability to make predictions. Therefore it's a better candidate for defining intelligence.
Because "achieving goals" subsumes "making predictions". Remember, Russell's goal is to find a definition of intelligence that is broader than humans -- and even broader than sentient beings. But using the "achieving goals" definition, one can include system that accomplishes goals, even if we can't find any way to verify it is making predictions. For example, even a purely reactive agent (e.g. operating on instincts) can display intelligent behavior if its actions serve its purposes.
If you are seeking one clear point of view about the nature of intelligence, I highly recommend Russell's writing. You don't have to "agree" with his definition, especially not at first, but if you give it a fair reading, you'll probably find it to be coherent and useful for the purposes he layes out.
Russell has been thinking about and teaching these topics for probably 40+ years in depth. So it is sensible to give his ideas serious consideration. Also, there are scholars who disagree with Russell's definition or accentuate different aspects. Wherever a person lands, these various scholars provide a clear foundation that is all too often lacking in everyday conversation.
Not really, but I can see why you might say this. Neither Russell nor I are attempting to define "the one component" of intelligence -- we're saying that there is no single kind of intelligence. Only when one defines a particular (agent, environment, goal) triple can one can start to analyze it statistically and tease apart the related factors. You and I agree that the result will be multifaceted.
I wouldn't say I'm trying to "reduce" anything. I would say I've been attempting to explain a general definition of intelligence that works for a wide variety of types of intelligence. The goal is to reduce unnecessary confusion about it. It simply requires taking some extra time to spell out the (agent, environment, goal).
Once people get specific about a particular triple, then we have a foundation and can start to talk about patterns across different triples. If one is so inclined, we can try to generalize across all intelligent behavior, but frankly, only a tiny fraction of people have put in the requisite thought to do this rigorously. Instead, many people latch onto one particular form of intelligence (e.g. abstract problem solving or "creativity" or whatever) and hoist these preferred qualities into their definition. This is the tail wagging the dog in my opinion. But this is another topic.