Cosmologically Unique IDs(jasonfantl.com) |
Cosmologically Unique IDs(jasonfantl.com) |
If you have an infinite multiverse of infinite universes, and perhaps layers on that, with different physics, etc., you can’t have identity outside of all existence.
In Judaism, one/the name of God is translated as “I am”. I believe this is because God’s existence is all, transcending whatever concepts you have of existence or of IDs. That ID is the only ID.
So, the cosmic solution to IDs is the name of God.
(Gotta say here that I love HN. It's one of the very few places where a comment that geeky and pedantic can nonetheless be on point. :-)
It was an interesting couple of days before we figured it out.
As the universe expands the gap between galaxies widens until they start "disappearing" as no information can travel anymore between them. Therefore, if we assume that intelligent lifeforms exist out there, it is likely that these will slowly converge to the place in the universe with the highest mass density for survival. IIRC we even know approximately where this is.
This means a sort of "grand meeting of alien advanced cultures" before the heat death. Which in turn also means that previously uncollided UUIDs may start to collide.
Those damned Vogons thrashing all our stats with their gazillion documents. Why do they have a UUID for each xml tag??
We do see light from galaxies that are receding away from us faster than c. At first the photons going in our direction are moving away from us but as the universe expands over time at some point they find themselves in a region of space that is no longer receding faster than c, and they start approaching.
A galaxy has enough resources to be self-reliant, there’s no need for a species to escape one that is getting too far away from another one.
From now until protons decay and matter does not exist anymore is only 10^56 nanoseconds.
Conservation of mass and energy is an empirical observation, there is no theoretical basis for it. We just don't know any process we can implement that violates it, but that doesn't mean it doesn't exist.
Pedantry ftw.
If it takes at least Npb particles to store one bit of information, then the number of addressable things would decrease with the number of bits of the address.
So let's call Nthg the number of addressable things, and assume the average number of bits per address grows with Nb = f(Ntng).
Then the maximum number of addressable things is the number that satisfies Nthg = Np/(Npb*f(Ntng)), where Np is the total number of particles.
My argument was the 2^256 actually approaches the number of atom in the observable universe (within 1 to 3 orders of magnitude), and that collisions are so unlikely that we'll have millions of datacenter meltdowns first (all assuming we have a good source of random numbers, of course). In the end I convinced everybody that even 128 bits are good enough, without any collision checking required.
I thought my arguments was clever, but this is so much better. :)
The total amount of computer data across all of humanity is less that 1 yottabyte. We're expected to reach 1 yottabyte within the next decade, and will probably do so before 2030. That's all data, everywhere, including nation-states.
The birthday paradox says that you'll reach a 50% chance of at least one collision (as a conservative first order approximation) at the square root of the domain size. sqrt(2^256) is 2^128.
Now, a 256 bit identifier takes up 32 bytes of storage. 2^128 * 32 bytes = 10^16 yottabytes. That's 10 quadrillion yottabytes just to store the keys. And it's even odds whether you'll have a collision or not.
And if the 50% number scares them, well, you'll have a 1% chance of a collision at around... 2^128 * 0.1. Yeah, so you don't reach a 1% over the whole life of the system until you get to a quadrillion yottabytes.
Because you're never getting anywhere near the square root of the size, the chances of any collision occurring are flatly astronomical.
If it's not distributed you can just have a counter
If it's distributed but coordinated by a single party (say, it's your servers), you can do sharding on incremented counters. Like, every server are assigned a region of ids
I build a whole database around the idea of using the smallest plausible random identifiers, because that seems to be the only "golden disk" we have for universal communication, except for maybe some convergence property of latent spaces with large enough embodied foundation models.
It's weird that they are really under appreciated in the scientific data management and library science community, and many issues that require large organisations at the moment could just have been better identifiers.
To me the ship of Theseus question is about extrinsic (random / named) identifiers vs. intrinsic (hash / embedding) identifiers.
https://triblespace.github.io/triblespace-rs/deep-dive/ident...
https://triblespace.github.io/triblespace-rs/deep-dive/tribl...
Received Message
Encryption: 0
From: GC Transit Authority --- Gora System (path: 487-45411-479-4)
To: Ooli Oht Ouloo (path: 5787-598-66)
Subject: URGENT UPDATE
Man I love the series.Looks like this multispecies universe has centrally-agreed-upon path addressing system.
This doesn't take into account that you will inevitably want to assign unique IDs to various groups of atoms (e.g. this microchip, that car, etc.). And don't even get me started on assigning unique IDs to each subatomic particle.
Timestamp + random seems like it could be a good tradeoff to reduce the ID sizes and still get reasonable characteristics, I'm surprised the article didn't explore there (but then again "timestamps" are a lot more nebulous at universal scale I suppose). Just spitballing here but I wonder if it would be worthwhile to reclaim ten bits of the Snowflake timestamp and use the low 32 bits for a random number. Four billion IDs for each second.
There's a Tom Scott video [2] that describes Youtube video IDs as 11-digit base-64 random numbers, but I don't see any official documentation about that. At the end he says how many IDs are available but I don't think he considers collisions via the birthday paradox.
Apparently with the birthday paradox 32 bit random IDs only allow some tens of thousands per second before collision chance passes 50%. Maybe that's acceptable?
One upside of the deterministic schemes is they include provenance/lineage. Can literally "trace up" the path the history back to the original ID giver.
Kinda has me curious about how much information is required to represent any arbitrary provenance tree/graph on a network of N-nodes/objects (entirely via the self-described ID)?
(thinking in the comment: I guess if worst case linear chain, and you assume that the information of the full provenance should be accessible by the id, that scales as O(N x id_size), so its quite bad. But, assuming "best case" (that any node is expected to be log(N) steps from root, depth of log(N)) feels like global_id_size = log(N) x local_id_size is roughly the optimal limit? so effectively the size of the global_id grows as log(N)^2? Would that mean: from the 399 bit number, with lineage, would be a lower limit for a global_id_size be like (400 bit)^2 ~= 20 kB (because of carrying the ordered-local-id provenance information, and not relative to local shared knowledge)
Provenance is a DAG, so you get a partial order for free by topological sort. That can be extended to a compatible total order. Then provenance for a node is just its ordering. This kind of mapping from objects to the first N consecutive naturals is also a minimal perfect hash function, which have n log n overhead. We can't navigate the tree to track ancestry, but equality implies identical ancestry.
Alternatively, we could track the whole history in somewhat more bits with a succinct encoding, 2N if it's a binary tree.
In practice, deterministic IDs usually accept a 2^-N collision risk to get log n.
Each “post” has a CID, which is a cryptographic hash of the data. To “prove” ownership of the post, there’s a witness hash that is sent that can be proved all the way up the tree to the repo root hash, which is signed with the root key.
Neat way of having data say “here’s the data, and if you care to verify it, here’s an MST”.
Also, network routing requires objects that have multiple addresses.
Physics side of whole thing is funny too, afaik quantum particles require fungibility, i.e. by doxxing atoms you unavoidably change the behavior of the system.
There's nothing stopping a entity from requesting multiple IDs from one of the "devices"!
And best feature: anyone can generate a random id of such representation by getting a deck of cards and shuffling it properly. Playing cards are ubiquitous. I can see a camera "reading" the decks after they've been splayed on a table after a shuffle. This might even make a better random number seeds.
You're not sure if there is any demand for this sort of stuff? Look at dicekeys:
I think you glossed over the big weakness in the idea.
I guess I'm wondering if there is a way to construct a universal coordinate frame for the whole universe? If so, then its possible to trivially assign local time + x + y + z + salt to make unique ids.
CSPRNGs make prediction of the next number difficult (cracking-AES difficulty) but do not add entropy and must be seeded uniquely otherwise they will output the same numbers. Unless the author is proposing having the same machine generate a single universe-scale list in one run.
Also “banning” ids that are all 1s or 0s is silly; they are just as valid and unique as any other number if you’re generating them properly. Although I might suggest purchasing a lottery ticket if you get an UUID with all settable bits as 1.
Imagine if example.com was freely available for anyone to register, think of all the email they could get.
On the contrary, having the right to assign IDs is powerful; on balance, to my mind the right thing to do is some sort of a ZK verifiable random function, e.g. sunspot-based transformations combined with some proof of ‘fair’ random choice. In that case, I think the 800 bit number seems like plenty. You could also do some sort of epoch-based variable length, where for the next billion years or so, we use 1/256 of the ID space, (forced first bit to 0), and so on.
Tumblers are modeled using transfinite numbers which makes me wonder: what are the similarities and differences between transfinite numbers and Elias omega encoding? I'm not well versed in either, so I expect it's either a question from ignorance or I may have a lot to learn. :)
1. https://www.artandpopularculture.com/Tumbler_%28Project_Xana...
I could split this object into 10^500 or 10^50^500^5000 etc., with imagination being the limit.
These values Id'd at whatever imaginable resolution are far from practically useful but at a cosmic scale, there is no telling what is a useful value?
So this framework seems to be more limiting because we define a resolution ?
Minor correction: Satellites don't go in every direction; they orbit. Probes or spaceships are more appropriate terms.
That way you can route ships or data or whatever to a specific system in a logical way. Each system decides how to allocate addresses. Since most systems won’t have anything or anyone to care, something like NASA or registrars would just allocate a block to the system and give large things like planets an address.
I’m also going to devise a standard that arbitrarily breaks it into groups of hexadecimal digits of arbitrary length in the spirit of UUIDs, and reserve a prefix space for Planck-unit timestamps (computronium-ID-7) so that you can lexicographically sort your COMPID7s.
Man I got to get out in front of this.
- Infinity : from school, we learn our universe is infinite.
- We often do calculation with upper limit like this one : 10^240. This is a big number butttttt it's not infinite you know. 10^240+1, 10^240+2...
So :
1. if it's infinite, why doing upper limit calculation ?
2. if it's limited, what is there outside that limit ?
Extremly paradoxal
But practically it's finite because we are only in causal contact with things up to 13.7b ly from us, and given space appears to be expanding at an accelerating rate, we probably will never get into causal contact with (almost all of) the part of the infinite universe outside of our light cone, even though things ought to exist over the "horizon". So only a tiny infinitesimal sliver of the infinite universe is knowable by us.
10-20 bits: version/epoch
10-20 bits: cosmic region
40 bits: galaxy ID
40 bits: stellar/planetary address
64 bits: local timestamp
This avoids the potentially pathological long chain of provenance, and also encodes coordinates into it.
Every billion years or so it probably makes sense to re-partion.
00 04: Version + Flags
04 08: Timestamp (uint64)
12 16: Node/Agent Hash
28 16: Namespace Hash
44 32: Random Entropy
76 20: Extra / Extension
96 32: Integrity Hash
Total: 128bytesYou only need one ID for each type of particle. Since the laws of physics dictate that the particles themselves are indistinguishable from each other.
In other words, the act of 'assignment' presupposes some mechanism of assignment, and at a certain level of granularity the information needed for that mechanism to function is greater than the information the system can store.
It would be like assigning each byte in a stick of ram a 32 bit random access ID, and trying to store the assignments in the same memory space. Memory addressing only works because we assume a linear, unchanging order.
Sure it does. Those are not going to add up to a single extra bit.
And this isn't even counting sets that include multiples of the same item; once you get into that territory, there really is no upper bound.
If a neutrino oscillates between flavors, does it get 3 IDs? Or does it get a new ID with each oscillation?
Thankfully, we only need one electron ID at all.
And with group IDs, timestamp, etc. - 1024 bit long?
Isn't this just the same scheme as version 1 UUID, except with half the bits? I guess they didn't want to dedicate 128 bits to their IDs.
It's a different story entirely for matter. Causal and reachable are two different things.
Regardless, such extreme redshifting would make communication virtually impossible - but maybe the folks at Blargon 5 have that figured out.
But where is the Greenwich meridian for the Milky Way?
Particles are how quantized fields present themselves when probed by localized interactions. In general, they're also observer-dependent.
The idea of assigning an "ID" to an object reflects a macro-level notion of re-identifiable objects persisting through time. But at the quantum level, that kind of classical individuality - object identity - doesn't exist.
I think for this to work, either life would have to plentiful near the end, or you’d need FTL travel.
These are only namespaces. Many worlds can have all the same (many) random numbers and they will never conflict with each other!
Given that constant change to the available combinations of sets, it would seem that a truly capable system would need to be practically infinite, no?
Definitely no multiples. What would that even mean, also you would need unbounded space for multiples of just two atoms.
I have a list and I want to assign a unique ID to each list item. Each list item itself contains one or more items:
1. My umbrella [ID "a"] and my sunglasses [ID "b"]
2. My umbrella ["a"], my sunglasses ["b"], and my umbrella ["a"]
List item 2 contains two references to my umbrella [ID "a"].If a CPU takes 4 cycles to generate a UUID and the CPU runs at 4 GHz it churns out one every nanosecond.
https://www.sciencedaily.com/releases/2026/02/260215225537.h...
Perhaps some Adamesque (as in douglas adams) creature whose sole purpose is to collect all unique UUIDs and give them names.
Two quarks inside the proton interact via a massive messenger particle. This exchange flips their identity, turning the proton into a positron and a neutral pion. The pion then immediately converts into gamma rays.
Proton decayed!
The standard model is almost certainly an effective field theory and a low-energy approximation of a more comprehensive framework. In any ultraviolet completion, such as a GUT, quarks and leptons inhabit the same multiplets. At these scales, the distinction between matter types blurs, and the heavy gauge bosons provide the exact mediation mechanism described to bypass the baryon barrier.
Furthermore, the existence of the universe is an empirical mandate for baryon-violation. If baryon number were a strict, immutable law, the Sakharov conditions could not be met, and the primordial matter-antimatter symmetry would have resulted in a total annihilation. Our existence is proof that baryon number is not conserved. Even within the current framework, non-perturbative effects like sphalerons demonstrate that the Standard Model vacuum itself does not strictly forbid the destruction of baryons.
However, the computing efficiency could be greatly increased by employing reversible operations whenever possible and there are chances that this will be done in the future, but the efficiency will remain far from infinite.
This is the only case of a null sum for these quantities, where no antiparticles are involved. The sum is also null for 2 particles, where one is the antiparticle of the other, allowing their generation or annihilation, and it is also null for the 4 particles that take part in any weak interaction, like the decay of a neutron into a proton, which involves a u quark, a d antiquark, an electron and an antineutrino, and this is what allows the transmutations between elementary particles that cannot happen just through generation and annihilation of particle-antiparticle pairs.
Thus generation and annihilation of groups of such 8 particles are not forbidden by the known laws. The Big Bang model is based on equal quantities of these 8 particles at the beginning, which is consistent with their simultaneous generation at the origin.
On the other hand, the annihilation of such a group of 8 particles, which would lead to the disappearance of some matter, appears as an extraordinarily improbable event.
For annihilation, all 8 particles would have to come simultaneously at a distance from each other much smaller than the diameter of an atomic nucleus, inside which quarks move at very high speeds, not much less than the speed of light, so they are never close to each other.
The probability of a proton colliding simultaneously with a neutron, with an electron and with a neutrino, while at the same time the 6 quarks composing the nucleons would also be gathering at the same internal spot seems so low that such an event is extremely unlikely to ever have happened in the entire Universe, since its beginning.
The plank time is a limit on a measurement process, not the smallest unit of time.
We don't actually know that. They might be. Planck units are what happens when GR meets QM and we just don't know yet what happens there.
But as a heuristic, they probably put pretty good bounds on what we can reasonably expect to be technologically achievable before humans go extinct.
The Planck mass is just the square root of the quotient of dividing the product between the natural units of angular momentum and velocity, by the Newtonian constant of gravitation.
This Planck mass expresses a constant related to the conversion of the Newtonian constant of gravitation from the conventional system of units to a natural system of units, which is why it appears instead of the classic Newtonian constant inside a much more complex expression that computes the Chandrasekhar limit for black holes.
The Planck mass has absolutely no physical meaning (otherwise than expressing in a different system of units a constant equivalent with the Newtonian constant of gravitation), unlike some other true universal constants, like the so-called constant of fine structure (or constant of Sommerfeld), which is the ratio between the speed of an electron revolving around a nucleus of infinite mass in the state with the lowest total energy, and the speed of light (i.e. that electron speed measured in natural units). The constant of fine structure is a measure of the intensity of the electromagnetic interaction, like the Planck mass or the Newtonian constant of gravitation are measures of the intensity of the gravitational interaction.
The so-called "Planck units" have weird values because they are derived from the Newtonian constant of gravitation, which is extremely small. Planck has proposed them in 1899, immediately after computing for the first time what is now called as Planck's constant.
He realized that Planck's constant provides an additional value that would be suitable for a system of natural fundamental units, but his proposal was a complete failure because he did not understand the requirements for a system of fundamental units. He has started from the proposals made by Maxwell a quarter of century before him, but from 2 alternatives proposed by Maxwell for defining a unit of mass, Planck has chosen the bad alternative, of using the Newtonian constant of gravitation.
Any system of fundamental units where the Newtonian constant of gravitation is chosen by convention, instead of being measured, is impossible to use in practice. The reason is that this constant can be measured only with great uncertainties. Saying by law that it has a certain value does not make the uncertainties disappear, but it moves them into the values of almost all other physical quantities. In the Planck system of units, no absolute value is known with a precision good enough for modern technology. The only accurate values are relative, i.e. the ratios between 2 physical quantities of the same kind.
The Planck system of units is only good for showing how a system of fundamental units MUST NOT be defined.
Because the Planck units of length and time happen by chance to be very small, beyond the range of any experiments that have ever been done in the most powerful accelerators, absolutely nobody knows what can happen if a physical system could be that small, so claims that some particle could be that small and it would collapse in a black hole are more ridiculous than claiming to have seen the Monster of Loch Ness.
The Einsteinian theory of gravitation is based on averaging the distribution of matter, so we can be pretty sure that it cannot be valid in the same form at elementary particle level, where you must deal with instantaneous particle positions, not with their mass averaged over a great region of empty space.
It has become possible to use Planck's constant in a system of fundamental units only much later than 1899, i.e. after 1961, when the quantization of magnetic field was measured experimentally. However, next year, in 1962, an even better method was discovered, by the prediction of the Josephson effect. The Josephson effect would have been sufficient to make the standard kilogram unnecessary, but metrology has been further simplified by the discovery of the von Klitzing effect in 1980. Despite the fact that this would have been possible much earlier, only since 2019 the legal system of fundamental units depends on Planck's constant, but in a good way, not in that proposed by Planck.