HEALPixels stand for 'Hierarchical Equal-Area Iso-latitudinal Pixels'. It is a scheme that was developed to analyze signals that cover the entire sky, but with variable density.
Like HTM or Hilbert curves, this can be used to organize spatial data.
The tesselation looks kind of funny but has many good features - it doesn't have discontinuities at poles, and is always equal area. And with the "nested" healpixel formulation, pixels are identified by integers. Pixel IDs are hierarchical based on leading bits - so, for example, pixel 106 (=0110 1010) contains pixel 1709 (=0110 1010 1101). This lets you do some marvelous optimizations in queries if you structure your data appropriately. Nearest neighbor searches can be extremely quick if things are HEALPix-indexed - and so can radius searches, and arbitrary polygon searches.
HEALPixels are used today for more than just their original intent. LSST will use them for storing all-sky data and point source catalogs, for example.
More here:
- Original NASA/JPL site: https://healpix.jpl.nasa.gov/
- Popular Python implementation: https://healpy.readthedocs.io/en/latest/
- Good PDF primer: https://healpix.jpl.nasa.gov/pdf/intro.pdf
And an experimental database being built on healpix for extremely large data volumes (certainly many TB, maybe single-digit PB): https://github.com/astronomy-commons/hipscat
By having both high level (cell) and low level (cell id) geometries it was a very powerful library which allowed projection from the hilbert space into a Postgres spatial index (spgist) including various trees, like noted in this article. It appears to be still quite active in development.
Interesting that you mention the use of multiple hilbert curves as well. We also experimented with two Hilbert Curves, rotated by 90 degrees. This helps to get around what we've dubbed the "Hilbert Equator" problem where two objects are quite far on the curve because they are placed close to one of the major fault lines in the fractal (for lack of a better word)
Some research from the 1980s showed it is only possible to mitigate bounded categories of edge case, thereby improving generality, by indexing on complex higher-dimensionality embeddings. Mitigating more categories requires more dimensions and more complexity. However, no one could figure out how to construct these embeddings for even basic cases or deal with more practical curse of dimensionality issues, so that is largely forgotten (the researchers themselves made comments to the effect that they didn't think a tractable solution was possible). I've never even been able to find that literature in electronic form, unfortunately.
Like AI, it is an interesting open-ended problem space. You can prove that an elegant optimal solution is not tractable so it ends up being a search for asymptotically optimal algorithms that become exponentially more complex the closer you get to optimal.
Hexagons are cool, but they are not necessarily the bestagon for a spherical geometry since you cannot break a hexagon into smaller hexagons, whereas an S2 cell is a "cube" with spherical sides or HEALPix uses a rhombic dodecahedron [0] both of which can be split into smaller divisions of themselves.
Not to discourage you from your experimentation, it's all trade offs and you might find a good one. Good luck!
But points with a large difference in their single curve coordinate can be either far apart or close together. E.g. on this 16 point Hilbert curve
__. .__
__| |__
| __ |
|__| |__|
the '.' marked points at 1/16th and 15/16th along the curve are adjacent.BOTH cartesian coordinates / EITHER Hilbert coordinate
And that being far apart in 2D space corresponds to being far apart on
EITHER cartesian coordinate / BOTH Hilbert coordinates
But if we consider the two commas below which are close together in 2D space, we see they are far apart in any rotation of this Hilbert curve?!
__ __ __ __
|__| __| |__ |__|
__ |__ __| __
| |__ __| ;__ __| |
|__ __,__ __ __|
__| |__ __| |__
| __ | | __ |
|__| |__| |__| |__|See just for a start: https://arxiv.org/abs/1504.05140
The paper basically deals with the most adverse case but the approach should work in lower dimensions and in data with a non-uniform density distribution as well.
In ClickHouse, we have almost every mentioned technique: H3 and S2, geohashes, and indexing by space-filling curves.
On the other hand, spatial data can be easily partitioned, rendering the aforementioned method less useful. Also, I've never seen anyone needed more than a single instance of postgres to index all the spatial data, unless you are dealing with spatial temporal data, which can grow infinitely large. Then again, spatial temporal data can be easily partitioned.
I know of at least one, GeoMesa, which seems like it could at least provide the building blocks to achieve what they are trying to do.
Actually you can.
1. Think of hexagons as six equilateral triangles sharing a center point.
2. Place one smaller vertically flipped equilateral triangle, in each original triangle.
3. Each original hexagon center point is now the center of a smaller (1/2 linear dimension, 1/4 area) hexagon.
4. New small hexagons replace each of the six original hexagon's edges. Since edges are shared, this is an increase a 3x increase in number of hexagons.
So each new hexagon has 1/4 the area of the original ones (and 1/2 the linear dimensions). This results in 2x the linear dimension resolution, 4x the area resolution.
Grids could also be increased in scale the same way. By retaining a half-sized (in linear terms) square at the center of each original square, and turning each original edge and corner into new squares. With the same 1/2 and 1/4 ratios of linear and area scaling.