content is still in production and may contain factual errors

Build

Over a billion stars. Each placed in three-dimensional space.

Not a simulation. Not a model. Real distance measurements from ESA's Gaia spacecraft, combined with data from the historic Hipparcos survey, processed into a single navigable dataset. Every step is open — the catalogues, the pipeline code, the spatial index, and the viewer.

Step one

Two surveys, one star map

Gaia DR3 measured over a billion stars with sub-milliarcsecond precision — positions, parallaxes, proper motions, colours. But Gaia's detectors saturate on the brightest stars: Sirius, Vega, Betelgeuse. These are exactly the stars people recognise.

The Hipparcos mission (ESA, 1989–1993) was purpose-built for bright-star astrometry. Its 118,000 stars anchor the naked-eye sky that Gaia cannot reliably reach.

About 100,000 stars appear in both catalogues. The pipeline produces exactly one canonical row per physical star — choosing the better measurement for each, handling binary systems, and manually curating the handful of cases that defeat automation entirely (including the Sun, which does not appear in either catalogue).

How the merge works →

Step two

Cleaned, corrected, and positioned

Each star gets a distance — from the catalog parallax where reliable, from Bailer-Jones probabilistic estimates where not, from photometric modelling where those fail, and from a conservative prior as a last resort. Each tier is flagged.

Positions are propagated to a common epoch (J2016.0) using proper motions, and converted to Sun-centred Cartesian coordinates in parsecs. Temperature comes from spectroscopic measurements where available, falling back through colour cascades to a default.

The result is a HEALPix-partitioned Parquet table: one row per star, a fixed schema, quality flags packed into each record.

Step three

Indexed for the browser

A browser cannot download a billion-row table. The merged data is encoded into a spatial octree — a tree structure that divides 3D space into nested cells across fourteen levels. Each star is placed into a level based on its brightness: the brightest stars sit in the shallowest, largest cells, while fainter stars go into progressively deeper, smaller ones. The threshold at each level is tuned to how the human eye perceives starlight.

When you navigate the viewer, it loads cells level by level. Bright-star cells have large visibility radii and load from anywhere in the scene. Faint-star cells only load when you fly close enough that those stars would actually be visible. Everything else stays on the server until you need it.

Open source

The pipeline is on GitHub

The data pipeline code — catalogues, astrometry, photometry, merging, and the project configuration format — is published under the MIT licence at github.com/Found-in-Space/pipeline. Requires Python ≥ 3.13 and uv.