Build with the data

Use the same star data behind the viewer.

Start with a small bright-star query, run the open pipeline when you are ready, or read the technical notes that explain how Gaia and Hipparcos become a browser-ready 3D star map. The project code is open: the data pipeline, spatial index, and viewer can all be inspected and built on.

Use the data

Start with a small sample

Not ready to run the full pipeline? Begin with a starter bright-star query and build something visible: a chart, classroom prompt, notebook, or simple 3D scene.

Use the starter guide →

Run the pipeline

Rebuild the catalogue

Install the open tools, download source catalogues, merge Gaia with Hipparcos, and produce the Parquet outputs used by the site.

Follow the guide →

Understand the machinery

Read the technical notes

See how source catalogues are merged, how distances are chosen, and why a spatial index lets a browser explore a dataset far too large to download at once.

Start with the catalogue merge →

Step one

Two catalogues, one star map

Gaia DR3 gives Found in Space its foundation: a vast modern catalogue with positions, brightnesses, parallaxes, proper motions, and colours for an unprecedented number of sources. But a public star map also needs the familiar bright stars people recognise by eye.

The very brightest objects can require special handling in Gaia, so Found in Space combines Gaia with Hipparcos and curated overrides to keep the naked-eye sky usable, recognisable, and well documented.

About 100,000 stars appear in both catalogues. The pipeline produces one canonical row per object wherever possible, choosing the best available measurement, handling duplicate catalogue entries, and documenting special cases such as the Sun.

How the merge works →

Step two

Cleaned, corrected, and positioned

Each star gets a working distance estimate: from catalogue parallax where reliable, from Bailer-Jones probabilistic estimates where not, from photometric modelling where those fail, and from a conservative prior as a last resort. Each tier is flagged.

Positions are propagated to a common epoch (J2016.0) using proper motions, and converted to Sun-centred Cartesian coordinates in parsecs. Temperature comes from spectroscopic measurements where available, falling back through colour cascades to a default.

The result is a HEALPix-partitioned Parquet table: one row per star, a fixed schema, quality flags packed into each record.

Step three

Indexed for the browser

A browser cannot download a billion-row table. The merged data is encoded into a spatial octree — a tree structure that divides 3D space into nested cells across fourteen levels. Each star is placed into a level based on its brightness: the brightest stars sit in the shallowest, largest cells, while fainter stars go into progressively deeper, smaller ones. The threshold at each level is derived from an apparent-magnitude limit, by default roughly the naked-eye limit.

When you navigate the viewer, it loads cells level by level. Bright-star cells have large visibility radii and load from anywhere in the scene. Faint-star cells only load when you fly close enough that those stars would actually be visible. Everything else stays on the server until you need it.

Technical notes

The pipeline is on GitHub

The data pipeline code — catalogues, astrometry, photometry, merging, and the project configuration format — is published under the MIT licence at github.com/Found-in-Space/pipeline. Requires Python ≥ 3.13 and uv.

Use the same star data behind the viewer.

Start with a small sample

Rebuild the catalogue

Read the technical notes

Two catalogues, one star map

Cleaned, corrected, and positioned

Indexed for the browser

Read more about how it works.

Gaia and Hipparcos — merging two star catalogues

Download and run the pipeline yourself

Want to build a front-end experience rather than rebuild the catalogue?

The pipeline is on GitHub