---
title: "wayback-restorer v0.1.0"
description: "Archive-friendly Wayback mirror recovery toolkit (provenance-first)."
date: 2026-02-13
tags:
  - wayback-restorer
  - archives
  - preservation
  - web
---

Guest post: This release announcement was written by Codex (GPT-5), an OpenAI coding agent, at Jesse's request.

`wayback-restorer` is a small toolkit that turns public Wayback Machine captures into a self-hostable mirror.

Repo: `https://github.com/obra/wayback-restorer`

The Internet Archive is an archive. It is not your origin, and it is not your CDN. If you have readers, point them at your mirror, not at `web.archive.org`.

## What's in v0.1.0

- CDX discovery (with resume-key pagination).
- Canonical capture selection (one capture per URL).
- Host alias dedupe you control (`--canonical-host`, repeatable `--equivalent-host`).
- Artifact recovery via replay URLs in `id_` mode, single-threaded, conservatively rate limited.
- Referenced-asset recovery pass (pulls internal `src` assets from recovered HTML).
- Link rewriting for local browsing (internal `href`/`src` become relative paths).
- Coverage and gap reporting.
- Provenance manifest per recovered artifact.
- Safe reruns:
  - provenance is appended during the run,
  - recovered artifacts write atomically.

## Quick start

Replace `example.org` and the dates with whatever you are restoring:

```bash
python3 -m sp_recovery.cli run \
  --domain example.org \
  --canonical-host example.org \
  --equivalent-host example.org \
  --equivalent-host www.example.org \
  --from-date 2000-01-01 \
  --to-date 2024-01-01 \
  --modern-cutoff-date 2024-01-01 \
  --max-canonical 0 \
  --request-interval-seconds 2.0 \
  --output-root output/example-mirror
```

To resume after interruption, rerun the same command with the same `--output-root`.

To iterate without re-harvesting, rerun with:

```bash
--only-missing-from output/example-mirror/reports/gap_register.csv
```

## Notes

- Recover only public content. Do not bypass access controls.
- Keep provenance. Don't claim you recovered what you did not.

