Mirroring lore.kernel.org

The mail archiving system at lore.kernel.org uses public-inbox, which relies on git as the mechanism to store messages. This makes the entire archive collection very easy to replicate using grokmirror — the same tool we use to mirror git.kernel.org repositories across multiple worldwide frontends.

Setting up

It doesn't take a lot to get started. First, install grokmirror either from pip:

pip install grokmirror

or from your distro repositories:

dnf install python3-grokmirror

Next, you will need a config file and a location where you'll store your copy (keep in mind, at the time of writing all of the archives take up upwards of 20GB):

[lore.kernel.org]
# Use the erol mirror instead of lore directly
site = https://erol.kernel.org
manifest = https://erol.kernel.org/manifest.js.gz
toplevel = /path/to/your/local/archive
mymanifest = %(toplevel)s/manifest.js.gz
log = %(toplevel)s/pull.log
pull_threads = 2

Save this file into lore.conf and just run:

grok-pull -v -c lore.conf

The initial clone is going to take a long time, but after it is complete, consecutive runs of grok-pull will only update those repositories that have changed. If new repositories are added, they will be automatically cloned and added to your mirror of the archive.

Note: this by itself is not enough to run public-inbox on your local system, because there's a lot more to public-inbox than just git archives of all messages. For starters, the archives would need to be indexed into a series of sqlite3 and xapian databases, and the end-result would take up a LOT more than 20GB.

Future work

We are hoping to fund the development of a set of tools around public-inbox archives that would allow you to do cool stuff with submitted patches without needing to subscribe to LKML or any other list archived by lore.kernel.org. We expect this would be a nice feature that various CI bots can use to automatically discover and test patches without needing to bother about SMTP and incoming mail processing. If you would like to participate, please feel free to join the public-inbox development list.