lore+lei: part 1, getting started

I am going to post a series of articles about public inbox's new lei tool (stands for “local email interface”, but is clearly a “lorelei” joke :)). In addition to being posted on the blog, it is also available on the workflows mailing list, so if you want to reply with a follow up, see this link:

What's the problem?

One of kernel developers' perennial complaints is that they just get Too Much Damn Email. Nobody in their right mind subscribes to “the LKML” (linux-kernel@vger.kernel.org) because it acts as a dumping ground for all email and the resulting firehose of patches and rants is completely impossible for a sane human being to follow.

For this reason, actual Linux development tends to happen on separate mailing lists dedicated to each particular subsystem. In turn, this has several negative side-effects:

  1. Developers working across multiple subsystems end up needing to subscribe to many different mailing lists in order to stay aware of what is happening in each area of the kernel.

  2. Contributors submitting patches find it increasingly difficult to know where to send their work, especially if their patches touch many different subsystems.

The get_maintainer.pl script is an attempt to solve the problem #2, and will look at the diff contents in order to suggest the list of recipients for each submitted patch. However, the submitter needs to be both aware of this script and know how to properly configure it in order to correctly use it with git-send-email.

Further complicating the matter is the fact that get_maintainer.pl relies on the entries in the MAINTAINERS file. Any edits to that file must go through the regular patch submission and review process and it may take days or weeks before the updates find their way to individual contributors.

Wouldn't it be nice if contributors could just send their patches to one place, and developers could just filter out the stuff that is relevant to their subsystem and ignore the rest?

lore meets lei

Public-inbox started out as a distributed mailing list archival framework with powerful search capabilities. We were happy to adopt it for our needs when we needed a proper home for kernel mailing list archives — thus, lore.kernel.org came online.

Even though it started out as merely a list archival service, it quickly became obvious that lore could be used for a lot more. Many developers ended up using its search features to quickly locate emails of interest, which in turn raised a simple question — what if there was a way to “save a search” and have it deliver all new incoming mail matching certain parameters straight to the developers' inbox?

You can now do this with lei.

lore's search syntax

Public-inbox uses Xapian behind the scenes, which allows to narrowly tailor the keyword database to very specific needs.

For example, did you know that you can search lore.kernel.org for patches that touch specific files? Here's every patch that touched the MAINTAINERS file:

How about every patch that modifies a function that starts with floppy_:

Say you're the floppy driver maintainer and wanted to find all mail that touches drivers/block/floppy.c and modifies any function that starts with floppy_ or has “floppy” in the subject and maybe any other mail that mentions “floppy” and has the words “bug” or “regression”? And maybe limit the results to just the past month.

Here's the query:

    (dfn:drivers/block/floppy.c OR dfhh:floppy_* OR s:floppy
     OR ((nq:bug OR nq:regression) AND nq:floppy))
    AND rt:1.month.ago..

And here are the results:

Now, how about getting that straight into your mailbox, so you don't have to subscribe to the (very busy) linux-block list, if you are the floppy maintainer?

Installing lei

Lei is very new and probably isn't yet available as part of your distribution, but I hope that it will change quickly once everyone realizes how awesome it is.

I'm working on packaging lei for Fedora, so depending on when you're reading this, try dnf install lei — maybe it's already there. If it's not in Fedora proper yet, you can get it from my copr:

    dnf copr enable icon/b4
    dnf install lei

If you're not a Fedora user, just consult the INSTALL file:

Maildir or IMAP?

Lei can deliver search results either into a local maildir, or to a remote IMAP folder (or both). We'll do local maildir first and look at IMAP in a future follow-up, as it requires some preparatory work.

Getting going with lei-q

Let's take the exact query we used for the floppy drive above, and get lei to deliver entire matching threads into a local maildir folder that we can read with mutt:

    lei q -I https://lore.kernel.org/all/ -o ~/Mail/floppy \
      --threads --dedupe=mid \
      '(dfn:drivers/block/floppy.c OR dfhh:floppy_* OR s:floppy \
      OR ((nq:bug OR nq:regression) AND nq:floppy)) \
      AND rt:1.month.ago..'

Before you run it, let's understand what it's going to do:

As always, backslashes and newlines are there just for readability — you don't need to use them.

After the command completes, you should get something similar to what is below:

    # /usr/bin/curl -Sf -s -d '' https://lore.kernel.org/all/?x=m&t=1&q=(omitted)
    # /home/user/.local/share/lei/store 0/0
    # https://lore.kernel.org/all/ 122/?
    # https://lore.kernel.org/all/ 227/227
    # 150 written to /home/user/Mail/floppy/ (227 matches)

A few things to notice here:

  1. The command actually executes a curl call and retrieves the results as an mbox file.
  2. Lei will automatically convert 1.month.ago into a precise timestamp
  3. The command wrote 150 messages into the maildir we specified

We can now view these results with mutt (or neomutt):

    neomutt -f ~/Mail/floppy

It is safe to delete mail from this folder — it will not get re-added during lei up runs, as lei keeps track of seen messages on its own.

Updating with lei-up

By default, lei -q will save your search and start keeping track of it. To see your saved searches, run:

    $ lei ls-search
    /home/user/Mail/floppy

To fetch the newest messages:

    lei up ~/Mail/floppy

You will notice that the first line of output will say that lei automatically limited the results to only those that arrived since the last time lei was invoked for this particular saved search, so you will most likely get no new messages.

As you add more queries in the future, you can update them all at once using:

    lei up --all

Editing and discarding saved searches

To edit your saved search, just run lei edit-search. This will bring up your $EDITOR with the configuration file lei uses internally:

    ; to refresh with new results, run: lei up /home/user/Mail/floppy
    ; `maxuid' and `lastresult' lines are maintained by "lei up" for optimization
    [lei]
        q = (dfn:drivers/block/floppy.c OR dfhh:floppy_* OR s:floppy OR \
            ((nq:bug OR nq:regression) AND nq:floppy)) AND rt:1.month.ago..
    [lei "q"]
        include = https://lore.kernel.org/all/
        external = 1
        local = 1
        remote = 1
        threads = 1
        dedupe = mid
        output = maildir:/home/user/Mail/floppy
    [external "/home/user/.local/share/lei/store"]
        maxuid = 4821
    [external "https://lore.kernel.org/all/"]
        lastresult = 1636129583

This lets you edit the query parameters if you want to add/remove specific keywords. I suggest you test them on lore.kernel.org first before putting them into the configuration file, just to make sure you don't end up retrieving tens of thousands of messages by mistake.

To delete a saved search, run:

    lei forget-search ~/Mail/floppy

This doesn't delete anything from ~/Mail/floppy, it just makes it impossible to run lei up to update it.

Subscribing to entire mailing lists

To subscribe to entire mailing lists, you can query based on the list-id header. For example, if you wanted to replace your individual subscriptions to linux-block and linux-scsi with a single lei command, do:

    lei q -I https://lore.kernel.org/all/ -o ~/Mail/lists --dedupe=mid \
      '(l:linux-block.vger.kernel.org OR l:linux-scsi.vger.kernel.org) AND rt:1.week.ago..'

You can always edit this to add more lists at any time.

Coming next

In the next series installment, I'll talk about how to deliver these results straight to a remote IMAP folder and how to set up a systemd timer to get newest mail automatically (if that's your thing — I prefer to run lei up manually and only when I'm ready for it).