metan's blog

Tales from kernel testing

What is wrong with sleep() then?

First of all this is something I had to fight off a lot and still have to from time to time. In most of the cases sleep() has been misused to avoid a need for a proper synchronization, which is wrong for at least two reasons.

The first is that it may and will introduce very rare test failures, that means somebody has to spend time looking into these, which is a wasted effort. Also I'm pretty sure that nobody likes tests that will fail rarely for no good reason. Even more so you cannot run such tests with a background load to ensure that everything works correctly on a bussy system, because that will increase the likehood of a failure.

The second is that this wastes resources and slowns down a test run. If you think that adding a sleep to a test is not a big deal, let me put things into a perspective. There is about 1600 syscall tests in Linux Test Project (LTP), if 7.5% of them would sleep just for one second, we would end up with two minutes of wasted time per testrun. In practice most of the test I've seen waited for much longer just to be sure that things will works even on slower hardware. With sleeps between 2 and 5 seconds that puts us somewhere between 4 and 10 minutes which is between 13% and 33% of the syscall runtime on my dated thinkpad, where the run finishes in a bit less than half an hour. It's even worse on newer hardware, because this slowdown will not change no matter how fast your machine is, which is maybe the reason why this was acceptable twenty years ago but it's not now.

When sleep() is acceptable then?

So far in my ten years of test development I met only a few cases where sleep() in a test code was appropriate. From the top of my head I remeber:

  • Filesystem tests for file timestamps, atime, mtime, etc.
  • Timer related tests where we sample timer in a loop
  • alarm() and timer_create() test where we wait for the timer to fire
  • Leap second tests

How to fix the problem?

Unfortunately there is no silver bullet since there are plenty of reasons for a race condition to happen and each class has to be dealt with differently.

Fortunately there are quite a few very common classes that could be dealt with quite easily. So in LTP we wrote a few synchronization primitives and helper functions that could be used by a test, so there is no longer any excuse to use sleep() instead.

The most common case was a need to synchronize between parent and child processes. There are actually two different cases that needed to be solved. First is a case where child has to execute certain piece of code before parent can continue. For that LTP library implements checkpoints with simple wait and wake functions based on futexes on a piece of shared memory set up by the test library. The second case is where child has to sleep in a syscall before parent can continue, for which we have a helper that polls /proc/$PID/stat. Also sometimes tests can be fixed just be adding a waitpid() in the parent which ensures that child is finished before parent runs.

There are other and even more complex cases where particular action is done asynchronously, or a kernel resource deallocation is deffered to a later time. In such cases quite often the best we can do is to poll. In LTP we ended up with a macro that polls by calling a piece of code in a loop with exponentially increasing sleeps between retries. Which means that instead of sleeping for a maximal time event can possibly take the sleep is capped by twice of the optimal sleeping time while we avoid polling too aggressively.

We are trying to resurrect the automated testing track on FOSDEM along with Anders and Matt from Linaro. Now that will not happen and even does not make sense to happen without you. I can talk and drink beer in the evening with Matt and Anders just fine, we do not need a devroom for that.

We also have a few people prepared to give a talks on various topics but not nearly enough to fill in and justify a room for a day. So if you are interested in attending or even better giving a talk or driving a discussion please let us know.

The problem

More than ten years ago even consumer grade hardware started to have two and more CPU cores. These days you can easily buy PC with eight cores even in the local computer shops. The hardware has envolved quite fast during the last decade, but the software that does kernel testing didn't. These days we still run LTP testcases sequentially even on beefy servers with 64+ cores and terabytes of RAM. That means that syscalls testrun takes 30+ minutes while it could be reduced to less than 10 minutes, which will significantly shorten the feedback loop for any continous integration.

The naive solution

The naive solution is obviously to run $NCPU+1 tests in parallel. But there is a catch, some of the tests utilize global system resources or state and running two such tests in parallel would lead to false negatives. If you think that this situation is rare you are mistaken, there are plenty of tests that needs a block device, change system wall clock, sample kernel timers in a loop, play with SysV IPC, networking, etc. In short we will end up with many, hard to reproduce, race conditions and mostly useless results.

The proper solution

The proper solution would require:

  1. Make tests declare which resources they utilize
  2. Export this data to the testrunner
  3. Make the testrunner use that information when running tests

Happily the first part is mostly done for LTP, we have switched to new test library I've written a few years ago. The new library defines a “driver model” for a testcase. At this time about half of the tests are converted to the new library and the progress is steady.

In the new library the resources a test needs are listed in a declarative way as a C structure, so for most cases we already have this information in place. If you want to know more you can have a look at our documentation.

At this point I'm trying to solve the second part, which is making the data available to the testrunner, which mostly works at this point. Once this part is finished the missing piece would be writing a scheduller that takes this information into an account.

Unfortunately the way LTP runs test is also dated, there is an old runltp script which I would call an ugly hack at best. Adding feature such as parallel test run to that code is close to impossible. Which is the reason I've started to work on a new LTP testrunner, which I guess could be a subject for a second blog post.