SRCU state double scan
The SRCU flavor of RCU uses per-cpu counters to detect that every CPU has passed through a quiescent state for a particular SRCU lock instance (
There's are total of 4 counters per-cpu. One pair for locks, and another for unlocks. You can think of the SRCU instance to be split into 2 parts. The readers sample
srcu_idx and decided which part to use. Each part corresponds to one pair of lock and unlock counters. A reader increments a part's lock counter during locking and likewise for unlock.
During an update, the updater flips
srcu_idx (thus attempting to force new readers to use the other part) and waits for the lock/unlock counters on the previous value of
srcu_idx to match.
Once the sum of the lock counters of all CPUs match that of unlock, the system knows all pre-existing read-side critical sections have completed.
Things are not that simple, however. It is possible that a reader samples the
srcu_idx, but before it can increment the lock counter corresponding to it, it undergoes a long delay. We thus we end up in a situation where there are readers in both
srcu_idx = 0 and
srcu_idx = 1.
To prevent such a situation, a writer has to wait for readers corresponding to both
srcu_idx = 0 and
srcu_idx = 1 to complete. This depicted with 'A MUST' in the below pseudo-code:
reader 1 writer reader 2 ------------------------------------------------------- // read_lock // enter Read: idx = 0; <long delay> // write_lock // enter wait_for lock==unlock idx = 1; /* flip */ wait_for lock==unlock done. Read: idx = 1; lock++; lock++; // write_lock // return // read_lock // return /**** NOW BOTH lock and lock are non-zero!! ****/ // write_lock // enter wait_for lock==unlock <- A MUST! idx = 0; /* flip */ wait_for lock==unlock <- A MUST!
NOTE: QRCU has a similar issue. However it overcomes such a race in the reader by retrying the sampling of its 'srcu_idx' equivalent.
Q: If you have to wait for readers of both
srcu_idx = 0, and
1, then why not just have a single counter and do away with the “flipping” logic?
Ans: Because of updater forward progress. If we had a single counter, then it is possible that new readers would constantly increment the lock counter, thus updaters would be waiting all the time. By using the 'flip' logic, we are able to drain pre-existing readers using the inactive part of
srcu_idx to be drained in a bounded time. The number of readers of a 'flipped' part would only monotonically decrease since new readers go to its counterpart.