c++: std::condition_variable::wait_until holds the lock?

Tuesday, July 23 2024

Recently, I was debugging some problem where it appeared that std::condition_variable::wait_until would hold the lock and never release it while waiting, in direct contradiction to its documentation:

Atomically releases lock, blocks the current executing thread, and adds it to the list of threads waiting on *this. The thread will be unblocked when notify_all() or notify_one() is executed, or when the absolute time point timeout_time is reached. It may also be unblocked spuriously. When unblocked, regardless of the reason, lock is reacquired and wait_until exits.

You can actually observe this with this reproducer, tested on Debian Linux “bookworm” with g++ 12.3 and libstdc++ 3.4.32:

#include <chrono>
#include <condition_variable>
#include <cstdlib>
#include <iostream>
#include <mutex>
#include <thread>

constexpr auto forever = std::chrono::steady_clock::time_point::max();

int main() {
  for (int i = 0; i < 1000; i++) {
    std::cerr << "============== " << std::endl;
    std::cerr << "Iteration " << i << std::endl;
    std::cerr << "============== " << std::endl;

    std::mutex m;
    std::condition_variable cv;
    int counter = 0;

    std::thread t([&]() {
      std::cerr << "Acquiring lock (BG)" << std::endl;
      std::unique_lock<std::mutex> lock(m);
      if (counter > 0) {
        std::cerr << "Already notified" << std::endl;
        return;
      }
      std::cerr << "Waiting for notification" << std::endl;
      cv.wait_until(lock, forever, [&] { return counter > 0; });
      std::cerr << "Got notification" << std::endl;
    });

    {
      std::cerr << "Acquiring lock" << std::endl;
      std::unique_lock<std::mutex> lock(m);
      std::cerr << "Incrementing counter" << std::endl;
      counter++;
      std::cerr << "Notifying CV" << std::endl;
      cv.notify_all();
      std::cerr << "Waiting for thread" << std::endl;
    }
    t.join();
  }

  return EXIT_SUCCESS;
}

It will print “Waiting for notification” and get stuck. This shouldn’t happen, since wait_until should release the lock, and the block at the bottom should acquire the lock, increment the counter, and notify the condition variable. So what was happening?

The reason is as follows:

wait_until converts our time_point using the steady clock to its local clock. Here’s the source from my copy of the condition_variable header:

template<typename _Clock, typename _Duration>
  cv_status
  wait_until(unique_lock<mutex>& __lock,
             const chrono::time_point<_Clock, _Duration>& __atime)
  {
#if __cplusplus > 201703L
     static_assert(chrono::is_clock_v<_Clock>);
#endif
  const typename _Clock::time_point __c_entry = _Clock::now();
  const __clock_t::time_point __s_entry = __clock_t::now();
  const auto __delta = __atime - __c_entry;
  const auto __s_atime = __s_entry + __delta;

  if (__wait_until_impl(__lock, __s_atime) == cv_status::no_timeout)
    return cv_status::no_timeout;
  // We got a timeout when measured against __clock_t but
  // we need to check against the caller-supplied clock
  // to tell whether we should return a timeout.
  if (_Clock::now() < __atime)
    return cv_status::no_timeout;
  return cv_status::timeout;
}

If we look at this in a debugger, some variables have been optimized out, but we can observe that there was an overflow/underflow:

(gdb) f 2
#2  0x00007ffff50b0aa4 in std::condition_variable::wait_until<std::chrono::_V2::steady_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > > (this=0x7fffffffbe18, __lock=..., __atime=...)
    at /home/lidavidm/miniforge3/envs/conda-dev/x86_64-conda-linux-gnu/include/c++/10.3.0/condition_variable:141
141		if (__wait_until_impl(__lock, __s_atime) == cv_status::no_timeout)
(gdb) p __atime
$11 = (const std::chrono::time_point<std::chrono::_V2::steady_clock, std::chrono::duration<long, std::ratio<1, 1000000000> > > &) @0x7fff92fc57b8: {__d = {__r = 9223372036854775807}}
(gdb) p __c_entry
$12 = {__d = {__r = 967774403226439}}
(gdb) p __s_entry
$13 = <optimized out>
(gdb) p __delta
$14 = <optimized out>
(gdb) p __s_atime
$15 = {__d = {__r = -7515780632097620207}}

We can see that the deadline we provide (__atime) is subtracted from the current time to get a delta, which is then added to the clock that the condition variable uses. And apparently, this overflowed and gave us a bogus negative value.

wait_until passes that bogus value to a helper:

template<typename _Dur>
   cv_status
   __wait_until_impl(unique_lock<mutex>& __lock,
                     const chrono::time_point<system_clock, _Dur>& __atime)
   {
     auto __s = chrono::time_point_cast<chrono::seconds>(__atime);
     auto __ns = chrono::duration_cast<chrono::nanoseconds>(__atime - __s);

     __gthread_time_t __ts =
       {
         static_cast<std::time_t>(__s.time_since_epoch().count()),
         static_cast<long>(__ns.count())
       };

     __gthread_cond_timedwait(&_M_cond, __lock.mutex()->native_handle(),
                              &__ts);

     return (system_clock::now() < __atime
             ? cv_status::no_timeout : cv_status::timeout);
   }
};

It constructs a time_t. If we look at that in the debugger, it also has crazy bogus values as a result:

(gdb) f 1
#1  std::condition_variable::__wait_until_impl<std::chrono::duration<long, std::ratio<1l, 1000000000l> > > (this=this@entry=0x7fffffffbe18, __lock=..., __atime=...)
    at /home/lidavidm/miniforge3/envs/conda-dev/x86_64-conda-linux-gnu/include/c++/10.3.0/condition_variable:232
232		__gthread_cond_timedwait(&_M_cond, __lock.mutex()->native_handle(),
(gdb) p __atime
$16 = (const std::chrono::time_point<std::chrono::_V2::system_clock, std::chrono::duration<long, std::ratio<1, 1000000000> > > &) @0x7fff92fc5778: {__d = {__r = -7515780632097620207}}
(gdb) p __s
$17 = <optimized out>
(gdb) p __ns
$18 = <optimized out>
(gdb) p __ts
$19 = {tv_sec = -7515780632, tv_nsec = -97620207}

That time_t gets passed to pthreads. I wasn’t immediately sure where I should go look for the source of my particular pthreads, but if we look at glibc, the implementation of pthread_cond_timedwait generally starts with something like this:

int
__pthread_cond_timedwait (cond, mutex, abstime)
     pthread_cond_t *cond;
     pthread_mutex_t *mutex;
     const struct timespec *abstime;
{
  struct _pthread_cleanup_buffer buffer;
  struct _condvar_cleanup_buffer cbuffer;
  int result = 0;

  /* Catch invalid parameters.  */
  if (abstime->tv_nsec < 0 || abstime->tv_nsec >= 1000000000)
    return EINVAL;
/* snip ... */

That is…it sees the negative nanosecond value in our bogus time_t, and immediately returns EINVAL, without doing anything like, oh, I don’t know, releasing the lock.

So pthread_cond_timedwait instantly returns without doing anything, and going to the wait_until implementation that takes a predicate, we see that it just spins forever waiting for the predicate to evaluate true or the deadline to pass…neither of which can happen, since we’re holding the lock the entire time!

And there we have it: deadlock.

The solution is to either use a slightly less forever time for forever, or to loop yourself and wait for short periods of time while checking for the predicate or deadline to pass.