Prevent abort in situations with recursive checkpoints

In situations where there were multiple checkpoints queued and the
first one causes the thread to suspend itself again then the
RunCheckpointFunction function will hit a LOG(FATAL) and abort. This
is because the recursive checkpoint will clear out the checkpoint
backlog and the first RunCheckpointFunction invocation will
unexpectedly find itself without any more checkpoints to run and
abort.

To fix this, and simplify the code at the same time, we have changed
the RunCheckpointFunction method to (as its name suggests) only run a
single checkpoint function. It will pop this function off the stack of
pending checkpoints and run it relying on the caller to ensure that
all pending checkpoints are handled. This is fine since, due to the
multithreaded nature of checkpoints, the caller must call
RunCheckpointFunction in a loop anyway to ensure it does not advance
until all checkpoints have been handled.

We add test 203-multi-checkpoints that tests that the checkpoint
system does not fall over if there are multiple checkpoints some of
which can suspend.

Bug: 67838964
Test: ./test.py --host -j50
Change-Id: Ib6a3e083e6069d4839647d194bee6849d973633e
diff --git a/runtime/thread.cc b/runtime/thread.cc
index 47ffb4e..83ebe99 100644
--- a/runtime/thread.cc
+++ b/runtime/thread.cc
@@ -1346,36 +1346,26 @@
 }
 
 void Thread::RunCheckpointFunction() {
-  bool done = false;
-  do {
-    // Grab the suspend_count lock and copy the checkpoints one by one. When the last checkpoint is
-    // copied, clear the list and the flag. The RequestCheckpoint function will also grab this lock
-    // to prevent a race between setting the kCheckpointRequest flag and clearing it.
-    Closure* checkpoint = nullptr;
-    {
-      MutexLock mu(this, *Locks::thread_suspend_count_lock_);
-      if (tlsPtr_.checkpoint_function != nullptr) {
-        checkpoint = tlsPtr_.checkpoint_function;
-        if (!checkpoint_overflow_.empty()) {
-          // Overflow list not empty, copy the first one out and continue.
-          tlsPtr_.checkpoint_function = checkpoint_overflow_.front();
-          checkpoint_overflow_.pop_front();
-        } else {
-          // No overflow checkpoints, this means that we are on the last pending checkpoint.
-          tlsPtr_.checkpoint_function = nullptr;
-          AtomicClearFlag(kCheckpointRequest);
-          done = true;
-        }
-      } else {
-        LOG(FATAL) << "Checkpoint flag set without pending checkpoint";
-      }
+  // Grab the suspend_count lock, get the next checkpoint and update all the checkpoint fields. If
+  // there are no more checkpoints we will also clear the kCheckpointRequest flag.
+  Closure* checkpoint;
+  {
+    MutexLock mu(this, *Locks::thread_suspend_count_lock_);
+    checkpoint = tlsPtr_.checkpoint_function;
+    if (!checkpoint_overflow_.empty()) {
+      // Overflow list not empty, copy the first one out and continue.
+      tlsPtr_.checkpoint_function = checkpoint_overflow_.front();
+      checkpoint_overflow_.pop_front();
+    } else {
+      // No overflow checkpoints. Clear the kCheckpointRequest flag
+      tlsPtr_.checkpoint_function = nullptr;
+      AtomicClearFlag(kCheckpointRequest);
     }
-
-    // Outside the lock, run the checkpoint functions that we collected.
-    ScopedTrace trace("Run checkpoint function");
-    DCHECK(checkpoint != nullptr);
-    checkpoint->Run(this);
-  } while (!done);
+  }
+  // Outside the lock, run the checkpoint function.
+  ScopedTrace trace("Run checkpoint function");
+  CHECK(checkpoint != nullptr) << "Checkpoint flag set without pending checkpoint";
+  checkpoint->Run(this);
 }
 
 void Thread::RunEmptyCheckpoint() {