Thread-local mark stacks for the CC collector.

Thread-local mark stacks are assigned to mutators where they push
references in read barriers to reduce the (CAS) synchronization cost
in a global mark stack/queue.

We step through three mark stack modes (thread-local, shared,
GC-exclusive) and use per-thread flags to disable/enable system weak
accesses (only for the CC collector) instead of the existing global
one to safely perform the marking phase. The reasons are 1)
thread-local mark stacks for mutators need to be revoked using a
checkpoint to avoid races (incorrectly leaving a reference on mark
stacks) when terminating marking, and 2) we can’t use a checkpoint
while system weak accesses are disabled (or a deadlock would
happen). More details are described in the code comments.

Performance improvements in Ritzperf EAAC: a ~2.8% improvement
(13290->12918) in run time and a ~23% improvement (51.6s->39.8s) in
the total GC time on N5.

Bug: 12687968
Change-Id: I5d234d7e48bf115cd773d38bdb62ad24ce9116c7
diff --git a/runtime/java_vm_ext.cc b/runtime/java_vm_ext.cc
index f1deacf..36adbea 100644
--- a/runtime/java_vm_ext.cc
+++ b/runtime/java_vm_ext.cc
@@ -473,7 +473,8 @@
     return nullptr;
   }
   MutexLock mu(self, weak_globals_lock_);
-  while (UNLIKELY(!allow_new_weak_globals_)) {
+  while (UNLIKELY((!kUseReadBarrier && !allow_new_weak_globals_) ||
+                  (kUseReadBarrier && !self->GetWeakRefAccessEnabled()))) {
     weak_globals_add_condition_.WaitHoldingLocks(self);
   }
   IndirectRef ref = weak_globals_.Add(IRT_FIRST_SEGMENT, obj);
@@ -559,6 +560,13 @@
   CHECK(!allow_new_weak_globals_);
 }
 
+void JavaVMExt::BroadcastForNewWeakGlobals() {
+  CHECK(kUseReadBarrier);
+  Thread* self = Thread::Current();
+  MutexLock mu(self, weak_globals_lock_);
+  weak_globals_add_condition_.Broadcast(self);
+}
+
 mirror::Object* JavaVMExt::DecodeGlobal(Thread* self, IndirectRef ref) {
   return globals_.SynchronizedGet(self, &globals_lock_, ref);
 }
@@ -570,7 +578,8 @@
 
 mirror::Object* JavaVMExt::DecodeWeakGlobal(Thread* self, IndirectRef ref) {
   MutexLock mu(self, weak_globals_lock_);
-  while (UNLIKELY(!allow_new_weak_globals_)) {
+  while (UNLIKELY((!kUseReadBarrier && !allow_new_weak_globals_) ||
+                  (kUseReadBarrier && !self->GetWeakRefAccessEnabled()))) {
     weak_globals_add_condition_.WaitHoldingLocks(self);
   }
   return weak_globals_.Get(ref);