Minor improvements for the CC collector.

- Split Mark() and inline its first part.
- Make sure some other routines are inlined.
- Add some UNLIKELY's.
- Use VisitConcurrentRoots().

Ritz EAAC GC time decreased from 28.9 -> 27.6s (-4.5%) on N5.

Bug: 12687968

Change-Id: I7bd13f162e7daa2a5853000fb22c5fefc318994f
diff --git a/runtime/mirror/object.h b/runtime/mirror/object.h
index f75b8ae..4364d94 100644
--- a/runtime/mirror/object.h
+++ b/runtime/mirror/object.h
@@ -99,7 +99,7 @@
 #ifndef USE_BAKER_OR_BROOKS_READ_BARRIER
   NO_RETURN
 #endif
-  bool AtomicSetReadBarrierPointer(Object* expected_rb_ptr, Object* rb_ptr)
+  ALWAYS_INLINE bool AtomicSetReadBarrierPointer(Object* expected_rb_ptr, Object* rb_ptr)
       SHARED_REQUIRES(Locks::mutator_lock_);
   void AssertReadBarrierPointer() const SHARED_REQUIRES(Locks::mutator_lock_);