Improve READ_BARRIER_MARK_REG for arm32

Use blocked register IP as scratch, avoid pushing in fast path.

Clean up slow path to not have simpler logic and one less memory
write.

Add simple fast path handling for region space TLAB object
allocation.

Test: test-art-target, N6P booting with CC baker

Bug: 30162165

Change-Id: I6594e42d3d6277ffe7bb79df09df8be6bee85eb5
diff --git a/runtime/arch/arm64/entrypoints_init_arm64.cc b/runtime/arch/arm64/entrypoints_init_arm64.cc
index cc5bf29..55b09c3 100644
--- a/runtime/arch/arm64/entrypoints_init_arm64.cc
+++ b/runtime/arch/arm64/entrypoints_init_arm64.cc
@@ -149,7 +149,7 @@
   qpoints->pReadBarrierMarkReg13 = art_quick_read_barrier_mark_reg13;
   qpoints->pReadBarrierMarkReg14 = art_quick_read_barrier_mark_reg14;
   qpoints->pReadBarrierMarkReg15 = art_quick_read_barrier_mark_reg15;
-  qpoints->pReadBarrierMarkReg16 = art_quick_read_barrier_mark_reg16;
+  qpoints->pReadBarrierMarkReg16 = nullptr;  // IP0 is used as a temp by the asm stub.
   qpoints->pReadBarrierMarkReg17 = art_quick_read_barrier_mark_reg17;
   qpoints->pReadBarrierMarkReg18 = art_quick_read_barrier_mark_reg18;
   qpoints->pReadBarrierMarkReg19 = art_quick_read_barrier_mark_reg19;