MIPS: Use sltiu instead of LoadConst32() + sltu

Bltu is sltu + bnez so we can use sltiu + bnez without
loading constant (if it is 16-bit constant).

Additionally, in VisitInvokeInterface() LoadConst32() is moved
to before Jalr(T9) so the load can be taken into the delay slot.

Test: ./testrunner.py --target --optimizing in QEMU

Change-Id: Ic19f251aeba015be38b7d3690e78b2fe59e7c5ae
diff --git a/compiler/optimizing/code_generator_mips64.cc b/compiler/optimizing/code_generator_mips64.cc
index e3529f1..985ac2c 100644
--- a/compiler/optimizing/code_generator_mips64.cc
+++ b/compiler/optimizing/code_generator_mips64.cc
@@ -1773,8 +1773,8 @@
       enum_cast<uint32_t>(ClassStatus::kInitialized) << (status_lsb_position % kBitsPerByte);
 
   __ LoadFromOffset(kLoadUnsignedByte, TMP, class_reg, status_byte_offset);
-  __ LoadConst32(AT, shifted_initialized_value);
-  __ Bltuc(TMP, AT, slow_path->GetEntryLabel());
+  __ Sltiu(TMP, TMP, shifted_initialized_value);
+  __ Bnezc(TMP, slow_path->GetEntryLabel());
   // Even if the initialized flag is set, we need to ensure consistent memory ordering.
   __ Sync(0);
   __ Bind(slow_path->GetExitLabel());