[AMDGPU] Extend the SI Load/Store optimizer to combine more things.

I've extended the load/store optimizer to be able to produce dwordx3
loads and stores, This change allows many more load/stores to be combined,
and results in much more optimal code for our hardware.

Differential Revision: https://reviews.llvm.org/D54042

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@348937 91177308-0d34-0410-b5e6-96231b3b80d8
13 files changed