[AMDGPU] Construct memory clauses before RA

Memory clauses are formed into bundles in presence of xnack.
Their source operands are marked as early-clobber.

This allows to allocate distinct source and destination registers
within a clause and prevent breaking the clause with s_nop in the
hazard recognizer.

Clauses are undone before post-RA scheduler to allow some rescheduling,
which will not break the clause since artificial edges are created in
the dag to keep memory operations together. Yet this allows a better
ILP in some cases.

Differential Revision: https://reviews.llvm.org/D47511

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@333691 91177308-0d34-0410-b5e6-96231b3b80d8
7 files changed