[AMDGPU][Waitcnt] As of gfx7, VMEM operations do not increment the export counter and the input registers are available in the next instruction; update the waitcnt pass to take this into account.

Differential Revision: https://reviews.llvm.org/D46067

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@330954 91177308-0d34-0410-b5e6-96231b3b80d8
7 files changed