AMDGPU/SI: Optimize adjacent s_nop instructions

Use the operand for how long to wait. This is somewhat
distasteful, since it would be better to just emit s_nop
with the right argument in the first place. This would require
changing TII::insertNoop to emit N operands, which would be easy.
Slightly more problematic is the post-RA scheduler and hazard recognizer
represent nops as a single null node, and would require inventing
another way of representing N nops.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267456 91177308-0d34-0410-b5e6-96231b3b80d8
1 file changed