[AMDGPU] Remove non-temporal flag from argument loads

Kernel arguments likely read by all workitems and should not bypass
cache. Fixes performance hit in sub-dword argument loads.

Differential Revision: https://reviews.llvm.org/D43249

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325146 91177308-0d34-0410-b5e6-96231b3b80d8
5 files changed