[X86][AVX] Reduce v4f64/v4i64 shuffle costs (PR37882)

These were being over cautious for costs for one/two op general shuffles - VSHUFPD doesn't have to replicate the same shuffle in both lanes like VSHUFPS does. 

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335216 91177308-0d34-0410-b5e6-96231b3b80d8
6 files changed