[X86] Prefer VPERMQ over VPERM2F128 for any unary shuffle, not just the ones that can be done with a insertf128

The early out for AVX2 in lowerV2X128VectorShuffle is positioned in a weird spot below some shuffle mask equivalency checks.

But I think we want to allow VPERMQ for any unary shuffle.

Differential Revision: https://reviews.llvm.org/D37893

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313373 91177308-0d34-0410-b5e6-96231b3b80d8
4 files changed