[X86][AVX2] Broadcast subvectors
AVX2 can only broadcast from the zero'th element of a vector, but if the broadcastable element is the zero'th element of a 128-bit subvector its advantageous to extract the subvector, broadcast from that and avoid the loading of shuffle mask data that would be needed for VPERMPS/VPERMD. The only exception being when the source type is 4f64 or 4i64 which can directly use the immediate shuffle VPERMPD/VPERMQ directly.
Differential Revision: http://reviews.llvm.org/D16050
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258081 91177308-0d34-0410-b5e6-96231b3b80d8
8 files changed