af7c00f21360208a4f87a1bcd01f5f268037e1b7 - platform_external_llvm80

commit	af7c00f21360208a4f87a1bcd01f5f268037e1b7	[log] [tgz]
author	Sanjay Patel <spatel@rotateright.com>	Tue Aug 25 16:29:21 2015 +0000
committer	Sanjay Patel <spatel@rotateright.com>	Tue Aug 25 16:29:21 2015 +0000
tree	9250a7ef18d71836a7119ec74f454b01583db9fd
parent	b6566b103fac04e33fbf12b95ca9fb49295da791 [diff]

make fast unaligned memory accesses implicit with SSE4.2 or SSE4a

This is a follow-on from the discussion in http://reviews.llvm.org/D12154.

This change allows memset/memcpy to use SSE or AVX memory accesses for any chip that has
generally fast unaligned memory ops.

A motivating use case for this change is a clang invocation that doesn't explicitly set
the CPU, but does target a feature that we know only exists on a CPU that supports fast
unaligned memops. For example:
$ clang -O1 foo.c -mavx

This resolves a difference in lowering noted in PR24449:
https://llvm.org/bugs/show_bug.cgi?id=24449

Before this patch, we used different store types depending on whether the example can be
lowered as a memset or not.

Differential Revision: http://reviews.llvm.org/D12288



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245950 91177308-0d34-0410-b5e6-96231b3b80d8

2 files changed

tree: 9250a7ef18d71836a7119ec74f454b01583db9fd