Add NEON swap opts and use opts in SkSwizzler

All RGBA, RGBX, BGRA, BGRX routines in SkSwizzler now use fast
options (with the exception of conversions to 565).

Swizzle Time for swap_rb
0.94x Nexus 9
0.81x Nexus 6P

Unpremul Decode Time for RGBA PNGs***
ZeroInit 0.93x Nexus 9
Regular  0.94x Nexus 9
ZeroInit 0.97x Nexus 6P
ZeroInit 0.95x Nexus 6P

***Two Notes:
The improvements here are actually due to taking advantage of
memcpy() (no need to swap, the bytes are already in the proper
order).
ZeroInit skips writing zeros to zero initialized memory.  This
is a memory use opt in Android.

BMP decodes should also benefit from these improvements.

I am relying on Gold to help test all possible cases.

BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1581933006
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot

Review URL: https://codereview.chromium.org/1581933006
2 files changed