Specialize Sk2d for ARM64

The implementation is nearly identical to Sk2f, with these changes:
  - float32x2_t -> float64x2_t
  - vfoo -> vfooq
  - one extra Newton's method step in sqrt().

Also, generally fix NEON detection to be defined(SK_ARM_HAS_NEON).
SK_ARM_HAS_NEON is not being set on ARM64 bots right now (nor does the compiler
seem to set __ARM_NEON__), so this CL fixes everything up.

BUG=skia:

Review URL: https://codereview.chromium.org/1020963002
5 files changed