add a scaled uint32x4_t divided by uint32_t to SkVx
This extracts the divide used in SkImageBlurFilter.cpp, and
encapsulates it into ScaledDividerU32. It generates results that
are with in +/- 1 of the rounded answer generated by doubles.
I have added hand coded implementations for sse and for neon to
hopefully to avoid code generation problems.
Bug: skia:12522
Change-Id: Ia7372d45895c799f69f8c0fd9fdea5efac321139
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/458216
Reviewed-by: Brian Osman <brianosman@google.com>
Commit-Queue: Herb Derby <herb@google.com>
2 files changed