Optimize cond_to_mask in non-scalar configurations

Conditions are ALREADY masks in all of our common (SIMD) builds, so this
is just injecting several extra instructions for no reason. It's only
when JUMPER_IS_SCALAR that we need to turn conditions (single bit bool)
into masks.

For many uses, clang is able to see through the problem and optimize the
extra work away, but at least one usage in the 2pt conical gradient
stages is made two instructions faster here.

More important than the speedup is the mental burden -- I had to read
this code quite a few times before I could understand why it was written
at all. The new version gives a much better clue to the reader about
what's happening.

Change-Id: I1b913431ffac4a9822f377a056a1773674b380d1
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/573878
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Brian Osman <brianosman@google.com>
1 file changed