821f5e8dfe66c5770d3ef0bf1ddb4aa6e98b5b90 - platform_external_skia

commit	821f5e8dfe66c5770d3ef0bf1ddb4aa6e98b5b90	[log] [tgz]
author	Mike Klein <mtklein@google.com>	Thu Jun 13 10:56:51 2019 -0500
committer	Skia Commit-Bot <skia-commit-bot@chromium.org>	Thu Jun 13 18:21:44 2019 +0000
tree	468de6e524e881d224d7745fe1f7877eb1f91100
parent	ed8d13089d0ece161e97dc891924b33247c21dc9 [diff]

remove mul_unorm8/mad_unorm8

I just kind of remembered that if we're doing (xy+x)/256
and x is a destination channel and y is 255-sa, then you
can get the +x for free by multiplying by 256-sa instead.

  (d * (255-sa) + d)
  (d * (255-sa + 1))
  (d * (256-sa)    )

Duh.  This is a trick we play in a lot of legacy code and
I've just now realized it's exactly equivalent to the trick
I want to play here... sigh.

Folding this math in kind of makes mul/mad_unorm8 moot.

Speed's getting good:

  I32_SWAR: 0.3  ns/px
  I32     : 0.55 ns/px
  F32     : 0.8  ns/px
  RP      : 0.8  ns/px
  Opts    : 0.2  ns/px

Change-Id: I4d10db51ea80a3258c36e97b6b334ad253804613
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/220708
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Mike Klein <mtklein@google.com>

5 files changed

tree: 468de6e524e881d224d7745fe1f7877eb1f91100