Improve performance of unclipped save layers.

Instead of allocating a separate renderTarget and switching
between them on each draw the new implementation follows the same
pattern that the old HWUI renderer used. The area of the layer is
now copied to a buffer on the GPU, the area is then cleared, rendered
as normal, and finally the texture is redrawn using dst_over blending.

This results in no render target switches and is considerably faster
on most hardware.

This CL also addresses initial bugs where the fading edge effect was
impacting neighboring pixels when the matrix contained fractional
values.

Bug: 129117085
Test: skia unit tests and test cases described in the bug
Change-Id: I9d898faf12fadc2a99d57de513d6a96d42733cdb
8 files changed