Add SSE optimization of FIR float filter

Adds x86 SSE optimization of the FIR filter, float version only.
Used ARM implementation as template. Improves performance by a
factor of 2-2.5 on Silvermont architecture.

Change-Id: I503ce2bf4cbf10355f5eec3e9d73b364fa701241
Signed-off-by: Henrik Smiding <henrik.smiding@intel.com>
3 files changed