Add functionality to memcpy-perf benchmark

Add delay option to purposely hurt bandwidth
Convert several hardcoded parameters into command-line options (keeping
defaults)
    - start/end sizes
    - number of samples to be collected
    - dummy mode
Add dummy mode, approximate replication of CPU load without hitting
memory. Possibly useful for comparison. Uses L2 instead of memory.
All added command-line options are optional arguments, keeps existing
behavior

Added new calculation in benchmark output that shows time spent doing
the memory operation vs. time spent during the delay (if present)

Change-Id: I11d1f600afa23f49b922417babe6c15cfa058b5d
1 file changed