simpleperf: use multithreading to speed up line annotation.

When converting addrs to line numbers, each binary is converted
separately (by calling llvm-symbolizer). So we can use multithreading
to speed up this procedure.

The speed-up depends on the real situation. In an experiment reporting
kernel line annotation, this procedure is improved from 5.2s to 2.8s.

Also fix two small errors in do_test.py.

Bug: 187540905
Test: run scripts/test/test.py --only-host-test.
Change-Id: I51ede3c0cbb9ce242cc2305d801ccfb42c130a75
7 files changed