simpleperf: add --per-core option to stat cmd.

When --per-core is used, open perf event file for each cpu for each
monitored target, and report event count for each cpu separately.
When reporting, add a cpu entry. Rows will be sorted by
(event_count_for_a_thread, event_count_for_a_thread_on_a_cpu) in
decreasing order.

Also add msgs explaining percentages in comment in different situations.
Also add a suggestion when EMFILE happens, which is likely to happen
when `-a --per-thread --per-core` is used.
Also add document.

Bug: 148302668
Test: run simpleperf_unit_test.

Change-Id: I9f9adcc26f1308fc5e5cd47611d0fad926792da8
4 files changed