simpleperf: add record read thread.

The change is to reduce sample lost rate when recording
dwarf-based callgraph.
It includes below changes:
1. Add RecordBuffer class to store record data.
2. Add RecordReadThread to create a separate high priority
   thread reading records from kernel buffer to a RecordBuffer.
3. Cut stack data in sample records when free space in
   record buffer is below low level.
4. Drop sample records when free space in record buffer is
   below critical level.
5. Use different record buffer sizes for system wide profiling
   and non system wide profiling.
6. Refactor code replacing regs and stack data to callchains
   in SampleRecord.

On walleye, set cpu percentage for profiling to 50:
$ ./old_simpleperf record -a -g --duration 30 --log debug
simpleperf I cmd_record.cpp:545] Samples recorded: 80524. Samples lost: 22993.
$ ./new_simpleperf record -a -g --duration 30 --log debug
simpleperf I cmd_record.cpp:555] Samples recorded: 99776. Samples lost: 0.

Bug: 110174247
Test: run simpleperf_unit_test.
Test: run simpleperf manually.

Change-Id: I10c8a090abc36e9feb712357cbb20a20b205af14
15 files changed