Merge "simpleperf: update simpleperf prebuilts to build 4445499."
diff --git a/simpleperf/demo/README.md b/simpleperf/demo/README.md
index 2c5f2b3..2a293c5 100644
--- a/simpleperf/demo/README.md
+++ b/simpleperf/demo/README.md
@@ -10,7 +10,7 @@
## Introduction
Simpleperf is a native profiler used on Android platform. It can be used to profile Android
-applications. It's document is at [here](https://android.googlesource.com/platform/system/extras/+/master/simpleperf/doc/README.md).
+applications. Its documentation is [here](https://android.googlesource.com/platform/system/extras/+/master/simpleperf/doc/README.md).
Instructions of preparing your Android application for profiling are [here](https://android.googlesource.com/platform/system/extras/+/master/simpleperf/doc/README.md#Android-application-profiling).
This directory is to show examples of using simpleperf to profile Android applications. The
meaning of each directory is as below:
@@ -22,20 +22,30 @@
It can be downloaded as below:
- $ git clone https://android.googlesource.com/platform/system/extras
- $ cd extras/simpleperf/demo
+```sh
+$ git clone https://android.googlesource.com/platform/system/extras
+$ cd extras/simpleperf/demo
+```
-## Profiling Java application
+The testing environment:
- Android Studio project: SimpleExamplePureJava
- test device: Android O (Google Pixel XL)
- test device: Android N (Google Nexus 5X)
+```
+Android Studio 3.0
+test device: Android O (Google Pixel 2)
+test device: Android N (Google Nexus 6P)
+Please make sure your device having Android version >= N.
+```
+
+## Profile a Java application
+
+Android Studio project: SimpleExamplePureJava
steps:
-1. Build and install app:
-```
+1. Build and install the application:
+
+```sh
# Open SimpleperfExamplesPureJava project with Android Studio,
-# and build this project sucessfully, otherwise the `./gradlew` command below will fail.
+# and build this project successfully, otherwise the `./gradlew` command below will fail.
$ cd SimpleperfExamplePureJava
# On windows, use "gradlew" instead.
@@ -44,32 +54,28 @@
```
2. Record profiling data:
-```
+
+```sh
$ cd ../../scripts/
+# app_profiler.py collects profiling data in perf.data, and binaries on device in binary_cache/.
$ python app_profiler.py -p com.example.simpleperf.simpleperfexamplepurejava
```
3. Show profiling data:
-```
-a. show call graph in txt mode
- $ python report.py -g | more
-b. show call graph in gui mode
- $ python report.py -g --gui
-c. show samples in source code
- $ python annotate.py -s ../demo/SimpleperfExamplePureJava
- $ find annotated_files -name "MainActivity.java"
- check the annoated source file MainActivity.java.
+
+```sh
+# report_html.py generates profiling result in report.html.
+$ python report_html.py --add_source_code --source_dirs ../demo --add_disassembly
```
-## Profiling Java/C++ application
+## Profile a Java/C++ application
- Android Studio project: SimpleExampleWithNative
- test device: Android O (Google Pixel XL)
- test device: Android N (Google Nexus 5X)
+Android Studio project: SimpleExampleWithNative
steps:
-1. Build and install app:
-```
+1. Build and install the application:
+
+```sh
# Open SimpleperfExamplesWithNative project with Android Studio,
# and build this project sucessfully, otherwise the `./gradlew` command below will fail.
$ cd SimpleperfExampleWithNative
@@ -80,33 +86,28 @@
```
2. Record profiling data:
-```
+
+```sh
$ cd ../../scripts/
+# app_profiler.py collects profiling data in perf.data, and binaries on device in binary_cache/.
$ python app_profiler.py -p com.example.simpleperf.simpleperfexamplewithnative
- It runs the application and collects profiling data in perf.data, binaries on device in binary_cache/.
```
3. Show profiling data:
-```
-a. show call graph in txt mode
- $ python report.py -g | more
-b. show call graph in gui mode
- $ python report.py -g --gui
-c. show samples in source code
- $ python annotate.py -s ../demo/SimpleperfExampleWithNative
- $ find annotated_files -name "native-lib.cpp"
- check the annoated source file native-lib.cpp.
+
+```sh
+# report_html.py generates profiling result in report.html.
+$ python report_html.py --add_source_code --source_dirs ../demo --add_disassembly
```
-## Profiling Kotlin application
+## Profile a Kotlin application
- Android Studio project: SimpleExampleOfKotlin
- test device: Android O (Google Pixel XL)
- test device: Android N (Google Nexus 5X)
+Android Studio project: SimpleExampleOfKotlin
steps:
-1. Build and install app:
-```
+1. Build and install the application:
+
+```sh
# Open SimpleperfExamplesOfKotlin project with Android Studio,
# and build this project sucessfully, otherwise the `./gradlew` command below will fail.
$ cd SimpleperfExampleOfKotlin
@@ -117,19 +118,16 @@
```
2. Record profiling data:
-```
+
+```sh
$ cd ../../scripts/
+# app_profiler.py collects profiling data in perf.data, and binaries on device in binary_cache/.
$ python app_profiler.py -p com.example.simpleperf.simpleperfexampleofkotlin
- It runs the application and collects profiling data in perf.data, binaries on device in binary_cache/.
```
3. Show profiling data:
-```
-a. show call graph in txt mode
- $ python report.py -g | more
-b. show call graph in gui mode
- $ python report.py -g --gui
-c. show samples in source code
- $ python annotate.py -s ../demo/SimpleperfExampleOfKotlin
- $ find . -name "MainActivity.kt"
+
+```sh
+# report_html.py generates profiling result in report.html.
+$ python report_html.py --add_source_code --source_dirs ../demo --add_disassembly
```
diff --git a/simpleperf/doc/README.md b/simpleperf/doc/README.md
index db58485..a82eecc 100644
--- a/simpleperf/doc/README.md
+++ b/simpleperf/doc/README.md
@@ -15,12 +15,6 @@
- [Simpleperf introduction](#simpleperf-introduction)
- [Why simpleperf](#why-simpleperf)
- [Tools in simpleperf](#tools-in-simpleperf)
- - [Simpleperf's profiling principle](#simpleperfs-profiling-principle)
- - [Main simpleperf commands](#main-simpleperf-commands)
- - [Simpleperf list](#simpleperf-list)
- - [Simpleperf stat](#simpleperf-stat)
- - [Simpleperf record](#simpleperf-record)
- - [Simpleperf report](#simpleperf-report)
- [Android application profiling](#android-application-profiling)
- [Prepare an Android application](#prepare-an-android-application)
- [Record and report profiling data (using command-lines)](#record-and-report-profiling-data-using-commandlines)
@@ -30,6 +24,13 @@
- [Annotate source code](#annotate-source-code)
- [Trace offcpu time](#trace-offcpu-time)
- [Profile from launch of an application](#profile-from-launch-of-an-application)
+- [Executable commands reference](#executable-commands-reference)
+ - [Simpleperf's profiling principle](#simpleperfs-profiling-principle)
+ - [Main simpleperf commands](#main-simpleperf-commands)
+ - [Simpleperf list](#simpleperf-list)
+ - [Simpleperf stat](#simpleperf-stat)
+ - [Simpleperf record](#simpleperf-record)
+ - [Simpleperf report](#simpleperf-report)
- [Answers to common issues](#answers-to-common-issues)
- [Why we suggest profiling on android >= N devices](#why-we-suggest-profiling-on-android-n-devices)
@@ -110,361 +111,6 @@
`simpleperf_report_lib.py` provides a python interface for parsing profiling data.
-### Simpleperf's profiling principle
-
-Modern CPUs have a hardware component called the performance monitoring unit
-(PMU). The PMU has several hardware counters, counting events like how many cpu
-cycles have happened, how many instructions have executed, or how many cache
-misses have happened.
-
-The Linux kernel wraps these hardware counters into hardware perf events. In
-addition, the Linux kernel also provides hardware independent software events
-and tracepoint events. The Linux kernel exposes all this to userspace via the
-perf_event_open system call, which simpleperf uses.
-
-Simpleperf has three main functions: stat, record and report.
-
-The stat command gives a summary of how many events have happened in the
-profiled processes in a time period. Here’s how it works:
-1. Given user options, simpleperf enables profiling by making a system call to
-linux kernel.
-2. Linux kernel enables counters while scheduling on the profiled processes.
-3. After profiling, simpleperf reads counters from linux kernel, and reports a
-counter summary.
-
-The record command records samples of the profiled process in a time period.
-Here’s how it works:
-1. Given user options, simpleperf enables profiling by making a system call to
-linux kernel.
-2. Simpleperf creates mapped buffers between simpleperf and linux kernel.
-3. Linux kernel enable counters while scheduling on the profiled processes.
-4. Each time a given number of events happen, linux kernel dumps a sample to a
-mapped buffer.
-5. Simpleperf reads samples from the mapped buffers and generates perf.data.
-
-The report command reads a "perf.data" file and any shared libraries used by
-the profiled processes, and outputs a report showing where the time was spent.
-
-
-### Main simpleperf commands
-
-Simpleperf supports several subcommands, including list, stat, record and report.
-Each subcommand supports different options. This section only covers the most
-important subcommands and options. To see all subcommands and options,
-use --help.
-
- # List all subcommands.
- $ simpleperf --help
-
- # Print help message for record subcommand.
- $ simpleperf record --help
-
-
-#### Simpleperf list
-
-simpleperf list is used to list all events available on the device. Different
-devices may support different events because of differences in hardware and
-kernel.
-
- $ simpleperf list
- List of hw-cache events:
- branch-loads
- ...
- List of hardware events:
- cpu-cycles
- instructions
- ...
- List of software events:
- cpu-clock
- task-clock
- ...
-
-
-#### Simpleperf stat
-
-simpleperf stat is used to get a raw event counter information of the profiled program
-or system-wide. By passing options, we can select which events to use, which
-processes/threads to monitor, how long to monitor and the print interval.
-Below is an example.
-
- # Stat using default events (cpu-cycles,instructions,...), and monitor
- # process 7394 for 10 seconds.
- $ simpleperf stat -p 7394 --duration 10
- Performance counter statistics:
-
- 1,320,496,145 cpu-cycles # 0.131736 GHz (100%)
- 510,426,028 instructions # 2.587047 cycles per instruction (100%)
- 4,692,338 branch-misses # 468.118 K/sec (100%)
- 886.008130(ms) task-clock # 0.088390 cpus used (100%)
- 753 context-switches # 75.121 /sec (100%)
- 870 page-faults # 86.793 /sec (100%)
-
- Total test time: 10.023829 seconds.
-
-**Select events**
-We can select which events to use via -e option. Below are examples:
-
- # Stat event cpu-cycles.
- $ simpleperf stat -e cpu-cycles -p 11904 --duration 10
-
- # Stat event cache-references and cache-misses.
- $ simpleperf stat -e cache-references,cache-misses -p 11904 --duration 10
-
-When running the stat command, if the number of hardware events is larger than
-the number of hardware counters available in the PMU, the kernel shares hardware
-counters between events, so each event is only monitored for part of the total
-time. In the example below, there is a percentage at the end of each row,
-showing the percentage of the total time that each event was actually monitored.
-
- # Stat using event cache-references, cache-references:u,....
- $ simpleperf stat -p 7394 -e cache-references,cache-references:u,cache-references:k,cache-misses,cache-misses:u,cache-misses:k,instructions --duration 1
- Performance counter statistics:
-
- 4,331,018 cache-references # 4.861 M/sec (87%)
- 3,064,089 cache-references:u # 3.439 M/sec (87%)
- 1,364,959 cache-references:k # 1.532 M/sec (87%)
- 91,721 cache-misses # 102.918 K/sec (87%)
- 45,735 cache-misses:u # 51.327 K/sec (87%)
- 38,447 cache-misses:k # 43.131 K/sec (87%)
- 9,688,515 instructions # 10.561 M/sec (89%)
-
- Total test time: 1.026802 seconds.
-
-In the example above, each event is monitored about 87% of the total time. But
-there is no guarantee that any pair of events are always monitored at the same
-time. If we want to have some events monitored at the same time, we can use
---group option. Below is an example.
-
- # Stat using event cache-references, cache-references:u,....
- $ simpleperf stat -p 7394 --group cache-references,cache-misses --group cache-references:u,cache-misses:u --group cache-references:k,cache-misses:k -e instructions --duration 1
- Performance counter statistics:
-
- 3,638,900 cache-references # 4.786 M/sec (74%)
- 65,171 cache-misses # 1.790953% miss rate (74%)
- 2,390,433 cache-references:u # 3.153 M/sec (74%)
- 32,280 cache-misses:u # 1.350383% miss rate (74%)
- 879,035 cache-references:k # 1.251 M/sec (68%)
- 30,303 cache-misses:k # 3.447303% miss rate (68%)
- 8,921,161 instructions # 10.070 M/sec (86%)
-
- Total test time: 1.029843 seconds.
-
-**Select target to monitor**
-We can select which processes or threads to monitor via -p option or -t option.
-Monitoring a process is the same as monitoring all threads in the process.
-Simpleperf can also fork a child process to run the new command and then monitor
-the child process. Below are examples.
-
- # Stat process 11904 and 11905.
- $ simpleperf stat -p 11904,11905 --duration 10
-
- # Stat thread 11904 and 11905.
- $ simpleperf stat -t 11904,11905 --duration 10
-
- # Start a child process running `ls`, and stat it.
- $ simpleperf stat ls
-
-**Decide how long to monitor**
-When monitoring existing threads, we can use --duration option to decide how long
-to monitor. When monitoring a child process running a new command, simpleperf
-monitors until the child process ends. In this case, we can use Ctrl-C to stop monitoring
-at any time. Below are examples.
-
- # Stat process 11904 for 10 seconds.
- $ simpleperf stat -p 11904 --duration 10
-
- # Stat until the child process running `ls` finishes.
- $ simpleperf stat ls
-
- # Stop monitoring using Ctrl-C.
- $ simpleperf stat -p 11904 --duration 10
- ^C
-
-**Decide the print interval**
-When monitoring perf counters, we can also use --interval option to decide the print
-interval. Below are examples.
-
- # Print stat for process 11904 every 300ms.
- $ simpleperf stat -p 11904 --duration 10 --interval 300
-
- # Print system wide stat at interval of 300ms for 10 seconds (rooted device only).
- # system wide profiling needs root privilege
- $ su 0 simpleperf stat -a --duration 10 --interval 300
-
-**Display counters in systrace**
-simpleperf can also work with systrace to dump counters in the collected trace.
-Below is an example to do a system wide stat
-
- # capture instructions (kernel only) and cache misses with interval of 300 milliseconds for 15 seconds
- $ su 0 simpleperf stat -e instructions:k,cache-misses -a --interval 300 --duration 15
- # on host launch systrace to collect trace for 10 seconds
- (HOST)$ external/chromium-trace/systrace.py --time=10 -o new.html sched gfx view
- # open the collected new.html in browser and perf counters will be shown up
-
-
-#### Simpleperf record
-
-simpleperf record is used to dump records of the profiled program. By passing
-options, we can select which events to use, which processes/threads to monitor,
-what frequency to dump records, how long to monitor, and where to store records.
-
- # Record on process 7394 for 10 seconds, using default event (cpu-cycles),
- # using default sample frequency (4000 samples per second), writing records
- # to perf.data.
- $ simpleperf record -p 7394 --duration 10
- simpleperf I 07-11 21:44:11 17522 17522 cmd_record.cpp:316] Samples recorded: 21430. Samples lost: 0.
-
-**Select events**
-In most cases, the cpu-cycles event is used to evaluate consumed cpu time.
-As a hardware event, it is both accurate and efficient. We can also use other
-events via -e option. Below is an example.
-
- # Record using event instructions.
- $ simpleperf record -e instructions -p 11904 --duration 10
-
-**Select target to monitor**
-The way to select target in record command is similar to that in stat command.
-Below are examples.
-
- # Record process 11904 and 11905.
- $ simpleperf record -p 11904,11905 --duration 10
-
- # Record thread 11904 and 11905.
- $ simpleperf record -t 11904,11905 --duration 10
-
- # Record a child process running `ls`.
- $ simpleperf record ls
-
-**Set the frequency to record**
-We can set the frequency to dump records via the -f or -c options. For example,
--f 4000 means dumping approximately 4000 records every second when the monitored
-thread runs. If a monitored thread runs 0.2s in one second (it can be preempted
-or blocked in other times), simpleperf dumps about 4000 * 0.2 / 1.0 = 800
-records every second. Another way is using -c option. For example, -c 10000
-means dumping one record whenever 10000 events happen. Below are examples.
-
- # Record with sample frequency 1000: sample 1000 times every second running.
- $ simpleperf record -f 1000 -p 11904,11905 --duration 10
-
- # Record with sample period 100000: sample 1 time every 100000 events.
- $ simpleperf record -c 100000 -t 11904,11905 --duration 10
-
-**Decide how long to monitor**
-The way to decide how long to monitor in record command is similar to that in
-stat command. Below are examples.
-
- # Record process 11904 for 10 seconds.
- $ simpleperf record -p 11904 --duration 10
-
- # Record until the child process running `ls` finishes.
- $ simpleperf record ls
-
- # Stop monitoring using Ctrl-C.
- $ simpleperf record -p 11904 --duration 10
- ^C
-
-**Set the path to store records**
-By default, simpleperf stores records in perf.data in current directory. We can
-use -o option to set the path to store records. Below is an example.
-
- # Write records to data/perf2.data.
- $ simpleperf record -p 11904 -o data/perf2.data --duration 10
-
-
-#### Simpleperf report
-
-simpleperf report is used to report based on perf.data generated by simpleperf
-record command. Report command groups records into different sample entries,
-sorts sample entries based on how many events each sample entry contains, and
-prints out each sample entry. By passing options, we can select where to find
-perf.data and executable binaries used by the monitored program, filter out
-uninteresting records, and decide how to group records.
-
-Below is an example. Records are grouped into 4 sample entries, each entry is
-a row. There are several columns, each column shows piece of information
-belonging to a sample entry. The first column is Overhead, which shows the
-percentage of events inside current sample entry in total events. As the
-perf event is cpu-cycles, the overhead can be seen as the percentage of cpu
-time used in each function.
-
- # Reports perf.data, using only records sampled in libsudo-game-jni.so,
- # grouping records using thread name(comm), process id(pid), thread id(tid),
- # function name(symbol), and showing sample count for each row.
- $ simpleperf report --dsos /data/app/com.example.sudogame-2/lib/arm64/libsudo-game-jni.so --sort comm,pid,tid,symbol -n
- Cmdline: /data/data/com.example.sudogame/simpleperf record -p 7394 --duration 10
- Arch: arm64
- Event: cpu-cycles (type 0, config 0)
- Samples: 28235
- Event count: 546356211
-
- Overhead Sample Command Pid Tid Symbol
- 59.25% 16680 sudogame 7394 7394 checkValid(Board const&, int, int)
- 20.42% 5620 sudogame 7394 7394 canFindSolution_r(Board&, int, int)
- 13.82% 4088 sudogame 7394 7394 randomBlock_r(Board&, int, int, int, int, int)
- 6.24% 1756 sudogame 7394 7394 @plt
-
-**Set the path to read records**
-By default, simpleperf reads perf.data in current directory. We can use -i
-option to select another file to read records.
-
- $ simpleperf report -i data/perf2.data
-
-**Set the path to find executable binaries**
-If reporting function symbols, simpleperf needs to read executable binaries
-used by the monitored processes to get symbol table and debug information. By
-default, the paths are the executable binaries used by monitored processes while
-recording. However, these binaries may not exist when reporting or not contain
-symbol table and debug information. So we can use --symfs to redirect the paths.
-Below is an example.
-
- $ simpleperf report
- # In this case, when simpleperf wants to read executable binary /A/b,
- # it reads file in /A/b.
-
- $ simpleperf report --symfs /debug_dir
- # In this case, when simpleperf wants to read executable binary /A/b,
- # it prefers file in /debug_dir/A/b to file in /A/b.
-
-**Filter records**
-When reporting, it happens that not all records are of interest. Simpleperf
-supports five filters to select records of interest. Below are examples.
-
- # Report records in threads having name sudogame.
- $ simpleperf report --comms sudogame
-
- # Report records in process 7394 or 7395
- $ simpleperf report --pids 7394,7395
-
- # Report records in thread 7394 or 7395.
- $ simpleperf report --tids 7394,7395
-
- # Report records in libsudo-game-jni.so.
- $ simpleperf report --dsos /data/app/com.example.sudogame-2/lib/arm64/libsudo-game-jni.so
-
- # Report records in function checkValid or canFindSolution_r.
- $ simpleperf report --symbols "checkValid(Board const&, int, int);canFindSolution_r(Board&, int, int)"
-
-**Decide how to group records into sample entries**
-Simpleperf uses --sort option to decide how to group sample entries. Below are
-examples.
-
- # Group records based on their process id: records having the same process
- # id are in the same sample entry.
- $ simpleperf report --sort pid
-
- # Group records based on their thread id and thread comm: records having
- # the same thread id and thread name are in the same sample entry.
- $ simpleperf report --sort tid,comm
-
- # Group records based on their binary and function: records in the same
- # binary and function are in the same sample entry.
- $ simpleperf report --sort dso,symbol
-
- # Default option: --sort comm,pid,tid,dso,symbol. Group records in the same
- # thread, and belong to the same function in the same binary.
- $ simpleperf report
-
-
## Android application profiling
This section shows how to profile an Android application.
@@ -889,6 +535,362 @@
-a .MainActivity --arch arm64 -r "-g -e cpu-cycles:u --duration 1" \
--profile_from_launch
+## Executable commands reference
+
+### Simpleperf's profiling principle
+
+Modern CPUs have a hardware component called the performance monitoring unit
+(PMU). The PMU has several hardware counters, counting events like how many cpu
+cycles have happened, how many instructions have executed, or how many cache
+misses have happened.
+
+The Linux kernel wraps these hardware counters into hardware perf events. In
+addition, the Linux kernel also provides hardware independent software events
+and tracepoint events. The Linux kernel exposes all this to userspace via the
+perf_event_open system call, which simpleperf uses.
+
+Simpleperf has three main functions: stat, record and report.
+
+The stat command gives a summary of how many events have happened in the
+profiled processes in a time period. Here’s how it works:
+1. Given user options, simpleperf enables profiling by making a system call to
+linux kernel.
+2. Linux kernel enables counters while scheduling on the profiled processes.
+3. After profiling, simpleperf reads counters from linux kernel, and reports a
+counter summary.
+
+The record command records samples of the profiled process in a time period.
+Here’s how it works:
+1. Given user options, simpleperf enables profiling by making a system call to
+linux kernel.
+2. Simpleperf creates mapped buffers between simpleperf and linux kernel.
+3. Linux kernel enable counters while scheduling on the profiled processes.
+4. Each time a given number of events happen, linux kernel dumps a sample to a
+mapped buffer.
+5. Simpleperf reads samples from the mapped buffers and generates perf.data.
+
+The report command reads a "perf.data" file and any shared libraries used by
+the profiled processes, and outputs a report showing where the time was spent.
+
+
+### Main simpleperf commands
+
+Simpleperf supports several subcommands, including list, stat, record and report.
+Each subcommand supports different options. This section only covers the most
+important subcommands and options. To see all subcommands and options,
+use --help.
+
+ # List all subcommands.
+ $ simpleperf --help
+
+ # Print help message for record subcommand.
+ $ simpleperf record --help
+
+
+#### Simpleperf list
+
+simpleperf list is used to list all events available on the device. Different
+devices may support different events because of differences in hardware and
+kernel.
+
+ $ simpleperf list
+ List of hw-cache events:
+ branch-loads
+ ...
+ List of hardware events:
+ cpu-cycles
+ instructions
+ ...
+ List of software events:
+ cpu-clock
+ task-clock
+ ...
+
+
+#### Simpleperf stat
+
+simpleperf stat is used to get a raw event counter information of the profiled program
+or system-wide. By passing options, we can select which events to use, which
+processes/threads to monitor, how long to monitor and the print interval.
+Below is an example.
+
+ # Stat using default events (cpu-cycles,instructions,...), and monitor
+ # process 7394 for 10 seconds.
+ $ simpleperf stat -p 7394 --duration 10
+ Performance counter statistics:
+
+ 1,320,496,145 cpu-cycles # 0.131736 GHz (100%)
+ 510,426,028 instructions # 2.587047 cycles per instruction (100%)
+ 4,692,338 branch-misses # 468.118 K/sec (100%)
+ 886.008130(ms) task-clock # 0.088390 cpus used (100%)
+ 753 context-switches # 75.121 /sec (100%)
+ 870 page-faults # 86.793 /sec (100%)
+
+ Total test time: 10.023829 seconds.
+
+**Select events**
+We can select which events to use via -e option. Below are examples:
+
+ # Stat event cpu-cycles.
+ $ simpleperf stat -e cpu-cycles -p 11904 --duration 10
+
+ # Stat event cache-references and cache-misses.
+ $ simpleperf stat -e cache-references,cache-misses -p 11904 --duration 10
+
+When running the stat command, if the number of hardware events is larger than
+the number of hardware counters available in the PMU, the kernel shares hardware
+counters between events, so each event is only monitored for part of the total
+time. In the example below, there is a percentage at the end of each row,
+showing the percentage of the total time that each event was actually monitored.
+
+ # Stat using event cache-references, cache-references:u,....
+ $ simpleperf stat -p 7394 -e cache-references,cache-references:u,cache-references:k,cache-misses,cache-misses:u,cache-misses:k,instructions --duration 1
+ Performance counter statistics:
+
+ 4,331,018 cache-references # 4.861 M/sec (87%)
+ 3,064,089 cache-references:u # 3.439 M/sec (87%)
+ 1,364,959 cache-references:k # 1.532 M/sec (87%)
+ 91,721 cache-misses # 102.918 K/sec (87%)
+ 45,735 cache-misses:u # 51.327 K/sec (87%)
+ 38,447 cache-misses:k # 43.131 K/sec (87%)
+ 9,688,515 instructions # 10.561 M/sec (89%)
+
+ Total test time: 1.026802 seconds.
+
+In the example above, each event is monitored about 87% of the total time. But
+there is no guarantee that any pair of events are always monitored at the same
+time. If we want to have some events monitored at the same time, we can use
+--group option. Below is an example.
+
+ # Stat using event cache-references, cache-references:u,....
+ $ simpleperf stat -p 7394 --group cache-references,cache-misses --group cache-references:u,cache-misses:u --group cache-references:k,cache-misses:k -e instructions --duration 1
+ Performance counter statistics:
+
+ 3,638,900 cache-references # 4.786 M/sec (74%)
+ 65,171 cache-misses # 1.790953% miss rate (74%)
+ 2,390,433 cache-references:u # 3.153 M/sec (74%)
+ 32,280 cache-misses:u # 1.350383% miss rate (74%)
+ 879,035 cache-references:k # 1.251 M/sec (68%)
+ 30,303 cache-misses:k # 3.447303% miss rate (68%)
+ 8,921,161 instructions # 10.070 M/sec (86%)
+
+ Total test time: 1.029843 seconds.
+
+**Select target to monitor**
+We can select which processes or threads to monitor via -p option or -t option.
+Monitoring a process is the same as monitoring all threads in the process.
+Simpleperf can also fork a child process to run the new command and then monitor
+the child process. Below are examples.
+
+ # Stat process 11904 and 11905.
+ $ simpleperf stat -p 11904,11905 --duration 10
+
+ # Stat thread 11904 and 11905.
+ $ simpleperf stat -t 11904,11905 --duration 10
+
+ # Start a child process running `ls`, and stat it.
+ $ simpleperf stat ls
+
+**Decide how long to monitor**
+When monitoring existing threads, we can use --duration option to decide how long
+to monitor. When monitoring a child process running a new command, simpleperf
+monitors until the child process ends. In this case, we can use Ctrl-C to stop monitoring
+at any time. Below are examples.
+
+ # Stat process 11904 for 10 seconds.
+ $ simpleperf stat -p 11904 --duration 10
+
+ # Stat until the child process running `ls` finishes.
+ $ simpleperf stat ls
+
+ # Stop monitoring using Ctrl-C.
+ $ simpleperf stat -p 11904 --duration 10
+ ^C
+
+**Decide the print interval**
+When monitoring perf counters, we can also use --interval option to decide the print
+interval. Below are examples.
+
+ # Print stat for process 11904 every 300ms.
+ $ simpleperf stat -p 11904 --duration 10 --interval 300
+
+ # Print system wide stat at interval of 300ms for 10 seconds (rooted device only).
+ # system wide profiling needs root privilege
+ $ su 0 simpleperf stat -a --duration 10 --interval 300
+
+**Display counters in systrace**
+simpleperf can also work with systrace to dump counters in the collected trace.
+Below is an example to do a system wide stat
+
+ # capture instructions (kernel only) and cache misses with interval of 300 milliseconds for 15 seconds
+ $ su 0 simpleperf stat -e instructions:k,cache-misses -a --interval 300 --duration 15
+ # on host launch systrace to collect trace for 10 seconds
+ (HOST)$ external/chromium-trace/systrace.py --time=10 -o new.html sched gfx view
+ # open the collected new.html in browser and perf counters will be shown up
+
+
+#### Simpleperf record
+
+simpleperf record is used to dump records of the profiled program. By passing
+options, we can select which events to use, which processes/threads to monitor,
+what frequency to dump records, how long to monitor, and where to store records.
+
+ # Record on process 7394 for 10 seconds, using default event (cpu-cycles),
+ # using default sample frequency (4000 samples per second), writing records
+ # to perf.data.
+ $ simpleperf record -p 7394 --duration 10
+ simpleperf I 07-11 21:44:11 17522 17522 cmd_record.cpp:316] Samples recorded: 21430. Samples lost: 0.
+
+**Select events**
+In most cases, the cpu-cycles event is used to evaluate consumed cpu time.
+As a hardware event, it is both accurate and efficient. We can also use other
+events via -e option. Below is an example.
+
+ # Record using event instructions.
+ $ simpleperf record -e instructions -p 11904 --duration 10
+
+**Select target to monitor**
+The way to select target in record command is similar to that in stat command.
+Below are examples.
+
+ # Record process 11904 and 11905.
+ $ simpleperf record -p 11904,11905 --duration 10
+
+ # Record thread 11904 and 11905.
+ $ simpleperf record -t 11904,11905 --duration 10
+
+ # Record a child process running `ls`.
+ $ simpleperf record ls
+
+**Set the frequency to record**
+We can set the frequency to dump records via the -f or -c options. For example,
+-f 4000 means dumping approximately 4000 records every second when the monitored
+thread runs. If a monitored thread runs 0.2s in one second (it can be preempted
+or blocked in other times), simpleperf dumps about 4000 * 0.2 / 1.0 = 800
+records every second. Another way is using -c option. For example, -c 10000
+means dumping one record whenever 10000 events happen. Below are examples.
+
+ # Record with sample frequency 1000: sample 1000 times every second running.
+ $ simpleperf record -f 1000 -p 11904,11905 --duration 10
+
+ # Record with sample period 100000: sample 1 time every 100000 events.
+ $ simpleperf record -c 100000 -t 11904,11905 --duration 10
+
+**Decide how long to monitor**
+The way to decide how long to monitor in record command is similar to that in
+stat command. Below are examples.
+
+ # Record process 11904 for 10 seconds.
+ $ simpleperf record -p 11904 --duration 10
+
+ # Record until the child process running `ls` finishes.
+ $ simpleperf record ls
+
+ # Stop monitoring using Ctrl-C.
+ $ simpleperf record -p 11904 --duration 10
+ ^C
+
+**Set the path to store records**
+By default, simpleperf stores records in perf.data in current directory. We can
+use -o option to set the path to store records. Below is an example.
+
+ # Write records to data/perf2.data.
+ $ simpleperf record -p 11904 -o data/perf2.data --duration 10
+
+
+#### Simpleperf report
+
+simpleperf report is used to report based on perf.data generated by simpleperf
+record command. Report command groups records into different sample entries,
+sorts sample entries based on how many events each sample entry contains, and
+prints out each sample entry. By passing options, we can select where to find
+perf.data and executable binaries used by the monitored program, filter out
+uninteresting records, and decide how to group records.
+
+Below is an example. Records are grouped into 4 sample entries, each entry is
+a row. There are several columns, each column shows piece of information
+belonging to a sample entry. The first column is Overhead, which shows the
+percentage of events inside current sample entry in total events. As the
+perf event is cpu-cycles, the overhead can be seen as the percentage of cpu
+time used in each function.
+
+ # Reports perf.data, using only records sampled in libsudo-game-jni.so,
+ # grouping records using thread name(comm), process id(pid), thread id(tid),
+ # function name(symbol), and showing sample count for each row.
+ $ simpleperf report --dsos /data/app/com.example.sudogame-2/lib/arm64/libsudo-game-jni.so --sort comm,pid,tid,symbol -n
+ Cmdline: /data/data/com.example.sudogame/simpleperf record -p 7394 --duration 10
+ Arch: arm64
+ Event: cpu-cycles (type 0, config 0)
+ Samples: 28235
+ Event count: 546356211
+
+ Overhead Sample Command Pid Tid Symbol
+ 59.25% 16680 sudogame 7394 7394 checkValid(Board const&, int, int)
+ 20.42% 5620 sudogame 7394 7394 canFindSolution_r(Board&, int, int)
+ 13.82% 4088 sudogame 7394 7394 randomBlock_r(Board&, int, int, int, int, int)
+ 6.24% 1756 sudogame 7394 7394 @plt
+
+**Set the path to read records**
+By default, simpleperf reads perf.data in current directory. We can use -i
+option to select another file to read records.
+
+ $ simpleperf report -i data/perf2.data
+
+**Set the path to find executable binaries**
+If reporting function symbols, simpleperf needs to read executable binaries
+used by the monitored processes to get symbol table and debug information. By
+default, the paths are the executable binaries used by monitored processes while
+recording. However, these binaries may not exist when reporting or not contain
+symbol table and debug information. So we can use --symfs to redirect the paths.
+Below is an example.
+
+ $ simpleperf report
+ # In this case, when simpleperf wants to read executable binary /A/b,
+ # it reads file in /A/b.
+
+ $ simpleperf report --symfs /debug_dir
+ # In this case, when simpleperf wants to read executable binary /A/b,
+ # it prefers file in /debug_dir/A/b to file in /A/b.
+
+**Filter records**
+When reporting, it happens that not all records are of interest. Simpleperf
+supports five filters to select records of interest. Below are examples.
+
+ # Report records in threads having name sudogame.
+ $ simpleperf report --comms sudogame
+
+ # Report records in process 7394 or 7395
+ $ simpleperf report --pids 7394,7395
+
+ # Report records in thread 7394 or 7395.
+ $ simpleperf report --tids 7394,7395
+
+ # Report records in libsudo-game-jni.so.
+ $ simpleperf report --dsos /data/app/com.example.sudogame-2/lib/arm64/libsudo-game-jni.so
+
+ # Report records in function checkValid or canFindSolution_r.
+ $ simpleperf report --symbols "checkValid(Board const&, int, int);canFindSolution_r(Board&, int, int)"
+
+**Decide how to group records into sample entries**
+Simpleperf uses --sort option to decide how to group sample entries. Below are
+examples.
+
+ # Group records based on their process id: records having the same process
+ # id are in the same sample entry.
+ $ simpleperf report --sort pid
+
+ # Group records based on their thread id and thread comm: records having
+ # the same thread id and thread name are in the same sample entry.
+ $ simpleperf report --sort tid,comm
+
+ # Group records based on their binary and function: records in the same
+ # binary and function are in the same sample entry.
+ $ simpleperf report --sort dso,symbol
+
+ # Default option: --sort comm,pid,tid,dso,symbol. Group records in the same
+ # thread, and belong to the same function in the same binary.
+ $ simpleperf report
+
## Answers to common issues