blob: e09587179ec3b56bbea10b818f628295101314b8 [file] [log] [blame]
Bill Wendling707f2fd2012-06-20 10:17:46 +00001===========================
2LLVM Branch Weight Metadata
3===========================
4
5.. contents::
6 :local:
7
8Introduction
9============
10
Duncan P. N. Exon Smith23a60332014-04-11 23:21:07 +000011Branch Weight Metadata represents branch weights as its likeliness to be taken
Chandler Carruthd95ef312018-10-18 07:40:24 +000012(see :doc:`BlockFrequencyTerminology`). Metadata is assigned to an
13``Instruction`` that is a terminator as a ``MDNode`` of the ``MD_prof`` kind.
14The first operator is always a ``MDString`` node with the string
15"branch_weights". Number of operators depends on the terminator type.
Bill Wendling707f2fd2012-06-20 10:17:46 +000016
17Branch weights might be fetch from the profiling file, or generated based on
18`__builtin_expect`_ instruction.
19
20All weights are represented as an unsigned 32-bit values, where higher value
21indicates greater chance to be taken.
22
23Supported Instructions
24======================
25
26``BranchInst``
27^^^^^^^^^^^^^^
28
John Criswell288bf1c2012-12-07 19:21:10 +000029Metadata is only assigned to the conditional branches. There are two extra
Bruce Mitchener767c34a2015-09-12 01:17:08 +000030operands for the true and the false branch.
Bill Wendling707f2fd2012-06-20 10:17:46 +000031
Aaron Ballman98714232016-07-19 23:50:11 +000032.. code-block:: none
Bill Wendling707f2fd2012-06-20 10:17:46 +000033
34 !0 = metadata !{
35 metadata !"branch_weights",
36 i32 <TRUE_BRANCH_WEIGHT>,
37 i32 <FALSE_BRANCH_WEIGHT>
38 }
39
40``SwitchInst``
41^^^^^^^^^^^^^^
42
John Criswell288bf1c2012-12-07 19:21:10 +000043Branch weights are assigned to every case (including the ``default`` case which
44is always case #0).
Bill Wendling707f2fd2012-06-20 10:17:46 +000045
Aaron Ballman98714232016-07-19 23:50:11 +000046.. code-block:: none
Bill Wendling707f2fd2012-06-20 10:17:46 +000047
48 !0 = metadata !{
49 metadata !"branch_weights",
50 i32 <DEFAULT_BRANCH_WEIGHT>
51 [ , i32 <CASE_BRANCH_WEIGHT> ... ]
52 }
53
54``IndirectBrInst``
55^^^^^^^^^^^^^^^^^^
56
John Criswell288bf1c2012-12-07 19:21:10 +000057Branch weights are assigned to every destination.
Bill Wendling707f2fd2012-06-20 10:17:46 +000058
Aaron Ballman98714232016-07-19 23:50:11 +000059.. code-block:: none
Bill Wendling707f2fd2012-06-20 10:17:46 +000060
61 !0 = metadata !{
62 metadata !"branch_weights",
63 i32 <LABEL_BRANCH_WEIGHT>
64 [ , i32 <LABEL_BRANCH_WEIGHT> ... ]
65 }
66
Teresa Johnsone3960182017-06-15 15:57:12 +000067``CallInst``
68^^^^^^^^^^^^^^^^^^
69
70Calls may have branch weight metadata, containing the execution count of
71the call. It is currently used in SamplePGO mode only, to augment the
72block and entry counts which may not be accurate with sampling.
73
74.. code-block:: none
75
76 !0 = metadata !{
77 metadata !"branch_weights",
78 i32 <CALL_BRANCH_WEIGHT>
79 }
80
Bill Wendling707f2fd2012-06-20 10:17:46 +000081Other
82^^^^^
83
84Other terminator instructions are not allowed to contain Branch Weight Metadata.
85
86.. _\__builtin_expect:
87
88Built-in ``expect`` Instructions
89================================
90
91``__builtin_expect(long exp, long c)`` instruction provides branch prediction
92information. The return value is the value of ``exp``.
93
94It is especially useful in conditional statements. Currently Clang supports two
95conditional statements:
96
97``if`` statement
98^^^^^^^^^^^^^^^^
99
100The ``exp`` parameter is the condition. The ``c`` parameter is the expected
101comparison value. If it is equal to 1 (true), the condition is likely to be
102true, in other case condition is likely to be false. For example:
103
104.. code-block:: c++
105
106 if (__builtin_expect(x > 0, 1)) {
107 // This block is likely to be taken.
108 }
109
110``switch`` statement
111^^^^^^^^^^^^^^^^^^^^
112
113The ``exp`` parameter is the value. The ``c`` parameter is the expected
114value. If the expected value doesn't show on the cases list, the ``default``
115case is assumed to be likely taken.
116
117.. code-block:: c++
118
119 switch (__builtin_expect(x, 5)) {
120 default: break;
121 case 0: // ...
122 case 3: // ...
123 case 5: // This case is likely to be taken.
124 }
125
126CFG Modifications
127=================
128
129Branch Weight Metatada is not proof against CFG changes. If terminator operands'
130are changed some action should be taken. In other case some misoptimizations may
Bruce Mitchener767c34a2015-09-12 01:17:08 +0000131occur due to incorrect branch prediction information.
Diego Novilloa3bccce2015-05-13 15:13:45 +0000132
133Function Entry Counts
134=====================
135
Bruce Mitchener767c34a2015-09-12 01:17:08 +0000136To allow comparing different functions during inter-procedural analysis and
Diego Novilloa3bccce2015-05-13 15:13:45 +0000137optimization, ``MD_prof`` nodes can also be assigned to a function definition.
138The first operand is a string indicating the name of the associated counter.
139
Dehao Chene26c4212017-02-28 18:09:44 +0000140Currently, one counter is supported: "function_entry_count". The second operand
141is a 64-bit counter that indicates the number of times that this function was
142invoked (in the case of instrumentation-based profiles). In the case of
143sampling-based profiles, this operand is an approximation of how many times
144the function was invoked.
Diego Novilloa3bccce2015-05-13 15:13:45 +0000145
146For example, in the code below, the instrumentation for function foo()
147indicates that it was called 2,590 times at runtime.
148
149.. code-block:: llvm
150
151 define i32 @foo() !prof !1 {
152 ret i32 0
153 }
154 !1 = !{!"function_entry_count", i64 2590}
Dehao Chene26c4212017-02-28 18:09:44 +0000155
156If "function_entry_count" has more than 2 operands, the later operands are
157the GUID of the functions that needs to be imported by ThinLTO. This is only
158set by sampling based profile. It is needed because the sampling based profile
159was collected on a binary that had already imported and inlined these functions,
160and we need to ensure the IR matches in the ThinLTO backends for profile
161annotation. The reason why we cannot annotate this on the callsite is that it
162can only goes down 1 level in the call chain. For the cases where
163foo_in_a_cc()->bar_in_b_cc()->baz_in_c_cc(), we will need to go down 2 levels
164in the call chain to import both bar_in_b_cc and baz_in_c_cc.