Bill Wendling | 707f2fd | 2012-06-20 10:17:46 +0000 | [diff] [blame] | 1 | =========================== |
| 2 | LLVM Branch Weight Metadata |
| 3 | =========================== |
| 4 | |
| 5 | .. contents:: |
| 6 | :local: |
| 7 | |
| 8 | Introduction |
| 9 | ============ |
| 10 | |
Duncan P. N. Exon Smith | 23a6033 | 2014-04-11 23:21:07 +0000 | [diff] [blame] | 11 | Branch Weight Metadata represents branch weights as its likeliness to be taken |
Chandler Carruth | d95ef31 | 2018-10-18 07:40:24 +0000 | [diff] [blame] | 12 | (see :doc:`BlockFrequencyTerminology`). Metadata is assigned to an |
| 13 | ``Instruction`` that is a terminator as a ``MDNode`` of the ``MD_prof`` kind. |
| 14 | The first operator is always a ``MDString`` node with the string |
| 15 | "branch_weights". Number of operators depends on the terminator type. |
Bill Wendling | 707f2fd | 2012-06-20 10:17:46 +0000 | [diff] [blame] | 16 | |
| 17 | Branch weights might be fetch from the profiling file, or generated based on |
| 18 | `__builtin_expect`_ instruction. |
| 19 | |
| 20 | All weights are represented as an unsigned 32-bit values, where higher value |
| 21 | indicates greater chance to be taken. |
| 22 | |
| 23 | Supported Instructions |
| 24 | ====================== |
| 25 | |
| 26 | ``BranchInst`` |
| 27 | ^^^^^^^^^^^^^^ |
| 28 | |
John Criswell | 288bf1c | 2012-12-07 19:21:10 +0000 | [diff] [blame] | 29 | Metadata is only assigned to the conditional branches. There are two extra |
Bruce Mitchener | 767c34a | 2015-09-12 01:17:08 +0000 | [diff] [blame] | 30 | operands for the true and the false branch. |
Bill Wendling | 707f2fd | 2012-06-20 10:17:46 +0000 | [diff] [blame] | 31 | |
Aaron Ballman | 9871423 | 2016-07-19 23:50:11 +0000 | [diff] [blame] | 32 | .. code-block:: none |
Bill Wendling | 707f2fd | 2012-06-20 10:17:46 +0000 | [diff] [blame] | 33 | |
| 34 | !0 = metadata !{ |
| 35 | metadata !"branch_weights", |
| 36 | i32 <TRUE_BRANCH_WEIGHT>, |
| 37 | i32 <FALSE_BRANCH_WEIGHT> |
| 38 | } |
| 39 | |
| 40 | ``SwitchInst`` |
| 41 | ^^^^^^^^^^^^^^ |
| 42 | |
John Criswell | 288bf1c | 2012-12-07 19:21:10 +0000 | [diff] [blame] | 43 | Branch weights are assigned to every case (including the ``default`` case which |
| 44 | is always case #0). |
Bill Wendling | 707f2fd | 2012-06-20 10:17:46 +0000 | [diff] [blame] | 45 | |
Aaron Ballman | 9871423 | 2016-07-19 23:50:11 +0000 | [diff] [blame] | 46 | .. code-block:: none |
Bill Wendling | 707f2fd | 2012-06-20 10:17:46 +0000 | [diff] [blame] | 47 | |
| 48 | !0 = metadata !{ |
| 49 | metadata !"branch_weights", |
| 50 | i32 <DEFAULT_BRANCH_WEIGHT> |
| 51 | [ , i32 <CASE_BRANCH_WEIGHT> ... ] |
| 52 | } |
| 53 | |
| 54 | ``IndirectBrInst`` |
| 55 | ^^^^^^^^^^^^^^^^^^ |
| 56 | |
John Criswell | 288bf1c | 2012-12-07 19:21:10 +0000 | [diff] [blame] | 57 | Branch weights are assigned to every destination. |
Bill Wendling | 707f2fd | 2012-06-20 10:17:46 +0000 | [diff] [blame] | 58 | |
Aaron Ballman | 9871423 | 2016-07-19 23:50:11 +0000 | [diff] [blame] | 59 | .. code-block:: none |
Bill Wendling | 707f2fd | 2012-06-20 10:17:46 +0000 | [diff] [blame] | 60 | |
| 61 | !0 = metadata !{ |
| 62 | metadata !"branch_weights", |
| 63 | i32 <LABEL_BRANCH_WEIGHT> |
| 64 | [ , i32 <LABEL_BRANCH_WEIGHT> ... ] |
| 65 | } |
| 66 | |
Teresa Johnson | e396018 | 2017-06-15 15:57:12 +0000 | [diff] [blame] | 67 | ``CallInst`` |
| 68 | ^^^^^^^^^^^^^^^^^^ |
| 69 | |
| 70 | Calls may have branch weight metadata, containing the execution count of |
| 71 | the call. It is currently used in SamplePGO mode only, to augment the |
| 72 | block and entry counts which may not be accurate with sampling. |
| 73 | |
| 74 | .. code-block:: none |
| 75 | |
| 76 | !0 = metadata !{ |
| 77 | metadata !"branch_weights", |
| 78 | i32 <CALL_BRANCH_WEIGHT> |
| 79 | } |
| 80 | |
Bill Wendling | 707f2fd | 2012-06-20 10:17:46 +0000 | [diff] [blame] | 81 | Other |
| 82 | ^^^^^ |
| 83 | |
| 84 | Other terminator instructions are not allowed to contain Branch Weight Metadata. |
| 85 | |
| 86 | .. _\__builtin_expect: |
| 87 | |
| 88 | Built-in ``expect`` Instructions |
| 89 | ================================ |
| 90 | |
| 91 | ``__builtin_expect(long exp, long c)`` instruction provides branch prediction |
| 92 | information. The return value is the value of ``exp``. |
| 93 | |
| 94 | It is especially useful in conditional statements. Currently Clang supports two |
| 95 | conditional statements: |
| 96 | |
| 97 | ``if`` statement |
| 98 | ^^^^^^^^^^^^^^^^ |
| 99 | |
| 100 | The ``exp`` parameter is the condition. The ``c`` parameter is the expected |
| 101 | comparison value. If it is equal to 1 (true), the condition is likely to be |
| 102 | true, in other case condition is likely to be false. For example: |
| 103 | |
| 104 | .. code-block:: c++ |
| 105 | |
| 106 | if (__builtin_expect(x > 0, 1)) { |
| 107 | // This block is likely to be taken. |
| 108 | } |
| 109 | |
| 110 | ``switch`` statement |
| 111 | ^^^^^^^^^^^^^^^^^^^^ |
| 112 | |
| 113 | The ``exp`` parameter is the value. The ``c`` parameter is the expected |
| 114 | value. If the expected value doesn't show on the cases list, the ``default`` |
| 115 | case is assumed to be likely taken. |
| 116 | |
| 117 | .. code-block:: c++ |
| 118 | |
| 119 | switch (__builtin_expect(x, 5)) { |
| 120 | default: break; |
| 121 | case 0: // ... |
| 122 | case 3: // ... |
| 123 | case 5: // This case is likely to be taken. |
| 124 | } |
| 125 | |
| 126 | CFG Modifications |
| 127 | ================= |
| 128 | |
| 129 | Branch Weight Metatada is not proof against CFG changes. If terminator operands' |
| 130 | are changed some action should be taken. In other case some misoptimizations may |
Bruce Mitchener | 767c34a | 2015-09-12 01:17:08 +0000 | [diff] [blame] | 131 | occur due to incorrect branch prediction information. |
Diego Novillo | a3bccce | 2015-05-13 15:13:45 +0000 | [diff] [blame] | 132 | |
| 133 | Function Entry Counts |
| 134 | ===================== |
| 135 | |
Bruce Mitchener | 767c34a | 2015-09-12 01:17:08 +0000 | [diff] [blame] | 136 | To allow comparing different functions during inter-procedural analysis and |
Diego Novillo | a3bccce | 2015-05-13 15:13:45 +0000 | [diff] [blame] | 137 | optimization, ``MD_prof`` nodes can also be assigned to a function definition. |
| 138 | The first operand is a string indicating the name of the associated counter. |
| 139 | |
Dehao Chen | e26c421 | 2017-02-28 18:09:44 +0000 | [diff] [blame] | 140 | Currently, one counter is supported: "function_entry_count". The second operand |
| 141 | is a 64-bit counter that indicates the number of times that this function was |
| 142 | invoked (in the case of instrumentation-based profiles). In the case of |
| 143 | sampling-based profiles, this operand is an approximation of how many times |
| 144 | the function was invoked. |
Diego Novillo | a3bccce | 2015-05-13 15:13:45 +0000 | [diff] [blame] | 145 | |
| 146 | For example, in the code below, the instrumentation for function foo() |
| 147 | indicates that it was called 2,590 times at runtime. |
| 148 | |
| 149 | .. code-block:: llvm |
| 150 | |
| 151 | define i32 @foo() !prof !1 { |
| 152 | ret i32 0 |
| 153 | } |
| 154 | !1 = !{!"function_entry_count", i64 2590} |
Dehao Chen | e26c421 | 2017-02-28 18:09:44 +0000 | [diff] [blame] | 155 | |
| 156 | If "function_entry_count" has more than 2 operands, the later operands are |
| 157 | the GUID of the functions that needs to be imported by ThinLTO. This is only |
| 158 | set by sampling based profile. It is needed because the sampling based profile |
| 159 | was collected on a binary that had already imported and inlined these functions, |
| 160 | and we need to ensure the IR matches in the ThinLTO backends for profile |
| 161 | annotation. The reason why we cannot annotate this on the callsite is that it |
| 162 | can only goes down 1 level in the call chain. For the cases where |
| 163 | foo_in_a_cc()->bar_in_b_cc()->baz_in_c_cc(), we will need to go down 2 levels |
| 164 | in the call chain to import both bar_in_b_cc and baz_in_c_cc. |