Philip Reames | 70fb375 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 1 | ===================================== |
| 2 | Garbage Collection Safepoints in LLVM |
| 3 | ===================================== |
| 4 | |
| 5 | .. contents:: |
| 6 | :local: |
| 7 | :depth: 2 |
| 8 | |
| 9 | Status |
| 10 | ======= |
| 11 | |
Philip Reames | a716933 | 2017-04-19 23:16:13 +0000 | [diff] [blame] | 12 | This document describes a set of extensions to LLVM to support garbage |
| 13 | collection. By now, these mechanisms are well proven with commercial java |
| 14 | implementation with a fully relocating collector having shipped using them. |
| 15 | There are a couple places where bugs might still linger; these are called out |
| 16 | below. |
Philip Reames | 70fb375 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 17 | |
Philip Reames | a716933 | 2017-04-19 23:16:13 +0000 | [diff] [blame] | 18 | They are still listed as "experimental" to indicate that no forward or backward |
| 19 | compatibility guarantees are offered across versions. If your use case is such |
| 20 | that you need some form of forward compatibility guarantee, please raise the |
| 21 | issue on the llvm-dev mailing list. |
| 22 | |
| 23 | LLVM still supports an alternate mechanism for conservative garbage collection |
| 24 | support using the ``gcroot`` intrinsic. The ``gcroot`` mechanism is mostly of |
Sanjoy Das | ef51813 | 2017-04-19 23:55:03 +0000 | [diff] [blame] | 25 | historical interest at this point with one exception - its implementation of |
Philip Reames | a716933 | 2017-04-19 23:16:13 +0000 | [diff] [blame] | 26 | shadow stacks has been used successfully by a number of language frontends and |
| 27 | is still supported. |
Philip Reames | 70fb375 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 28 | |
Philip Reames | 9887d2b | 2018-11-09 16:27:04 +0000 | [diff] [blame] | 29 | Overview & Core Concepts |
| 30 | ======================== |
Philip Reames | 70fb375 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 31 | |
Philip Reames | 8c599f7 | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 32 | To collect dead objects, garbage collectors must be able to identify |
| 33 | any references to objects contained within executing code, and, |
| 34 | depending on the collector, potentially update them. The collector |
| 35 | does not need this information at all points in code - that would make |
| 36 | the problem much harder - but only at well-defined points in the |
| 37 | execution known as 'safepoints' For most collectors, it is sufficient |
| 38 | to track at least one copy of each unique pointer value. However, for |
| 39 | a collector which wishes to relocate objects directly reachable from |
| 40 | running code, a higher standard is required. |
Philip Reames | 70fb375 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 41 | |
Philip Reames | 8c599f7 | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 42 | One additional challenge is that the compiler may compute intermediate |
| 43 | results ("derived pointers") which point outside of the allocation or |
| 44 | even into the middle of another allocation. The eventual use of this |
| 45 | intermediate value must yield an address within the bounds of the |
| 46 | allocation, but such "exterior derived pointers" may be visible to the |
| 47 | collector. Given this, a garbage collector can not safely rely on the |
| 48 | runtime value of an address to indicate the object it is associated |
| 49 | with. If the garbage collector wishes to move any object, the |
| 50 | compiler must provide a mapping, for each pointer, to an indication of |
| 51 | its allocation. |
Philip Reames | 70fb375 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 52 | |
Philip Reames | 8c599f7 | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 53 | To simplify the interaction between a collector and the compiled code, |
| 54 | most garbage collectors are organized in terms of three abstractions: |
| 55 | load barriers, store barriers, and safepoints. |
Philip Reames | 70fb375 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 56 | |
Philip Reames | 8c599f7 | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 57 | #. A load barrier is a bit of code executed immediately after the |
| 58 | machine load instruction, but before any use of the value loaded. |
| 59 | Depending on the collector, such a barrier may be needed for all |
| 60 | loads, merely loads of a particular type (in the original source |
| 61 | language), or none at all. |
Philip Reames | 70fb375 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 62 | |
Bruce Mitchener | 767c34a | 2015-09-12 01:17:08 +0000 | [diff] [blame] | 63 | #. Analogously, a store barrier is a code fragment that runs |
Philip Reames | 8c599f7 | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 64 | immediately before the machine store instruction, but after the |
| 65 | computation of the value stored. The most common use of a store |
| 66 | barrier is to update a 'card table' in a generational garbage |
| 67 | collector. |
Philip Reames | 70fb375 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 68 | |
Philip Reames | 8c599f7 | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 69 | #. A safepoint is a location at which pointers visible to the compiled |
| 70 | code (i.e. currently in registers or on the stack) are allowed to |
| 71 | change. After the safepoint completes, the actual pointer value |
| 72 | may differ, but the 'object' (as seen by the source language) |
| 73 | pointed to will not. |
Philip Reames | 70fb375 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 74 | |
Philip Reames | 8c599f7 | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 75 | Note that the term 'safepoint' is somewhat overloaded. It refers to |
| 76 | both the location at which the machine state is parsable and the |
| 77 | coordination protocol involved in bring application threads to a |
| 78 | point at which the collector can safely use that information. The |
| 79 | term "statepoint" as used in this document refers exclusively to the |
| 80 | former. |
Philip Reames | 70fb375 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 81 | |
Philip Reames | 8c599f7 | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 82 | This document focuses on the last item - compiler support for |
| 83 | safepoints in generated code. We will assume that an outside |
| 84 | mechanism has decided where to place safepoints. From our |
| 85 | perspective, all safepoints will be function calls. To support |
| 86 | relocation of objects directly reachable from values in compiled code, |
| 87 | the collector must be able to: |
| 88 | |
| 89 | #. identify every copy of a pointer (including copies introduced by |
| 90 | the compiler itself) at the safepoint, |
Philip Reames | 70fb375 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 91 | #. identify which object each pointer relates to, and |
| 92 | #. potentially update each of those copies. |
| 93 | |
Philip Reames | 8c599f7 | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 94 | This document describes the mechanism by which an LLVM based compiler |
| 95 | can provide this information to a language runtime/collector, and |
Philip Reames | e482060 | 2018-11-08 22:56:41 +0000 | [diff] [blame] | 96 | ensure that all pointers can be read and updated if desired. |
| 97 | |
| 98 | Abstract Machine Model |
| 99 | ^^^^^^^^^^^^^^^^^^^^^^^ |
Philip Reames | a716933 | 2017-04-19 23:16:13 +0000 | [diff] [blame] | 100 | |
| 101 | At a high level, LLVM has been extended to support compiling to an abstract |
| 102 | machine which extends the actual target with a non-integral pointer type |
| 103 | suitable for representing a garbage collected reference to an object. In |
| 104 | particular, such non-integral pointer type have no defined mapping to an |
| 105 | integer representation. This semantic quirk allows the runtime to pick a |
| 106 | integer mapping for each point in the program allowing relocations of objects |
| 107 | without visible effects. |
| 108 | |
Philip Reames | e482060 | 2018-11-08 22:56:41 +0000 | [diff] [blame] | 109 | This high level abstract machine model is used for most of the optimizer. As |
| 110 | a result, transform passes do not need to be extended to look through explicit |
| 111 | relocation sequence. Before starting code generation, we switch |
| 112 | representations to an explicit form. The exact location chosen for lowering |
| 113 | is an implementation detail. |
| 114 | |
| 115 | Note that most of the value of the abstract machine model comes for collectors |
| 116 | which need to model potentially relocatable objects. For a compiler which |
| 117 | supports only a non-relocating collector, you may wish to consider starting |
| 118 | with the fully explicit form. |
Philip Reames | a716933 | 2017-04-19 23:16:13 +0000 | [diff] [blame] | 119 | |
| 120 | Warning: There is one currently known semantic hole in the definition of |
| 121 | non-integral pointers which has not been addressed upstream. To work around |
| 122 | this, you need to disable speculation of loads unless the memory type |
| 123 | (non-integral pointer vs anything else) is known to unchanged. That is, it is |
| 124 | not safe to speculate a load if doing causes a non-integral pointer value to |
| 125 | be loaded as any other type or vice versa. In practice, this restriction is |
| 126 | well isolated to isSafeToSpeculate in ValueTracking.cpp. |
| 127 | |
Philip Reames | e482060 | 2018-11-08 22:56:41 +0000 | [diff] [blame] | 128 | Explicit Representation |
| 129 | ^^^^^^^^^^^^^^^^^^^^^^^ |
| 130 | |
| 131 | A frontend could directly generate this low level explicit form, but |
| 132 | doing so may inhibit optimization. Instead, it is recommended that |
| 133 | compilers with relocating collectors target the abstract machine model just |
| 134 | described. |
Philip Reames | a716933 | 2017-04-19 23:16:13 +0000 | [diff] [blame] | 135 | |
| 136 | The heart of the explicit approach is to construct (or rewrite) the IR in a |
| 137 | manner where the possible updates performed by the garbage collector are |
Philip Reames | 8c599f7 | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 138 | explicitly visible in the IR. Doing so requires that we: |
Philip Reames | 70fb375 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 139 | |
Philip Reames | 8c599f7 | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 140 | #. create a new SSA value for each potentially relocated pointer, and |
| 141 | ensure that no uses of the original (non relocated) value is |
| 142 | reachable after the safepoint, |
| 143 | #. specify the relocation in a way which is opaque to the compiler to |
| 144 | ensure that the optimizer can not introduce new uses of an |
| 145 | unrelocated value after a statepoint. This prevents the optimizer |
| 146 | from performing unsound optimizations. |
| 147 | #. recording a mapping of live pointers (and the allocation they're |
| 148 | associated with) for each statepoint. |
Philip Reames | 70fb375 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 149 | |
Philip Reames | 8c599f7 | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 150 | At the most abstract level, inserting a safepoint can be thought of as |
| 151 | replacing a call instruction with a call to a multiple return value |
| 152 | function which both calls the original target of the call, returns |
Sanjoy Das | ef51813 | 2017-04-19 23:55:03 +0000 | [diff] [blame] | 153 | its result, and returns updated values for any live pointers to |
Philip Reames | 8c599f7 | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 154 | garbage collected objects. |
Philip Reames | 70fb375 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 155 | |
Philip Reames | 8c599f7 | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 156 | Note that the task of identifying all live pointers to garbage |
| 157 | collected values, transforming the IR to expose a pointer giving the |
| 158 | base object for every such live pointer, and inserting all the |
| 159 | intrinsics correctly is explicitly out of scope for this document. |
Philip Reames | 3f2a3f9 | 2015-02-25 01:23:59 +0000 | [diff] [blame] | 160 | The recommended approach is to use the :ref:`utility passes |
| 161 | <statepoint-utilities>` described below. |
Philip Reames | 70fb375 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 162 | |
Philip Reames | 8c599f7 | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 163 | This abstract function call is concretely represented by a sequence of |
Philip Reames | fda8a18 | 2015-02-26 01:18:21 +0000 | [diff] [blame] | 164 | intrinsic calls known collectively as a "statepoint relocation sequence". |
Philip Reames | 70fb375 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 165 | |
| 166 | Let's consider a simple call in LLVM IR: |
Philip Reames | 70fb375 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 167 | |
Philip Reames | fda8a18 | 2015-02-26 01:18:21 +0000 | [diff] [blame] | 168 | .. code-block:: llvm |
Philip Reames | 70fb375 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 169 | |
Philip Reames | fda8a18 | 2015-02-26 01:18:21 +0000 | [diff] [blame] | 170 | define i8 addrspace(1)* @test1(i8 addrspace(1)* %obj) |
| 171 | gc "statepoint-example" { |
| 172 | call void ()* @foo() |
| 173 | ret i8 addrspace(1)* %obj |
| 174 | } |
Philip Reames | 70fb375 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 175 | |
Philip Reames | fda8a18 | 2015-02-26 01:18:21 +0000 | [diff] [blame] | 176 | Depending on our language we may need to allow a safepoint during the execution |
| 177 | of ``foo``. If so, we need to let the collector update local values in the |
| 178 | current frame. If we don't, we'll be accessing a potential invalid reference |
| 179 | once we eventually return from the call. |
| 180 | |
| 181 | In this example, we need to relocate the SSA value ``%obj``. Since we can't |
| 182 | actually change the value in the SSA value ``%obj``, we need to introduce a new |
| 183 | SSA value ``%obj.relocated`` which represents the potentially changed value of |
| 184 | ``%obj`` after the safepoint and update any following uses appropriately. The |
| 185 | resulting relocation sequence is: |
| 186 | |
Nuno Lopes | 2cc32b1 | 2017-07-26 14:11:23 +0000 | [diff] [blame] | 187 | .. code-block:: llvm |
Philip Reames | fda8a18 | 2015-02-26 01:18:21 +0000 | [diff] [blame] | 188 | |
| 189 | define i8 addrspace(1)* @test1(i8 addrspace(1)* %obj) |
| 190 | gc "statepoint-example" { |
Chen Li | 955318d | 2015-12-26 07:54:32 +0000 | [diff] [blame] | 191 | %0 = call token (i64, i32, void ()*, i32, i32, ...)* @llvm.experimental.gc.statepoint.p0f_isVoidf(i64 0, i32 0, void ()* @foo, i32 0, i32 0, i32 0, i32 0, i8 addrspace(1)* %obj) |
| 192 | %obj.relocated = call coldcc i8 addrspace(1)* @llvm.experimental.gc.relocate.p1i8(token %0, i32 7, i32 7) |
Philip Reames | fda8a18 | 2015-02-26 01:18:21 +0000 | [diff] [blame] | 193 | ret i8 addrspace(1)* %obj.relocated |
| 194 | } |
Philip Reames | 70fb375 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 195 | |
Philip Reames | 8c599f7 | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 196 | Ideally, this sequence would have been represented as a M argument, N |
| 197 | return value function (where M is the number of values being |
| 198 | relocated + the original call arguments and N is the original return |
| 199 | value + each relocated value), but LLVM does not easily support such a |
| 200 | representation. |
| 201 | |
| 202 | Instead, the statepoint intrinsic marks the actual site of the |
| 203 | safepoint or statepoint. The statepoint returns a token value (which |
| 204 | exists only at compile time). To get back the original return value |
Philip Reames | fda8a18 | 2015-02-26 01:18:21 +0000 | [diff] [blame] | 205 | of the call, we use the ``gc.result`` intrinsic. To get the relocation |
| 206 | of each pointer in turn, we use the ``gc.relocate`` intrinsic with the |
| 207 | appropriate index. Note that both the ``gc.relocate`` and ``gc.result`` are |
| 208 | tied to the statepoint. The combination forms a "statepoint relocation |
Bruce Mitchener | 767c34a | 2015-09-12 01:17:08 +0000 | [diff] [blame] | 209 | sequence" and represents the entirety of a parseable call or 'statepoint'. |
Philip Reames | 70fb375 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 210 | |
Philip Reames | fda8a18 | 2015-02-26 01:18:21 +0000 | [diff] [blame] | 211 | When lowered, this example would generate the following x86 assembly: |
| 212 | |
| 213 | .. code-block:: gas |
| 214 | |
| 215 | .globl test1 |
| 216 | .align 16, 0x90 |
| 217 | pushq %rax |
| 218 | callq foo |
| 219 | .Ltmp1: |
| 220 | movq (%rsp), %rax # This load is redundant (oops!) |
| 221 | popq %rdx |
| 222 | retq |
Philip Reames | 70fb375 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 223 | |
Philip Reames | 8c599f7 | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 224 | Each of the potentially relocated values has been spilled to the |
| 225 | stack, and a record of that location has been recorded to the |
Philip Reames | fda8a18 | 2015-02-26 01:18:21 +0000 | [diff] [blame] | 226 | :ref:`Stack Map section <stackmap-section>`. If the garbage collector |
Philip Reames | 8c599f7 | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 227 | needs to update any of these pointers during the call, it knows |
| 228 | exactly what to change. |
Philip Reames | 70fb375 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 229 | |
Philip Reames | fda8a18 | 2015-02-26 01:18:21 +0000 | [diff] [blame] | 230 | The relevant parts of the StackMap section for our example are: |
| 231 | |
| 232 | .. code-block:: gas |
| 233 | |
| 234 | # This describes the call site |
| 235 | # Stack Maps: callsite 2882400000 |
| 236 | .quad 2882400000 |
| 237 | .long .Ltmp1-test1 |
| 238 | .short 0 |
| 239 | # .. 8 entries skipped .. |
| 240 | # This entry describes the spill slot which is directly addressable |
| 241 | # off RSP with offset 0. Given the value was spilled with a pushq, |
| 242 | # that makes sense. |
| 243 | # Stack Maps: Loc 8: Direct RSP [encoding: .byte 2, .byte 8, .short 7, .int 0] |
| 244 | .byte 2 |
| 245 | .byte 8 |
| 246 | .short 7 |
| 247 | .long 0 |
| 248 | |
Sanjoy Das | ef51813 | 2017-04-19 23:55:03 +0000 | [diff] [blame] | 249 | This example was taken from the tests for the :ref:`RewriteStatepointsForGC` |
| 250 | utility pass. As such, its full StackMap can be easily examined with the |
| 251 | following command. |
Philip Reames | fda8a18 | 2015-02-26 01:18:21 +0000 | [diff] [blame] | 252 | |
| 253 | .. code-block:: bash |
| 254 | |
| 255 | opt -rewrite-statepoints-for-gc test/Transforms/RewriteStatepointsForGC/basics.ll -S | llc -debug-only=stackmaps |
| 256 | |
Philip Reames | bdb320a | 2018-11-08 23:07:04 +0000 | [diff] [blame] | 257 | Simplifications for Non-Relocating GCs |
| 258 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 259 | |
| 260 | Some of the complexity in the previous example is unnecessary for a |
| 261 | non-relocating collector. While a non-relocating collector still needs the |
| 262 | information about which location contain live references, it doesn't need to |
| 263 | represent explicit relocations. As such, the previously described explicit |
| 264 | lowering can be simplified to remove all of the ``gc.relocate`` intrinsic |
| 265 | calls and leave uses in terms of the original reference value. |
| 266 | |
| 267 | Here's the explicit lowering for the previous example for a non-relocating |
| 268 | collector: |
| 269 | |
| 270 | .. code-block:: llvm |
| 271 | |
| 272 | define i8 addrspace(1)* @test1(i8 addrspace(1)* %obj) |
| 273 | gc "statepoint-example" { |
| 274 | call token (i64, i32, void ()*, i32, i32, ...)* @llvm.experimental.gc.statepoint.p0f_isVoidf(i64 0, i32 0, void ()* @foo, i32 0, i32 0, i32 0, i32 0, i8 addrspace(1)* %obj) |
| 275 | ret i8 addrspace(1)* %obj |
| 276 | } |
| 277 | |
Philip Reames | 0429e0b | 2018-11-08 23:20:40 +0000 | [diff] [blame] | 278 | Recording On Stack Regions |
| 279 | ^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 280 | |
| 281 | In addition to the explicit relocation form previously described, the |
| 282 | statepoint infrastructure also allows the listing of allocas within the gc |
| 283 | pointer list. Allocas can be listed with or without additional explicit gc |
| 284 | pointer values and relocations. |
| 285 | |
| 286 | An alloca in the gc region of the statepoint operand list will cause the |
| 287 | address of the stack region to be listed in the stackmap for the statepoint. |
| 288 | |
| 289 | This mechanism can be used to describe explicit spill slots if desired. It |
| 290 | then becomes the generator's responsibility to ensure that values are |
| 291 | spill/filled to/from the alloca as needed on either side of the safepoint. |
| 292 | Note that there is no way to indicate a corresponding base pointer for such |
| 293 | an explicitly specified spill slot, so usage is restricted to values for |
| 294 | which the associated collector can derive the object base from the pointer |
| 295 | itself. |
| 296 | |
| 297 | This mechanism can be used to describe on stack objects containing |
| 298 | references provided that the collector can map from the location on the |
| 299 | stack to a heap map describing the internal layout of the references the |
| 300 | collector needs to process. |
| 301 | |
| 302 | WARNING: At the moment, this alternate form is not well exercised. It is |
| 303 | recommended to use this with caution and expect to have to fix a few bugs. |
| 304 | In particular, the RewriteStatepointsForGC utility pass does not do |
| 305 | anything for allocas today. |
Philip Reames | bdb320a | 2018-11-08 23:07:04 +0000 | [diff] [blame] | 306 | |
Philip Reames | 7a65f99 | 2015-08-26 17:25:36 +0000 | [diff] [blame] | 307 | Base & Derived Pointers |
| 308 | ^^^^^^^^^^^^^^^^^^^^^^^ |
| 309 | |
Philip Reames | 304aa02 | 2015-08-26 23:13:35 +0000 | [diff] [blame] | 310 | A "base pointer" is one which points to the starting address of an allocation |
| 311 | (object). A "derived pointer" is one which is offset from a base pointer by |
| 312 | some amount. When relocating objects, a garbage collector needs to be able |
| 313 | to relocate each derived pointer associated with an allocation to the same |
| 314 | offset from the new address. |
Philip Reames | 7a65f99 | 2015-08-26 17:25:36 +0000 | [diff] [blame] | 315 | |
Philip Reames | 304aa02 | 2015-08-26 23:13:35 +0000 | [diff] [blame] | 316 | "Interior derived pointers" remain within the bounds of the allocation |
| 317 | they're associated with. As a result, the base object can be found at |
| 318 | runtime provided the bounds of allocations are known to the runtime system. |
| 319 | |
| 320 | "Exterior derived pointers" are outside the bounds of the associated object; |
| 321 | they may even fall within *another* allocations address range. As a result, |
| 322 | there is no way for a garbage collector to determine which allocation they |
| 323 | are associated with at runtime and compiler support is needed. |
| 324 | |
| 325 | The ``gc.relocate`` intrinsic supports an explicit operand for describing the |
| 326 | allocation associated with a derived pointer. This operand is frequently |
| 327 | referred to as the base operand, but does not strictly speaking have to be |
| 328 | a base pointer, but it does need to lie within the bounds of the associated |
| 329 | allocation. Some collectors may require that the operand be an actual base |
| 330 | pointer rather than merely an internal derived pointer. Note that during |
| 331 | lowering both the base and derived pointer operands are required to be live |
| 332 | over the associated call safepoint even if the base is otherwise unused |
| 333 | afterwards. |
| 334 | |
| 335 | If we extend our previous example to include a pointless derived pointer, |
| 336 | we get: |
| 337 | |
Nuno Lopes | 2cc32b1 | 2017-07-26 14:11:23 +0000 | [diff] [blame] | 338 | .. code-block:: llvm |
Philip Reames | 304aa02 | 2015-08-26 23:13:35 +0000 | [diff] [blame] | 339 | |
| 340 | define i8 addrspace(1)* @test1(i8 addrspace(1)* %obj) |
| 341 | gc "statepoint-example" { |
| 342 | %gep = getelementptr i8, i8 addrspace(1)* %obj, i64 20000 |
Chen Li | 955318d | 2015-12-26 07:54:32 +0000 | [diff] [blame] | 343 | %token = call token (i64, i32, void ()*, i32, i32, ...)* @llvm.experimental.gc.statepoint.p0f_isVoidf(i64 0, i32 0, void ()* @foo, i32 0, i32 0, i32 0, i32 0, i8 addrspace(1)* %obj, i8 addrspace(1)* %gep) |
| 344 | %obj.relocated = call i8 addrspace(1)* @llvm.experimental.gc.relocate.p1i8(token %token, i32 7, i32 7) |
| 345 | %gep.relocated = call i8 addrspace(1)* @llvm.experimental.gc.relocate.p1i8(token %token, i32 7, i32 8) |
Philip Reames | 304aa02 | 2015-08-26 23:13:35 +0000 | [diff] [blame] | 346 | %p = getelementptr i8, i8 addrspace(1)* %gep, i64 -20000 |
| 347 | ret i8 addrspace(1)* %p |
| 348 | } |
| 349 | |
| 350 | Note that in this example %p and %obj.relocate are the same address and we |
| 351 | could replace one with the other, potentially removing the derived pointer |
Sanjoy Das | e6a9ed7 | 2016-01-20 19:50:25 +0000 | [diff] [blame] | 352 | from the live set at the safepoint entirely. |
| 353 | |
| 354 | .. _gc_transition_args: |
Philip Reames | fda8a18 | 2015-02-26 01:18:21 +0000 | [diff] [blame] | 355 | |
Pat Gavlin | 5c7f746 | 2015-05-08 18:07:42 +0000 | [diff] [blame] | 356 | GC Transitions |
| 357 | ^^^^^^^^^^^^^^^^^^ |
Philip Reames | fda8a18 | 2015-02-26 01:18:21 +0000 | [diff] [blame] | 358 | |
Pat Gavlin | 5c7f746 | 2015-05-08 18:07:42 +0000 | [diff] [blame] | 359 | As a practical consideration, many garbage-collected systems allow code that is |
| 360 | collector-aware ("managed code") to call code that is not collector-aware |
| 361 | ("unmanaged code"). It is common that such calls must also be safepoints, since |
| 362 | it is desirable to allow the collector to run during the execution of |
Sylvestre Ledru | 3c5ec72 | 2016-02-14 20:16:22 +0000 | [diff] [blame] | 363 | unmanaged code. Furthermore, it is common that coordinating the transition from |
Pat Gavlin | 5c7f746 | 2015-05-08 18:07:42 +0000 | [diff] [blame] | 364 | managed to unmanaged code requires extra code generation at the call site to |
| 365 | inform the collector of the transition. In order to support these needs, a |
| 366 | statepoint may be marked as a GC transition, and data that is necessary to |
| 367 | perform the transition (if any) may be provided as additional arguments to the |
| 368 | statepoint. |
| 369 | |
| 370 | Note that although in many cases statepoints may be inferred to be GC |
| 371 | transitions based on the function symbols involved (e.g. a call from a |
| 372 | function with GC strategy "foo" to a function with GC strategy "bar"), |
| 373 | indirect calls that are also GC transitions must also be supported. This |
Bruce Mitchener | 767c34a | 2015-09-12 01:17:08 +0000 | [diff] [blame] | 374 | requirement is the driving force behind the decision to require that GC |
Pat Gavlin | 5c7f746 | 2015-05-08 18:07:42 +0000 | [diff] [blame] | 375 | transitions are explicitly marked. |
| 376 | |
| 377 | Let's revisit the sample given above, this time treating the call to ``@foo`` |
| 378 | as a GC transition. Depending on our target, the transition code may need to |
| 379 | access some extra state in order to inform the collector of the transition. |
| 380 | Let's assume a hypothetical GC--somewhat unimaginatively named "hypothetical-gc" |
| 381 | --that requires that a TLS variable must be written to before and after a call |
| 382 | to unmanaged code. The resulting relocation sequence is: |
| 383 | |
Nuno Lopes | 2cc32b1 | 2017-07-26 14:11:23 +0000 | [diff] [blame] | 384 | .. code-block:: llvm |
Pat Gavlin | 5c7f746 | 2015-05-08 18:07:42 +0000 | [diff] [blame] | 385 | |
| 386 | @flag = thread_local global i32 0, align 4 |
| 387 | |
| 388 | define i8 addrspace(1)* @test1(i8 addrspace(1) *%obj) |
| 389 | gc "hypothetical-gc" { |
| 390 | |
Chen Li | 955318d | 2015-12-26 07:54:32 +0000 | [diff] [blame] | 391 | %0 = call token (i64, i32, void ()*, i32, i32, ...)* @llvm.experimental.gc.statepoint.p0f_isVoidf(i64 0, i32 0, void ()* @foo, i32 0, i32 1, i32* @Flag, i32 0, i8 addrspace(1)* %obj) |
| 392 | %obj.relocated = call coldcc i8 addrspace(1)* @llvm.experimental.gc.relocate.p1i8(token %0, i32 7, i32 7) |
Pat Gavlin | 5c7f746 | 2015-05-08 18:07:42 +0000 | [diff] [blame] | 393 | ret i8 addrspace(1)* %obj.relocated |
| 394 | } |
| 395 | |
| 396 | During lowering, this will result in a instruction selection DAG that looks |
| 397 | something like: |
| 398 | |
Pat Gavlin | 278c121 | 2015-05-08 18:37:49 +0000 | [diff] [blame] | 399 | :: |
Pat Gavlin | 5c7f746 | 2015-05-08 18:07:42 +0000 | [diff] [blame] | 400 | |
| 401 | CALLSEQ_START |
| 402 | ... |
| 403 | GC_TRANSITION_START (lowered i32 *@Flag), SRCVALUE i32* Flag |
| 404 | STATEPOINT |
| 405 | GC_TRANSITION_END (lowered i32 *@Flag), SRCVALUE i32 *Flag |
| 406 | ... |
| 407 | CALLSEQ_END |
| 408 | |
| 409 | In order to generate the necessary transition code, the backend for each target |
| 410 | supported by "hypothetical-gc" must be modified to lower ``GC_TRANSITION_START`` |
| 411 | and ``GC_TRANSITION_END`` nodes appropriately when the "hypothetical-gc" |
| 412 | strategy is in use for a particular function. Assuming that such lowering has |
| 413 | been added for X86, the generated assembly would be: |
| 414 | |
| 415 | .. code-block:: gas |
| 416 | |
| 417 | .globl test1 |
| 418 | .align 16, 0x90 |
| 419 | pushq %rax |
| 420 | movl $1, %fs:Flag@TPOFF |
| 421 | callq foo |
| 422 | movl $0, %fs:Flag@TPOFF |
| 423 | .Ltmp1: |
| 424 | movq (%rsp), %rax # This load is redundant (oops!) |
| 425 | popq %rdx |
| 426 | retq |
| 427 | |
| 428 | Note that the design as presented above is not fully implemented: in particular, |
| 429 | strategy-specific lowering is not present, and all GC transitions are emitted as |
| 430 | as single no-op before and after the call instruction. These no-ops are often |
| 431 | removed by the backend during dead machine instruction elimination. |
Philip Reames | fda8a18 | 2015-02-26 01:18:21 +0000 | [diff] [blame] | 432 | |
| 433 | |
Philip Reames | 70fb375 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 434 | Intrinsics |
| 435 | =========== |
| 436 | |
Philip Reames | a230eee | 2015-02-24 23:57:26 +0000 | [diff] [blame] | 437 | 'llvm.experimental.gc.statepoint' Intrinsic |
| 438 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
Philip Reames | 70fb375 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 439 | |
| 440 | Syntax: |
| 441 | """"""" |
| 442 | |
| 443 | :: |
| 444 | |
Chen Li | 955318d | 2015-12-26 07:54:32 +0000 | [diff] [blame] | 445 | declare token |
Sanjoy Das | ead2d1f | 2015-05-12 23:52:24 +0000 | [diff] [blame] | 446 | @llvm.experimental.gc.statepoint(i64 <id>, i32 <num patch bytes>, |
| 447 | func_type <target>, |
Sanjoy Das | 3bc33d9 | 2015-05-13 20:19:51 +0000 | [diff] [blame] | 448 | i64 <#call args>, i64 <flags>, |
Philip Reames | a230eee | 2015-02-24 23:57:26 +0000 | [diff] [blame] | 449 | ... (call parameters), |
Pat Gavlin | 5c7f746 | 2015-05-08 18:07:42 +0000 | [diff] [blame] | 450 | i64 <# transition args>, ... (transition parameters), |
Philip Reames | 70fb375 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 451 | i64 <# deopt args>, ... (deopt parameters), |
| 452 | ... (gc parameters)) |
| 453 | |
| 454 | Overview: |
| 455 | """"""""" |
| 456 | |
Philip Reames | 8c599f7 | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 457 | The statepoint intrinsic represents a call which is parse-able by the |
| 458 | runtime. |
Philip Reames | 70fb375 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 459 | |
| 460 | Operands: |
| 461 | """"""""" |
| 462 | |
Sanjoy Das | ead2d1f | 2015-05-12 23:52:24 +0000 | [diff] [blame] | 463 | The 'id' operand is a constant integer that is reported as the ID |
| 464 | field in the generated stackmap. LLVM does not interpret this |
| 465 | parameter in any way and its meaning is up to the statepoint user to |
| 466 | decide. Note that LLVM is free to duplicate code containing |
| 467 | statepoint calls, and this may transform IR that had a unique 'id' per |
| 468 | lexical call to statepoint to IR that does not. |
| 469 | |
| 470 | If 'num patch bytes' is non-zero then the call instruction |
| 471 | corresponding to the statepoint is not emitted and LLVM emits 'num |
| 472 | patch bytes' bytes of nops in its place. LLVM will emit code to |
| 473 | prepare the function arguments and retrieve the function return value |
| 474 | in accordance to the calling convention; the former before the nop |
| 475 | sequence and the latter after the nop sequence. It is expected that |
| 476 | the user will patch over the 'num patch bytes' bytes of nops with a |
| 477 | calling sequence specific to their runtime before executing the |
| 478 | generated machine code. There are no guarantees with respect to the |
| 479 | alignment of the nop sequence. Unlike :doc:`StackMaps` statepoints do |
Sanjoy Das | 44d65ea | 2015-07-28 23:50:30 +0000 | [diff] [blame] | 480 | not have a concept of shadow bytes. Note that semantically the |
| 481 | statepoint still represents a call or invoke to 'target', and the nop |
| 482 | sequence after patching is expected to represent an operation |
| 483 | equivalent to a call or invoke to 'target'. |
Sanjoy Das | ead2d1f | 2015-05-12 23:52:24 +0000 | [diff] [blame] | 484 | |
Philip Reames | 8c599f7 | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 485 | The 'target' operand is the function actually being called. The |
| 486 | target can be specified as either a symbolic LLVM function, or as an |
| 487 | arbitrary Value of appropriate function type. Note that the function |
| 488 | type must match the signature of the callee and the types of the 'call |
Sanjoy Das | 44d65ea | 2015-07-28 23:50:30 +0000 | [diff] [blame] | 489 | parameters' arguments. |
Philip Reames | 70fb375 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 490 | |
Philip Reames | 8c599f7 | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 491 | The '#call args' operand is the number of arguments to the actual |
| 492 | call. It must exactly match the number of arguments passed in the |
| 493 | 'call parameters' variable length section. |
Philip Reames | 70fb375 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 494 | |
Pat Gavlin | 5c7f746 | 2015-05-08 18:07:42 +0000 | [diff] [blame] | 495 | The 'flags' operand is used to specify extra information about the |
| 496 | statepoint. This is currently only used to mark certain statepoints |
| 497 | as GC transitions. This operand is a 64-bit integer with the following |
| 498 | layout, where bit 0 is the least significant bit: |
| 499 | |
| 500 | +-------+---------------------------------------------------+ |
| 501 | | Bit # | Usage | |
| 502 | +=======+===================================================+ |
| 503 | | 0 | Set if the statepoint is a GC transition, cleared | |
| 504 | | | otherwise. | |
| 505 | +-------+---------------------------------------------------+ |
| 506 | | 1-63 | Reserved for future use; must be cleared. | |
| 507 | +-------+---------------------------------------------------+ |
Philip Reames | 70fb375 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 508 | |
Philip Reames | 8c599f7 | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 509 | The 'call parameters' arguments are simply the arguments which need to |
| 510 | be passed to the call target. They will be lowered according to the |
| 511 | specified calling convention and otherwise handled like a normal call |
| 512 | instruction. The number of arguments must exactly match what is |
| 513 | specified in '# call args'. The types must match the signature of |
| 514 | 'target'. |
Philip Reames | 70fb375 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 515 | |
Pat Gavlin | 5c7f746 | 2015-05-08 18:07:42 +0000 | [diff] [blame] | 516 | The 'transition parameters' arguments contain an arbitrary list of |
| 517 | Values which need to be passed to GC transition code. They will be |
| 518 | lowered and passed as operands to the appropriate GC_TRANSITION nodes |
| 519 | in the selection DAG. It is assumed that these arguments must be |
| 520 | available before and after (but not necessarily during) the execution |
| 521 | of the callee. The '# transition args' field indicates how many operands |
| 522 | are to be interpreted as 'transition parameters'. |
| 523 | |
Philip Reames | 8c599f7 | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 524 | The 'deopt parameters' arguments contain an arbitrary list of Values |
| 525 | which is meaningful to the runtime. The runtime may read any of these |
| 526 | values, but is assumed not to modify them. If the garbage collector |
| 527 | might need to modify one of these values, it must also be listed in |
| 528 | the 'gc pointer' argument list. The '# deopt args' field indicates |
| 529 | how many operands are to be interpreted as 'deopt parameters'. |
Philip Reames | 70fb375 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 530 | |
Philip Reames | 8c599f7 | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 531 | The 'gc parameters' arguments contain every pointer to a garbage |
| 532 | collector object which potentially needs to be updated by the garbage |
| 533 | collector. Note that the argument list must explicitly contain a base |
| 534 | pointer for every derived pointer listed. The order of arguments is |
| 535 | unimportant. Unlike the other variable length parameter sets, this |
| 536 | list is not length prefixed. |
Philip Reames | 70fb375 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 537 | |
| 538 | Semantics: |
| 539 | """""""""" |
| 540 | |
Philip Reames | 8c599f7 | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 541 | A statepoint is assumed to read and write all memory. As a result, |
| 542 | memory operations can not be reordered past a statepoint. It is |
| 543 | illegal to mark a statepoint as being either 'readonly' or 'readnone'. |
Philip Reames | 70fb375 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 544 | |
Philip Reames | 8c599f7 | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 545 | Note that legal IR can not perform any memory operation on a 'gc |
| 546 | pointer' argument of the statepoint in a location statically reachable |
| 547 | from the statepoint. Instead, the explicitly relocated value (from a |
Philip Reames | b4ecacf | 2015-02-25 00:22:07 +0000 | [diff] [blame] | 548 | ``gc.relocate``) must be used. |
Philip Reames | 70fb375 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 549 | |
Philip Reames | a230eee | 2015-02-24 23:57:26 +0000 | [diff] [blame] | 550 | 'llvm.experimental.gc.result' Intrinsic |
| 551 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
Philip Reames | 70fb375 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 552 | |
| 553 | Syntax: |
| 554 | """"""" |
| 555 | |
| 556 | :: |
| 557 | |
| 558 | declare type* |
Chen Li | 955318d | 2015-12-26 07:54:32 +0000 | [diff] [blame] | 559 | @llvm.experimental.gc.result(token %statepoint_token) |
Philip Reames | 70fb375 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 560 | |
| 561 | Overview: |
| 562 | """"""""" |
| 563 | |
Philip Reames | b4ecacf | 2015-02-25 00:22:07 +0000 | [diff] [blame] | 564 | ``gc.result`` extracts the result of the original call instruction |
| 565 | which was replaced by the ``gc.statepoint``. The ``gc.result`` |
Philip Reames | 8c599f7 | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 566 | intrinsic is actually a family of three intrinsics due to an |
| 567 | implementation limitation. Other than the type of the return value, |
| 568 | the semantics are the same. |
Philip Reames | 70fb375 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 569 | |
| 570 | Operands: |
| 571 | """"""""" |
| 572 | |
Philip Reames | b4ecacf | 2015-02-25 00:22:07 +0000 | [diff] [blame] | 573 | The first and only argument is the ``gc.statepoint`` which starts |
| 574 | the safepoint sequence of which this ``gc.result`` is a part. |
Chen Li | 955318d | 2015-12-26 07:54:32 +0000 | [diff] [blame] | 575 | Despite the typing of this as a generic token, *only* the value defined |
Philip Reames | b4ecacf | 2015-02-25 00:22:07 +0000 | [diff] [blame] | 576 | by a ``gc.statepoint`` is legal here. |
Philip Reames | 70fb375 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 577 | |
| 578 | Semantics: |
| 579 | """""""""" |
| 580 | |
Philip Reames | b4ecacf | 2015-02-25 00:22:07 +0000 | [diff] [blame] | 581 | The ``gc.result`` represents the return value of the call target of |
| 582 | the ``statepoint``. The type of the ``gc.result`` must exactly match |
Philip Reames | 8c599f7 | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 583 | the type of the target. If the call target returns void, there will |
Philip Reames | b4ecacf | 2015-02-25 00:22:07 +0000 | [diff] [blame] | 584 | be no ``gc.result``. |
Philip Reames | 70fb375 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 585 | |
Philip Reames | b4ecacf | 2015-02-25 00:22:07 +0000 | [diff] [blame] | 586 | A ``gc.result`` is modeled as a 'readnone' pure function. It has no |
Philip Reames | 8c599f7 | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 587 | side effects since it is just a projection of the return value of the |
Philip Reames | b4ecacf | 2015-02-25 00:22:07 +0000 | [diff] [blame] | 588 | previous call represented by the ``gc.statepoint``. |
Philip Reames | 70fb375 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 589 | |
Philip Reames | a230eee | 2015-02-24 23:57:26 +0000 | [diff] [blame] | 590 | 'llvm.experimental.gc.relocate' Intrinsic |
| 591 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
Philip Reames | 70fb375 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 592 | |
| 593 | Syntax: |
| 594 | """"""" |
| 595 | |
| 596 | :: |
| 597 | |
Philip Reames | a230eee | 2015-02-24 23:57:26 +0000 | [diff] [blame] | 598 | declare <pointer type> |
Chen Li | 955318d | 2015-12-26 07:54:32 +0000 | [diff] [blame] | 599 | @llvm.experimental.gc.relocate(token %statepoint_token, |
Philip Reames | a230eee | 2015-02-24 23:57:26 +0000 | [diff] [blame] | 600 | i32 %base_offset, |
| 601 | i32 %pointer_offset) |
Philip Reames | 70fb375 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 602 | |
| 603 | Overview: |
| 604 | """"""""" |
| 605 | |
Philip Reames | b4ecacf | 2015-02-25 00:22:07 +0000 | [diff] [blame] | 606 | A ``gc.relocate`` returns the potentially relocated value of a pointer |
Philip Reames | 8c599f7 | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 607 | at the safepoint. |
Philip Reames | 70fb375 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 608 | |
| 609 | Operands: |
| 610 | """"""""" |
| 611 | |
Philip Reames | b4ecacf | 2015-02-25 00:22:07 +0000 | [diff] [blame] | 612 | The first argument is the ``gc.statepoint`` which starts the |
| 613 | safepoint sequence of which this ``gc.relocation`` is a part. |
Chen Li | 955318d | 2015-12-26 07:54:32 +0000 | [diff] [blame] | 614 | Despite the typing of this as a generic token, *only* the value defined |
Philip Reames | b4ecacf | 2015-02-25 00:22:07 +0000 | [diff] [blame] | 615 | by a ``gc.statepoint`` is legal here. |
Philip Reames | 70fb375 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 616 | |
Philip Reames | 8c599f7 | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 617 | The second argument is an index into the statepoints list of arguments |
Philip Reames | 304aa02 | 2015-08-26 23:13:35 +0000 | [diff] [blame] | 618 | which specifies the allocation for the pointer being relocated. |
Philip Reames | 8c599f7 | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 619 | This index must land within the 'gc parameter' section of the |
Philip Reames | 304aa02 | 2015-08-26 23:13:35 +0000 | [diff] [blame] | 620 | statepoint's argument list. The associated value must be within the |
| 621 | object with which the pointer being relocated is associated. The optimizer |
| 622 | is free to change *which* interior derived pointer is reported, provided that |
| 623 | it does not replace an actual base pointer with another interior derived |
| 624 | pointer. Collectors are allowed to rely on the base pointer operand |
| 625 | remaining an actual base pointer if so constructed. |
Philip Reames | 70fb375 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 626 | |
Philip Reames | 8c599f7 | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 627 | The third argument is an index into the statepoint's list of arguments |
| 628 | which specify the (potentially) derived pointer being relocated. It |
| 629 | is legal for this index to be the same as the second argument |
| 630 | if-and-only-if a base pointer is being relocated. This index must land |
| 631 | within the 'gc parameter' section of the statepoint's argument list. |
Philip Reames | 70fb375 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 632 | |
| 633 | Semantics: |
| 634 | """""""""" |
Philip Reames | 70fb375 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 635 | |
Philip Reames | b4ecacf | 2015-02-25 00:22:07 +0000 | [diff] [blame] | 636 | The return value of ``gc.relocate`` is the potentially relocated value |
Sanjoy Das | ef51813 | 2017-04-19 23:55:03 +0000 | [diff] [blame] | 637 | of the pointer specified by its arguments. It is unspecified how the |
Philip Reames | 8c599f7 | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 638 | value of the returned pointer relates to the argument to the |
Philip Reames | b4ecacf | 2015-02-25 00:22:07 +0000 | [diff] [blame] | 639 | ``gc.statepoint`` other than that a) it points to the same source |
Philip Reames | 8c599f7 | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 640 | language object with the same offset, and b) the 'based-on' |
| 641 | relationship of the newly relocated pointers is a projection of the |
| 642 | unrelocated pointers. In particular, the integer value of the pointer |
| 643 | returned is unspecified. |
| 644 | |
Philip Reames | b4ecacf | 2015-02-25 00:22:07 +0000 | [diff] [blame] | 645 | A ``gc.relocate`` is modeled as a ``readnone`` pure function. It has no |
Philip Reames | 8c599f7 | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 646 | side effects since it is just a way to extract information about work |
Philip Reames | b4ecacf | 2015-02-25 00:22:07 +0000 | [diff] [blame] | 647 | done during the actual call modeled by the ``gc.statepoint``. |
Philip Reames | 70fb375 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 648 | |
Philip Reames | 651713e | 2015-02-25 23:22:43 +0000 | [diff] [blame] | 649 | .. _statepoint-stackmap-format: |
Philip Reames | 70fb375 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 650 | |
Philip Reames | 6bb0038 | 2014-12-04 00:45:23 +0000 | [diff] [blame] | 651 | Stack Map Format |
Philip Reames | 70fb375 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 652 | ================ |
| 653 | |
Philip Reames | 8c599f7 | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 654 | Locations for each pointer value which may need read and/or updated by |
Philip Reames | af1ff1a | 2018-11-08 15:17:10 +0000 | [diff] [blame] | 655 | the runtime or collector are provided in a separate section of the |
Philip Reames | 0cdc04d | 2018-11-08 17:20:35 +0000 | [diff] [blame] | 656 | generated object file as specified in the PatchPoint documentation. |
| 657 | This special section is encoded per the |
Philip Reames | af1ff1a | 2018-11-08 15:17:10 +0000 | [diff] [blame] | 658 | :ref:`Stack Map format <stackmap-format>`. |
| 659 | |
| 660 | The general expectation is that a JIT compiler will parse and discard this |
| 661 | format; it is not particularly memory efficient. If you need an alternate |
| 662 | format (e.g. for an ahead of time compiler), see discussion under |
| 663 | :ref: `open work items <OpenWork>` below. |
Philip Reames | 70fb375 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 664 | |
| 665 | Each statepoint generates the following Locations: |
| 666 | |
Pat Gavlin | 81b3ceb | 2015-05-12 19:50:19 +0000 | [diff] [blame] | 667 | * Constant which describes the calling convention of the call target. This |
| 668 | constant is a valid :ref:`calling convention identifier <callingconv>` for |
| 669 | the version of LLVM used to generate the stackmap. No additional compatibility |
| 670 | guarantees are made for this constant over what LLVM provides elsewhere w.r.t. |
| 671 | these identifiers. |
| 672 | * Constant which describes the flags passed to the statepoint intrinsic |
Philip Reames | 8c599f7 | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 673 | * Constant which describes number of following deopt *Locations* (not |
| 674 | operands) |
| 675 | * Variable number of Locations, one for each deopt parameter listed in |
Philip Reames | 112e094 | 2016-01-14 23:58:18 +0000 | [diff] [blame] | 676 | the IR statepoint (same number as described by previous Constant). At |
| 677 | the moment, only deopt parameters with a bitwidth of 64 bits or less |
| 678 | are supported. Values of a type larger than 64 bits can be specified |
| 679 | and reported only if a) the value is constant at the call site, and b) |
| 680 | the constant can be represented with less than 64 bits (assuming zero |
| 681 | extension to the original bitwidth). |
Philip Reames | 16b8b2d | 2016-01-15 00:13:39 +0000 | [diff] [blame] | 682 | * Variable number of relocation records, each of which consists of |
| 683 | exactly two Locations. Relocation records are described in detail |
| 684 | below. |
| 685 | |
| 686 | Each relocation record provides sufficient information for a collector to |
| 687 | relocate one or more derived pointers. Each record consists of a pair of |
| 688 | Locations. The second element in the record represents the pointer (or |
| 689 | pointers) which need updated. The first element in the record provides a |
| 690 | pointer to the base of the object with which the pointer(s) being relocated is |
| 691 | associated. This information is required for handling generalized derived |
| 692 | pointers since a pointer may be outside the bounds of the original allocation, |
| 693 | but still needs to be relocated with the allocation. Additionally: |
| 694 | |
| 695 | * It is guaranteed that the base pointer must also appear explicitly as a |
| 696 | relocation pair if used after the statepoint. |
| 697 | * There may be fewer relocation records then gc parameters in the IR |
Philip Reames | 8c599f7 | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 698 | statepoint. Each *unique* pair will occur at least once; duplicates |
Philip Reames | 16b8b2d | 2016-01-15 00:13:39 +0000 | [diff] [blame] | 699 | are possible. |
| 700 | * The Locations within each record may either be of pointer size or a |
| 701 | multiple of pointer size. In the later case, the record must be |
| 702 | interpreted as describing a sequence of pointers and their corresponding |
| 703 | base pointers. If the Location is of size N x sizeof(pointer), then |
| 704 | there will be N records of one pointer each contained within the Location. |
| 705 | Both Locations in a pair can be assumed to be of the same size. |
Philip Reames | 70fb375 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 706 | |
Philip Reames | 8c599f7 | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 707 | Note that the Locations used in each section may describe the same |
| 708 | physical location. e.g. A stack slot may appear as a deopt location, |
| 709 | a gc base pointer, and a gc derived pointer. |
Philip Reames | 70fb375 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 710 | |
Philip Reames | 8c599f7 | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 711 | The LiveOut section of the StkMapRecord will be empty for a statepoint |
| 712 | record. |
Philip Reames | 70fb375 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 713 | |
| 714 | Safepoint Semantics & Verification |
| 715 | ================================== |
| 716 | |
Philip Reames | 8c599f7 | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 717 | The fundamental correctness property for the compiled code's |
| 718 | correctness w.r.t. the garbage collector is a dynamic one. It must be |
| 719 | the case that there is no dynamic trace such that a operation |
| 720 | involving a potentially relocated pointer is observably-after a |
| 721 | safepoint which could relocate it. 'observably-after' is this usage |
| 722 | means that an outside observer could observe this sequence of events |
| 723 | in a way which precludes the operation being performed before the |
| 724 | safepoint. |
Philip Reames | 70fb375 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 725 | |
Philip Reames | 8c599f7 | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 726 | To understand why this 'observable-after' property is required, |
| 727 | consider a null comparison performed on the original copy of a |
| 728 | relocated pointer. Assuming that control flow follows the safepoint, |
| 729 | there is no way to observe externally whether the null comparison is |
| 730 | performed before or after the safepoint. (Remember, the original |
| 731 | Value is unmodified by the safepoint.) The compiler is free to make |
| 732 | either scheduling choice. |
Philip Reames | 70fb375 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 733 | |
Philip Reames | 8c599f7 | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 734 | The actual correctness property implemented is slightly stronger than |
| 735 | this. We require that there be no *static path* on which a |
| 736 | potentially relocated pointer is 'observably-after' it may have been |
| 737 | relocated. This is slightly stronger than is strictly necessary (and |
| 738 | thus may disallow some otherwise valid programs), but greatly |
| 739 | simplifies reasoning about correctness of the compiled code. |
Philip Reames | 70fb375 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 740 | |
Philip Reames | 8c599f7 | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 741 | By construction, this property will be upheld by the optimizer if |
| 742 | correctly established in the source IR. This is a key invariant of |
| 743 | the design. |
Philip Reames | 70fb375 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 744 | |
Philip Reames | 8c599f7 | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 745 | The existing IR Verifier pass has been extended to check most of the |
| 746 | local restrictions on the intrinsics mentioned in their respective |
| 747 | documentation. The current implementation in LLVM does not check the |
| 748 | key relocation invariant, but this is ongoing work on developing such |
Tanya Lattner | 377a984 | 2015-08-05 03:51:17 +0000 | [diff] [blame] | 749 | a verifier. Please ask on llvm-dev if you're interested in |
Philip Reames | 8c599f7 | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 750 | experimenting with the current version. |
Philip Reames | 70fb375 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 751 | |
Philip Reames | 3f2a3f9 | 2015-02-25 01:23:59 +0000 | [diff] [blame] | 752 | .. _statepoint-utilities: |
| 753 | |
| 754 | Utility Passes for Safepoint Insertion |
| 755 | ====================================== |
| 756 | |
| 757 | .. _RewriteStatepointsForGC: |
| 758 | |
| 759 | RewriteStatepointsForGC |
| 760 | ^^^^^^^^^^^^^^^^^^^^^^^^ |
| 761 | |
Philip Reames | a716933 | 2017-04-19 23:16:13 +0000 | [diff] [blame] | 762 | The pass RewriteStatepointsForGC transforms a function's IR to lower from the |
| 763 | abstract machine model described above to the explicit statepoint model of |
| 764 | relocations. To do this, it replaces all calls or invokes of functions which |
| 765 | might contain a safepoint poll with a ``gc.statepoint`` and associated full |
| 766 | relocation sequence, including all required ``gc.relocates``. |
| 767 | |
| 768 | Note that by default, this pass only runs for the "statepoint-example" or |
| 769 | "core-clr" gc strategies. You will need to add your custom strategy to this |
| 770 | whitelist or use one of the predefined ones. |
Philip Reames | 3f2a3f9 | 2015-02-25 01:23:59 +0000 | [diff] [blame] | 771 | |
| 772 | As an example, given this code: |
| 773 | |
Nuno Lopes | 2cc32b1 | 2017-07-26 14:11:23 +0000 | [diff] [blame] | 774 | .. code-block:: llvm |
Philip Reames | 3f2a3f9 | 2015-02-25 01:23:59 +0000 | [diff] [blame] | 775 | |
| 776 | define i8 addrspace(1)* @test1(i8 addrspace(1)* %obj) |
| 777 | gc "statepoint-example" { |
Philip Reames | a716933 | 2017-04-19 23:16:13 +0000 | [diff] [blame] | 778 | call void @foo() |
Philip Reames | 3f2a3f9 | 2015-02-25 01:23:59 +0000 | [diff] [blame] | 779 | ret i8 addrspace(1)* %obj |
| 780 | } |
| 781 | |
| 782 | The pass would produce this IR: |
| 783 | |
Nuno Lopes | 2cc32b1 | 2017-07-26 14:11:23 +0000 | [diff] [blame] | 784 | .. code-block:: llvm |
Philip Reames | 3f2a3f9 | 2015-02-25 01:23:59 +0000 | [diff] [blame] | 785 | |
| 786 | define i8 addrspace(1)* @test1(i8 addrspace(1)* %obj) |
| 787 | gc "statepoint-example" { |
Chen Li | 955318d | 2015-12-26 07:54:32 +0000 | [diff] [blame] | 788 | %0 = call token (i64, i32, void ()*, i32, i32, ...)* @llvm.experimental.gc.statepoint.p0f_isVoidf(i64 2882400000, i32 0, void ()* @foo, i32 0, i32 0, i32 0, i32 5, i32 0, i32 -1, i32 0, i32 0, i32 0, i8 addrspace(1)* %obj) |
| 789 | %obj.relocated = call coldcc i8 addrspace(1)* @llvm.experimental.gc.relocate.p1i8(token %0, i32 12, i32 12) |
Philip Reames | 3f2a3f9 | 2015-02-25 01:23:59 +0000 | [diff] [blame] | 790 | ret i8 addrspace(1)* %obj.relocated |
| 791 | } |
| 792 | |
| 793 | In the above examples, the addrspace(1) marker on the pointers is the mechanism |
| 794 | that the ``statepoint-example`` GC strategy uses to distinguish references from |
Philip Reames | a716933 | 2017-04-19 23:16:13 +0000 | [diff] [blame] | 795 | non references. The pass assumes that all addrspace(1) pointers are non-integral |
| 796 | pointer types. Address space 1 is not globally reserved for this purpose. |
Philip Reames | 3f2a3f9 | 2015-02-25 01:23:59 +0000 | [diff] [blame] | 797 | |
| 798 | This pass can be used an utility function by a language frontend that doesn't |
| 799 | want to manually reason about liveness, base pointers, or relocation when |
| 800 | constructing IR. As currently implemented, RewriteStatepointsForGC must be |
Philip Reames | 304aa02 | 2015-08-26 23:13:35 +0000 | [diff] [blame] | 801 | run after SSA construction (i.e. mem2ref). |
Philip Reames | 3f2a3f9 | 2015-02-25 01:23:59 +0000 | [diff] [blame] | 802 | |
Philip Reames | 304aa02 | 2015-08-26 23:13:35 +0000 | [diff] [blame] | 803 | RewriteStatepointsForGC will ensure that appropriate base pointers are listed |
| 804 | for every relocation created. It will do so by duplicating code as needed to |
| 805 | propagate the base pointer associated with each pointer being relocated to |
| 806 | the appropriate safepoints. The implementation assumes that the following |
| 807 | IR constructs produce base pointers: loads from the heap, addresses of global |
| 808 | variables, function arguments, function return values. Constant pointers (such |
| 809 | as null) are also assumed to be base pointers. In practice, this constraint |
| 810 | can be relaxed to producing interior derived pointers provided the target |
| 811 | collector can find the associated allocation from an arbitrary interior |
| 812 | derived pointer. |
Philip Reames | 3f2a3f9 | 2015-02-25 01:23:59 +0000 | [diff] [blame] | 813 | |
Philip Reames | a716933 | 2017-04-19 23:16:13 +0000 | [diff] [blame] | 814 | By default RewriteStatepointsForGC passes in ``0xABCDEF00`` as the statepoint |
| 815 | ID and ``0`` as the number of patchable bytes to the newly constructed |
| 816 | ``gc.statepoint``. These values can be configured on a per-callsite |
| 817 | basis using the attributes ``"statepoint-id"`` and |
| 818 | ``"statepoint-num-patch-bytes"``. If a call site is marked with a |
| 819 | ``"statepoint-id"`` function attribute and its value is a positive |
| 820 | integer (represented as a string), then that value is used as the ID |
| 821 | of the newly constructed ``gc.statepoint``. If a call site is marked |
| 822 | with a ``"statepoint-num-patch-bytes"`` function attribute and its |
| 823 | value is a positive integer, then that value is used as the 'num patch |
| 824 | bytes' parameter of the newly constructed ``gc.statepoint``. The |
| 825 | ``"statepoint-id"`` and ``"statepoint-num-patch-bytes"`` attributes |
| 826 | are not propagated to the ``gc.statepoint`` call or invoke if they |
| 827 | could be successfully parsed. |
| 828 | |
| 829 | In practice, RewriteStatepointsForGC should be run much later in the pass |
Philip Reames | 3f2a3f9 | 2015-02-25 01:23:59 +0000 | [diff] [blame] | 830 | pipeline, after most optimization is already done. This helps to improve |
| 831 | the quality of the generated code when compiled with garbage collection support. |
Philip Reames | 3f2a3f9 | 2015-02-25 01:23:59 +0000 | [diff] [blame] | 832 | |
| 833 | .. _PlaceSafepoints: |
| 834 | |
| 835 | PlaceSafepoints |
| 836 | ^^^^^^^^^^^^^^^^ |
| 837 | |
Philip Reames | a716933 | 2017-04-19 23:16:13 +0000 | [diff] [blame] | 838 | The pass PlaceSafepoints inserts safepoint polls sufficient to ensure running |
| 839 | code checks for a safepoint request on a timely manner. This pass is expected |
| 840 | to be run before RewriteStatepointsForGC and thus does not produce full |
| 841 | relocation sequences. |
Philip Reames | 3f2a3f9 | 2015-02-25 01:23:59 +0000 | [diff] [blame] | 842 | |
Philip Reames | fda8a18 | 2015-02-26 01:18:21 +0000 | [diff] [blame] | 843 | As an example, given input IR of the following: |
| 844 | |
| 845 | .. code-block:: llvm |
| 846 | |
| 847 | define void @test() gc "statepoint-example" { |
| 848 | call void @foo() |
| 849 | ret void |
| 850 | } |
| 851 | |
| 852 | declare void @do_safepoint() |
| 853 | define void @gc.safepoint_poll() { |
| 854 | call void @do_safepoint() |
| 855 | ret void |
| 856 | } |
| 857 | |
| 858 | |
| 859 | This pass would produce the following IR: |
| 860 | |
Nuno Lopes | 2cc32b1 | 2017-07-26 14:11:23 +0000 | [diff] [blame] | 861 | .. code-block:: llvm |
Philip Reames | fda8a18 | 2015-02-26 01:18:21 +0000 | [diff] [blame] | 862 | |
| 863 | define void @test() gc "statepoint-example" { |
Philip Reames | a716933 | 2017-04-19 23:16:13 +0000 | [diff] [blame] | 864 | call void @do_safepoint() |
| 865 | call void @foo() |
Philip Reames | fda8a18 | 2015-02-26 01:18:21 +0000 | [diff] [blame] | 866 | ret void |
| 867 | } |
| 868 | |
Philip Reames | a716933 | 2017-04-19 23:16:13 +0000 | [diff] [blame] | 869 | In this case, we've added an (unconditional) entry safepoint poll. Note that |
| 870 | despite appearances, the entry poll is not necessarily redundant. We'd have to |
| 871 | know that ``foo`` and ``test`` were not mutually recursive for the poll to be |
| 872 | redundant. In practice, you'd probably want to your poll definition to contain |
| 873 | a conditional branch of some form. |
Philip Reames | fda8a18 | 2015-02-26 01:18:21 +0000 | [diff] [blame] | 874 | |
Philip Reames | 3f2a3f9 | 2015-02-25 01:23:59 +0000 | [diff] [blame] | 875 | At the moment, PlaceSafepoints can insert safepoint polls at method entry and |
| 876 | loop backedges locations. Extending this to work with return polls would be |
| 877 | straight forward if desired. |
| 878 | |
| 879 | PlaceSafepoints includes a number of optimizations to avoid placing safepoint |
| 880 | polls at particular sites unless needed to ensure timely execution of a poll |
| 881 | under normal conditions. PlaceSafepoints does not attempt to ensure timely |
| 882 | execution of a poll under worst case conditions such as heavy system paging. |
| 883 | |
| 884 | The implementation of a safepoint poll action is specified by looking up a |
| 885 | function of the name ``gc.safepoint_poll`` in the containing Module. The body |
| 886 | of this function is inserted at each poll site desired. While calls or invokes |
| 887 | inside this method are transformed to a ``gc.statepoints``, recursive poll |
| 888 | insertion is not performed. |
| 889 | |
Philip Reames | a716933 | 2017-04-19 23:16:13 +0000 | [diff] [blame] | 890 | This pass is useful for any language frontend which only has to support |
| 891 | garbage collection semantics at safepoints. If you need other abstract |
| 892 | frame information at safepoints (e.g. for deoptimization or introspection), |
| 893 | you can insert safepoint polls in the frontend. If you have the later case, |
| 894 | please ask on llvm-dev for suggestions. There's been a good amount of work |
| 895 | done on making such a scheme work well in practice which is not yet documented |
| 896 | here. |
Philip Reames | 3f2a3f9 | 2015-02-25 01:23:59 +0000 | [diff] [blame] | 897 | |
| 898 | |
Philip Reames | edcb51f | 2015-07-16 21:10:46 +0000 | [diff] [blame] | 899 | Supported Architectures |
| 900 | ======================= |
| 901 | |
| 902 | Support for statepoint generation requires some code for each backend. |
Philip Reames | af1ff1a | 2018-11-08 15:17:10 +0000 | [diff] [blame] | 903 | Today, only X86_64 is supported. |
| 904 | |
| 905 | .. _OpenWork: |
Philip Reames | edcb51f | 2015-07-16 21:10:46 +0000 | [diff] [blame] | 906 | |
Philip Reames | d32f781 | 2018-11-09 17:09:16 +0000 | [diff] [blame] | 907 | Limitations and Half Baked Ideas |
| 908 | ================================ |
Philip Reames | 94c4abb | 2016-03-03 23:24:44 +0000 | [diff] [blame] | 909 | |
Philip Reames | d32f781 | 2018-11-09 17:09:16 +0000 | [diff] [blame] | 910 | Mixing References and Raw Pointers |
| 911 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
Philip Reames | 94c4abb | 2016-03-03 23:24:44 +0000 | [diff] [blame] | 912 | |
Philip Reames | d32f781 | 2018-11-09 17:09:16 +0000 | [diff] [blame] | 913 | Support for languages which allow unmanaged pointers to garbage collected |
| 914 | objects (i.e. pass a pointer to an object to a C routine) in the abstract |
| 915 | machine model. At the moment, the best idea on how to approach this |
| 916 | involves an intrinsic or opaque function which hides the connection between |
| 917 | the reference value and the raw pointer. The problem is that having a |
| 918 | ptrtoint or inttoptr cast (which is common for such use cases) breaks the |
| 919 | rules used for inferring base pointers for arbitrary references when |
| 920 | lowering out of the abstract model to the explicit physical model. Note |
| 921 | that a frontend which lowers directly to the physical model doesn't have |
| 922 | any problems here. |
Philip Reames | 94c4abb | 2016-03-03 23:24:44 +0000 | [diff] [blame] | 923 | |
Philip Reames | d32f781 | 2018-11-09 17:09:16 +0000 | [diff] [blame] | 924 | Objects on the Stack |
| 925 | ^^^^^^^^^^^^^^^^^^^^ |
Philip Reames | 94c4abb | 2016-03-03 23:24:44 +0000 | [diff] [blame] | 926 | |
Philip Reames | d32f781 | 2018-11-09 17:09:16 +0000 | [diff] [blame] | 927 | As noted above, the explicit lowering supports objects allocated on the |
| 928 | stack provided the collector can find a heap map given the stack address. |
Philip Reames | 94c4abb | 2016-03-03 23:24:44 +0000 | [diff] [blame] | 929 | |
Philip Reames | d32f781 | 2018-11-09 17:09:16 +0000 | [diff] [blame] | 930 | The missing pieces are a) integration with rewriting (RS4GC) from the |
| 931 | abstract machine model and b) support for optionally decomposing on stack |
| 932 | objects so as not to require heap maps for them. The later is required |
| 933 | for ease of integration with some collectors. |
Philip Reames | 94c4abb | 2016-03-03 23:24:44 +0000 | [diff] [blame] | 934 | |
Philip Reames | d32f781 | 2018-11-09 17:09:16 +0000 | [diff] [blame] | 935 | Lowering Quality and Representation Overhead |
| 936 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 937 | |
| 938 | The current statepoint lowering is known to be somewhat poor. In the very |
| 939 | long term, we'd like to integrate statepoints with the register allocator; |
| 940 | in the near term this is unlikely to happen. We've found the quality of |
| 941 | lowering to be relatively unimportant as hot-statepoints are almost always |
| 942 | inliner bugs. |
| 943 | |
| 944 | Concerns have been raised that the statepoint representation results in a |
| 945 | large amount of IR being produced for some examples and that this |
| 946 | contributes to higher than expected memory usage and compile times. There's |
| 947 | no immediate plans to make changes due to this, but alternate models may be |
| 948 | explored in the future. |
| 949 | |
| 950 | Relocations Along Exceptional Edges |
| 951 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 952 | |
| 953 | Relocations along exceptional paths are currently broken in ToT. In |
| 954 | particular, there is current no way to represent a rethrow on a path which |
| 955 | also has relocations. See `this llvm-dev discussion |
| 956 | <https://groups.google.com/forum/#!topic/llvm-dev/AE417XjgxvI>`_ for more |
| 957 | detail. |
| 958 | |
| 959 | Support for alternate stackmap formats |
| 960 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 961 | |
| 962 | For some use cases, it is |
| 963 | desirable to directly encode a final memory efficient stackmap format for |
| 964 | use by the runtime. This is particularly relevant for ahead of time |
| 965 | compilers which wish to directly link object files without the need for |
| 966 | post processing of each individual object file. While not implemented |
| 967 | today for statepoints, there is precedent for a GCStrategy to be able to |
| 968 | select a customer GCMetataPrinter for this purpose. Patches to enable |
| 969 | this functionality upstream are welcome. |
Philip Reames | af1ff1a | 2018-11-08 15:17:10 +0000 | [diff] [blame] | 970 | |
Philip Reames | 327f243 | 2014-12-04 18:33:28 +0000 | [diff] [blame] | 971 | Bugs and Enhancements |
| 972 | ===================== |
Philip Reames | 8c599f7 | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 973 | |
| 974 | Currently known bugs and enhancements under consideration can be |
| 975 | tracked by performing a `bugzilla search |
Ismail Donmez | f93e288 | 2017-02-17 08:26:11 +0000 | [diff] [blame] | 976 | <https://bugs.llvm.org/buglist.cgi?cmdtype=runnamed&namedcmd=Statepoint%20Bugs&list_id=64342>`_ |
Philip Reames | 8c599f7 | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 977 | for [Statepoint] in the summary field. When filing new bugs, please |
| 978 | use this tag so that interested parties see the newly filed bug. As |
Tanya Lattner | 377a984 | 2015-08-05 03:51:17 +0000 | [diff] [blame] | 979 | with most LLVM features, design discussions take place on `llvm-dev |
| 980 | <http://lists.llvm.org/mailman/listinfo/llvm-dev>`_, and patches |
Philip Reames | 8c599f7 | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 981 | should be sent to `llvm-commits |
Tanya Lattner | 377a984 | 2015-08-05 03:51:17 +0000 | [diff] [blame] | 982 | <http://lists.llvm.org/mailman/listinfo/llvm-commits>`_ for review. |
Philip Reames | 327f243 | 2014-12-04 18:33:28 +0000 | [diff] [blame] | 983 | |