blob: 1669117fcf560e704402a25bb2244183e92ca421 [file] [log] [blame]
George Burgess IV6058aba2016-08-17 00:17:29 +00001=========
2MemorySSA
3=========
4
5.. contents::
6 :local:
7
8Introduction
9============
10
11``MemorySSA`` is an analysis that allows us to cheaply reason about the
12interactions between various memory operations. Its goal is to replace
13``MemoryDependenceAnalysis`` for most (if not all) use-cases. This is because,
14unless you're very careful, use of ``MemoryDependenceAnalysis`` can easily
15result in quadratic-time algorithms in LLVM. Additionally, ``MemorySSA`` doesn't
16have as many arbitrary limits as ``MemoryDependenceAnalysis``, so you should get
17better results, too.
18
19At a high level, one of the goals of ``MemorySSA`` is to provide an SSA based
20form for memory, complete with def-use and use-def chains, which
21enables users to quickly find may-def and may-uses of memory operations.
22It can also be thought of as a way to cheaply give versions to the complete
23state of heap memory, and associate memory operations with those versions.
24
25This document goes over how ``MemorySSA`` is structured, and some basic
26intuition on how ``MemorySSA`` works.
27
28A paper on MemorySSA (with notes about how it's implemented in GCC) `can be
29found here <http://www.airs.com/dnovillo/Papers/mem-ssa.pdf>`_. Though, it's
30relatively out-of-date; the paper references multiple heap partitions, but GCC
31eventually swapped to just using one, like we now have in LLVM. Like
32GCC's, LLVM's MemorySSA is intraprocedural.
33
34
35MemorySSA Structure
36===================
37
38MemorySSA is a virtual IR. After it's built, ``MemorySSA`` will contain a
George Burgess IV7c34aef2016-08-17 23:21:56 +000039structure that maps ``Instruction``\ s to ``MemoryAccess``\ es, which are
40``MemorySSA``'s parallel to LLVM ``Instruction``\ s.
George Burgess IV6058aba2016-08-17 00:17:29 +000041
42Each ``MemoryAccess`` can be one of three types:
43
44- ``MemoryPhi``
45- ``MemoryUse``
46- ``MemoryDef``
47
George Burgess IV7c34aef2016-08-17 23:21:56 +000048``MemoryPhi``\ s are ``PhiNode``\ s, but for memory operations. If at any
49point we have two (or more) ``MemoryDef``\ s that could flow into a
George Burgess IV6058aba2016-08-17 00:17:29 +000050``BasicBlock``, the block's top ``MemoryAccess`` will be a
George Burgess IV7c34aef2016-08-17 23:21:56 +000051``MemoryPhi``. As in LLVM IR, ``MemoryPhi``\ s don't correspond to any
52concrete operation. As such, ``BasicBlock``\ s are mapped to ``MemoryPhi``\ s
53inside ``MemorySSA``, whereas ``Instruction``\ s are mapped to ``MemoryUse``\ s
54and ``MemoryDef``\ s.
George Burgess IV6058aba2016-08-17 00:17:29 +000055
George Burgess IV11eaa162016-08-18 02:56:05 +000056Note also that in SSA, Phi nodes merge must-reach definitions (that is,
57definitions that *must* be new versions of variables). In MemorySSA, PHI nodes
58merge may-reach definitions (that is, until disambiguated, the versions that
59reach a phi node may or may not clobber a given variable).
George Burgess IV6058aba2016-08-17 00:17:29 +000060
George Burgess IV7c34aef2016-08-17 23:21:56 +000061``MemoryUse``\ s are operations which use but don't modify memory. An example of
George Burgess IV6058aba2016-08-17 00:17:29 +000062a ``MemoryUse`` is a ``load``, or a ``readonly`` function call.
63
George Burgess IV7c34aef2016-08-17 23:21:56 +000064``MemoryDef``\ s are operations which may either modify memory, or which
George Burgess IV11eaa162016-08-18 02:56:05 +000065introduce some kind of ordering constraints. Examples of ``MemoryDef``\ s
George Burgess IV7c34aef2016-08-17 23:21:56 +000066include ``store``\ s, function calls, ``load``\ s with ``acquire`` (or higher)
George Burgess IV6058aba2016-08-17 00:17:29 +000067ordering, volatile operations, memory fences, etc.
68
69Every function that exists has a special ``MemoryDef`` called ``liveOnEntry``.
70It dominates every ``MemoryAccess`` in the function that ``MemorySSA`` is being
71run on, and implies that we've hit the top of the function. It's the only
72``MemoryDef`` that maps to no ``Instruction`` in LLVM IR. Use of
73``liveOnEntry`` implies that the memory being used is either undefined or
74defined before the function begins.
75
George Burgess IV7c34aef2016-08-17 23:21:56 +000076An example of all of this overlaid on LLVM IR (obtained by running ``opt
George Burgess IV6058aba2016-08-17 00:17:29 +000077-passes='print<memoryssa>' -disable-output`` on an ``.ll`` file) is below. When
78viewing this example, it may be helpful to view it in terms of clobbers. The
79operands of a given ``MemoryAccess`` are all (potential) clobbers of said
80MemoryAccess, and the value produced by a ``MemoryAccess`` can act as a clobber
George Burgess IV7c34aef2016-08-17 23:21:56 +000081for other ``MemoryAccess``\ es. Another useful way of looking at it is in
Hiroshi Inoue8040eab2018-01-26 08:15:29 +000082terms of heap versions. In that view, operands of a given
George Burgess IV6058aba2016-08-17 00:17:29 +000083``MemoryAccess`` are the version of the heap before the operation, and
84if the access produces a value, the value is the new version of the heap
85after the operation.
86
87.. code-block:: llvm
88
89 define void @foo() {
90 entry:
91 %p1 = alloca i8
92 %p2 = alloca i8
93 %p3 = alloca i8
94 ; 1 = MemoryDef(liveOnEntry)
95 store i8 0, i8* %p3
96 br label %while.cond
97
98 while.cond:
99 ; 6 = MemoryPhi({%0,1},{if.end,4})
100 br i1 undef, label %if.then, label %if.else
101
102 if.then:
103 ; 2 = MemoryDef(6)
104 store i8 0, i8* %p1
105 br label %if.end
106
107 if.else:
108 ; 3 = MemoryDef(6)
109 store i8 1, i8* %p2
110 br label %if.end
111
112 if.end:
George Burgess IVc4944122016-08-17 01:50:54 +0000113 ; 5 = MemoryPhi({if.then,2},{if.else,3})
George Burgess IV6058aba2016-08-17 00:17:29 +0000114 ; MemoryUse(5)
115 %1 = load i8, i8* %p1
116 ; 4 = MemoryDef(5)
117 store i8 2, i8* %p2
118 ; MemoryUse(1)
119 %2 = load i8, i8* %p3
120 br label %while.cond
121 }
122
George Burgess IVc4944122016-08-17 01:50:54 +0000123The ``MemorySSA`` IR is shown in comments that precede the instructions they map
George Burgess IV6058aba2016-08-17 00:17:29 +0000124to (if such an instruction exists). For example, ``1 = MemoryDef(liveOnEntry)``
125is a ``MemoryAccess`` (specifically, a ``MemoryDef``), and it describes the LLVM
126instruction ``store i8 0, i8* %p3``. Other places in ``MemorySSA`` refer to this
127particular ``MemoryDef`` as ``1`` (much like how one can refer to ``load i8, i8*
George Burgess IV7c34aef2016-08-17 23:21:56 +0000128%p1`` in LLVM with ``%1``). Again, ``MemoryPhi``\ s don't correspond to any LLVM
George Burgess IV6058aba2016-08-17 00:17:29 +0000129Instruction, so the line directly below a ``MemoryPhi`` isn't special.
130
131Going from the top down:
132
George Burgess IVc4944122016-08-17 01:50:54 +0000133- ``6 = MemoryPhi({entry,1},{if.end,4})`` notes that, when entering
134 ``while.cond``, the reaching definition for it is either ``1`` or ``4``. This
135 ``MemoryPhi`` is referred to in the textual IR by the number ``6``.
George Burgess IV6058aba2016-08-17 00:17:29 +0000136- ``2 = MemoryDef(6)`` notes that ``store i8 0, i8* %p1`` is a definition,
137 and its reaching definition before it is ``6``, or the ``MemoryPhi`` after
George Burgess IV7c34aef2016-08-17 23:21:56 +0000138 ``while.cond``. (See the `Build-time use optimization`_ and `Precision`_
Sylvestre Ledru297f1792016-08-28 20:29:18 +0000139 sections below for why this ``MemoryDef`` isn't linked to a separate,
George Burgess IV7c34aef2016-08-17 23:21:56 +0000140 disambiguated ``MemoryPhi``.)
George Burgess IV6058aba2016-08-17 00:17:29 +0000141- ``3 = MemoryDef(6)`` notes that ``store i8 0, i8* %p2`` is a definition; its
142 reaching definition is also ``6``.
George Burgess IVc4944122016-08-17 01:50:54 +0000143- ``5 = MemoryPhi({if.then,2},{if.else,3})`` notes that the clobber before
George Burgess IV6058aba2016-08-17 00:17:29 +0000144 this block could either be ``2`` or ``3``.
145- ``MemoryUse(5)`` notes that ``load i8, i8* %p1`` is a use of memory, and that
146 it's clobbered by ``5``.
147- ``4 = MemoryDef(5)`` notes that ``store i8 2, i8* %p2`` is a definition; it's
148 reaching definition is ``5``.
149- ``MemoryUse(1)`` notes that ``load i8, i8* %p3`` is just a user of memory,
150 and the last thing that could clobber this use is above ``while.cond`` (e.g.
George Burgess IV7c34aef2016-08-17 23:21:56 +0000151 the store to ``%p3``). In heap versioning parlance, it really only depends on
152 the heap version 1, and is unaffected by the new heap versions generated since
153 then.
George Burgess IV6058aba2016-08-17 00:17:29 +0000154
155As an aside, ``MemoryAccess`` is a ``Value`` mostly for convenience; it's not
156meant to interact with LLVM IR.
157
158Design of MemorySSA
159===================
160
161``MemorySSA`` is an analysis that can be built for any arbitrary function. When
162it's built, it does a pass over the function's IR in order to build up its
George Burgess IV7c34aef2016-08-17 23:21:56 +0000163mapping of ``MemoryAccess``\ es. You can then query ``MemorySSA`` for things
164like the dominance relation between ``MemoryAccess``\ es, and get the
165``MemoryAccess`` for any given ``Instruction`` .
George Burgess IV6058aba2016-08-17 00:17:29 +0000166
167When ``MemorySSA`` is done building, it also hands you a ``MemorySSAWalker``
168that you can use (see below).
169
170
171The walker
172----------
173
174A structure that helps ``MemorySSA`` do its job is the ``MemorySSAWalker``, or
175the walker, for short. The goal of the walker is to provide answers to clobber
George Burgess IV7c34aef2016-08-17 23:21:56 +0000176queries beyond what's represented directly by ``MemoryAccess``\ es. For example,
George Burgess IV6058aba2016-08-17 00:17:29 +0000177given:
178
179.. code-block:: llvm
180
181 define void @foo() {
182 %a = alloca i8
183 %b = alloca i8
184
185 ; 1 = MemoryDef(liveOnEntry)
186 store i8 0, i8* %a
187 ; 2 = MemoryDef(1)
188 store i8 0, i8* %b
189 }
190
191The store to ``%a`` is clearly not a clobber for the store to ``%b``. It would
192be the walker's goal to figure this out, and return ``liveOnEntry`` when queried
193for the clobber of ``MemoryAccess`` ``2``.
194
George Burgess IV7c34aef2016-08-17 23:21:56 +0000195By default, ``MemorySSA`` provides a walker that can optimize ``MemoryDef``\ s
196and ``MemoryUse``\ s by consulting whatever alias analysis stack you happen to
197be using. Walkers were built to be flexible, though, so it's entirely reasonable
198(and expected) to create more specialized walkers (e.g. one that specifically
199queries ``GlobalsAA``, one that always stops at ``MemoryPhi`` nodes, etc).
George Burgess IV6058aba2016-08-17 00:17:29 +0000200
201
202Locating clobbers yourself
203^^^^^^^^^^^^^^^^^^^^^^^^^^
204
205If you choose to make your own walker, you can find the clobber for a
206``MemoryAccess`` by walking every ``MemoryDef`` that dominates said
George Burgess IV7c34aef2016-08-17 23:21:56 +0000207``MemoryAccess``. The structure of ``MemoryDef``\ s makes this relatively simple;
George Burgess IV6058aba2016-08-17 00:17:29 +0000208they ultimately form a linked list of every clobber that dominates the
209``MemoryAccess`` that you're trying to optimize. In other words, the
210``definingAccess`` of a ``MemoryDef`` is always the nearest dominating
211``MemoryDef`` or ``MemoryPhi`` of said ``MemoryDef``.
212
213
George Burgess IV7c34aef2016-08-17 23:21:56 +0000214Build-time use optimization
215---------------------------
George Burgess IV6058aba2016-08-17 00:17:29 +0000216
George Burgess IV7c34aef2016-08-17 23:21:56 +0000217``MemorySSA`` will optimize some ``MemoryAccess``\ es at build-time.
George Burgess IVc4944122016-08-17 01:50:54 +0000218Specifically, we optimize the operand of every ``MemoryUse`` to point to the
George Burgess IV6058aba2016-08-17 00:17:29 +0000219actual clobber of said ``MemoryUse``. This can be seen in the above example; the
220second ``MemoryUse`` in ``if.end`` has an operand of ``1``, which is a
221``MemoryDef`` from the entry block. This is done to make walking,
222value numbering, etc, faster and easier.
George Burgess IV7c34aef2016-08-17 23:21:56 +0000223
George Burgess IV6058aba2016-08-17 00:17:29 +0000224It is not possible to optimize ``MemoryDef`` in the same way, as we
225restrict ``MemorySSA`` to one heap variable and, thus, one Phi node
226per block.
227
228
229Invalidation and updating
230-------------------------
231
232Because ``MemorySSA`` keeps track of LLVM IR, it needs to be updated whenever
233the IR is updated. "Update", in this case, includes the addition, deletion, and
George Burgess IVc4944122016-08-17 01:50:54 +0000234motion of ``Instructions``. The update API is being made on an as-needed basis.
George Burgess IV7c34aef2016-08-17 23:21:56 +0000235If you'd like examples, ``GVNHoist`` is a user of ``MemorySSA``\ s update API.
George Burgess IV6058aba2016-08-17 00:17:29 +0000236
237
238Phi placement
239^^^^^^^^^^^^^
240
George Burgess IV7c34aef2016-08-17 23:21:56 +0000241``MemorySSA`` only places ``MemoryPhi``\ s where they're actually
George Burgess IV6058aba2016-08-17 00:17:29 +0000242needed. That is, it is a pruned SSA form, like LLVM's SSA form. For
243example, consider:
244
245.. code-block:: llvm
246
247 define void @foo() {
248 entry:
249 %p1 = alloca i8
250 %p2 = alloca i8
251 %p3 = alloca i8
252 ; 1 = MemoryDef(liveOnEntry)
253 store i8 0, i8* %p3
254 br label %while.cond
255
256 while.cond:
257 ; 3 = MemoryPhi({%0,1},{if.end,2})
258 br i1 undef, label %if.then, label %if.else
259
260 if.then:
261 br label %if.end
262
263 if.else:
264 br label %if.end
265
266 if.end:
267 ; MemoryUse(1)
268 %1 = load i8, i8* %p1
269 ; 2 = MemoryDef(3)
270 store i8 2, i8* %p2
271 ; MemoryUse(1)
272 %2 = load i8, i8* %p3
273 br label %while.cond
274 }
275
276Because we removed the stores from ``if.then`` and ``if.else``, a ``MemoryPhi``
277for ``if.end`` would be pointless, so we don't place one. So, if you need to
278place a ``MemoryDef`` in ``if.then`` or ``if.else``, you'll need to also create
279a ``MemoryPhi`` for ``if.end``.
280
George Burgess IV7c34aef2016-08-17 23:21:56 +0000281If it turns out that this is a large burden, we can just place ``MemoryPhi``\ s
George Burgess IV6058aba2016-08-17 00:17:29 +0000282everywhere. Because we have Walkers that are capable of optimizing above said
283phis, doing so shouldn't prohibit optimizations.
284
285
286Non-Goals
287---------
288
289``MemorySSA`` is meant to reason about the relation between memory
290operations, and enable quicker querying.
291It isn't meant to be the single source of truth for all potential memory-related
292optimizations. Specifically, care must be taken when trying to use ``MemorySSA``
293to reason about atomic or volatile operations, as in:
294
295.. code-block:: llvm
296
297 define i8 @foo(i8* %a) {
298 entry:
299 br i1 undef, label %if.then, label %if.end
300
301 if.then:
302 ; 1 = MemoryDef(liveOnEntry)
303 %0 = load volatile i8, i8* %a
304 br label %if.end
305
306 if.end:
307 %av = phi i8 [0, %entry], [%0, %if.then]
308 ret i8 %av
309 }
310
311Going solely by ``MemorySSA``'s analysis, hoisting the ``load`` to ``entry`` may
312seem legal. Because it's a volatile load, though, it's not.
313
314
315Design tradeoffs
316----------------
317
318Precision
319^^^^^^^^^
George Burgess IVc4944122016-08-17 01:50:54 +0000320
George Burgess IV6058aba2016-08-17 00:17:29 +0000321``MemorySSA`` in LLVM deliberately trades off precision for speed.
322Let us think about memory variables as if they were disjoint partitions of the
323heap (that is, if you have one variable, as above, it represents the entire
324heap, and if you have multiple variables, each one represents some
325disjoint portion of the heap)
326
327First, because alias analysis results conflict with each other, and
328each result may be what an analysis wants (IE
329TBAA may say no-alias, and something else may say must-alias), it is
330not possible to partition the heap the way every optimization wants.
331Second, some alias analysis results are not transitive (IE A noalias B,
332and B noalias C, does not mean A noalias C), so it is not possible to
333come up with a precise partitioning in all cases without variables to
334represent every pair of possible aliases. Thus, partitioning
335precisely may require introducing at least N^2 new virtual variables,
336phi nodes, etc.
337
338Each of these variables may be clobbered at multiple def sites.
339
340To give an example, if you were to split up struct fields into
341individual variables, all aliasing operations that may-def multiple struct
342fields, will may-def more than one of them. This is pretty common (calls,
343copies, field stores, etc).
344
345Experience with SSA forms for memory in other compilers has shown that
346it is simply not possible to do this precisely, and in fact, doing it
347precisely is not worth it, because now all the optimizations have to
348walk tons and tons of virtual variables and phi nodes.
349
350So we partition. At the point at which you partition, again,
351experience has shown us there is no point in partitioning to more than
352one variable. It simply generates more IR, and optimizations still
353have to query something to disambiguate further anyway.
354
355As a result, LLVM partitions to one variable.
356
357Use Optimization
358^^^^^^^^^^^^^^^^
359
360Unlike other partitioned forms, LLVM's ``MemorySSA`` does make one
361useful guarantee - all loads are optimized to point at the thing that
362actually clobbers them. This gives some nice properties. For example,
363for a given store, you can find all loads actually clobbered by that
364store by walking the immediate uses of the store.