blob: ca6c0107fa66e6d1f89ea60de6514264c268ced4 [file] [log] [blame]
Alex Lorenz32dcc282015-08-06 22:55:19 +00001========================================
2Machine IR (MIR) Format Reference Manual
3========================================
4
5.. contents::
6 :local:
7
8.. warning::
9 This is a work in progress.
10
11Introduction
12============
13
14This document is a reference manual for the Machine IR (MIR) serialization
15format. MIR is a human readable serialization format that is used to represent
16LLVM's :ref:`machine specific intermediate representation
17<machine code representation>`.
18
19The MIR serialization format is designed to be used for testing the code
20generation passes in LLVM.
21
22Overview
23========
24
25The MIR serialization format uses a YAML container. YAML is a standard
26data serialization language, and the full YAML language spec can be read at
27`yaml.org
28<http://www.yaml.org/spec/1.2/spec.html#Introduction>`_.
29
30A MIR file is split up into a series of `YAML documents`_. The first document
31can contain an optional embedded LLVM IR module, and the rest of the documents
32contain the serialized machine functions.
33
34.. _YAML documents: http://www.yaml.org/spec/1.2/spec.html#id2800132
35
Alex Lorenz0f92e212015-08-21 22:58:33 +000036MIR Testing Guide
37=================
38
39You can use the MIR format for testing in two different ways:
40
41- You can write MIR tests that invoke a single code generation pass using the
Matthias Braun11301382017-04-13 22:14:45 +000042 ``-run-pass`` option in llc.
Alex Lorenz0f92e212015-08-21 22:58:33 +000043
Matthias Braun11301382017-04-13 22:14:45 +000044- You can use llc's ``-stop-after`` option with existing or new LLVM assembly
Alex Lorenz0f92e212015-08-21 22:58:33 +000045 tests and check the MIR output of a specific code generation pass.
46
47Testing Individual Code Generation Passes
48-----------------------------------------
49
Matthias Braun11301382017-04-13 22:14:45 +000050The ``-run-pass`` option in llc allows you to create MIR tests that invoke just
51a single code generation pass. When this option is used, llc will parse an
52input MIR file, run the specified code generation pass(es), and output the
53resulting MIR code.
Alex Lorenz0f92e212015-08-21 22:58:33 +000054
Matthias Braun11301382017-04-13 22:14:45 +000055You can generate an input MIR file for the test by using the ``-stop-after`` or
56``-stop-before`` option in llc. For example, if you would like to write a test
57for the post register allocation pseudo instruction expansion pass, you can
58specify the machine copy propagation pass in the ``-stop-after`` option, as it
59runs just before the pass that we are trying to test:
Alex Lorenz0f92e212015-08-21 22:58:33 +000060
Matthias Braun11301382017-04-13 22:14:45 +000061 ``llc -stop-after=machine-cp bug-trigger.ll > test.mir``
Alex Lorenz0f92e212015-08-21 22:58:33 +000062
Matt Arsenault4c2adea2018-12-04 17:45:12 +000063If the same pass is run multiple times, a run index can be included
64after the name with a comma.
65
66 ``llc -stop-after=dead-mi-elimination,1 bug-trigger.ll > test.mir``
67
Alex Lorenz0f92e212015-08-21 22:58:33 +000068After generating the input MIR file, you'll have to add a run line that uses
69the ``-run-pass`` option to it. In order to test the post register allocation
70pseudo instruction expansion pass on X86-64, a run line like the one shown
71below can be used:
72
Matthias Braun11301382017-04-13 22:14:45 +000073 ``# RUN: llc -o - %s -mtriple=x86_64-- -run-pass=postrapseudos | FileCheck %s``
Alex Lorenz0f92e212015-08-21 22:58:33 +000074
75The MIR files are target dependent, so they have to be placed in the target
Matthias Braun11301382017-04-13 22:14:45 +000076specific test directories (``lib/CodeGen/TARGETNAME``). They also need to
77specify a target triple or a target architecture either in the run line or in
78the embedded LLVM IR module.
Alex Lorenz0f92e212015-08-21 22:58:33 +000079
Matthias Braundc1a3612017-04-13 23:45:14 +000080Simplifying MIR files
81^^^^^^^^^^^^^^^^^^^^^
82
83The MIR code coming out of ``-stop-after``/``-stop-before`` is very verbose;
84Tests are more accessible and future proof when simplified:
85
Matthias Braun0cb25a22017-05-05 21:09:30 +000086- Use the ``-simplify-mir`` option with llc.
87
Matthias Braundc1a3612017-04-13 23:45:14 +000088- Machine function attributes often have default values or the test works just
89 as well with default values. Typical candidates for this are: `alignment:`,
90 `exposesReturnsTwice`, `legalized`, `regBankSelected`, `selected`.
91 The whole `frameInfo` section is often unnecessary if there is no special
92 frame usage in the function. `tracksRegLiveness` on the other hand is often
93 necessary for some passes that care about block livein lists.
94
95- The (global) `liveins:` list is typically only interesting for early
96 instruction selection passes and can be removed when testing later passes.
97 The per-block `liveins:` on the other hand are necessary if
98 `tracksRegLiveness` is true.
99
100- Branch probability data in block `successors:` lists can be dropped if the
101 test doesn't depend on it. Example:
102 `successors: %bb.1(0x40000000), %bb.2(0x40000000)` can be replaced with
103 `successors: %bb.1, %bb.2`.
104
105- MIR code contains a whole IR module. This is necessary because there are
106 no equivalents in MIR for global variables, references to external functions,
107 function attributes, metadata, debug info. Instead some MIR data references
108 the IR constructs. You can often remove them if the test doesn't depend on
109 them.
110
111- Alias Analysis is performed on IR values. These are referenced by memory
112 operands in MIR. Example: `:: (load 8 from %ir.foobar, !alias.scope !9)`.
113 If the test doesn't depend on (good) alias analysis the references can be
114 dropped: `:: (load 8)`
115
116- MIR blocks can reference IR blocks for debug printing, profile information
117 or debug locations. Example: `bb.42.myblock` in MIR references the IR block
118 `myblock`. It is usually possible to drop the `.myblock` reference and simply
119 use `bb.42`.
120
121- If there are no memory operands or blocks referencing the IR then the
122 IR function can be replaced by a parameterless dummy function like
123 `define @func() { ret void }`.
124
125- It is possible to drop the whole IR section of the MIR file if it only
126 contains dummy functions (see above). The .mir loader will create the
127 IR functions automatically in this case.
128
Francis Visoiu Mistrihee30ab72017-12-14 10:03:23 +0000129.. _limitations:
130
Alex Lorenz0f92e212015-08-21 22:58:33 +0000131Limitations
132-----------
133
134Currently the MIR format has several limitations in terms of which state it
135can serialize:
136
137- The target-specific state in the target-specific ``MachineFunctionInfo``
138 subclasses isn't serialized at the moment.
139
140- The target-specific ``MachineConstantPoolValue`` subclasses (in the ARM and
141 SystemZ backends) aren't serialized at the moment.
142
Chandler Carruthe90d4402018-08-16 23:11:05 +0000143- The ``MCSymbol`` machine operands don't support temporary or local symbols.
Alex Lorenz0f92e212015-08-21 22:58:33 +0000144
145- A lot of the state in ``MachineModuleInfo`` isn't serialized - only the CFI
146 instructions and the variable debug information from MMI is serialized right
147 now.
148
149These limitations impose restrictions on what you can test with the MIR format.
150For now, tests that would like to test some behaviour that depends on the state
Chandler Carruthe90d4402018-08-16 23:11:05 +0000151of temporary or local ``MCSymbol`` operands or the exception handling state in
152MMI, can't use the MIR format. As well as that, tests that test some behaviour
153that depends on the state of the target specific ``MachineFunctionInfo`` or
Alex Lorenz0f92e212015-08-21 22:58:33 +0000154``MachineConstantPoolValue`` subclasses can't use the MIR format at the moment.
155
Alex Lorenz32dcc282015-08-06 22:55:19 +0000156High Level Structure
157====================
158
Alex Lorenz1fd57732015-09-08 11:38:16 +0000159.. _embedded-module:
160
Alex Lorenz32dcc282015-08-06 22:55:19 +0000161Embedded Module
162---------------
163
164When the first YAML document contains a `YAML block literal string`_, the MIR
165parser will treat this string as an LLVM assembly language string that
166represents an embedded LLVM IR module.
167Here is an example of a YAML document that contains an LLVM module:
168
169.. code-block:: llvm
170
Alex Lorenz32dcc282015-08-06 22:55:19 +0000171 define i32 @inc(i32* %x) {
172 entry:
173 %0 = load i32, i32* %x
174 %1 = add i32 %0, 1
175 store i32 %1, i32* %x
176 ret i32 %1
177 }
Alex Lorenz32dcc282015-08-06 22:55:19 +0000178
179.. _YAML block literal string: http://www.yaml.org/spec/1.2/spec.html#id2795688
180
181Machine Functions
182-----------------
183
184The remaining YAML documents contain the machine functions. This is an example
185of such YAML document:
186
Renato Golin88ea57f2016-07-20 12:16:38 +0000187.. code-block:: text
Alex Lorenz32dcc282015-08-06 22:55:19 +0000188
189 ---
190 name: inc
191 tracksRegLiveness: true
192 liveins:
Puyan Lotfi9e2cb422018-03-12 14:51:19 +0000193 - { reg: '$rdi' }
Alex Lorenz1dde2af2015-08-14 00:36:10 +0000194 body: |
195 bb.0.entry:
Puyan Lotfi9e2cb422018-03-12 14:51:19 +0000196 liveins: $rdi
Alex Lorenz1dde2af2015-08-14 00:36:10 +0000197
Puyan Lotfi9e2cb422018-03-12 14:51:19 +0000198 $eax = MOV32rm $rdi, 1, _, 0, _
199 $eax = INC32r killed $eax, implicit-def dead $eflags
200 MOV32mr killed $rdi, 1, _, 0, _, $eax
201 RETQ $eax
Alex Lorenz32dcc282015-08-06 22:55:19 +0000202 ...
203
204The document above consists of attributes that represent the various
205properties and data structures in a machine function.
206
207The attribute ``name`` is required, and its value should be identical to the
208name of a function that this machine function is based on.
209
Alex Lorenz1dde2af2015-08-14 00:36:10 +0000210The attribute ``body`` is a `YAML block literal string`_. Its value represents
211the function's machine basic blocks and their machine instructions.
Alex Lorenz32dcc282015-08-06 22:55:19 +0000212
Alex Lorenzbd978a42015-08-15 01:06:06 +0000213Machine Instructions Format Reference
214=====================================
215
216The machine basic blocks and their instructions are represented using a custom,
217human readable serialization language. This language is used in the
218`YAML block literal string`_ that corresponds to the machine function's body.
219
220A source string that uses this language contains a list of machine basic
221blocks, which are described in the section below.
222
223Machine Basic Blocks
224--------------------
225
226A machine basic block is defined in a single block definition source construct
227that contains the block's ID.
228The example below defines two blocks that have an ID of zero and one:
229
Renato Golin88ea57f2016-07-20 12:16:38 +0000230.. code-block:: text
Alex Lorenzbd978a42015-08-15 01:06:06 +0000231
232 bb.0:
233 <instructions>
234 bb.1:
235 <instructions>
236
237A machine basic block can also have a name. It should be specified after the ID
238in the block's definition:
239
Renato Golin88ea57f2016-07-20 12:16:38 +0000240.. code-block:: text
Alex Lorenzbd978a42015-08-15 01:06:06 +0000241
242 bb.0.entry: ; This block's name is "entry"
243 <instructions>
244
245The block's name should be identical to the name of the IR block that this
246machine block is based on.
247
Francis Visoiu Mistrihd347e972017-12-13 10:30:59 +0000248.. _block-references:
249
Alex Lorenzbd978a42015-08-15 01:06:06 +0000250Block References
251^^^^^^^^^^^^^^^^
252
253The machine basic blocks are identified by their ID numbers. Individual
254blocks are referenced using the following syntax:
255
Renato Golin88ea57f2016-07-20 12:16:38 +0000256.. code-block:: text
Alex Lorenzbd978a42015-08-15 01:06:06 +0000257
Francis Visoiu Mistrihca0df552017-12-04 17:18:51 +0000258 %bb.<id>
Alex Lorenzbd978a42015-08-15 01:06:06 +0000259
Francis Visoiu Mistrihca0df552017-12-04 17:18:51 +0000260Example:
Alex Lorenzbd978a42015-08-15 01:06:06 +0000261
262.. code-block:: llvm
263
264 %bb.0
Francis Visoiu Mistrihca0df552017-12-04 17:18:51 +0000265
266The following syntax is also supported, but the former syntax is preferred for
267block references:
268
269.. code-block:: text
270
271 %bb.<id>[.<name>]
272
273Example:
274
275.. code-block:: llvm
276
Alex Lorenzbd978a42015-08-15 01:06:06 +0000277 %bb.1.then
278
279Successors
280^^^^^^^^^^
281
282The machine basic block's successors have to be specified before any of the
283instructions:
284
Renato Golin88ea57f2016-07-20 12:16:38 +0000285.. code-block:: text
Alex Lorenzbd978a42015-08-15 01:06:06 +0000286
287 bb.0.entry:
288 successors: %bb.1.then, %bb.2.else
289 <instructions>
290 bb.1.then:
291 <instructions>
292 bb.2.else:
293 <instructions>
294
295The branch weights can be specified in brackets after the successor blocks.
296The example below defines a block that has two successors with branch weights
297of 32 and 16:
298
Renato Golin88ea57f2016-07-20 12:16:38 +0000299.. code-block:: text
Alex Lorenzbd978a42015-08-15 01:06:06 +0000300
301 bb.0.entry:
302 successors: %bb.1.then(32), %bb.2.else(16)
303
Alex Lorenz5e825f62015-08-21 21:17:01 +0000304.. _bb-liveins:
305
Alex Lorenzbd978a42015-08-15 01:06:06 +0000306Live In Registers
307^^^^^^^^^^^^^^^^^
308
309The machine basic block's live in registers have to be specified before any of
310the instructions:
311
Renato Golin88ea57f2016-07-20 12:16:38 +0000312.. code-block:: text
Alex Lorenzbd978a42015-08-15 01:06:06 +0000313
314 bb.0.entry:
Puyan Lotfi9e2cb422018-03-12 14:51:19 +0000315 liveins: $edi, $esi
Alex Lorenzbd978a42015-08-15 01:06:06 +0000316
317The list of live in registers and successors can be empty. The language also
318allows multiple live in register and successor lists - they are combined into
319one list by the parser.
320
321Miscellaneous Attributes
322^^^^^^^^^^^^^^^^^^^^^^^^
323
324The attributes ``IsAddressTaken``, ``IsLandingPad`` and ``Alignment`` can be
325specified in brackets after the block's definition:
326
Renato Golin88ea57f2016-07-20 12:16:38 +0000327.. code-block:: text
Alex Lorenzbd978a42015-08-15 01:06:06 +0000328
329 bb.0.entry (address-taken):
330 <instructions>
331 bb.2.else (align 4):
332 <instructions>
333 bb.3(landing-pad, align 4):
334 <instructions>
335
336.. TODO: Describe the way the reference to an unnamed LLVM IR block can be
337 preserved.
338
Alex Lorenzded00c72015-08-21 17:26:38 +0000339Machine Instructions
340--------------------
341
Alex Lorenz5e825f62015-08-21 21:17:01 +0000342A machine instruction is composed of a name,
343:ref:`machine operands <machine-operands>`,
Alex Lorenzded00c72015-08-21 17:26:38 +0000344:ref:`instruction flags <instruction-flags>`, and machine memory operands.
345
346The instruction's name is usually specified before the operands. The example
347below shows an instance of the X86 ``RETQ`` instruction with a single machine
348operand:
349
Renato Golin88ea57f2016-07-20 12:16:38 +0000350.. code-block:: text
Alex Lorenzded00c72015-08-21 17:26:38 +0000351
Puyan Lotfi9e2cb422018-03-12 14:51:19 +0000352 RETQ $eax
Alex Lorenzded00c72015-08-21 17:26:38 +0000353
354However, if the machine instruction has one or more explicitly defined register
355operands, the instruction's name has to be specified after them. The example
356below shows an instance of the AArch64 ``LDPXpost`` instruction with three
357defined register operands:
358
Renato Golin88ea57f2016-07-20 12:16:38 +0000359.. code-block:: text
Alex Lorenzded00c72015-08-21 17:26:38 +0000360
Puyan Lotfi9e2cb422018-03-12 14:51:19 +0000361 $sp, $fp, $lr = LDPXpost $sp, 2
Alex Lorenzded00c72015-08-21 17:26:38 +0000362
363The instruction names are serialized using the exact definitions from the
364target's ``*InstrInfo.td`` files, and they are case sensitive. This means that
365similar instruction names like ``TSTri`` and ``tSTRi`` represent different
366machine instructions.
367
368.. _instruction-flags:
369
370Instruction Flags
371^^^^^^^^^^^^^^^^^
372
Francis Visoiu Mistrih4e1cf662018-01-09 11:33:22 +0000373The flag ``frame-setup`` or ``frame-destroy`` can be specified before the
374instruction's name:
Alex Lorenzded00c72015-08-21 17:26:38 +0000375
Renato Golin88ea57f2016-07-20 12:16:38 +0000376.. code-block:: text
Alex Lorenzded00c72015-08-21 17:26:38 +0000377
Puyan Lotfi9e2cb422018-03-12 14:51:19 +0000378 $fp = frame-setup ADDXri $sp, 0, 0
Alex Lorenzded00c72015-08-21 17:26:38 +0000379
Francis Visoiu Mistrih4e1cf662018-01-09 11:33:22 +0000380.. code-block:: text
381
Puyan Lotfi9e2cb422018-03-12 14:51:19 +0000382 $x21, $x20 = frame-destroy LDPXi $sp
Francis Visoiu Mistrih4e1cf662018-01-09 11:33:22 +0000383
Alex Lorenz5e825f62015-08-21 21:17:01 +0000384.. _registers:
385
Francis Visoiu Mistrih1f2edf62018-01-10 17:53:16 +0000386Bundled Instructions
387^^^^^^^^^^^^^^^^^^^^
388
389The syntax for bundled instructions is the following:
390
391.. code-block:: text
392
Puyan Lotfi9e2cb422018-03-12 14:51:19 +0000393 BUNDLE implicit-def $r0, implicit-def $r1, implicit $r2 {
394 $r0 = SOME_OP $r2
395 $r1 = ANOTHER_OP internal $r0
Francis Visoiu Mistrih1f2edf62018-01-10 17:53:16 +0000396 }
397
398The first instruction is often a bundle header. The instructions between ``{``
399and ``}`` are bundled with the first instruction.
400
Alex Lorenz5e825f62015-08-21 21:17:01 +0000401Registers
402---------
403
404Registers are one of the key primitives in the machine instructions
Hiroshi Inoue96dcb662018-06-15 05:10:09 +0000405serialization language. They are primarily used in the
Alex Lorenz5e825f62015-08-21 21:17:01 +0000406:ref:`register machine operands <register-operands>`,
407but they can also be used in a number of other places, like the
408:ref:`basic block's live in list <bb-liveins>`.
409
Puyan Lotfi9e2cb422018-03-12 14:51:19 +0000410The physical registers are identified by their name and by the '$' prefix sigil.
411They use the following syntax:
Alex Lorenz5e825f62015-08-21 21:17:01 +0000412
Renato Golin88ea57f2016-07-20 12:16:38 +0000413.. code-block:: text
Alex Lorenz5e825f62015-08-21 21:17:01 +0000414
Puyan Lotfi9e2cb422018-03-12 14:51:19 +0000415 $<name>
Alex Lorenz5e825f62015-08-21 21:17:01 +0000416
417The example below shows three X86 physical registers:
418
Renato Golin88ea57f2016-07-20 12:16:38 +0000419.. code-block:: text
Alex Lorenz5e825f62015-08-21 21:17:01 +0000420
Puyan Lotfi9e2cb422018-03-12 14:51:19 +0000421 $eax
422 $r15
423 $eflags
Alex Lorenz5e825f62015-08-21 21:17:01 +0000424
Puyan Lotfi9e2cb422018-03-12 14:51:19 +0000425The virtual registers are identified by their ID number and by the '%' sigil.
426They use the following syntax:
Alex Lorenz5e825f62015-08-21 21:17:01 +0000427
Renato Golin88ea57f2016-07-20 12:16:38 +0000428.. code-block:: text
Alex Lorenz5e825f62015-08-21 21:17:01 +0000429
430 %<id>
431
432Example:
433
Renato Golin88ea57f2016-07-20 12:16:38 +0000434.. code-block:: text
Alex Lorenz5e825f62015-08-21 21:17:01 +0000435
436 %0
437
438The null registers are represented using an underscore ('``_``'). They can also be
Puyan Lotfi9e2cb422018-03-12 14:51:19 +0000439represented using a '``$noreg``' named register, although the former syntax
Alex Lorenz5e825f62015-08-21 21:17:01 +0000440is preferred.
441
442.. _machine-operands:
443
444Machine Operands
445----------------
446
Chandler Carruthe90d4402018-08-16 23:11:05 +0000447There are seventeen different kinds of machine operands, and all of them can be
448serialized.
Alex Lorenz5e825f62015-08-21 21:17:01 +0000449
450Immediate Operands
451^^^^^^^^^^^^^^^^^^
452
453The immediate machine operands are untyped, 64-bit signed integers. The
454example below shows an instance of the X86 ``MOV32ri`` instruction that has an
455immediate machine operand ``-42``:
456
Renato Golin88ea57f2016-07-20 12:16:38 +0000457.. code-block:: text
Alex Lorenz5e825f62015-08-21 21:17:01 +0000458
Puyan Lotfi9e2cb422018-03-12 14:51:19 +0000459 $eax = MOV32ri -42
Alex Lorenz5e825f62015-08-21 21:17:01 +0000460
Francis Visoiu Mistrihe28484a2017-12-08 22:53:21 +0000461An immediate operand is also used to represent a subregister index when the
462machine instruction has one of the following opcodes:
463
464- ``EXTRACT_SUBREG``
465
466- ``INSERT_SUBREG``
467
468- ``REG_SEQUENCE``
469
470- ``SUBREG_TO_REG``
471
472In case this is true, the Machine Operand is printed according to the target.
473
474For example:
475
476In AArch64RegisterInfo.td:
477
478.. code-block:: text
479
480 def sub_32 : SubRegIndex<32>;
481
482If the third operand is an immediate with the value ``15`` (target-dependent
483value), based on the instruction's opcode and the operand's index the operand
484will be printed as ``%subreg.sub_32``:
485
486.. code-block:: text
487
488 %1:gpr64 = SUBREG_TO_REG 0, %0, %subreg.sub_32
489
Francis Visoiu Mistrihab9bb802017-12-08 11:40:06 +0000490For integers > 64bit, we use a special machine operand, ``MO_CImmediate``,
491which stores the immediate in a ``ConstantInt`` using an ``APInt`` (LLVM's
492arbitrary precision integers).
493
494.. TODO: Describe the FPIMM immediate operands.
Alex Lorenz5e825f62015-08-21 21:17:01 +0000495
496.. _register-operands:
497
498Register Operands
499^^^^^^^^^^^^^^^^^
500
501The :ref:`register <registers>` primitive is used to represent the register
502machine operands. The register operands can also have optional
503:ref:`register flags <register-flags>`,
Alex Lorenz0aeea882015-09-08 11:39:47 +0000504:ref:`a subregister index <subregister-indices>`,
505and a reference to the tied register operand.
Alex Lorenz5e825f62015-08-21 21:17:01 +0000506The full syntax of a register operand is shown below:
507
Renato Golin88ea57f2016-07-20 12:16:38 +0000508.. code-block:: text
Alex Lorenz5e825f62015-08-21 21:17:01 +0000509
510 [<flags>] <register> [ :<subregister-idx-name> ] [ (tied-def <tied-op>) ]
511
512This example shows an instance of the X86 ``XOR32rr`` instruction that has
5135 register operands with different register flags:
514
Renato Golin88ea57f2016-07-20 12:16:38 +0000515.. code-block:: text
Alex Lorenz5e825f62015-08-21 21:17:01 +0000516
Puyan Lotfi9e2cb422018-03-12 14:51:19 +0000517 dead $eax = XOR32rr undef $eax, undef $eax, implicit-def dead $eflags, implicit-def $al
Alex Lorenz5e825f62015-08-21 21:17:01 +0000518
519.. _register-flags:
520
521Register Flags
522~~~~~~~~~~~~~~
523
524The table below shows all of the possible register flags along with the
525corresponding internal ``llvm::RegState`` representation:
526
527.. list-table::
528 :header-rows: 1
529
530 * - Flag
531 - Internal Value
532
533 * - ``implicit``
534 - ``RegState::Implicit``
535
536 * - ``implicit-def``
537 - ``RegState::ImplicitDefine``
538
539 * - ``def``
540 - ``RegState::Define``
541
542 * - ``dead``
543 - ``RegState::Dead``
544
545 * - ``killed``
546 - ``RegState::Kill``
547
548 * - ``undef``
549 - ``RegState::Undef``
550
551 * - ``internal``
552 - ``RegState::InternalRead``
553
554 * - ``early-clobber``
555 - ``RegState::EarlyClobber``
556
557 * - ``debug-use``
558 - ``RegState::Debug``
Alex Lorenzbd978a42015-08-15 01:06:06 +0000559
Geoff Berry3b391fe2017-12-12 17:53:59 +0000560 * - ``renamable``
561 - ``RegState::Renamable``
562
Alex Lorenz0aeea882015-09-08 11:39:47 +0000563.. _subregister-indices:
564
565Subregister Indices
566~~~~~~~~~~~~~~~~~~~
567
568The register machine operands can reference a portion of a register by using
569the subregister indices. The example below shows an instance of the ``COPY``
570pseudo instruction that uses the X86 ``sub_8bit`` subregister index to copy 8
571lower bits from the 32-bit virtual register 0 to the 8-bit virtual register 1:
572
Renato Golin88ea57f2016-07-20 12:16:38 +0000573.. code-block:: text
Alex Lorenz0aeea882015-09-08 11:39:47 +0000574
575 %1 = COPY %0:sub_8bit
576
577The names of the subregister indices are target specific, and are typically
578defined in the target's ``*RegisterInfo.td`` file.
579
Francis Visoiu Mistrihc8469092017-12-13 10:30:45 +0000580Constant Pool Indices
581^^^^^^^^^^^^^^^^^^^^^
582
583A constant pool index (CPI) operand is printed using its index in the
584function's ``MachineConstantPool`` and an offset.
585
586For example, a CPI with the index 1 and offset 8:
587
588.. code-block:: text
589
590 %1:gr64 = MOV64ri %const.1 + 8
591
592For a CPI with the index 0 and offset -12:
593
594.. code-block:: text
595
596 %1:gr64 = MOV64ri %const.0 - 12
597
598A constant pool entry is bound to a LLVM IR ``Constant`` or a target-specific
599``MachineConstantPoolValue``. When serializing all the function's constants the
600following format is used:
601
602.. code-block:: text
603
604 constants:
605 - id: <index>
606 value: <value>
607 alignment: <alignment>
608 isTargetSpecific: <target-specific>
609
610where ``<index>`` is a 32-bit unsigned integer, ``<value>`` is a `LLVM IR Constant
611<https://www.llvm.org/docs/LangRef.html#constants>`_, alignment is a 32-bit
612unsigned integer, and ``<target-specific>`` is either true or false.
613
614Example:
615
616.. code-block:: text
617
618 constants:
619 - id: 0
620 value: 'double 3.250000e+00'
621 alignment: 8
622 - id: 1
623 value: 'g-(LPC0+8)'
624 alignment: 4
625 isTargetSpecific: true
626
Alex Lorenz1fd57732015-09-08 11:38:16 +0000627Global Value Operands
628^^^^^^^^^^^^^^^^^^^^^
629
630The global value machine operands reference the global values from the
631:ref:`embedded LLVM IR module <embedded-module>`.
632The example below shows an instance of the X86 ``MOV64rm`` instruction that has
633a global value operand named ``G``:
634
Renato Golin88ea57f2016-07-20 12:16:38 +0000635.. code-block:: text
Alex Lorenz1fd57732015-09-08 11:38:16 +0000636
Puyan Lotfi9e2cb422018-03-12 14:51:19 +0000637 $rax = MOV64rm $rip, 1, _, @G, _
Alex Lorenz1fd57732015-09-08 11:38:16 +0000638
639The named global values are represented using an identifier with the '@' prefix.
640If the identifier doesn't match the regular expression
641`[-a-zA-Z$._][-a-zA-Z$._0-9]*`, then this identifier must be quoted.
642
643The unnamed global values are represented using an unsigned numeric value with
644the '@' prefix, like in the following examples: ``@0``, ``@989``.
645
Francis Visoiu Mistrih2b168632017-12-13 10:30:51 +0000646Target-dependent Index Operands
647^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
648
649A target index operand is a target-specific index and an offset. The
650target-specific index is printed using target-specific names and a positive or
651negative offset.
652
653For example, the ``amdgpu-constdata-start`` is associated with the index ``0``
654in the AMDGPU backend. So if we have a target index operand with the index 0
655and the offset 8:
656
657.. code-block:: text
658
Puyan Lotfi9e2cb422018-03-12 14:51:19 +0000659 $sgpr2 = S_ADD_U32 _, target-index(amdgpu-constdata-start) + 8, implicit-def _, implicit-def _
Francis Visoiu Mistrih2b168632017-12-13 10:30:51 +0000660
Francis Visoiu Mistrihd347e972017-12-13 10:30:59 +0000661Jump-table Index Operands
662^^^^^^^^^^^^^^^^^^^^^^^^^
663
664A jump-table index operand with the index 0 is printed as following:
665
666.. code-block:: text
667
Puyan Lotfi9e2cb422018-03-12 14:51:19 +0000668 tBR_JTr killed $r0, %jump-table.0
Francis Visoiu Mistrihd347e972017-12-13 10:30:59 +0000669
670A machine jump-table entry contains a list of ``MachineBasicBlocks``. When serializing all the function's jump-table entries, the following format is used:
671
672.. code-block:: text
673
674 jumpTable:
675 kind: <kind>
676 entries:
677 - id: <index>
678 blocks: [ <bbreference>, <bbreference>, ... ]
679
680where ``<kind>`` is describing how the jump table is represented and emitted (plain address, relocations, PIC, etc.), and each ``<index>`` is a 32-bit unsigned integer and ``blocks`` contains a list of :ref:`machine basic block references <block-references>`.
681
682Example:
683
684.. code-block:: text
685
686 jumpTable:
687 kind: inline
688 entries:
689 - id: 0
690 blocks: [ '%bb.3', '%bb.9', '%bb.4.d3' ]
691 - id: 1
692 blocks: [ '%bb.7', '%bb.7', '%bb.4.d3', '%bb.5' ]
693
Francis Visoiu Mistrihd3987752017-12-14 10:02:58 +0000694External Symbol Operands
695^^^^^^^^^^^^^^^^^^^^^^^^^
696
Puyan Lotfi9e2cb422018-03-12 14:51:19 +0000697An external symbol operand is represented using an identifier with the ``&``
Francis Visoiu Mistrihd3987752017-12-14 10:02:58 +0000698prefix. The identifier is surrounded with ""'s and escaped if it has any
699special non-printable characters in it.
700
701Example:
702
703.. code-block:: text
704
Puyan Lotfi9e2cb422018-03-12 14:51:19 +0000705 CALL64pcrel32 &__stack_chk_fail, csr_64, implicit $rsp, implicit-def $rsp
Francis Visoiu Mistrihd3987752017-12-14 10:02:58 +0000706
Francis Visoiu Mistrihee30ab72017-12-14 10:03:23 +0000707MCSymbol Operands
708^^^^^^^^^^^^^^^^^
709
710A MCSymbol operand is holding a pointer to a ``MCSymbol``. For the limitations
711of this operand in MIR, see :ref:`limitations <limitations>`.
712
713The syntax is:
714
715.. code-block:: text
716
717 EH_LABEL <mcsymbol Ltmp1>
Francis Visoiu Mistrihd3987752017-12-14 10:02:58 +0000718
Francis Visoiu Mistrihfcfc7b22017-12-19 16:51:52 +0000719CFIIndex Operands
720^^^^^^^^^^^^^^^^^
721
722A CFI Index operand is holding an index into a per-function side-table,
723``MachineFunction::getFrameInstructions()``, which references all the frame
724instructions in a ``MachineFunction``. A ``CFI_INSTRUCTION`` may look like it
725contains multiple operands, but the only operand it contains is the CFI Index.
726The other operands are tracked by the ``MCCFIInstruction`` object.
727
728The syntax is:
729
730.. code-block:: text
731
Puyan Lotfi9e2cb422018-03-12 14:51:19 +0000732 CFI_INSTRUCTION offset $w30, -16
Francis Visoiu Mistrihfcfc7b22017-12-19 16:51:52 +0000733
734which may be emitted later in the MC layer as:
735
736.. code-block:: text
737
738 .cfi_offset w30, -16
739
Francis Visoiu Mistrih43c2ba72017-12-19 21:47:05 +0000740IntrinsicID Operands
741^^^^^^^^^^^^^^^^^^^^
742
743An Intrinsic ID operand contains a generic intrinsic ID or a target-specific ID.
744
745The syntax for the ``returnaddress`` intrinsic is:
746
747.. code-block:: text
748
Puyan Lotfi9e2cb422018-03-12 14:51:19 +0000749 $x0 = COPY intrinsic(@llvm.returnaddress)
Francis Visoiu Mistrih43c2ba72017-12-19 21:47:05 +0000750
Francis Visoiu Mistrih234b36e2017-12-19 21:47:10 +0000751Predicate Operands
752^^^^^^^^^^^^^^^^^^
753
754A Predicate operand contains an IR predicate from ``CmpInst::Predicate``, like
755``ICMP_EQ``, etc.
756
757For an int eq predicate ``ICMP_EQ``, the syntax is:
758
759.. code-block:: text
760
761 %2:gpr(s32) = G_ICMP intpred(eq), %0, %1
762
Alex Lorenz32dcc282015-08-06 22:55:19 +0000763.. TODO: Describe the parsers default behaviour when optional YAML attributes
764 are missing.
Alex Lorenz5e825f62015-08-21 21:17:01 +0000765.. TODO: Describe the syntax for virtual register YAML definitions.
Alex Lorenz32dcc282015-08-06 22:55:19 +0000766.. TODO: Describe the machine function's YAML flag attributes.
Francis Visoiu Mistrihd3987752017-12-14 10:02:58 +0000767.. TODO: Describe the syntax for the register mask machine operands.
Alex Lorenz32dcc282015-08-06 22:55:19 +0000768.. TODO: Describe the frame information YAML mapping.
769.. TODO: Describe the syntax of the stack object machine operands and their
770 YAML definitions.
Alex Lorenz32dcc282015-08-06 22:55:19 +0000771.. TODO: Describe the syntax of the block address machine operands.
Alex Lorenz32dcc282015-08-06 22:55:19 +0000772.. TODO: Describe the syntax of the metadata machine operands, and the
773 instructions debug location attribute.
Alex Lorenz32dcc282015-08-06 22:55:19 +0000774.. TODO: Describe the syntax of the register live out machine operands.
775.. TODO: Describe the syntax of the machine memory operands.