Blame - docs/GlobalISel.rst - platform_external_llvm

blob: 8746685491c7c74395067fc0c97e30428b780a9e [file] [log] [blame]

Ahmed Bougacha	bbf0c3a	2016-11-04 17:57:34 +0000	[diff] [blame]	1	============================
				2	Global Instruction Selection
				3	============================
				4
				5	.. contents::
				6	:local:
				7	:depth: 1
				8
				9	.. warning::
				10	This document is a work in progress. It reflects the current state of the
				11	implementation, as well as open design and implementation issues.
				12
				13	Introduction
				14	============
				15
				16	GlobalISel is a framework that provides a set of reusable passes and utilities
				17	for instruction selection --- translation from LLVM IR to target-specific
				18	Machine IR (MIR).
				19
				20	GlobalISel is intended to be a replacement for SelectionDAG and FastISel, to
				21	solve three major problems:
				22
				23	* Performance --- SelectionDAG introduces a dedicated intermediate
				24	representation, which has a compile-time cost.
				25
				26	GlobalISel directly operates on the post-isel representation used by the
				27	rest of the code generator, MIR.
				28	It does require extensions to that representation to support arbitrary
				29	incoming IR: :ref:`gmir`.
				30
				31	* Granularity --- SelectionDAG and FastISel operate on individual basic
				32	blocks, losing some global optimization opportunities.
				33
				34	GlobalISel operates on the whole function.
				35
				36	* Modularity --- SelectionDAG and FastISel are radically different and share
				37	very little code.
				38
				39	GlobalISel is built in a way that enables code reuse. For instance, both the
				40	optimized and fast selectors share the :ref:`pipeline`, and targets can
				41	configure that pipeline to better suit their needs.
				42
				43
				44	.. _gmir:
				45
				46	Generic Machine IR
				47	==================
				48
				49	Machine IR operates on physical registers, register classes, and (mostly)
				50	target-specific instructions.
				51
				52	To bridge the gap with LLVM IR, GlobalISel introduces "generic" extensions to
				53	Machine IR:
				54
				55	.. contents::
				56	:local:
				57
				58	``NOTE``:
				59	The generic MIR (GMIR) representation still contains references to IR
				60	constructs (such as ``GlobalValue``). Removing those should let us write more
				61	accurate tests, or delete IR after building the initial MIR. However, it is
				62	not part of the GlobalISel effort.
				63
				64	.. _gmir-instructions:
				65
				66	Generic Instructions
				67	--------------------
				68
				69	The main addition is support for pre-isel generic machine instructions (e.g.,
				70	``G_ADD``). Like other target-independent instructions (e.g., ``COPY`` or
				71	``PHI``), these are available on all targets.
				72
				73	``TODO``:
				74	While we're progressively adding instructions, one kind in particular exposes
				75	interesting problems: compares and how to represent condition codes.
				76	Some targets (x86, ARM) have generic comparisons setting multiple flags,
				77	which are then used by predicated variants.
				78	Others (IR) specify the predicate in the comparison and users just get a single
				79	bit. SelectionDAG uses SETCC/CONDBR vs BR_CC (and similar for select) to
				80	represent this.
				81
				82	The ``MachineIRBuilder`` class wraps the ``MachineInstrBuilder`` and provides
				83	a convenient way to create these generic instructions.
				84
				85	.. _gmir-gvregs:
				86
				87	Generic Virtual Registers
				88	-------------------------
				89
				90	Generic instructions operate on a new kind of register: "generic" virtual
				91	registers. As opposed to non-generic vregs, they are not assigned a Register
				92	Class. Instead, generic vregs have a :ref:`gmir-llt`, and can be assigned
				93	a :ref:`gmir-regbank`.
				94
				95	``MachineRegisterInfo`` tracks the same information that it does for
				96	non-generic vregs (e.g., use-def chains). Additionally, it also tracks the
				97	:ref:`gmir-llt` of the register, and, instead of the ``TargetRegisterClass``,
				98	its :ref:`gmir-regbank`, if any.
				99
				100	For simplicity, most generic instructions only accept generic vregs:
				101
				102	* instead of immediates, they use a gvreg defined by an instruction
				103	materializing the immediate value (see :ref:`irtranslator-constants`).
				104	* instead of physical register, they use a gvreg defined by a ``COPY``.
				105
				106	``NOTE``:
				107	We started with an alternative representation, where MRI tracks a size for
				108	each gvreg, and instructions have lists of types.
				109	That had two flaws: the type and size are redundant, and there was no generic
				110	way of getting a given operand's type (as there was no 1:1 mapping between
				111	instruction types and operands).
				112	We considered putting the type in some variant of MCInstrDesc instead:
				113	See `PR26576 <http://llvm.org/PR26576>`_: [GlobalISel] Generic MachineInstrs
				114	need a type but this increases the memory footprint of the related objects
				115
				116	.. _gmir-regbank:
				117
				118	Register Bank
				119	-------------
				120
				121	A Register Bank is a set of register classes defined by the target.
				122	A bank has a size, which is the maximum store size of all covered classes.
				123
				124	In general, cross-class copies inside a bank are expected to be cheaper than
				125	copies across banks. They are also coalesceable by the register coalescer,
				126	whereas cross-bank copies are not.
				127
				128	Also, equivalent operations can be performed on different banks using different
				129	instructions.
				130
				131	For example, X86 can be seen as having 3 main banks: general-purpose, x87, and
				132	vector (which could be further split into a bank per domain for single vs
				133	double precision instructions).
				134
				135	Register banks are described by a target-provided API,
				136	:ref:`RegisterBankInfo <api-registerbankinfo>`.
				137
				138	.. _gmir-llt:
				139
				140	Low Level Type
				141	--------------
				142
				143	Additionally, every generic virtual register has a type, represented by an
				144	instance of the ``LLT`` class.
				145
				146	Like ``EVT``/``MVT``/``Type``, it has no distinction between unsigned and signed
				147	integer types. Furthermore, it also has no distinction between integer and
				148	floating-point types: it mainly conveys absolutely necessary information, such
				149	as size and number of vector lanes:
				150
				151	* ``sN`` for scalars
				152	* ``pN`` for pointers
				153	* ``<N x sM>`` for vectors
				154	* ``unsized`` for labels, etc..
				155
				156	``LLT`` is intended to replace the usage of ``EVT`` in SelectionDAG.
				157
				158	Here are some LLT examples and their ``EVT`` and ``Type`` equivalents:
				159
				160	============= ========= ======================================
				161	LLT EVT IR Type
				162	============= ========= ======================================
				163	``s1`` ``i1`` ``i1``
				164	``s8`` ``i8`` ``i8``
				165	``s32`` ``i32`` ``i32``
				166	``s32`` ``f32`` ``float``
				167	``s17`` ``i17`` ``i17``
				168	``s16`` N/A ``{i8, i8}``
				169	``s32`` N/A ``[4 x i8]``
				170	``p0`` ``iPTR`` ``i8``, ``i32``, ``%opaque*``
				171	``p2`` ``iPTR`` ``i8 addrspace(2)*``
				172	``<4 x s32>`` ``v4f32`` ``<4 x float>``
				173	``s64`` ``v1f64`` ``<1 x double>``
				174	``<3 x s32>`` ``v3i32`` ``<3 x i32>``
				175	``unsized`` ``Other`` ``label``
				176	============= ========= ======================================
				177
				178
				179	Rationale: instructions already encode a specific interpretation of types
				180	(e.g., ``add`` vs. ``fadd``, or ``sdiv`` vs. ``udiv``). Also encoding that
				181	information in the type system requires introducing bitcast with no real
				182	advantage for the selector.
				183
				184	Pointer types are distinguished by address space. This matches IR, as opposed
				185	to SelectionDAG where address space is an attribute on operations.
				186	This representation better supports pointers having different sizes depending
				187	on their addressspace.
				188
				189	``NOTE``:
				190	Currently, LLT requires at least 2 elements in vectors, but some targets have
				191	the concept of a '1-element vector'. Representing them as their underlying
				192	scalar type is a nice simplification.
				193
				194	``TODO``:
				195	Currently, non-generic virtual registers, defined by non-pre-isel-generic
				196	instructions, cannot have a type, and thus cannot be used by a pre-isel generic
				197	instruction. Instead, they are given a type using a COPY. We could relax that
				198	and allow types on all vregs: this would reduce the number of MI required when
				199	emitting target-specific MIR early in the pipeline. This should purely be
				200	a compile-time optimization.
				201
				202	.. _pipeline:
				203
				204	Core Pipeline
				205	=============
				206
				207	There are four required passes, regardless of the optimization mode:
				208
				209	.. contents::
				210	:local:
				211
				212	Additional passes can then be inserted at higher optimization levels or for
				213	specific targets. For example, to match the current SelectionDAG set of
				214	transformations: MachineCSE and a better MachineCombiner between every pass.
				215
				216	``NOTE``:
				217	In theory, not all passes are always necessary.
				218	As an additional compile-time optimization, we could skip some of the passes by
				219	setting the relevant MachineFunction properties. For instance, if the
				220	IRTranslator did not encounter any illegal instruction, it would set the
				221	``legalized`` property to avoid running the :ref:`milegalizer`.
				222	Similarly, we considered specializing the IRTranslator per-target to directly
				223	emit target-specific MI.
				224	However, we instead decided to keep the core pipeline simple, and focus on
				225	minimizing the overhead of the passes in the no-op cases.
				226
				227
				228	.. _irtranslator:
				229
				230	IRTranslator
				231	------------
				232
				233	This pass translates the input LLVM IR ``Function`` to a GMIR
				234	``MachineFunction``.
				235
				236	``TODO``:
				237	This currently doesn't support the more complex instructions, in particular
				238	those involving control flow (``switch``, ``invoke``, ...).
				239	For ``switch`` in particular, we can initially use the ``LowerSwitch`` pass.
				240
				241	.. _api-calllowering:
				242
				243	API: CallLowering
				244	^^^^^^^^^^^^^^^^^
				245
				246	The ``IRTranslator`` (using the ``CallLowering`` target-provided utility) also
				247	implements the ABI's calling convention by lowering calls, returns, and
				248	arguments to the appropriate physical register usage and instruction sequences.
				249
				250	.. _irtranslator-aggregates:
				251
				252	Aggregates
				253	^^^^^^^^^^
				254
				255	Aggregates are lowered to a single scalar vreg.
				256	This differs from SelectionDAG's multiple vregs via ``GetValueVTs``.
				257
				258	``TODO``:
				259	As some of the bits are undef (padding), we should consider augmenting the
				260	representation with additional metadata (in effect, caching computeKnownBits
				261	information on vregs).
				262	See `PR26161 <http://llvm.org/PR26161>`_: [GlobalISel] Value to vreg during
				263	IR to MachineInstr translation for aggregate type
				264
				265	.. _irtranslator-constants:
				266
				267	Constant Lowering
				268	^^^^^^^^^^^^^^^^^
				269
				270	The ``IRTranslator`` lowers ``Constant`` operands into uses of gvregs defined
				271	by ``G_CONSTANT`` or ``G_FCONSTANT`` instructions.
				272	Currently, these instructions are always emitted in the entry basic block.
				273	In a ``MachineFunction``, each ``Constant`` is materialized by a single gvreg.
				274
				275	This is beneficial as it allows us to fold constants into immediate operands
				276	during :ref:`instructionselect`, while still avoiding redundant materializations
				277	for expensive non-foldable constants.
				278	However, this can lead to unnecessary spills and reloads in an -O0 pipeline, as
				279	these vregs can have long live ranges.
				280
				281	``TODO``:
				282	We're investigating better placement of these instructions, in fast and
				283	optimized modes.
				284
				285
				286	.. _milegalizer:
				287
				288	Legalizer
				289	---------
				290
				291	This pass transforms the generic machine instructions such that they are legal.
				292
				293	A legal instruction is defined as:
				294
				295	* selectable --- the target will later be able to select it to a
				296	target-specific (non-generic) instruction.
				297
				298	* operating on vregs that can be loaded and stored -- if necessary, the
				299	target can select a ``G_LOAD``/``G_STORE`` of each gvreg operand.
				300
				301	As opposed to SelectionDAG, there are no legalization phases. In particular,
				302	'type' and 'operation' legalization are not separate.
				303
				304	Legalization is iterative, and all state is contained in GMIR. To maintain the
				305	validity of the intermediate code, instructions are introduced:
				306
Quentin Colombet	ae44772	2017-09-25 22:03:06 +0000	[diff] [blame]	307	* ``G_MERGE_VALUES`` --- concatenate multiple registers of the same
				308	size into a single wider register.
				309
				310	* ``G_UNMERGE_VALUES`` --- extract multiple registers of the same size
				311	from a single wider register.
Ahmed Bougacha	bbf0c3a	2016-11-04 17:57:34 +0000	[diff] [blame]	312
Quentin Colombet	33e9c8a	2017-09-25 22:03:01 +0000	[diff] [blame]	313	* ``G_EXTRACT`` --- extract a simple register (as contiguous sequences of bits)
Ahmed Bougacha	bbf0c3a	2016-11-04 17:57:34 +0000	[diff] [blame]	314	from a single wider register.
				315
				316	As they are expected to be temporary byproducts of the legalization process,
				317	they are combined at the end of the :ref:`milegalizer` pass.
				318	If any remain, they are expected to always be selectable, using loads and stores
				319	if necessary.
				320
				321	.. _api-legalizerinfo:
				322
				323	API: LegalizerInfo
				324	^^^^^^^^^^^^^^^^^^
				325
				326	Currently the API is broadly similar to SelectionDAG/TargetLowering, but
				327	extended in two ways:
				328
				329	* The set of available actions is wider, avoiding the currently very
				330	overloaded ``Expand`` (which can cover everything from libcalls to
				331	scalarization depending on the node's opcode).
				332
				333	* Since there's no separate type legalization, independently varying
				334	types on an instruction can have independent actions. For example a
				335	``G_ICMP`` has 2 independent types: the result and the inputs; we need
				336	to be able to say that comparing 2 s32s is OK, but the s1 result
				337	must be dealt with in another way.
				338
				339	As such, the primary key when deciding what to do is the ``InstrAspect``,
				340	essentially a tuple consisting of ``(Opcode, TypeIdx, Type)`` and mapping to a
				341	suggested course of action.
				342
				343	An example use might be:
				344
				345	.. code-block:: c++
				346
				347	// The CPU can't deal with an s1 result, do something about it.
				348	setAction({G_ICMP, 0, s1}, WidenScalar);
				349	// An s32 input (the second type) is fine though.
				350	setAction({G_ICMP, 1, s32}, Legal);
				351
				352
				353	``TODO``:
				354	An alternative worth investigating is to generalize the API to represent
				355	actions using ``std::function`` that implements the action, instead of explicit
				356	enum tokens (``Legal``, ``WidenScalar``, ...).
				357
				358	``TODO``:
				359	Moreover, we could use TableGen to initially infer legality of operation from
				360	existing patterns (as any pattern we can select is by definition legal).
				361	Expanding that to describe legalization actions is a much larger but
				362	potentially useful project.
				363
Ahmed Bougacha	bbf0c3a	2016-11-04 17:57:34 +0000	[diff] [blame]	364	.. _milegalizer-non-power-of-2:
				365
				366	Non-power of 2 types
				367	^^^^^^^^^^^^^^^^^^^^
				368
				369	``TODO``:
				370	Types which have a size that isn't a power of 2 aren't currently supported.
				371	The setAction API will probably require changes to support them.
				372	Even notionally explicitly specified operations only make suggestions
				373	like "Widen" or "Narrow". The eventual type is still unspecified and a
				374	search is performed by repeated doubling/halving of the type's
				375	size.
				376	This is incorrect for types that aren't a power of 2. It's reasonable to
				377	expect we could construct an efficient set of side-tables for more general
				378	lookups though, encoding a map from the integers (i.e. the size of the current
				379	type) to types (the legal size).
				380
				381	.. _milegalizer-vector:
				382
				383	Vector types
				384	^^^^^^^^^^^^
				385
				386	Vectors first get their element type legalized: ``<A x sB>`` becomes
				387	``<A x sC>`` such that at least one operation is legal with ``sC``.
				388
				389	This is currently specified by the function ``setScalarInVectorAction``, called
				390	for example as:
				391
				392	setScalarInVectorAction(G_ICMP, s1, WidenScalar);
				393
				394	Next the number of elements is chosen so that the entire operation is
				395	legal. This aspect is not controllable at the moment, but probably
				396	should be (you could imagine disagreements on whether a ``<2 x s8>``
				397	operation should be scalarized or extended to ``<8 x s8>``).
				398
				399
				400	.. _regbankselect:
				401
				402	RegBankSelect
				403	-------------
				404
				405	This pass constrains the :ref:`gmir-gvregs` operands of generic
				406	instructions to some :ref:`gmir-regbank`.
				407
				408	It iteratively maps instructions to a set of per-operand bank assignment.
				409	The possible mappings are determined by the target-provided
				410	:ref:`RegisterBankInfo <api-registerbankinfo>`.
				411	The mapping is then applied, possibly introducing ``COPY`` instructions if
				412	necessary.
				413
				414	It traverses the ``MachineFunction`` top down so that all operands are already
				415	mapped when analyzing an instruction.
				416
				417	This pass could also remap target-specific instructions when beneficial.
				418	In the future, this could replace the ExeDepsFix pass, as we can directly
				419	select the best variant for an instruction that's available on multiple banks.
				420
				421	.. _api-registerbankinfo:
				422
				423	API: RegisterBankInfo
				424	^^^^^^^^^^^^^^^^^^^^^
				425
				426	The ``RegisterBankInfo`` class describes multiple aspects of register banks.
				427
				428	* Banks: ``addRegBankCoverage`` --- which register bank covers each
				429	register class.
				430
				431	* Cross-Bank Copies: ``copyCost`` --- the cost of a ``COPY`` from one bank
				432	to another.
				433
				434	* Default Mapping: ``getInstrMapping`` --- the default bank assignments for
				435	a given instruction.
				436
				437	* Alternative Mapping: ``getInstrAlternativeMapping`` --- the other
				438	possible bank assignments for a given instruction.
				439
				440	``TODO``:
				441	All this information should eventually be static and generated by TableGen,
				442	mostly using existing information augmented by bank descriptions.
				443
				444	``TODO``:
				445	``getInstrMapping`` is currently separate from ``getInstrAlternativeMapping``
				446	because the latter is more expensive: as we move to static mapping info,
				447	both methods should be free, and we should merge them.
				448
				449	.. _regbankselect-modes:
				450
				451	RegBankSelect Modes
				452	^^^^^^^^^^^^^^^^^^^
				453
				454	``RegBankSelect`` currently has two modes:
				455
				456	* Fast --- For each instruction, pick a target-provided "default" bank
				457	assignment. This is the default at -O0.
				458
				459	* Greedy --- For each instruction, pick the cheapest of several
				460	target-provided bank assignment alternatives.
				461
				462	We intend to eventually introduce an additional optimizing mode:
				463
				464	* Global --- Across multiple instructions, pick the cheapest combination of
				465	bank assignments.
				466
				467	``NOTE``:
				468	On AArch64, we are considering using the Greedy mode even at -O0 (or perhaps at
				469	backend -O1): because :ref:`gmir-llt` doesn't distinguish floating point from
				470	integer scalars, the default assignment for loads and stores is the integer
				471	bank, introducing cross-bank copies on most floating point operations.
				472
				473
				474	.. _instructionselect:
				475
				476	InstructionSelect
				477	-----------------
				478
				479	This pass transforms generic machine instructions into equivalent
				480	target-specific instructions. It traverses the ``MachineFunction`` bottom-up,
				481	selecting uses before definitions, enabling trivial dead code elimination.
				482
				483	.. _api-instructionselector:
				484
				485	API: InstructionSelector
				486	^^^^^^^^^^^^^^^^^^^^^^^^
				487
				488	The target implements the ``InstructionSelector`` class, containing the
				489	target-specific selection logic proper.
				490
				491	The instance is provided by the subtarget, so that it can specialize the
				492	selector by subtarget feature (with, e.g., a vector selector overriding parts
				493	of a general-purpose common selector).
				494	We might also want to parameterize it by MachineFunction, to enable selector
				495	variants based on function attributes like optsize.
				496
				497	The simple API consists of:
				498
				499	.. code-block:: c++
				500
				501	virtual bool select(MachineInstr &MI)
				502
				503	This target-provided method is responsible for mutating (or replacing) a
				504	possibly-generic MI into a fully target-specific equivalent.
				505	It is also responsible for doing the necessary constraining of gvregs into the
Daniel Sanders	86274cd	2017-10-23 17:18:44 +0000	[diff] [blame]	506	appropriate register classes as well as passing through COPY instructions to
				507	the register allocator.
Ahmed Bougacha	bbf0c3a	2016-11-04 17:57:34 +0000	[diff] [blame]	508
				509	The ``InstructionSelector`` can fold other instructions into the selected MI,
				510	by walking the use-def chain of the vreg operands.
				511	As GlobalISel is Global, this folding can occur across basic blocks.
				512
Daniel Sanders	86274cd	2017-10-23 17:18:44 +0000	[diff] [blame]	513	SelectionDAG Rule Imports
				514	^^^^^^^^^^^^^^^^^^^^^^^^^
				515
				516	TableGen will import SelectionDAG rules and provide the following function to
				517	execute them:
				518
				519	.. code-block:: c++
				520
				521	bool selectImpl(MachineInstr &MI)
				522
				523	The ``--stats`` option can be used to determine what proportion of rules were
				524	successfully imported. The easiest way to use this is to copy the
				525	``-gen-globalisel`` tablegen command from ``ninja -v`` and modify it.
				526
				527	Similarly, the ``--warn-on-skipped-patterns`` option can be used to obtain the
				528	reasons that rules weren't imported. This can be used to focus on the most
				529	important rejection reasons.
				530
				531	PatLeaf Predicates
				532	^^^^^^^^^^^^^^^^^^
				533
				534	PatLeafs cannot be imported because their C++ is implemented in terms of
				535	``SDNode`` objects. PatLeafs that handle immediate predicates should be
				536	replaced by ``ImmLeaf``, ``IntImmLeaf``, or ``FPImmLeaf`` as appropriate.
				537
				538	There's no standard answer for other PatLeafs. Some standard predicates have
				539	been baked into TableGen but this should not generally be done.
				540
				541	Custom SDNodes
				542	^^^^^^^^^^^^^^
				543
				544	Custom SDNodes should be mapped to Target Pseudos using ``GINodeEquiv``. This
				545	will cause the instruction selector to import them but you will also need to
				546	ensure the target pseudo is introduced to the MIR before the instruction
				547	selector. Any preceeding pass is suitable but the legalizer will be a
				548	particularly common choice.
				549
				550	ComplexPatterns
				551	^^^^^^^^^^^^^^^
				552
				553	ComplexPatterns cannot be imported because their C++ is implemented in terms of
				554	``SDNode`` objects. GlobalISel versions should be defined with
				555	``GIComplexOperandMatcher`` and mapped to ComplexPattern with
				556	``GIComplexPatternEquiv``.
				557
				558	The following predicates are useful for porting ComplexPattern:
				559
				560	* isBaseWithConstantOffset() - Check for base+offset structures
				561	* isOperandImmEqual() - Check for a particular constant
				562	* isObviouslySafeToFold() - Check for reasons an instruction can't be sunk and folded into another.
				563
				564	There are some important points for the C++ implementation:
				565
				566	* Don't modify MIR in the predicate
				567	* Renderer lambdas should capture by value to avoid use-after-free. They will be used after the predicate returns.
				568	* Only create instructions in a renderer lambda. GlobalISel won't clean up things you create but don't use.
Ahmed Bougacha	bbf0c3a	2016-11-04 17:57:34 +0000	[diff] [blame]	569
				570
				571	.. _maintainability:
				572
				573	Maintainability
				574	===============
				575
				576	.. _maintainability-iterative:
				577
				578	Iterative Transformations
				579	-------------------------
				580
				581	Passes are split into small, iterative transformations, with all state
				582	represented in the MIR.
				583
				584	This differs from SelectionDAG (in particular, the legalizer) using various
				585	in-memory side-tables.
				586
				587
				588	.. _maintainability-mir:
				589
				590	MIR Serialization
				591	-----------------
				592
				593	.. FIXME: Update the MIRLangRef to include GMI additions.
				594
				595	:ref:`gmir` is serializable (see :doc:`MIRLangRef`).
				596	Combined with :ref:`maintainability-iterative`, this enables much finer-grained
				597	testing, rather than requiring large and fragile IR-to-assembly tests.
				598
				599	The current "stage" in the :ref:`pipeline` is represented by a set of
				600	``MachineFunctionProperties``:
				601
				602	* ``legalized``
				603	* ``regBankSelected``
				604	* ``selected``
				605
				606
				607	.. _maintainability-verifier:
				608
				609	MachineVerifier
				610	---------------
				611
				612	The pass approach lets us use the ``MachineVerifier`` to enforce invariants.
				613	For instance, a ``regBankSelected`` function may not have gvregs without
				614	a bank.
				615
				616	``TODO``:
				617	The ``MachineVerifier`` being monolithic, some of the checks we want to do
				618	can't be integrated to it: GlobalISel is a separate library, so we can't
				619	directly reference it from CodeGen. For instance, legality checks are
				620	currently done in RegBankSelect/InstructionSelect proper. We could #ifdef out
				621	the checks, or we could add some sort of verifier API.
				622
				623
				624	.. _progress:
				625
				626	Progress and Future Work
				627	========================
				628
				629	The initial goal is to replace FastISel on AArch64. The next step will be to
				630	replace SelectionDAG as the optimized ISel.
				631
				632	``NOTE``:
				633	While we iterate on GlobalISel, we strive to avoid affecting the performance of
				634	SelectionDAG, FastISel, or the other MIR passes. For instance, the types of
				635	:ref:`gmir-gvregs` are stored in a separate table in ``MachineRegisterInfo``,
				636	that is destroyed after :ref:`instructionselect`.
				637
				638	.. _progress-fastisel:
				639
				640	FastISel Replacement
				641	--------------------
				642
				643	For the initial FastISel replacement, we intend to fallback to SelectionDAG on
				644	selection failures.
				645
				646	Currently, compile-time of the fast pipeline is within 1.5x of FastISel.
				647	We're optimistic we can get to within 1.1/1.2x, but beating FastISel will be
				648	challenging given the multi-pass approach.
				649	Still, supporting all IR (via a complete legalizer) and avoiding the fallback
				650	to SelectionDAG in the worst case should enable better amortized performance
				651	than SelectionDAG+FastISel.
				652
				653	``NOTE``:
				654	We considered never having a fallback to SelectionDAG, instead deciding early
				655	whether a given function is supported by GlobalISel or not. The decision would
				656	be based on :ref:`milegalizer` queries.
				657	We abandoned that for two reasons:
				658	a) on IR inputs, we'd need to basically simulate the :ref:`irtranslator`;
				659	b) to be robust against unforeseen failures and to enable iterative
				660	improvements.
				661
				662	.. _progress-targets:
				663
				664	Support For Other Targets
				665	-------------------------
				666
				667	In parallel, we're investigating adding support for other - ideally quite
				668	different - targets. For instance, there is some initial AMDGPU support.
				669
				670
				671	.. _porting:
				672
				673	Porting GlobalISel to A New Target
				674	==================================
				675
				676	There are four major classes to implement by the target:
				677
				678	* :ref:`CallLowering <api-calllowering>` --- lower calls, returns, and arguments
				679	according to the ABI.
				680	* :ref:`RegisterBankInfo <api-registerbankinfo>` --- describe
				681	:ref:`gmir-regbank` coverage, cross-bank copy cost, and the mapping of
				682	operands onto banks for each instruction.
				683	* :ref:`LegalizerInfo <api-legalizerinfo>` --- describe what is legal, and how
				684	to legalize what isn't.
				685	* :ref:`InstructionSelector <api-instructionselector>` --- select generic MIR
				686	to target-specific MIR.
				687
				688	Additionally:
				689
				690	* ``TargetPassConfig`` --- create the passes constituting the pipeline,
				691	including additional passes not included in the :ref:`pipeline`.
Daniel Sanders	86274cd	2017-10-23 17:18:44 +0000	[diff] [blame]	692
				693	.. _other_resources:
				694
				695	Resources
				696	=========
				697
				698	* `Global Instruction Selection - A Proposal by Quentin Colombet @LLVMDevMeeting 2015 <https://www.youtube.com/watch?v=F6GGbYtae3g>`_
				699	* `Global Instruction Selection - Status by Quentin Colombet, Ahmed Bougacha, and Tim Northover @LLVMDevMeeting 2016 <https://www.youtube.com/watch?v=6tfb344A7w8>`_
				700	* `GlobalISel - LLVM's Latest Instruction Selection Framework by Diana Picus @FOSDEM17 <https://www.youtube.com/watch?v=d6dF6E4BPeU>`_
				701	* GlobalISel: Past, Present, and Future by Quentin Colombet and Ahmed Bougacha @LLVMDevMeeting 2017
				702	* Head First into GlobalISel by Daniel Sanders, Aditya Nandakumar, and Justin Bogner @LLVMDevMeeting 2017