Blame - docs/InAlloca.rst - platform_external_llvm

blob: a75f22da7964b0909b8e34eae232079c3a446f53 [file] [log] [blame]

Reid Kleckner	4b70bfc	2013-12-19 02:14:12 +0000	[diff] [blame]	1	==========================================
				2	Design and Usage of the InAlloca Attribute
				3	==========================================
				4
				5	Introduction
				6	============
				7
Reid Kleckner	ad60d3c	2014-01-16 22:59:24 +0000	[diff] [blame]	8	The :ref:`inalloca <attr_inalloca>` attribute is designed to allow
				9	taking the address of an aggregate argument that is being passed by
				10	value through memory. Primarily, this feature is required for
				11	compatibility with the Microsoft C++ ABI. Under that ABI, class
				12	instances that are passed by value are constructed directly into
				13	argument stack memory. Prior to the addition of inalloca, calls in LLVM
				14	were indivisible instructions. There was no way to perform intermediate
				15	work, such as object construction, between the first stack adjustment
				16	and the final control transfer. With inalloca, all arguments passed in
				17	memory are modelled as a single alloca, which can be stored to prior to
				18	the call. Unfortunately, this complicated feature comes with a large
				19	set of restrictions designed to bound the lifetime of the argument
				20	memory around the call.
Reid Kleckner	4b70bfc	2013-12-19 02:14:12 +0000	[diff] [blame]	21
				22	For now, it is recommended that frontends and optimizers avoid producing
				23	this construct, primarily because it forces the use of a base pointer.
				24	This feature may grow in the future to allow general mid-level
				25	optimization, but for now, it should be regarded as less efficient than
				26	passing by value with a copy.
				27
				28	Intended Usage
				29	==============
				30
Reid Kleckner	ad60d3c	2014-01-16 22:59:24 +0000	[diff] [blame]	31	The example below is the intended LLVM IR lowering for some C++ code
Reid Kleckner	2ce2122	2014-03-27 01:32:22 +0000	[diff] [blame]	32	that passes two default-constructed ``Foo`` objects to ``g`` in the
				33	32-bit Microsoft C++ ABI.
Reid Kleckner	ad60d3c	2014-01-16 22:59:24 +0000	[diff] [blame]	34
				35	.. code-block:: c++
				36
				37	// Foo is non-trivial.
Reid Kleckner	2ce2122	2014-03-27 01:32:22 +0000	[diff] [blame]	38	struct Foo { int a, b; Foo(); ~Foo(); Foo(const Foo &); };
Reid Kleckner	ad60d3c	2014-01-16 22:59:24 +0000	[diff] [blame]	39	void g(Foo a, Foo b);
				40	void f() {
Reid Kleckner	2ce2122	2014-03-27 01:32:22 +0000	[diff] [blame]	41	g(Foo(), Foo());
Reid Kleckner	ad60d3c	2014-01-16 22:59:24 +0000	[diff] [blame]	42	}
Reid Kleckner	4b70bfc	2013-12-19 02:14:12 +0000	[diff] [blame]	43
Renato Golin	88ea57f	2016-07-20 12:16:38 +0000	[diff] [blame]	44	.. code-block:: text
Reid Kleckner	4b70bfc	2013-12-19 02:14:12 +0000	[diff] [blame]	45
Reid Kleckner	ad60d3c	2014-01-16 22:59:24 +0000	[diff] [blame]	46	%struct.Foo = type { i32, i32 }
Reid Kleckner	2ce2122	2014-03-27 01:32:22 +0000	[diff] [blame]	47	declare void @Foo_ctor(%struct.Foo* %this)
				48	declare void @Foo_dtor(%struct.Foo* %this)
				49	declare void @g(<{ %struct.Foo, %struct.Foo }>* inalloca %memargs)
Reid Kleckner	4b70bfc	2013-12-19 02:14:12 +0000	[diff] [blame]	50
				51	define void @f() {
Reid Kleckner	ad60d3c	2014-01-16 22:59:24 +0000	[diff] [blame]	52	entry:
Reid Kleckner	4b70bfc	2013-12-19 02:14:12 +0000	[diff] [blame]	53	%base = call i8* @llvm.stacksave()
Reid Kleckner	2ce2122	2014-03-27 01:32:22 +0000	[diff] [blame]	54	%memargs = alloca <{ %struct.Foo, %struct.Foo }>
Reid Kleckner	7d3c3163	2014-03-27 01:38:48 +0000	[diff] [blame]	55	%b = getelementptr <{ %struct.Foo, %struct.Foo }>* %memargs, i32 1
Reid Kleckner	ad60d3c	2014-01-16 22:59:24 +0000	[diff] [blame]	56	call void @Foo_ctor(%struct.Foo* %b)
				57
				58	; If a's ctor throws, we must destruct b.
Reid Kleckner	7d3c3163	2014-03-27 01:38:48 +0000	[diff] [blame]	59	%a = getelementptr <{ %struct.Foo, %struct.Foo }>* %memargs, i32 0
Reid Kleckner	2ce2122	2014-03-27 01:32:22 +0000	[diff] [blame]	60	invoke void @Foo_ctor(%struct.Foo* %a)
Reid Kleckner	4b70bfc	2013-12-19 02:14:12 +0000	[diff] [blame]	61	to label %invoke.cont unwind %invoke.unwind
				62
				63	invoke.cont:
Reid Kleckner	2ce2122	2014-03-27 01:32:22 +0000	[diff] [blame]	64	call void @g(<{ %struct.Foo, %struct.Foo }>* inalloca %memargs)
Reid Kleckner	4b70bfc	2013-12-19 02:14:12 +0000	[diff] [blame]	65	call void @llvm.stackrestore(i8* %base)
				66	...
				67
				68	invoke.unwind:
Reid Kleckner	ad60d3c	2014-01-16 22:59:24 +0000	[diff] [blame]	69	call void @Foo_dtor(%struct.Foo* %b)
Reid Kleckner	4b70bfc	2013-12-19 02:14:12 +0000	[diff] [blame]	70	call void @llvm.stackrestore(i8* %base)
				71	...
				72	}
				73
Reid Kleckner	ad60d3c	2014-01-16 22:59:24 +0000	[diff] [blame]	74	To avoid stack leaks, the frontend saves the current stack pointer with
				75	a call to :ref:`llvm.stacksave <int_stacksave>`. Then, it allocates the
				76	argument stack space with alloca and calls the default constructor. The
				77	default constructor could throw an exception, so the frontend has to
				78	create a landing pad. The frontend has to destroy the already
				79	constructed argument ``b`` before restoring the stack pointer. If the
				80	constructor does not unwind, ``g`` is called. In the Microsoft C++ ABI,
				81	``g`` will destroy its arguments, and then the stack is restored in
				82	``f``.
Reid Kleckner	4b70bfc	2013-12-19 02:14:12 +0000	[diff] [blame]	83
				84	Design Considerations
				85	=====================
				86
				87	Lifetime
				88	--------
				89
				90	The biggest design consideration for this feature is object lifetime.
				91	We cannot model the arguments as static allocas in the entry block,
Reid Kleckner	ad60d3c	2014-01-16 22:59:24 +0000	[diff] [blame]	92	because all calls need to use the memory at the top of the stack to pass
				93	arguments. We cannot vend pointers to that memory at function entry
				94	because after code generation they will alias.
				95
				96	The rule against allocas between argument allocations and the call site
				97	avoids this problem, but it creates a cleanup problem. Cleanup and
				98	lifetime is handled explicitly with stack save and restore calls. In
				99	the future, we may want to introduce a new construct such as ``freea``
				100	or ``afree`` to make it clear that this stack adjusting cleanup is less
				101	powerful than a full stack save and restore.
Reid Kleckner	4b70bfc	2013-12-19 02:14:12 +0000	[diff] [blame]	102
				103	Nested Calls and Copy Elision
				104	-----------------------------
				105
Reid Kleckner	ad60d3c	2014-01-16 22:59:24 +0000	[diff] [blame]	106	We also want to be able to support copy elision into these argument
				107	slots. This means we have to support multiple live argument
				108	allocations.
				109
				110	Consider the evaluation of:
				111
				112	.. code-block:: c++
				113
				114	// Foo is non-trivial.
				115	struct Foo { int a; Foo(); Foo(const &Foo); ~Foo(); };
				116	Foo bar(Foo b);
				117	int main() {
				118	bar(bar(Foo()));
				119	}
				120
				121	In this case, we want to be able to elide copies into ``bar``'s argument
				122	slots. That means we need to have more than one set of argument frames
				123	active at the same time. First, we need to allocate the frame for the
				124	outer call so we can pass it in as the hidden struct return pointer to
				125	the middle call. Then we do the same for the middle call, allocating a
				126	frame and passing its address to ``Foo``'s default constructor. By
				127	wrapping the evaluation of the inner ``bar`` with stack save and
				128	restore, we can have multiple overlapping active call frames.
Reid Kleckner	4b70bfc	2013-12-19 02:14:12 +0000	[diff] [blame]	129
				130	Callee-cleanup Calling Conventions
				131	----------------------------------
				132
				133	Another wrinkle is the existence of callee-cleanup conventions. On
				134	Windows, all methods and many other functions adjust the stack to clear
				135	the memory used to pass their arguments. In some sense, this means that
				136	the allocas are automatically cleared by the call. However, LLVM
				137	instead models this as a write of undef to all of the inalloca values
				138	passed to the call instead of a stack adjustment. Frontends should
				139	still restore the stack pointer to avoid a stack leak.
				140
				141	Exceptions
				142	----------
				143
				144	There is also the possibility of an exception. If argument evaluation
				145	or copy construction throws an exception, the landing pad must do
				146	cleanup, which includes adjusting the stack pointer to avoid a stack
				147	leak. This means the cleanup of the stack memory cannot be tied to the
				148	call itself. There needs to be a separate IR-level instruction that can
				149	perform independent cleanup of arguments.
				150
				151	Efficiency
				152	----------
				153
				154	Eventually, it should be possible to generate efficient code for this
				155	construct. In particular, using inalloca should not require a base
				156	pointer. If the backend can prove that all points in the CFG only have
				157	one possible stack level, then it can address the stack directly from
				158	the stack pointer. While this is not yet implemented, the plan is that
				159	the inalloca attribute should not change much, but the frontend IR
				160	generation recommendations may change.