Chris Bieneman | ae143ce4 | 2016-04-29 20:34:54 +0000 | [diff] [blame] | 1 | ============ |
| 2 | CMake Primer |
| 3 | ============ |
| 4 | |
| 5 | .. contents:: |
| 6 | :local: |
| 7 | |
| 8 | .. warning:: |
| 9 | Disclaimer: This documentation is written by LLVM project contributors `not` |
| 10 | anyone affiliated with the CMake project. This document may contain |
| 11 | inaccurate terminology, phrasing, or technical details. It is provided with |
| 12 | the best intentions. |
| 13 | |
| 14 | |
| 15 | Introduction |
| 16 | ============ |
| 17 | |
| 18 | The LLVM project and many of the core projects built on LLVM build using CMake. |
| 19 | This document aims to provide a brief overview of CMake for developers modifying |
| 20 | LLVM projects or building their own projects on top of LLVM. |
| 21 | |
| 22 | The official CMake language references is available in the cmake-language |
| 23 | manpage and `cmake-language online documentation |
| 24 | <https://cmake.org/cmake/help/v3.4/manual/cmake-language.7.html>`_. |
| 25 | |
| 26 | 10,000 ft View |
| 27 | ============== |
| 28 | |
| 29 | CMake is a tool that reads script files in its own language that describe how a |
| 30 | software project builds. As CMake evaluates the scripts it constructs an |
| 31 | internal representation of the software project. Once the scripts have been |
| 32 | fully processed, if there are no errors, CMake will generate build files to |
| 33 | actually build the project. CMake supports generating build files for a variety |
| 34 | of command line build tools as well as for popular IDEs. |
| 35 | |
| 36 | When a user runs CMake it performs a variety of checks similar to how autoconf |
| 37 | worked historically. During the checks and the evaluation of the build |
| 38 | description scripts CMake caches values into the CMakeCache. This is useful |
| 39 | because it allows the build system to skip long-running checks during |
| 40 | incremental development. CMake caching also has some drawbacks, but that will be |
| 41 | discussed later. |
| 42 | |
| 43 | Scripting Overview |
| 44 | ================== |
| 45 | |
| 46 | CMake's scripting language has a very simple grammar. Every language construct |
| 47 | is a command that matches the pattern _name_(_args_). Commands come in three |
| 48 | primary types: language-defined (commands implemented in C++ in CMake), defined |
| 49 | functions, and defined macros. The CMake distribution also contains a suite of |
| 50 | CMake modules that contain definitions for useful functionality. |
| 51 | |
| 52 | The example below is the full CMake build for building a C++ "Hello World" |
| 53 | program. The example uses only CMake language-defined functions. |
| 54 | |
| 55 | .. code-block:: cmake |
| 56 | |
| 57 | cmake_minimum_required(VERSION 3.2) |
| 58 | project(HelloWorld) |
| 59 | add_executable(HelloWorld HelloWorld.cpp) |
| 60 | |
| 61 | The CMake language provides control flow constructs in the form of foreach loops |
| 62 | and if blocks. To make the example above more complicated you could add an if |
| 63 | block to define "APPLE" when targeting Apple platforms: |
| 64 | |
| 65 | .. code-block:: cmake |
| 66 | |
| 67 | cmake_minimum_required(VERSION 3.2) |
| 68 | project(HelloWorld) |
| 69 | add_executable(HelloWorld HelloWorld.cpp) |
| 70 | if(APPLE) |
| 71 | target_compile_definitions(HelloWorld PUBLIC APPLE) |
| 72 | endif() |
| 73 | |
| 74 | Variables, Types, and Scope |
| 75 | =========================== |
| 76 | |
| 77 | Dereferencing |
| 78 | ------------- |
| 79 | |
| 80 | In CMake variables are "stringly" typed. All variables are represented as |
| 81 | strings throughout evaluation. Wrapping a variable in ``${}`` dereferences it |
| 82 | and results in a literal substitution of the name for the value. CMake refers to |
| 83 | this as "variable evaluation" in their documentation. Dereferences are performed |
| 84 | *before* the command being called receives the arguments. This means |
| 85 | dereferencing a list results in multiple separate arguments being passed to the |
| 86 | command. |
| 87 | |
| 88 | Variable dereferences can be nested and be used to model complex data. For |
| 89 | example: |
| 90 | |
| 91 | .. code-block:: cmake |
| 92 | |
| 93 | set(var_name var1) |
| 94 | set(${var_name} foo) # same as "set(var1 foo)" |
| 95 | set(${${var_name}}_var bar) # same as "set(foo_var bar)" |
| 96 | |
| 97 | Dereferencing an unset variable results in an empty expansion. It is a common |
| 98 | pattern in CMake to conditionally set variables knowing that it will be used in |
| 99 | code paths that the variable isn't set. There are examples of this throughout |
| 100 | the LLVM CMake build system. |
| 101 | |
| 102 | An example of variable empty expansion is: |
| 103 | |
| 104 | .. code-block:: cmake |
| 105 | |
| 106 | if(APPLE) |
| 107 | set(extra_sources Apple.cpp) |
| 108 | endif() |
| 109 | add_executable(HelloWorld HelloWorld.cpp ${extra_sources}) |
| 110 | |
| 111 | In this example the ``extra_sources`` variable is only defined if you're |
| 112 | targeting an Apple platform. For all other targets the ``extra_sources`` will be |
| 113 | evaluated as empty before add_executable is given its arguments. |
| 114 | |
Chris Bieneman | ae143ce4 | 2016-04-29 20:34:54 +0000 | [diff] [blame] | 115 | Lists |
| 116 | ----- |
| 117 | |
| 118 | In CMake lists are semi-colon delimited strings, and it is strongly advised that |
| 119 | you avoid using semi-colons in lists; it doesn't go smoothly. A few examples of |
| 120 | defining lists: |
| 121 | |
| 122 | .. code-block:: cmake |
| 123 | |
| 124 | # Creates a list with members a, b, c, and d |
| 125 | set(my_list a b c d) |
| 126 | set(my_list "a;b;c;d") |
| 127 | |
| 128 | # Creates a string "a b c d" |
| 129 | set(my_string "a b c d") |
| 130 | |
| 131 | Lists of Lists |
| 132 | -------------- |
| 133 | |
| 134 | One of the more complicated patterns in CMake is lists of lists. Because a list |
| 135 | cannot contain an element with a semi-colon to construct a list of lists you |
| 136 | make a list of variable names that refer to other lists. For example: |
| 137 | |
| 138 | .. code-block:: cmake |
| 139 | |
| 140 | set(list_of_lists a b c) |
| 141 | set(a 1 2 3) |
| 142 | set(b 4 5 6) |
| 143 | set(c 7 8 9) |
| 144 | |
| 145 | With this layout you can iterate through the list of lists printing each value |
| 146 | with the following code: |
| 147 | |
| 148 | .. code-block:: cmake |
| 149 | |
| 150 | foreach(list_name IN LISTS list_of_lists) |
| 151 | foreach(value IN LISTS ${list_name}) |
| 152 | message(${value}) |
| 153 | endforeach() |
| 154 | endforeach() |
| 155 | |
| 156 | You'll notice that the inner foreach loop's list is doubly dereferenced. This is |
| 157 | because the first dereference turns ``list_name`` into the name of the sub-list |
| 158 | (a, b, or c in the example), then the second dereference is to get the value of |
| 159 | the list. |
| 160 | |
| 161 | This pattern is used throughout CMake, the most common example is the compiler |
| 162 | flags options, which CMake refers to using the following variable expansions: |
| 163 | CMAKE_${LANGUAGE}_FLAGS and CMAKE_${LANGUAGE}_FLAGS_${CMAKE_BUILD_TYPE}. |
| 164 | |
| 165 | Other Types |
| 166 | ----------- |
| 167 | |
| 168 | Variables that are cached or specified on the command line can have types |
| 169 | associated with them. The variable's type is used by CMake's UI tool to display |
Lang Hames | d17c970 | 2017-08-17 18:00:28 +0000 | [diff] [blame] | 170 | the right input field. A variable's type generally doesn't impact evaluation, |
| 171 | however CMake does have special handling for some variables such as PATH. |
| 172 | You can read more about the special handling in `CMake's set documentation |
Chris Bieneman | ae143ce4 | 2016-04-29 20:34:54 +0000 | [diff] [blame] | 173 | <https://cmake.org/cmake/help/v3.5/command/set.html#set-cache-entry>`_. |
| 174 | |
| 175 | Scope |
| 176 | ----- |
| 177 | |
| 178 | CMake inherently has a directory-based scoping. Setting a variable in a |
| 179 | CMakeLists file, will set the variable for that file, and all subdirectories. |
| 180 | Variables set in a CMake module that is included in a CMakeLists file will be |
| 181 | set in the scope they are included from, and all subdirectories. |
| 182 | |
| 183 | When a variable that is already set is set again in a subdirectory it overrides |
| 184 | the value in that scope and any deeper subdirectories. |
| 185 | |
| 186 | The CMake set command provides two scope-related options. PARENT_SCOPE sets a |
| 187 | variable into the parent scope, and not the current scope. The CACHE option sets |
| 188 | the variable in the CMakeCache, which results in it being set in all scopes. The |
| 189 | CACHE option will not set a variable that already exists in the CACHE unless the |
| 190 | FORCE option is specified. |
| 191 | |
| 192 | In addition to directory-based scope, CMake functions also have their own scope. |
| 193 | This means variables set inside functions do not bleed into the parent scope. |
| 194 | This is not true of macros, and it is for this reason LLVM prefers functions |
| 195 | over macros whenever reasonable. |
| 196 | |
| 197 | .. note:: |
| 198 | Unlike C-based languages, CMake's loop and control flow blocks do not have |
| 199 | their own scopes. |
| 200 | |
| 201 | Control Flow |
| 202 | ============ |
| 203 | |
| 204 | CMake features the same basic control flow constructs you would expect in any |
Lang Hames | d17c970 | 2017-08-17 18:00:28 +0000 | [diff] [blame] | 205 | scripting language, but there are a few quirks because, as with everything in |
Chris Bieneman | ae143ce4 | 2016-04-29 20:34:54 +0000 | [diff] [blame] | 206 | CMake, control flow constructs are commands. |
| 207 | |
| 208 | If, ElseIf, Else |
| 209 | ---------------- |
| 210 | |
| 211 | .. note:: |
| 212 | For the full documentation on the CMake if command go |
| 213 | `here <https://cmake.org/cmake/help/v3.4/command/if.html>`_. That resource is |
| 214 | far more complete. |
| 215 | |
| 216 | In general CMake if blocks work the way you'd expect: |
| 217 | |
| 218 | .. code-block:: cmake |
| 219 | |
| 220 | if(<condition>) |
Renato Golin | f12c36b2 | 2016-07-20 09:47:09 +0000 | [diff] [blame] | 221 | message("do stuff") |
Chris Bieneman | ae143ce4 | 2016-04-29 20:34:54 +0000 | [diff] [blame] | 222 | elseif(<condition>) |
Renato Golin | f12c36b2 | 2016-07-20 09:47:09 +0000 | [diff] [blame] | 223 | message("do other stuff") |
Chris Bieneman | ae143ce4 | 2016-04-29 20:34:54 +0000 | [diff] [blame] | 224 | else() |
Renato Golin | f12c36b2 | 2016-07-20 09:47:09 +0000 | [diff] [blame] | 225 | message("do other other stuff") |
Chris Bieneman | ae143ce4 | 2016-04-29 20:34:54 +0000 | [diff] [blame] | 226 | endif() |
| 227 | |
| 228 | The single most important thing to know about CMake's if blocks coming from a C |
| 229 | background is that they do not have their own scope. Variables set inside |
| 230 | conditional blocks persist after the ``endif()``. |
| 231 | |
| 232 | Loops |
| 233 | ----- |
| 234 | |
| 235 | The most common form of the CMake ``foreach`` block is: |
| 236 | |
| 237 | .. code-block:: cmake |
| 238 | |
| 239 | foreach(var ...) |
Renato Golin | f12c36b2 | 2016-07-20 09:47:09 +0000 | [diff] [blame] | 240 | message("do stuff") |
Chris Bieneman | ae143ce4 | 2016-04-29 20:34:54 +0000 | [diff] [blame] | 241 | endforeach() |
| 242 | |
| 243 | The variable argument portion of the ``foreach`` block can contain dereferenced |
| 244 | lists, values to iterate, or a mix of both: |
| 245 | |
| 246 | .. code-block:: cmake |
| 247 | |
| 248 | foreach(var foo bar baz) |
| 249 | message(${var}) |
| 250 | endforeach() |
| 251 | # prints: |
| 252 | # foo |
| 253 | # bar |
| 254 | # baz |
| 255 | |
| 256 | set(my_list 1 2 3) |
| 257 | foreach(var ${my_list}) |
| 258 | message(${var}) |
| 259 | endforeach() |
| 260 | # prints: |
| 261 | # 1 |
| 262 | # 2 |
| 263 | # 3 |
| 264 | |
| 265 | foreach(var ${my_list} out_of_bounds) |
| 266 | message(${var}) |
| 267 | endforeach() |
| 268 | # prints: |
| 269 | # 1 |
| 270 | # 2 |
| 271 | # 3 |
| 272 | # out_of_bounds |
| 273 | |
| 274 | There is also a more modern CMake foreach syntax. The code below is equivalent |
| 275 | to the code above: |
| 276 | |
| 277 | .. code-block:: cmake |
| 278 | |
| 279 | foreach(var IN ITEMS foo bar baz) |
| 280 | message(${var}) |
| 281 | endforeach() |
| 282 | # prints: |
| 283 | # foo |
| 284 | # bar |
| 285 | # baz |
| 286 | |
| 287 | set(my_list 1 2 3) |
| 288 | foreach(var IN LISTS my_list) |
| 289 | message(${var}) |
| 290 | endforeach() |
| 291 | # prints: |
| 292 | # 1 |
| 293 | # 2 |
| 294 | # 3 |
| 295 | |
| 296 | foreach(var IN LISTS my_list ITEMS out_of_bounds) |
| 297 | message(${var}) |
| 298 | endforeach() |
| 299 | # prints: |
| 300 | # 1 |
| 301 | # 2 |
| 302 | # 3 |
| 303 | # out_of_bounds |
| 304 | |
| 305 | Similar to the conditional statements, these generally behave how you would |
| 306 | expect, and they do not have their own scope. |
| 307 | |
| 308 | CMake also supports ``while`` loops, although they are not widely used in LLVM. |
| 309 | |
| 310 | Modules, Functions and Macros |
| 311 | ============================= |
| 312 | |
| 313 | Modules |
| 314 | ------- |
| 315 | |
| 316 | Modules are CMake's vehicle for enabling code reuse. CMake modules are just |
| 317 | CMake script files. They can contain code to execute on include as well as |
| 318 | definitions for commands. |
| 319 | |
| 320 | In CMake macros and functions are universally referred to as commands, and they |
| 321 | are the primary method of defining code that can be called multiple times. |
| 322 | |
| 323 | In LLVM we have several CMake modules that are included as part of our |
| 324 | distribution for developers who don't build our project from source. Those |
| 325 | modules are the fundamental pieces needed to build LLVM-based projects with |
| 326 | CMake. We also rely on modules as a way of organizing the build system's |
| 327 | functionality for maintainability and re-use within LLVM projects. |
| 328 | |
| 329 | Argument Handling |
| 330 | ----------------- |
| 331 | |
| 332 | When defining a CMake command handling arguments is very useful. The examples |
| 333 | in this section will all use the CMake ``function`` block, but this all applies |
| 334 | to the ``macro`` block as well. |
| 335 | |
Lang Hames | 35adac2 | 2017-08-17 18:21:53 +0000 | [diff] [blame] | 336 | CMake commands can have named arguments that are requried at every call site. In |
| 337 | addition, all commands will implicitly accept a variable number of extra |
| 338 | arguments (In C parlance, all commands are varargs functions). When a command is |
Lang Hames | 5a19c29 | 2017-08-24 05:38:39 +0000 | [diff] [blame] | 339 | invoked with extra arguments (beyond the named ones) CMake will store the full |
| 340 | list of arguments (both named and unnamed) in a list named ``ARGV``, and the |
| 341 | sublist of unnamed arguments in ``ARGN``. Below is a trivial example of |
| 342 | providing a wrapper function for CMake's built in function ``add_dependencies``. |
Chris Bieneman | ae143ce4 | 2016-04-29 20:34:54 +0000 | [diff] [blame] | 343 | |
| 344 | .. code-block:: cmake |
| 345 | |
| 346 | function(add_deps target) |
Lang Hames | 5a19c29 | 2017-08-24 05:38:39 +0000 | [diff] [blame] | 347 | add_dependencies(${target} ${ARGN}) |
Chris Bieneman | ae143ce4 | 2016-04-29 20:34:54 +0000 | [diff] [blame] | 348 | endfunction() |
| 349 | |
| 350 | This example defines a new macro named ``add_deps`` which takes a required first |
| 351 | argument, and just calls another function passing through the first argument and |
Lang Hames | 35adac2 | 2017-08-17 18:21:53 +0000 | [diff] [blame] | 352 | all trailing arguments. |
Chris Bieneman | ae143ce4 | 2016-04-29 20:34:54 +0000 | [diff] [blame] | 353 | |
| 354 | CMake provides a module ``CMakeParseArguments`` which provides an implementation |
| 355 | of advanced argument parsing. We use this all over LLVM, and it is recommended |
| 356 | for any function that has complex argument-based behaviors or optional |
| 357 | arguments. CMake's official documentation for the module is in the |
| 358 | ``cmake-modules`` manpage, and is also available at the |
| 359 | `cmake-modules online documentation |
| 360 | <https://cmake.org/cmake/help/v3.4/module/CMakeParseArguments.html>`_. |
| 361 | |
| 362 | .. note:: |
| 363 | As of CMake 3.5 the cmake_parse_arguments command has become a native command |
| 364 | and the CMakeParseArguments module is empty and only left around for |
| 365 | compatibility. |
| 366 | |
| 367 | Functions Vs Macros |
| 368 | ------------------- |
| 369 | |
| 370 | Functions and Macros look very similar in how they are used, but there is one |
| 371 | fundamental difference between the two. Functions have their own scope, and |
| 372 | macros don't. This means variables set in macros will bleed out into the calling |
| 373 | scope. That makes macros suitable for defining very small bits of functionality |
| 374 | only. |
| 375 | |
| 376 | The other difference between CMake functions and macros is how arguments are |
| 377 | passed. Arguments to macros are not set as variables, instead dereferences to |
| 378 | the parameters are resolved across the macro before executing it. This can |
| 379 | result in some unexpected behavior if using unreferenced variables. For example: |
| 380 | |
| 381 | .. code-block:: cmake |
| 382 | |
| 383 | macro(print_list my_list) |
| 384 | foreach(var IN LISTS my_list) |
| 385 | message("${var}") |
| 386 | endforeach() |
| 387 | endmacro() |
| 388 | |
| 389 | set(my_list a b c d) |
| 390 | set(my_list_of_numbers 1 2 3 4) |
| 391 | print_list(my_list_of_numbers) |
| 392 | # prints: |
| 393 | # a |
| 394 | # b |
| 395 | # c |
| 396 | # d |
| 397 | |
| 398 | Generally speaking this issue is uncommon because it requires using |
| 399 | non-dereferenced variables with names that overlap in the parent scope, but it |
| 400 | is important to be aware of because it can lead to subtle bugs. |
| 401 | |
| 402 | LLVM Project Wrappers |
| 403 | ===================== |
| 404 | |
| 405 | LLVM projects provide lots of wrappers around critical CMake built-in commands. |
| 406 | We use these wrappers to provide consistent behaviors across LLVM components |
| 407 | and to reduce code duplication. |
| 408 | |
| 409 | We generally (but not always) follow the convention that commands prefaced with |
| 410 | ``llvm_`` are intended to be used only as building blocks for other commands. |
| 411 | Wrapper commands that are intended for direct use are generally named following |
| 412 | with the project in the middle of the command name (i.e. ``add_llvm_executable`` |
| 413 | is the wrapper for ``add_executable``). The LLVM ``add_*`` wrapper functions are |
| 414 | all defined in ``AddLLVM.cmake`` which is installed as part of the LLVM |
| 415 | distribution. It can be included and used by any LLVM sub-project that requires |
| 416 | LLVM. |
| 417 | |
| 418 | .. note:: |
| 419 | |
| 420 | Not all LLVM projects require LLVM for all use cases. For example compiler-rt |
| 421 | can be built without LLVM, and the compiler-rt sanitizer libraries are used |
| 422 | with GCC. |
| 423 | |
| 424 | Useful Built-in Commands |
| 425 | ======================== |
| 426 | |
| 427 | CMake has a bunch of useful built-in commands. This document isn't going to |
| 428 | go into details about them because The CMake project has excellent |
| 429 | documentation. To highlight a few useful functions see: |
| 430 | |
| 431 | * `add_custom_command <https://cmake.org/cmake/help/v3.4/command/add_custom_command.html>`_ |
| 432 | * `add_custom_target <https://cmake.org/cmake/help/v3.4/command/add_custom_target.html>`_ |
| 433 | * `file <https://cmake.org/cmake/help/v3.4/command/file.html>`_ |
| 434 | * `list <https://cmake.org/cmake/help/v3.4/command/list.html>`_ |
| 435 | * `math <https://cmake.org/cmake/help/v3.4/command/math.html>`_ |
| 436 | * `string <https://cmake.org/cmake/help/v3.4/command/string.html>`_ |
| 437 | |
| 438 | The full documentation for CMake commands is in the ``cmake-commands`` manpage |
| 439 | and available on `CMake's website <https://cmake.org/cmake/help/v3.4/manual/cmake-commands.7.html>`_ |