Justin Bogner | 89b77ce | 2017-10-12 01:44:24 +0000 | [diff] [blame] | 1 | ================================ |
| 2 | Fuzzing LLVM libraries and tools |
| 3 | ================================ |
| 4 | |
| 5 | .. contents:: |
| 6 | :local: |
| 7 | :depth: 2 |
| 8 | |
| 9 | Introduction |
| 10 | ============ |
| 11 | |
| 12 | The LLVM tree includes a number of fuzzers for various components. These are |
Justin Bogner | 218ada7 | 2018-08-16 21:55:09 +0000 | [diff] [blame] | 13 | built on top of :doc:`LibFuzzer <LibFuzzer>`. In order to build and run these |
| 14 | fuzzers, see :ref:`building-fuzzers`. |
Justin Bogner | 89b77ce | 2017-10-12 01:44:24 +0000 | [diff] [blame] | 15 | |
| 16 | |
| 17 | Available Fuzzers |
| 18 | ================= |
| 19 | |
| 20 | clang-fuzzer |
| 21 | ------------ |
| 22 | |
| 23 | A |generic fuzzer| that tries to compile textual input as C++ code. Some of the |
Justin Bogner | 997df9e | 2017-10-12 02:04:39 +0000 | [diff] [blame] | 24 | bugs this fuzzer has reported are `on bugzilla`__ and `on OSS Fuzz's |
| 25 | tracker`__. |
| 26 | |
| 27 | __ https://llvm.org/pr23057 |
| 28 | __ https://bugs.chromium.org/p/oss-fuzz/issues/list?q=proj-llvm+clang-fuzzer |
Justin Bogner | 89b77ce | 2017-10-12 01:44:24 +0000 | [diff] [blame] | 29 | |
| 30 | clang-proto-fuzzer |
| 31 | ------------------ |
| 32 | |
| 33 | A |protobuf fuzzer| that compiles valid C++ programs generated from a protobuf |
| 34 | class that describes a subset of the C++ language. |
| 35 | |
| 36 | This fuzzer accepts clang command line options after `ignore_remaining_args=1`. |
| 37 | For example, the following command will fuzz clang with a higher optimization |
| 38 | level: |
| 39 | |
| 40 | .. code-block:: shell |
| 41 | |
| 42 | % bin/clang-proto-fuzzer <corpus-dir> -ignore_remaining_args=1 -O3 |
| 43 | |
| 44 | clang-format-fuzzer |
| 45 | ------------------- |
| 46 | |
| 47 | A |generic fuzzer| that runs clang-format_ on C++ text fragments. Some of the |
Justin Bogner | 997df9e | 2017-10-12 02:04:39 +0000 | [diff] [blame] | 48 | bugs this fuzzer has reported are `on bugzilla`__ |
| 49 | and `on OSS Fuzz's tracker`__. |
Justin Bogner | 89b77ce | 2017-10-12 01:44:24 +0000 | [diff] [blame] | 50 | |
| 51 | .. _clang-format: https://clang.llvm.org/docs/ClangFormat.html |
Justin Bogner | 997df9e | 2017-10-12 02:04:39 +0000 | [diff] [blame] | 52 | __ https://llvm.org/pr23052 |
| 53 | __ https://bugs.chromium.org/p/oss-fuzz/issues/list?q=proj-llvm+clang-format-fuzzer |
Justin Bogner | 89b77ce | 2017-10-12 01:44:24 +0000 | [diff] [blame] | 54 | |
| 55 | llvm-as-fuzzer |
| 56 | -------------- |
| 57 | |
| 58 | A |generic fuzzer| that tries to parse text as :doc:`LLVM assembly <LangRef>`. |
Justin Bogner | 997df9e | 2017-10-12 02:04:39 +0000 | [diff] [blame] | 59 | Some of the bugs this fuzzer has reported are `on bugzilla`__. |
| 60 | |
| 61 | __ https://llvm.org/pr24639 |
Justin Bogner | 89b77ce | 2017-10-12 01:44:24 +0000 | [diff] [blame] | 62 | |
| 63 | llvm-dwarfdump-fuzzer |
| 64 | --------------------- |
| 65 | |
| 66 | A |generic fuzzer| that interprets inputs as object files and runs |
| 67 | :doc:`llvm-dwarfdump <CommandGuide/llvm-dwarfdump>` on them. Some of the bugs |
Justin Bogner | 997df9e | 2017-10-12 02:04:39 +0000 | [diff] [blame] | 68 | this fuzzer has reported are `on OSS Fuzz's tracker`__ |
| 69 | |
| 70 | __ https://bugs.chromium.org/p/oss-fuzz/issues/list?q=proj-llvm+llvm-dwarfdump-fuzzer |
Justin Bogner | 89b77ce | 2017-10-12 01:44:24 +0000 | [diff] [blame] | 71 | |
Matt Morehouse | b1b02b7 | 2017-10-13 17:35:37 +0000 | [diff] [blame] | 72 | llvm-demangle-fuzzer |
| 73 | --------------------- |
| 74 | |
| 75 | A |generic fuzzer| for the Itanium demangler used in various LLVM tools. We've |
| 76 | fuzzed __cxa_demangle to death, why not fuzz LLVM's implementation of the same |
| 77 | function! |
| 78 | |
Justin Bogner | 89b77ce | 2017-10-12 01:44:24 +0000 | [diff] [blame] | 79 | llvm-isel-fuzzer |
| 80 | ---------------- |
| 81 | |
| 82 | A |LLVM IR fuzzer| aimed at finding bugs in instruction selection. |
| 83 | |
| 84 | This fuzzer accepts flags after `ignore_remaining_args=1`. The flags match |
| 85 | those of :doc:`llc <CommandGuide/llc>` and the triple is required. For example, |
| 86 | the following command would fuzz AArch64 with :doc:`GlobalISel`: |
| 87 | |
| 88 | .. code-block:: shell |
| 89 | |
| 90 | % bin/llvm-isel-fuzzer <corpus-dir> -ignore_remaining_args=1 -mtriple aarch64 -global-isel -O0 |
| 91 | |
Justin Bogner | 58658af | 2017-10-12 04:35:32 +0000 | [diff] [blame] | 92 | Some flags can also be specified in the binary name itself in order to support |
| 93 | OSS Fuzz, which has trouble with required arguments. To do this, you can copy |
Justin Bogner | c880526 | 2017-10-13 00:27:35 +0000 | [diff] [blame] | 94 | or move ``llvm-isel-fuzzer`` to ``llvm-isel-fuzzer--x-y-z``, separating options |
| 95 | from the binary name using "--". The valid options are architecture names |
| 96 | (``aarch64``, ``x86_64``), optimization levels (``O0``, ``O2``), or specific |
| 97 | keywords, like ``gisel`` for enabling global instruction selection. In this |
| 98 | mode, the same example could be run like so: |
| 99 | |
| 100 | .. code-block:: shell |
| 101 | |
| 102 | % bin/llvm-isel-fuzzer--aarch64-O0-gisel <corpus-dir> |
Justin Bogner | 58658af | 2017-10-12 04:35:32 +0000 | [diff] [blame] | 103 | |
Igor Laevsky | 089f886 | 2017-11-10 12:19:08 +0000 | [diff] [blame] | 104 | llvm-opt-fuzzer |
| 105 | --------------- |
| 106 | |
| 107 | A |LLVM IR fuzzer| aimed at finding bugs in optimization passes. |
| 108 | |
| 109 | It receives optimzation pipeline and runs it for each fuzzer input. |
| 110 | |
| 111 | Interface of this fuzzer almost directly mirrors ``llvm-isel-fuzzer``. Both |
| 112 | ``mtriple`` and ``passes`` arguments are required. Passes are specified in a |
Justin Bogner | 218ada7 | 2018-08-16 21:55:09 +0000 | [diff] [blame] | 113 | format suitable for the new pass manager. You can find some documentation about |
| 114 | this format in the doxygen for ``PassBuilder::parsePassPipeline``. |
Igor Laevsky | 089f886 | 2017-11-10 12:19:08 +0000 | [diff] [blame] | 115 | |
| 116 | .. code-block:: shell |
| 117 | |
| 118 | % bin/llvm-opt-fuzzer <corpus-dir> -ignore_remaining_args=1 -mtriple x86_64 -passes instcombine |
| 119 | |
| 120 | Similarly to the ``llvm-isel-fuzzer`` arguments in some predefined configurations |
| 121 | might be embedded directly into the binary file name: |
| 122 | |
| 123 | .. code-block:: shell |
| 124 | |
| 125 | % bin/llvm-opt-fuzzer--x86_64-instcombine <corpus-dir> |
| 126 | |
Justin Bogner | 89b77ce | 2017-10-12 01:44:24 +0000 | [diff] [blame] | 127 | llvm-mc-assemble-fuzzer |
| 128 | ----------------------- |
| 129 | |
| 130 | A |generic fuzzer| that fuzzes the MC layer's assemblers by treating inputs as |
| 131 | target specific assembly. |
| 132 | |
| 133 | Note that this fuzzer has an unusual command line interface which is not fully |
| 134 | compatible with all of libFuzzer's features. Fuzzer arguments must be passed |
| 135 | after ``--fuzzer-args``, and any ``llc`` flags must use two dashes. For |
| 136 | example, to fuzz the AArch64 assembler you might use the following command: |
| 137 | |
| 138 | .. code-block:: console |
| 139 | |
| 140 | llvm-mc-fuzzer --triple=aarch64-linux-gnu --fuzzer-args -max_len=4 |
| 141 | |
| 142 | This scheme will likely change in the future. |
| 143 | |
| 144 | llvm-mc-disassemble-fuzzer |
| 145 | -------------------------- |
| 146 | |
| 147 | A |generic fuzzer| that fuzzes the MC layer's disassemblers by treating inputs |
| 148 | as assembled binary data. |
| 149 | |
| 150 | Note that this fuzzer has an unusual command line interface which is not fully |
| 151 | compatible with all of libFuzzer's features. See the notes above about |
| 152 | ``llvm-mc-assemble-fuzzer`` for details. |
| 153 | |
| 154 | |
| 155 | .. |generic fuzzer| replace:: :ref:`generic fuzzer <fuzzing-llvm-generic>` |
| 156 | .. |protobuf fuzzer| |
| 157 | replace:: :ref:`libprotobuf-mutator based fuzzer <fuzzing-llvm-protobuf>` |
| 158 | .. |LLVM IR fuzzer| |
| 159 | replace:: :ref:`structured LLVM IR fuzzer <fuzzing-llvm-ir>` |
| 160 | |
| 161 | |
| 162 | Mutators and Input Generators |
| 163 | ============================= |
| 164 | |
| 165 | The inputs for a fuzz target are generated via random mutations of a |
| 166 | :ref:`corpus <libfuzzer-corpus>`. There are a few options for the kinds of |
| 167 | mutations that a fuzzer in LLVM might want. |
| 168 | |
| 169 | .. _fuzzing-llvm-generic: |
| 170 | |
| 171 | Generic Random Fuzzing |
| 172 | ---------------------- |
| 173 | |
| 174 | The most basic form of input mutation is to use the built in mutators of |
| 175 | LibFuzzer. These simply treat the input corpus as a bag of bits and make random |
| 176 | mutations. This type of fuzzer is good for stressing the surface layers of a |
| 177 | program, and is good at testing things like lexers, parsers, or binary |
| 178 | protocols. |
| 179 | |
| 180 | Some of the in-tree fuzzers that use this type of mutator are `clang-fuzzer`_, |
| 181 | `clang-format-fuzzer`_, `llvm-as-fuzzer`_, `llvm-dwarfdump-fuzzer`_, |
| 182 | `llvm-mc-assemble-fuzzer`_, and `llvm-mc-disassemble-fuzzer`_. |
| 183 | |
| 184 | .. _fuzzing-llvm-protobuf: |
| 185 | |
| 186 | Structured Fuzzing using ``libprotobuf-mutator`` |
| 187 | ------------------------------------------------ |
| 188 | |
| 189 | We can use libprotobuf-mutator_ in order to perform structured fuzzing and |
| 190 | stress deeper layers of programs. This works by defining a protobuf class that |
| 191 | translates arbitrary data into structurally interesting input. Specifically, we |
| 192 | use this to work with a subset of the C++ language and perform mutations that |
| 193 | produce valid C++ programs in order to exercise parts of clang that are more |
| 194 | interesting than parser error handling. |
| 195 | |
| 196 | To build this kind of fuzzer you need `protobuf`_ and its dependencies |
| 197 | installed, and you need to specify some extra flags when configuring the build |
| 198 | with :doc:`CMake <CMake>`. For example, `clang-proto-fuzzer`_ can be enabled by |
| 199 | adding ``-DCLANG_ENABLE_PROTO_FUZZER=ON`` to the flags described in |
| 200 | :ref:`building-fuzzers`. |
| 201 | |
| 202 | The only in-tree fuzzer that uses ``libprotobuf-mutator`` today is |
| 203 | `clang-proto-fuzzer`_. |
| 204 | |
| 205 | .. _libprotobuf-mutator: https://github.com/google/libprotobuf-mutator |
| 206 | .. _protobuf: https://github.com/google/protobuf |
| 207 | |
| 208 | .. _fuzzing-llvm-ir: |
| 209 | |
| 210 | Structured Fuzzing of LLVM IR |
| 211 | ----------------------------- |
| 212 | |
| 213 | We also use a more direct form of structured fuzzing for fuzzers that take |
| 214 | :doc:`LLVM IR <LangRef>` as input. This is achieved through the ``FuzzMutate`` |
| 215 | library, which was `discussed at EuroLLVM 2017`_. |
| 216 | |
| 217 | The ``FuzzMutate`` library is used to structurally fuzz backends in |
| 218 | `llvm-isel-fuzzer`_. |
| 219 | |
| 220 | .. _discussed at EuroLLVM 2017: https://www.youtube.com/watch?v=UBbQ_s6hNgg |
| 221 | |
| 222 | |
| 223 | Building and Running |
| 224 | ==================== |
| 225 | |
| 226 | .. _building-fuzzers: |
| 227 | |
| 228 | Configuring LLVM to Build Fuzzers |
| 229 | --------------------------------- |
| 230 | |
| 231 | Fuzzers will be built and linked to libFuzzer by default as long as you build |
| 232 | LLVM with sanitizer coverage enabled. You would typically also enable at least |
Justin Bogner | 18c35eb | 2017-10-13 06:29:09 +0000 | [diff] [blame] | 233 | one sanitizer to find bugs faster. The most common way to build the fuzzers is |
| 234 | by adding the following two flags to your CMake invocation: |
| 235 | ``-DLLVM_USE_SANITIZER=Address -DLLVM_USE_SANITIZE_COVERAGE=On``. |
Justin Bogner | 89b77ce | 2017-10-12 01:44:24 +0000 | [diff] [blame] | 236 | |
| 237 | .. note:: If you have ``compiler-rt`` checked out in an LLVM tree when building |
| 238 | with sanitizers, you'll want to specify ``-DLLVM_BUILD_RUNTIME=Off`` |
| 239 | to avoid building the sanitizers themselves with sanitizers enabled. |
| 240 | |
Justin Bogner | 218ada7 | 2018-08-16 21:55:09 +0000 | [diff] [blame] | 241 | .. note:: You may run into issues if you build with BFD ld, which is the |
| 242 | default linker on many unix systems. These issues are being tracked |
| 243 | in https://llvm.org/PR34636. |
| 244 | |
Justin Bogner | 89b77ce | 2017-10-12 01:44:24 +0000 | [diff] [blame] | 245 | Continuously Running and Finding Bugs |
| 246 | ------------------------------------- |
| 247 | |
| 248 | There used to be a public buildbot running LLVM fuzzers continuously, and while |
| 249 | this did find issues, it didn't have a very good way to report problems in an |
| 250 | actionable way. Because of this, we're moving towards using `OSS Fuzz`_ more |
| 251 | instead. |
| 252 | |
Justin Bogner | 68b5d4b | 2017-10-12 02:28:26 +0000 | [diff] [blame] | 253 | You can browse the `LLVM project issue list`_ for the bugs found by |
| 254 | `LLVM on OSS Fuzz`_. These are also mailed to the `llvm-bugs mailing |
| 255 | list`_. |
Justin Bogner | 89b77ce | 2017-10-12 01:44:24 +0000 | [diff] [blame] | 256 | |
| 257 | .. _OSS Fuzz: https://github.com/google/oss-fuzz |
Justin Bogner | 68b5d4b | 2017-10-12 02:28:26 +0000 | [diff] [blame] | 258 | .. _LLVM project issue list: |
| 259 | https://bugs.chromium.org/p/oss-fuzz/issues/list?q=Proj-llvm |
| 260 | .. _LLVM on OSS Fuzz: |
| 261 | https://github.com/google/oss-fuzz/blob/master/projects/llvm |
| 262 | .. _llvm-bugs mailing list: |
| 263 | http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs |
Justin Bogner | 89b77ce | 2017-10-12 01:44:24 +0000 | [diff] [blame] | 264 | |
| 265 | |
| 266 | Utilities for Writing Fuzzers |
| 267 | ============================= |
| 268 | |
| 269 | There are some utilities available for writing fuzzers in LLVM. |
| 270 | |
| 271 | Some helpers for handling the command line interface are available in |
| 272 | ``include/llvm/FuzzMutate/FuzzerCLI.h``, including functions to parse command |
| 273 | line options in a consistent way and to implement standalone main functions so |
| 274 | your fuzzer can be built and tested when not built against libFuzzer. |
| 275 | |
| 276 | There is also some handling of the CMake config for fuzzers, where you should |
| 277 | use the ``add_llvm_fuzzer`` to set up fuzzer targets. This function works |
| 278 | similarly to functions such as ``add_llvm_tool``, but they take care of linking |
| 279 | to LibFuzzer when appropriate and can be passed the ``DUMMY_MAIN`` argument to |
| 280 | enable standalone testing. |