The Android Open Source Project | 1dc9e47 | 2009-03-03 19:28:35 -0800 | [diff] [blame^] | 1 | .\" Copyright (c) 1985 Regents of the University of California. |
| 2 | .\" All rights reserved. |
| 3 | .\" |
| 4 | .\" Redistribution and use in source and binary forms, with or without |
| 5 | .\" modification, are permitted provided that the following conditions |
| 6 | .\" are met: |
| 7 | .\" 1. Redistributions of source code must retain the above copyright |
| 8 | .\" notice, this list of conditions and the following disclaimer. |
| 9 | .\" 2. Redistributions in binary form must reproduce the above copyright |
| 10 | .\" notice, this list of conditions and the following disclaimer in the |
| 11 | .\" documentation and/or other materials provided with the distribution. |
| 12 | .\" 3. All advertising materials mentioning features or use of this software |
| 13 | .\" must display the following acknowledgement: |
| 14 | .\" This product includes software developed by the University of |
| 15 | .\" California, Berkeley and its contributors. |
| 16 | .\" 4. Neither the name of the University nor the names of its contributors |
| 17 | .\" may be used to endorse or promote products derived from this software |
| 18 | .\" without specific prior written permission. |
| 19 | .\" |
| 20 | .\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND |
| 21 | .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE |
| 22 | .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE |
| 23 | .\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE |
| 24 | .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL |
| 25 | .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS |
| 26 | .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) |
| 27 | .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT |
| 28 | .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY |
| 29 | .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF |
| 30 | .\" SUCH DAMAGE. |
| 31 | .\" |
| 32 | .\" from: @(#)ieee.3 6.4 (Berkeley) 5/6/91 |
| 33 | .\" $FreeBSD: src/lib/msun/man/ieee.3,v 1.22 2005/06/16 21:55:45 ru Exp $ |
| 34 | .\" |
| 35 | .Dd January 26, 2005 |
| 36 | .Dt IEEE 3 |
| 37 | .Os |
| 38 | .Sh NAME |
| 39 | .Nm ieee |
| 40 | .Nd IEEE standard 754 for floating-point arithmetic |
| 41 | .Sh DESCRIPTION |
| 42 | The IEEE Standard 754 for Binary Floating-Point Arithmetic |
| 43 | defines representations of floating-point numbers and abstract |
| 44 | properties of arithmetic operations relating to precision, |
| 45 | rounding, and exceptional cases, as described below. |
| 46 | .Ss IEEE STANDARD 754 Floating-Point Arithmetic |
| 47 | Radix: Binary. |
| 48 | .Pp |
| 49 | Overflow and underflow: |
| 50 | .Bd -ragged -offset indent -compact |
| 51 | Overflow goes by default to a signed \*(If. |
| 52 | Underflow is |
| 53 | .Em gradual . |
| 54 | .Ed |
| 55 | .Pp |
| 56 | Zero is represented ambiguously as +0 or \-0. |
| 57 | .Bd -ragged -offset indent -compact |
| 58 | Its sign transforms correctly through multiplication or |
| 59 | division, and is preserved by addition of zeros |
| 60 | with like signs; but x\-x yields +0 for every |
| 61 | finite x. |
| 62 | The only operations that reveal zero's |
| 63 | sign are division by zero and |
| 64 | .Fn copysign x \(+-0 . |
| 65 | In particular, comparison (x > y, x \(>= y, etc.)\& |
| 66 | cannot be affected by the sign of zero; but if |
| 67 | finite x = y then \*(If = 1/(x\-y) \(!= \-1/(y\-x) = \-\*(If. |
| 68 | .Ed |
| 69 | .Pp |
| 70 | Infinity is signed. |
| 71 | .Bd -ragged -offset indent -compact |
| 72 | It persists when added to itself |
| 73 | or to any finite number. |
| 74 | Its sign transforms |
| 75 | correctly through multiplication and division, and |
| 76 | (finite)/\(+-\*(If\0=\0\(+-0 |
| 77 | (nonzero)/0 = \(+-\*(If. |
| 78 | But |
| 79 | \*(If\-\*(If, \*(If\(**0 and \*(If/\*(If |
| 80 | are, like 0/0 and sqrt(\-3), |
| 81 | invalid operations that produce \*(Na. ... |
| 82 | .Ed |
| 83 | .Pp |
| 84 | Reserved operands (\*(Nas): |
| 85 | .Bd -ragged -offset indent -compact |
| 86 | An \*(Na is |
| 87 | .Em ( N Ns ot Em a N Ns umber ) . |
| 88 | Some \*(Nas, called Signaling \*(Nas, trap any floating-point operation |
| 89 | performed upon them; they are used to mark missing |
| 90 | or uninitialized values, or nonexistent elements |
| 91 | of arrays. |
| 92 | The rest are Quiet \*(Nas; they are |
| 93 | the default results of Invalid Operations, and |
| 94 | propagate through subsequent arithmetic operations. |
| 95 | If x \(!= x then x is \*(Na; every other predicate |
| 96 | (x > y, x = y, x < y, ...) is FALSE if \*(Na is involved. |
| 97 | .Ed |
| 98 | .Pp |
| 99 | Rounding: |
| 100 | .Bd -ragged -offset indent -compact |
| 101 | Every algebraic operation (+, \-, \(**, /, |
| 102 | \(sr) |
| 103 | is rounded by default to within half an |
| 104 | .Em ulp , |
| 105 | and when the rounding error is exactly half an |
| 106 | .Em ulp |
| 107 | then |
| 108 | the rounded value's least significant bit is zero. |
| 109 | (An |
| 110 | .Em ulp |
| 111 | is one |
| 112 | .Em U Ns nit |
| 113 | in the |
| 114 | .Em L Ns ast |
| 115 | .Em P Ns lace . ) |
| 116 | This kind of rounding is usually the best kind, |
| 117 | sometimes provably so; for instance, for every |
| 118 | x = 1.0, 2.0, 3.0, 4.0, ..., 2.0**52, we find |
| 119 | (x/3.0)\(**3.0 == x and (x/10.0)\(**10.0 == x and ... |
| 120 | despite that both the quotients and the products |
| 121 | have been rounded. |
| 122 | Only rounding like IEEE 754 can do that. |
| 123 | But no single kind of rounding can be |
| 124 | proved best for every circumstance, so IEEE 754 |
| 125 | provides rounding towards zero or towards |
| 126 | +\*(If or towards \-\*(If |
| 127 | at the programmer's option. |
| 128 | .Ed |
| 129 | .Pp |
| 130 | Exceptions: |
| 131 | .Bd -ragged -offset indent -compact |
| 132 | IEEE 754 recognizes five kinds of floating-point exceptions, |
| 133 | listed below in declining order of probable importance. |
| 134 | .Bl -column -offset indent "Invalid Operation" "Gradual Underflow" |
| 135 | .Em "Exception Default Result" |
| 136 | Invalid Operation \*(Na, or FALSE |
| 137 | Overflow \(+-\*(If |
| 138 | Divide by Zero \(+-\*(If |
| 139 | Underflow Gradual Underflow |
| 140 | Inexact Rounded value |
| 141 | .El |
| 142 | .Pp |
| 143 | NOTE: An Exception is not an Error unless handled |
| 144 | badly. |
| 145 | What makes a class of exceptions exceptional |
| 146 | is that no single default response can be satisfactory |
| 147 | in every instance. |
| 148 | On the other hand, if a default |
| 149 | response will serve most instances satisfactorily, |
| 150 | the unsatisfactory instances cannot justify aborting |
| 151 | computation every time the exception occurs. |
| 152 | .Ed |
| 153 | .Ss Data Formats |
| 154 | Single-precision: |
| 155 | .Bd -ragged -offset indent -compact |
| 156 | Type name: |
| 157 | .Vt float |
| 158 | .Pp |
| 159 | Wordsize: 32 bits. |
| 160 | .Pp |
| 161 | Precision: 24 significant bits, |
| 162 | roughly like 7 significant decimals. |
| 163 | .Bd -ragged -offset indent -compact |
| 164 | If x and x' are consecutive positive single-precision |
| 165 | numbers (they differ by 1 |
| 166 | .Em ulp ) , |
| 167 | then |
| 168 | .Bd -ragged -compact |
| 169 | 5.9e\-08 < 0.5**24 < (x'\-x)/x \(<= 0.5**23 < 1.2e\-07. |
| 170 | .Ed |
| 171 | .Ed |
| 172 | .Pp |
| 173 | .Bl -column "XXX" -compact |
| 174 | Range: Overflow threshold = 2.0**128 = 3.4e38 |
| 175 | Underflow threshold = 0.5**126 = 1.2e\-38 |
| 176 | .El |
| 177 | .Bd -ragged -offset indent -compact |
| 178 | Underflowed results round to the nearest |
| 179 | integer multiple of 0.5**149 = 1.4e\-45. |
| 180 | .Ed |
| 181 | .Ed |
| 182 | .Pp |
| 183 | Double-precision: |
| 184 | .Bd -ragged -offset indent -compact |
| 185 | Type name: |
| 186 | .Vt double |
| 187 | .Bd -ragged -offset indent -compact |
| 188 | On some architectures, |
| 189 | .Vt long double |
| 190 | is the the same as |
| 191 | .Vt double . |
| 192 | .Ed |
| 193 | .Pp |
| 194 | Wordsize: 64 bits. |
| 195 | .Pp |
| 196 | Precision: 53 significant bits, |
| 197 | roughly like 16 significant decimals. |
| 198 | .Bd -ragged -offset indent -compact |
| 199 | If x and x' are consecutive positive double-precision |
| 200 | numbers (they differ by 1 |
| 201 | .Em ulp ) , |
| 202 | then |
| 203 | .Bd -ragged -compact |
| 204 | 1.1e\-16 < 0.5**53 < (x'\-x)/x \(<= 0.5**52 < 2.3e\-16. |
| 205 | .Ed |
| 206 | .Ed |
| 207 | .Pp |
| 208 | .Bl -column "XXX" -compact |
| 209 | Range: Overflow threshold = 2.0**1024 = 1.8e308 |
| 210 | Underflow threshold = 0.5**1022 = 2.2e\-308 |
| 211 | .El |
| 212 | .Bd -ragged -offset indent -compact |
| 213 | Underflowed results round to the nearest |
| 214 | integer multiple of 0.5**1074 = 4.9e\-324. |
| 215 | .Ed |
| 216 | .Ed |
| 217 | .Pp |
| 218 | Extended-precision: |
| 219 | .Bd -ragged -offset indent -compact |
| 220 | Type name: |
| 221 | .Vt long double |
| 222 | (when supported by the hardware) |
| 223 | .Pp |
| 224 | Wordsize: 96 bits. |
| 225 | .Pp |
| 226 | Precision: 64 significant bits, |
| 227 | roughly like 19 significant decimals. |
| 228 | .Bd -ragged -offset indent -compact |
| 229 | If x and x' are consecutive positive double-precision |
| 230 | numbers (they differ by 1 |
| 231 | .Em ulp ) , |
| 232 | then |
| 233 | .Bd -ragged -compact |
| 234 | 1.0e\-19 < 0.5**63 < (x'\-x)/x \(<= 0.5**62 < 2.2e\-19. |
| 235 | .Ed |
| 236 | .Ed |
| 237 | .Pp |
| 238 | .Bl -column "XXX" -compact |
| 239 | Range: Overflow threshold = 2.0**16384 = 1.2e4932 |
| 240 | Underflow threshold = 0.5**16382 = 3.4e\-4932 |
| 241 | .El |
| 242 | .Bd -ragged -offset indent -compact |
| 243 | Underflowed results round to the nearest |
| 244 | integer multiple of 0.5**16445 = 5.7e\-4953. |
| 245 | .Ed |
| 246 | .Ed |
| 247 | .Pp |
| 248 | Quad-extended-precision: |
| 249 | .Bd -ragged -offset indent -compact |
| 250 | Type name: |
| 251 | .Vt long double |
| 252 | (when supported by the hardware) |
| 253 | .Pp |
| 254 | Wordsize: 128 bits. |
| 255 | .Pp |
| 256 | Precision: 113 significant bits, |
| 257 | roughly like 34 significant decimals. |
| 258 | .Bd -ragged -offset indent -compact |
| 259 | If x and x' are consecutive positive double-precision |
| 260 | numbers (they differ by 1 |
| 261 | .Em ulp ) , |
| 262 | then |
| 263 | .Bd -ragged -compact |
| 264 | 9.6e\-35 < 0.5**113 < (x'\-x)/x \(<= 0.5**112 < 2.0e\-34. |
| 265 | .Ed |
| 266 | .Ed |
| 267 | .Pp |
| 268 | .Bl -column "XXX" -compact |
| 269 | Range: Overflow threshold = 2.0**16384 = 1.2e4932 |
| 270 | Underflow threshold = 0.5**16382 = 3.4e\-4932 |
| 271 | .El |
| 272 | .Bd -ragged -offset indent -compact |
| 273 | Underflowed results round to the nearest |
| 274 | integer multiple of 0.5**16494 = 6.5e\-4966. |
| 275 | .Ed |
| 276 | .Ed |
| 277 | .Ss Additional Information Regarding Exceptions |
| 278 | .Pp |
| 279 | For each kind of floating-point exception, IEEE 754 |
| 280 | provides a Flag that is raised each time its exception |
| 281 | is signaled, and stays raised until the program resets |
| 282 | it. |
| 283 | Programs may also test, save and restore a flag. |
| 284 | Thus, IEEE 754 provides three ways by which programs |
| 285 | may cope with exceptions for which the default result |
| 286 | might be unsatisfactory: |
| 287 | .Bl -enum |
| 288 | .It |
| 289 | Test for a condition that might cause an exception |
| 290 | later, and branch to avoid the exception. |
| 291 | .It |
| 292 | Test a flag to see whether an exception has occurred |
| 293 | since the program last reset its flag. |
| 294 | .It |
| 295 | Test a result to see whether it is a value that only |
| 296 | an exception could have produced. |
| 297 | .Pp |
| 298 | CAUTION: The only reliable ways to discover |
| 299 | whether Underflow has occurred are to test whether |
| 300 | products or quotients lie closer to zero than the |
| 301 | underflow threshold, or to test the Underflow |
| 302 | flag. |
| 303 | (Sums and differences cannot underflow in |
| 304 | IEEE 754; if x \(!= y then x\-y is correct to |
| 305 | full precision and certainly nonzero regardless of |
| 306 | how tiny it may be.) |
| 307 | Products and quotients that |
| 308 | underflow gradually can lose accuracy gradually |
| 309 | without vanishing, so comparing them with zero |
| 310 | (as one might on a VAX) will not reveal the loss. |
| 311 | Fortunately, if a gradually underflowed value is |
| 312 | destined to be added to something bigger than the |
| 313 | underflow threshold, as is almost always the case, |
| 314 | digits lost to gradual underflow will not be missed |
| 315 | because they would have been rounded off anyway. |
| 316 | So gradual underflows are usually |
| 317 | .Em provably |
| 318 | ignorable. |
| 319 | The same cannot be said of underflows flushed to 0. |
| 320 | .El |
| 321 | .Pp |
| 322 | At the option of an implementor conforming to IEEE 754, |
| 323 | other ways to cope with exceptions may be provided: |
| 324 | .Bl -enum |
| 325 | .It |
| 326 | ABORT. |
| 327 | This mechanism classifies an exception in |
| 328 | advance as an incident to be handled by means |
| 329 | traditionally associated with error-handling |
| 330 | statements like "ON ERROR GO TO ...". |
| 331 | Different |
| 332 | languages offer different forms of this statement, |
| 333 | but most share the following characteristics: |
| 334 | .Bl -dash |
| 335 | .It |
| 336 | No means is provided to substitute a value for |
| 337 | the offending operation's result and resume |
| 338 | computation from what may be the middle of an |
| 339 | expression. |
| 340 | An exceptional result is abandoned. |
| 341 | .It |
| 342 | In a subprogram that lacks an error-handling |
| 343 | statement, an exception causes the subprogram to |
| 344 | abort within whatever program called it, and so |
| 345 | on back up the chain of calling subprograms until |
| 346 | an error-handling statement is encountered or the |
| 347 | whole task is aborted and memory is dumped. |
| 348 | .El |
| 349 | .It |
| 350 | STOP. |
| 351 | This mechanism, requiring an interactive |
| 352 | debugging environment, is more for the programmer |
| 353 | than the program. |
| 354 | It classifies an exception in |
| 355 | advance as a symptom of a programmer's error; the |
| 356 | exception suspends execution as near as it can to |
| 357 | the offending operation so that the programmer can |
| 358 | look around to see how it happened. |
| 359 | Quite often |
| 360 | the first several exceptions turn out to be quite |
| 361 | unexceptionable, so the programmer ought ideally |
| 362 | to be able to resume execution after each one as if |
| 363 | execution had not been stopped. |
| 364 | .It |
| 365 | \&... Other ways lie beyond the scope of this document. |
| 366 | .El |
| 367 | .Pp |
| 368 | Ideally, each |
| 369 | elementary function should act as if it were indivisible, or |
| 370 | atomic, in the sense that ... |
| 371 | .Bl -enum |
| 372 | .It |
| 373 | No exception should be signaled that is not deserved by |
| 374 | the data supplied to that function. |
| 375 | .It |
| 376 | Any exception signaled should be identified with that |
| 377 | function rather than with one of its subroutines. |
| 378 | .It |
| 379 | The internal behavior of an atomic function should not |
| 380 | be disrupted when a calling program changes from |
| 381 | one to another of the five or so ways of handling |
| 382 | exceptions listed above, although the definition |
| 383 | of the function may be correlated intentionally |
| 384 | with exception handling. |
| 385 | .El |
| 386 | .Pp |
| 387 | The functions in |
| 388 | .Nm libm |
| 389 | are only approximately atomic. |
| 390 | They signal no inappropriate exception except possibly ... |
| 391 | .Bl -tag -width indent -offset indent -compact |
| 392 | .It Xo |
| 393 | Over/Underflow |
| 394 | .Xc |
| 395 | when a result, if properly computed, might have lain barely within range, and |
| 396 | .It Xo |
| 397 | Inexact in |
| 398 | .Fn cabs , |
| 399 | .Fn cbrt , |
| 400 | .Fn hypot , |
| 401 | .Fn log10 |
| 402 | and |
| 403 | .Fn pow |
| 404 | .Xc |
| 405 | when it happens to be exact, thanks to fortuitous cancellation of errors. |
| 406 | .El |
| 407 | Otherwise, ... |
| 408 | .Bl -tag -width indent -offset indent -compact |
| 409 | .It Xo |
| 410 | Invalid Operation is signaled only when |
| 411 | .Xc |
| 412 | any result but \*(Na would probably be misleading. |
| 413 | .It Xo |
| 414 | Overflow is signaled only when |
| 415 | .Xc |
| 416 | the exact result would be finite but beyond the overflow threshold. |
| 417 | .It Xo |
| 418 | Divide-by-Zero is signaled only when |
| 419 | .Xc |
| 420 | a function takes exactly infinite values at finite operands. |
| 421 | .It Xo |
| 422 | Underflow is signaled only when |
| 423 | .Xc |
| 424 | the exact result would be nonzero but tinier than the underflow threshold. |
| 425 | .It Xo |
| 426 | Inexact is signaled only when |
| 427 | .Xc |
| 428 | greater range or precision would be needed to represent the exact result. |
| 429 | .El |
| 430 | .Sh SEE ALSO |
| 431 | .Xr fenv 3 , |
| 432 | .Xr ieee_test 3 , |
| 433 | .Xr math 3 |
| 434 | .Pp |
| 435 | An explanation of IEEE 754 and its proposed extension p854 |
| 436 | was published in the IEEE magazine MICRO in August 1984 under |
| 437 | the title "A Proposed Radix- and Word-length-independent |
| 438 | Standard for Floating-point Arithmetic" by |
| 439 | .An "W. J. Cody" |
| 440 | et al. |
| 441 | The manuals for Pascal, C and BASIC on the Apple Macintosh |
| 442 | document the features of IEEE 754 pretty well. |
| 443 | Articles in the IEEE magazine COMPUTER vol.\& 14 no.\& 3 (Mar.\& |
| 444 | 1981), and in the ACM SIGNUM Newsletter Special Issue of |
| 445 | Oct.\& 1979, may be helpful although they pertain to |
| 446 | superseded drafts of the standard. |
| 447 | .Sh STANDARDS |
| 448 | .St -ieee754 |