blob: 1a555b678324d1e6e49f3760c973cfec1699059b [file] [log] [blame]
Dmitry Preobrazhensky51120d72018-12-17 17:38:11 +00001======================================
2Syntax of AMDGPU Instruction Modifiers
3======================================
4
5.. contents::
6 :local:
7
8Conventions
9===========
10
11The following notation is used throughout this document:
12
13 =================== =============================================================
14 Notation Description
15 =================== =============================================================
16 {0..N} Any integer value in the range from 0 to N (inclusive).
17 <x> Syntax and meaning of *x* is explained elsewhere.
18 =================== =============================================================
19
20.. _amdgpu_syn_modifiers:
21
22Modifiers
23=========
24
25DS Modifiers
26------------
27
28.. _amdgpu_synid_ds_offset8:
29
Dmitry Preobrazhensky78dfeb72018-12-28 11:48:23 +000030offset8
31~~~~~~~
Dmitry Preobrazhensky51120d72018-12-17 17:38:11 +000032
33Specifies an immediate unsigned 8-bit offset, in bytes. The default value is 0.
34
35Used with DS instructions which have 2 addresses.
36
37 =================== =====================================================
38 Syntax Description
39 =================== =====================================================
40 offset:{0..0xFF} Specifies an unsigned 8-bit offset as a positive
41 :ref:`integer number <amdgpu_synid_integer_number>`.
42 =================== =====================================================
43
44Examples:
45
Dmitry Preobrazhensky23da1102018-12-17 18:53:10 +000046.. parsed-literal::
Dmitry Preobrazhensky51120d72018-12-17 17:38:11 +000047
48 offset:255
49 offset:0xff
50
51.. _amdgpu_synid_ds_offset16:
52
Dmitry Preobrazhensky78dfeb72018-12-28 11:48:23 +000053offset16
54~~~~~~~~
Dmitry Preobrazhensky51120d72018-12-17 17:38:11 +000055
56Specifies an immediate unsigned 16-bit offset, in bytes. The default value is 0.
57
58Used with DS instructions which have 1 address.
59
60 ==================== ======================================================
61 Syntax Description
62 ==================== ======================================================
63 offset:{0..0xFFFF} Specifies an unsigned 16-bit offset as a positive
64 :ref:`integer number <amdgpu_synid_integer_number>`.
65 ==================== ======================================================
66
67Examples:
68
Dmitry Preobrazhensky23da1102018-12-17 18:53:10 +000069.. parsed-literal::
Dmitry Preobrazhensky51120d72018-12-17 17:38:11 +000070
71 offset:65535
72 offset:0xffff
73
74.. _amdgpu_synid_sw_offset16:
75
Dmitry Preobrazhensky78dfeb72018-12-28 11:48:23 +000076pattern
77~~~~~~~
Dmitry Preobrazhensky51120d72018-12-17 17:38:11 +000078
79This is a special modifier which may be used with *ds_swizzle_b32* instruction only.
80It specifies a swizzle pattern in numeric or symbolic form. The default value is 0.
81
82See AMD documentation for more information.
83
84 ======================================================= ===========================================================
85 Syntax Description
86 ======================================================= ===========================================================
87 offset:{0..0xFFFF} Specifies a 16-bit swizzle pattern.
88 offset:swizzle(QUAD_PERM,{0..3},{0..3},{0..3},{0..3}) Specifies a quad permute mode pattern
89
90 Each number is a lane *id*.
91 offset:swizzle(BITMASK_PERM, "<mask>") Specifies a bitmask permute mode pattern.
92
93 The pattern converts a 5-bit lane *id* to another
94 lane *id* with which the lane interacts.
95
96 *mask* is a 5 character sequence which
97 specifies how to transform the bits of the
98 lane *id*.
99
100 The following characters are allowed:
101
102 * "0" - set bit to 0.
103
104 * "1" - set bit to 1.
105
106 * "p" - preserve bit.
107
108 * "i" - inverse bit.
109
110 offset:swizzle(BROADCAST,{2..32},{0..N}) Specifies a broadcast mode.
111
112 Broadcasts the value of any particular lane to
113 all lanes in its group.
114
115 The first numeric parameter is a group
116 size and must be equal to 2, 4, 8, 16 or 32.
117
118 The second numeric parameter is an index of the
119 lane being broadcasted.
120
121 The index must not exceed group size.
122 offset:swizzle(SWAP,{1..16}) Specifies a swap mode.
123
124 Swaps the neighboring groups of
125 1, 2, 4, 8 or 16 lanes.
126 offset:swizzle(REVERSE,{2..32}) Specifies a reverse mode.
127
128 Reverses the lanes for groups of 2, 4, 8, 16 or 32 lanes.
129 ======================================================= ===========================================================
130
131Numeric parameters may be specified as either :ref:`integer numbers<amdgpu_synid_integer_number>` or
132:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
133
134Examples:
135
Dmitry Preobrazhensky23da1102018-12-17 18:53:10 +0000136.. parsed-literal::
Dmitry Preobrazhensky51120d72018-12-17 17:38:11 +0000137
138 offset:255
139 offset:0xffff
140 offset:swizzle(QUAD_PERM, 0, 1, 2 ,3)
141 offset:swizzle(BITMASK_PERM, "01pi0")
142 offset:swizzle(BROADCAST, 2, 0)
143 offset:swizzle(SWAP, 8)
144 offset:swizzle(REVERSE, 30 + 2)
145
146.. _amdgpu_synid_gds:
147
148gds
149~~~
150
151Specifies whether to use GDS or LDS memory (LDS is the default).
152
153 ======================================== ================================================
154 Syntax Description
155 ======================================== ================================================
156 gds Use GDS memory.
157 ======================================== ================================================
158
159
160EXP Modifiers
161-------------
162
163.. _amdgpu_synid_done:
164
165done
166~~~~
167
168Specifies if this is the last export from the shader to the target. By default, current
169instruction does not finish an export sequence.
170
171 ======================================== ================================================
172 Syntax Description
173 ======================================== ================================================
174 done Indicates the last export operation.
175 ======================================== ================================================
176
177.. _amdgpu_synid_compr:
178
179compr
180~~~~~
181
182Indicates if the data are compressed (data are not compressed by default).
183
184 ======================================== ================================================
185 Syntax Description
186 ======================================== ================================================
187 compr Data are compressed.
188 ======================================== ================================================
189
190.. _amdgpu_synid_vm:
191
192vm
193~~
194
195Specifies valid mask flag state (off by default).
196
197 ======================================== ================================================
198 Syntax Description
199 ======================================== ================================================
200 vm Set valid mask flag.
201 ======================================== ================================================
202
203FLAT Modifiers
204--------------
205
206.. _amdgpu_synid_flat_offset12:
207
Dmitry Preobrazhensky78dfeb72018-12-28 11:48:23 +0000208offset12
209~~~~~~~~
Dmitry Preobrazhensky51120d72018-12-17 17:38:11 +0000210
211Specifies an immediate unsigned 12-bit offset, in bytes. The default value is 0.
212
213Cannot be used with *global/scratch* opcodes. GFX9 only.
214
215 ================= ======================================================
216 Syntax Description
217 ================= ======================================================
218 offset:{0..4095} Specifies a 12-bit unsigned offset as a positive
219 :ref:`integer number <amdgpu_synid_integer_number>`.
220 ================= ======================================================
221
222Examples:
223
Dmitry Preobrazhensky23da1102018-12-17 18:53:10 +0000224.. parsed-literal::
Dmitry Preobrazhensky51120d72018-12-17 17:38:11 +0000225
226 offset:4095
227 offset:0xff
228
Dmitry Preobrazhensky78dfeb72018-12-28 11:48:23 +0000229.. _amdgpu_synid_flat_offset13s:
Dmitry Preobrazhensky51120d72018-12-17 17:38:11 +0000230
Dmitry Preobrazhensky78dfeb72018-12-28 11:48:23 +0000231offset13s
232~~~~~~~~~
Dmitry Preobrazhensky51120d72018-12-17 17:38:11 +0000233
234Specifies an immediate signed 13-bit offset, in bytes. The default value is 0.
235
236Can be used with *global/scratch* opcodes only. GFX9 only.
237
238 ============================ =======================================================
239 Syntax Description
240 ============================ =======================================================
Dmitry Preobrazhensky78dfeb72018-12-28 11:48:23 +0000241 offset:{-4096..4095} Specifies a 13-bit signed offset as an
Dmitry Preobrazhensky51120d72018-12-17 17:38:11 +0000242 :ref:`integer number <amdgpu_synid_integer_number>`.
243 ============================ =======================================================
244
245Examples:
246
Dmitry Preobrazhensky23da1102018-12-17 18:53:10 +0000247.. parsed-literal::
Dmitry Preobrazhensky51120d72018-12-17 17:38:11 +0000248
249 offset:-4000
250 offset:0x10
251
252glc
253~~~
254
255See a description :ref:`here<amdgpu_synid_glc>`.
256
257slc
258~~~
259
260See a description :ref:`here<amdgpu_synid_slc>`.
261
262tfe
263~~~
264
265See a description :ref:`here<amdgpu_synid_tfe>`.
266
267nv
268~~
269
270See a description :ref:`here<amdgpu_synid_nv>`.
271
272MIMG Modifiers
273--------------
274
275.. _amdgpu_synid_dmask:
276
277dmask
278~~~~~
279
280Specifies which channels (image components) are used by the operation. By default, no channels
281are used.
282
283 =============== =====================================================
284 Syntax Description
285 =============== =====================================================
286 dmask:{0..15} Specifies image channels as a positive
287 :ref:`integer number <amdgpu_synid_integer_number>`.
288
289 Each bit corresponds to one of 4 image
290 components (RGBA).
291
292 If the specified bit value
293 is 0, the component is not used, value 1 means
294 that the component is used.
295 =============== =====================================================
296
297This modifier has some limitations depending on instruction kind:
298
299 =================================================== ========================
300 Instruction Kind Valid dmask Values
301 =================================================== ========================
302 32-bit atomic *cmpswap* 0x3
303 32-bit atomic instructions except for *cmpswap* 0x1
304 64-bit atomic *cmpswap* 0xF
305 64-bit atomic instructions except for *cmpswap* 0x3
306 *gather4* 0x1, 0x2, 0x4, 0x8
307 Other instructions any value
308 =================================================== ========================
309
310Examples:
311
Dmitry Preobrazhensky23da1102018-12-17 18:53:10 +0000312.. parsed-literal::
Dmitry Preobrazhensky51120d72018-12-17 17:38:11 +0000313
314 dmask:0xf
315 dmask:0b1111
316 dmask:3
317
318.. _amdgpu_synid_unorm:
319
320unorm
321~~~~~
322
323Specifies whether the address is normalized or not (the address is normalized by default).
324
325 ======================== ========================================
326 Syntax Description
327 ======================== ========================================
328 unorm Force the address to be unnormalized.
329 ======================== ========================================
330
331glc
332~~~
333
334See a description :ref:`here<amdgpu_synid_glc>`.
335
336slc
337~~~
338
339See a description :ref:`here<amdgpu_synid_slc>`.
340
341.. _amdgpu_synid_r128:
342
343r128
344~~~~
345
346Specifies texture resource size. The default size is 256 bits.
347
348GFX7 and GFX8 only.
349
350 =================== ================================================
351 Syntax Description
352 =================== ================================================
353 r128 Specifies 128 bits texture resource size.
354 =================== ================================================
355
Dmitry Preobrazhensky78dfeb72018-12-28 11:48:23 +0000356.. WARNING:: Using this modifier should descrease *rsrc* operand size from 8 to 4 dwords, but assembler does not currently support this feature.
Dmitry Preobrazhensky51120d72018-12-17 17:38:11 +0000357
358tfe
359~~~
360
361See a description :ref:`here<amdgpu_synid_tfe>`.
362
363.. _amdgpu_synid_lwe:
364
365lwe
366~~~
367
368Specifies LOD warning status (LOD warning is disabled by default).
369
370 ======================================== ================================================
371 Syntax Description
372 ======================================== ================================================
373 lwe Enables LOD warning.
374 ======================================== ================================================
375
376.. _amdgpu_synid_da:
377
378da
379~~
380
381Specifies if an array index must be sent to TA. By default, array index is not sent.
382
383 ======================================== ================================================
384 Syntax Description
385 ======================================== ================================================
386 da Send an array-index to TA.
387 ======================================== ================================================
388
389.. _amdgpu_synid_d16:
390
391d16
392~~~
393
394Specifies data size: 16 or 32 bits (32 bits by default). Not supported by GFX7.
395
396 ======================================== ================================================
397 Syntax Description
398 ======================================== ================================================
399 d16 Enables 16-bits data mode.
400
401 On loads, convert data in memory to 16-bit
402 format before storing it in VGPRs.
403
404 For stores, convert 16-bit data in VGPRs to
405 32 bits before going to memory.
406
407 Note that GFX8.0 does not support data packing.
408 Each 16-bit data element occupies 1 VGPR.
409
410 GFX8.1 and GFX9 support data packing.
411 Each pair of 16-bit data elements
412 occupies 1 VGPR.
413 ======================================== ================================================
414
415.. _amdgpu_synid_a16:
416
417a16
418~~~
419
420Specifies size of image address components: 16 or 32 bits (32 bits by default). GFX9 only.
421
422 ======================================== ================================================
423 Syntax Description
424 ======================================== ================================================
425 a16 Enables 16-bits image address components.
426 ======================================== ================================================
427
428Miscellaneous Modifiers
429-----------------------
430
431.. _amdgpu_synid_glc:
432
433glc
434~~~
435
436This modifier has different meaning for loads, stores, and atomic operations.
437The default value is off (0).
438
439See AMD documentation for details.
440
441 ======================================== ================================================
442 Syntax Description
443 ======================================== ================================================
444 glc Set glc bit to 1.
445 ======================================== ================================================
446
447.. _amdgpu_synid_slc:
448
449slc
450~~~
451
452Specifies cache policy. The default value is off (0).
453
454See AMD documentation for details.
455
456 ======================================== ================================================
457 Syntax Description
458 ======================================== ================================================
459 slc Set slc bit to 1.
460 ======================================== ================================================
461
462.. _amdgpu_synid_tfe:
463
464tfe
465~~~
466
467Controls access to partially resident textures. The default value is off (0).
468
469See AMD documentation for details.
470
471 ======================================== ================================================
472 Syntax Description
473 ======================================== ================================================
474 tfe Set tfe bit to 1.
475 ======================================== ================================================
476
477.. _amdgpu_synid_nv:
478
479nv
480~~
481
482Specifies if instruction is operating on non-volatile memory. By default, memory is volatile.
483
484GFX9 only.
485
486 ======================================== ================================================
487 Syntax Description
488 ======================================== ================================================
489 nv Indicates that instruction operates on
490 non-volatile memory.
491 ======================================== ================================================
492
493MUBUF/MTBUF Modifiers
494---------------------
495
496.. _amdgpu_synid_idxen:
497
498idxen
499~~~~~
500
501Specifies whether address components include an index. By default, no components are used.
502
503Can be used together with :ref:`offen<amdgpu_synid_offen>`.
504
505Cannot be used with :ref:`addr64<amdgpu_synid_addr64>`.
506
507 ======================================== ================================================
508 Syntax Description
509 ======================================== ================================================
510 idxen Address components include an index.
511 ======================================== ================================================
512
513.. _amdgpu_synid_offen:
514
515offen
516~~~~~
517
518Specifies whether address components include an offset. By default, no components are used.
519
520Can be used together with :ref:`idxen<amdgpu_synid_idxen>`.
521
522Cannot be used with :ref:`addr64<amdgpu_synid_addr64>`.
523
524 ======================================== ================================================
525 Syntax Description
526 ======================================== ================================================
527 offen Address components include an offset.
528 ======================================== ================================================
529
530.. _amdgpu_synid_addr64:
531
532addr64
533~~~~~~
534
535Specifies whether a 64-bit address is used. By default, no address is used.
536
537GFX7 only. Cannot be used with :ref:`offen<amdgpu_synid_offen>` and
538:ref:`idxen<amdgpu_synid_idxen>` modifiers.
539
540 ======================================== ================================================
541 Syntax Description
542 ======================================== ================================================
543 addr64 A 64-bit address is used.
544 ======================================== ================================================
545
546.. _amdgpu_synid_buf_offset12:
547
Dmitry Preobrazhensky78dfeb72018-12-28 11:48:23 +0000548offset12
549~~~~~~~~
Dmitry Preobrazhensky51120d72018-12-17 17:38:11 +0000550
551Specifies an immediate unsigned 12-bit offset, in bytes. The default value is 0.
552
553 =============================== ======================================================
554 Syntax Description
555 =============================== ======================================================
556 offset:{0..0xFFF} Specifies a 12-bit unsigned offset as a positive
557 :ref:`integer number <amdgpu_synid_integer_number>`.
558 =============================== ======================================================
559
560Examples:
561
Dmitry Preobrazhensky23da1102018-12-17 18:53:10 +0000562.. parsed-literal::
Dmitry Preobrazhensky51120d72018-12-17 17:38:11 +0000563
564 offset:0
565 offset:0x10
566
567glc
568~~~
569
570See a description :ref:`here<amdgpu_synid_glc>`.
571
572slc
573~~~
574
575See a description :ref:`here<amdgpu_synid_slc>`.
576
577.. _amdgpu_synid_lds:
578
579lds
580~~~
581
582Specifies where to store the result: VGPRs or LDS (VGPRs by default).
583
584 ======================================== ===========================
585 Syntax Description
586 ======================================== ===========================
587 lds Store result in LDS.
588 ======================================== ===========================
589
590tfe
591~~~
592
593See a description :ref:`here<amdgpu_synid_tfe>`.
594
595.. _amdgpu_synid_dfmt:
596
597dfmt
598~~~~
599
600TBD
601
602.. _amdgpu_synid_nfmt:
603
604nfmt
605~~~~
606
607TBD
608
609SMRD/SMEM Modifiers
610-------------------
611
612glc
613~~~
614
615See a description :ref:`here<amdgpu_synid_glc>`.
616
617nv
618~~
619
620See a description :ref:`here<amdgpu_synid_nv>`.
621
622VINTRP Modifiers
623----------------
624
625.. _amdgpu_synid_high:
626
627high
628~~~~
629
630Specifies which half of the LDS word to use. Low half of LDS word is used by default.
631GFX9 only.
632
633 ======================================== ================================
634 Syntax Description
635 ======================================== ================================
636 high Use high half of LDS word.
637 ======================================== ================================
638
639VOP1/VOP2 DPP Modifiers
640-----------------------
641
642GFX8 and GFX9 only.
643
644.. _amdgpu_synid_dpp_ctrl:
645
646dpp_ctrl
647~~~~~~~~
648
649Specifies how data are shared between threads. This is a mandatory modifier.
650There is no default value.
651
652Note. The lanes of a wavefront are organized in four banks and four rows.
653
654 ======================================== ================================================
655 Syntax Description
656 ======================================== ================================================
657 quad_perm:[{0..3},{0..3},{0..3},{0..3}] Full permute of 4 threads.
658 row_mirror Mirror threads within row.
659 row_half_mirror Mirror threads within 1/2 row (8 threads).
660 row_bcast:15 Broadcast 15th thread of each row to next row.
661 row_bcast:31 Broadcast thread 31 to rows 2 and 3.
662 wave_shl:1 Wavefront left shift by 1 thread.
663 wave_rol:1 Wavefront left rotate by 1 thread.
664 wave_shr:1 Wavefront right shift by 1 thread.
665 wave_ror:1 Wavefront right rotate by 1 thread.
666 row_shl:{1..15} Row shift left by 1-15 threads.
667 row_shr:{1..15} Row shift right by 1-15 threads.
668 row_ror:{1..15} Row rotate right by 1-15 threads.
669 ======================================== ================================================
670
671Note: Numeric parameters may be specified as either
672:ref:`integer numbers<amdgpu_synid_integer_number>` or
673:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
674
675Examples:
676
Dmitry Preobrazhensky23da1102018-12-17 18:53:10 +0000677.. parsed-literal::
Dmitry Preobrazhensky51120d72018-12-17 17:38:11 +0000678
679 quad_perm:[0, 1, 2, 3]
680 row_shl:3
681
682.. _amdgpu_synid_row_mask:
683
684row_mask
685~~~~~~~~
686
687Controls which rows are enabled for data sharing. By default, all rows are enabled.
688
689Note. The lanes of a wavefront are organized in four banks and four rows.
690
691 ======================================== =====================================================
692 Syntax Description
693 ======================================== =====================================================
694 row_mask:{0..15} Specifies a *row mask* as a positive
695 :ref:`integer number <amdgpu_synid_integer_number>`.
696
697 Each of 4 bits in the mask controls one
698 row (0 - disabled, 1 - enabled).
699 ======================================== =====================================================
700
701Examples:
702
Dmitry Preobrazhensky23da1102018-12-17 18:53:10 +0000703.. parsed-literal::
Dmitry Preobrazhensky51120d72018-12-17 17:38:11 +0000704
705 row_mask:0xf
706 row_mask:0b1010
707 row_mask:0b1111
708
709.. _amdgpu_synid_bank_mask:
710
711bank_mask
712~~~~~~~~~
713
714Controls which banks are enabled for data sharing. By default, all banks are enabled.
715
716Note. The lanes of a wavefront are organized in four banks and four rows.
717
718 ======================================== =======================================================
719 Syntax Description
720 ======================================== =======================================================
721 bank_mask:{0..15} Specifies a *bank mask* as a positive
722 :ref:`integer number <amdgpu_synid_integer_number>`.
723
724 Each of 4 bits in the mask controls one
725 bank (0 - disabled, 1 - enabled).
726 ======================================== =======================================================
727
728Examples:
729
Dmitry Preobrazhensky23da1102018-12-17 18:53:10 +0000730.. parsed-literal::
Dmitry Preobrazhensky51120d72018-12-17 17:38:11 +0000731
732 bank_mask:0x3
733 bank_mask:0b0011
734 bank_mask:0b1111
735
736.. _amdgpu_synid_bound_ctrl:
737
738bound_ctrl
739~~~~~~~~~~
740
741Controls data sharing when accessing an invalid lane. By default, data sharing with
742invalid lanes is disabled.
743
744 ======================================== ================================================
745 Syntax Description
746 ======================================== ================================================
747 bound_ctrl:0 Enables data sharing with invalid lanes.
748
749 Accessing data from an invalid lane will
750 return zero.
751 ======================================== ================================================
752
753VOP1/VOP2/VOPC SDWA Modifiers
754-----------------------------
755
756GFX8 and GFX9 only.
757
758clamp
759~~~~~
760
761See a description :ref:`here<amdgpu_synid_clamp>`.
762
763omod
764~~~~
765
766See a description :ref:`here<amdgpu_synid_omod>`.
767
768GFX9 only.
769
770.. _amdgpu_synid_dst_sel:
771
772dst_sel
773~~~~~~~
774
775Selects which bits in the destination are affected. By default, all bits are affected.
776
777 ======================================== ================================================
778 Syntax Description
779 ======================================== ================================================
780 dst_sel:DWORD Use bits 31:0.
781 dst_sel:BYTE_0 Use bits 7:0.
782 dst_sel:BYTE_1 Use bits 15:8.
783 dst_sel:BYTE_2 Use bits 23:16.
784 dst_sel:BYTE_3 Use bits 31:24.
785 dst_sel:WORD_0 Use bits 15:0.
786 dst_sel:WORD_1 Use bits 31:16.
787 ======================================== ================================================
788
789
790.. _amdgpu_synid_dst_unused:
791
792dst_unused
793~~~~~~~~~~
794
795Controls what to do with the bits in the destination which are not selected
796by :ref:`dst_sel<amdgpu_synid_dst_sel>`.
797By default, unused bits are preserved.
798
799 ======================================== ================================================
800 Syntax Description
801 ======================================== ================================================
802 dst_unused:UNUSED_PAD Pad with zeros.
803 dst_unused:UNUSED_SEXT Sign-extend upper bits, zero lower bits.
804 dst_unused:UNUSED_PRESERVE Preserve bits.
805 ======================================== ================================================
806
807.. _amdgpu_synid_src0_sel:
808
809src0_sel
810~~~~~~~~
811
812Controls which bits in the src0 are used. By default, all bits are used.
813
814 ======================================== ================================================
815 Syntax Description
816 ======================================== ================================================
817 src0_sel:DWORD Use bits 31:0.
818 src0_sel:BYTE_0 Use bits 7:0.
819 src0_sel:BYTE_1 Use bits 15:8.
820 src0_sel:BYTE_2 Use bits 23:16.
821 src0_sel:BYTE_3 Use bits 31:24.
822 src0_sel:WORD_0 Use bits 15:0.
823 src0_sel:WORD_1 Use bits 31:16.
824 ======================================== ================================================
825
826.. _amdgpu_synid_src1_sel:
827
828src1_sel
829~~~~~~~~
830
831Controls which bits in the src1 are used. By default, all bits are used.
832
833 ======================================== ================================================
834 Syntax Description
835 ======================================== ================================================
836 src1_sel:DWORD Use bits 31:0.
837 src1_sel:BYTE_0 Use bits 7:0.
838 src1_sel:BYTE_1 Use bits 15:8.
839 src1_sel:BYTE_2 Use bits 23:16.
840 src1_sel:BYTE_3 Use bits 31:24.
841 src1_sel:WORD_0 Use bits 15:0.
842 src1_sel:WORD_1 Use bits 31:16.
843 ======================================== ================================================
844
845.. _amdgpu_synid_sdwa_operand_modifiers:
846
847VOP1/VOP2/VOPC SDWA Operand Modifiers
848-------------------------------------
849
850Operand modifiers are not used separately. They are applied to source operands.
851
852GFX8 and GFX9 only.
853
854abs
855~~~
856
857See a description :ref:`here<amdgpu_synid_abs>`.
858
859neg
860~~~
861
862See a description :ref:`here<amdgpu_synid_neg>`.
863
864.. _amdgpu_synid_sext:
865
866sext
867~~~~
868
869Sign-extends value of a (sub-dword) operand to fill all 32 bits.
870Has no effect for 32-bit operands.
871
872Valid for integer operands only.
873
874 ======================================== ================================================
875 Syntax Description
876 ======================================== ================================================
877 sext(<operand>) Sign-extend operand value.
878 ======================================== ================================================
879
880Examples:
881
Dmitry Preobrazhensky23da1102018-12-17 18:53:10 +0000882.. parsed-literal::
Dmitry Preobrazhensky51120d72018-12-17 17:38:11 +0000883
884 sext(v4)
885 sext(v255)
886
887VOP3 Modifiers
888--------------
889
890.. _amdgpu_synid_vop3_op_sel:
891
Dmitry Preobrazhensky78dfeb72018-12-28 11:48:23 +0000892op_sel
893~~~~~~
Dmitry Preobrazhensky51120d72018-12-17 17:38:11 +0000894
895Selects the low [15:0] or high [31:16] operand bits for source and destination operands.
896By default, low bits are used for all operands.
897
898The number of values specified with the op_sel modifier must match the number of instruction
899operands (both source and destination). First value controls src0, second value controls src1
900and so on, except that the last value controls destination.
901The value 0 selects the low bits, while 1 selects the high bits.
902
903Note. op_sel modifier affects 16-bit operands only. For 32-bit operands the value specified
904by op_sel must be 0.
905
906GFX9 only.
907
908 ======================================== ============================================================
909 Syntax Description
910 ======================================== ============================================================
911 op_sel:[{0..1},{0..1}] Select operand bits for instructions with 1 source operand.
912 op_sel:[{0..1},{0..1},{0..1}] Select operand bits for instructions with 2 source operands.
913 op_sel:[{0..1},{0..1},{0..1},{0..1}] Select operand bits for instructions with 3 source operands.
914 ======================================== ============================================================
915
916Examples:
917
Dmitry Preobrazhensky23da1102018-12-17 18:53:10 +0000918.. parsed-literal::
Dmitry Preobrazhensky51120d72018-12-17 17:38:11 +0000919
920 op_sel:[0,0]
921 op_sel:[0,1]
922
923.. _amdgpu_synid_clamp:
924
925clamp
926~~~~~
927
928Clamp meaning depends on instruction.
929
930For *v_cmp* instructions, clamp modifier indicates that the compare signals
931if a floating point exception occurs. By default, signaling is disabled.
932Not supported by GFX7.
933
934For integer operations, clamp modifier indicates that the result must be clamped
935to the largest and smallest representable value. By default, there is no clamping.
936Integer clamping is not supported by GFX7.
937
938For floating point operations, clamp modifier indicates that the result must be clamped
939to the range [0.0, 1.0]. By default, there is no clamping.
940
941Note. Clamp modifier is applied after :ref:`output modifiers<amdgpu_synid_omod>` (if any).
942
943 ======================================== ================================================
944 Syntax Description
945 ======================================== ================================================
946 clamp Enables clamping (or signaling).
947 ======================================== ================================================
948
949.. _amdgpu_synid_omod:
950
951omod
952~~~~
953
954Specifies if an output modifier must be applied to the result.
955By default, no output modifiers are applied.
956
957Note. Output modifiers are applied before :ref:`clamping<amdgpu_synid_clamp>` (if any).
958
959Output modifiers are valid for f32 and f64 floating point results only.
960They must not be used with f16.
961
962Note. *v_cvt_f16_f32* is an exception. This instruction produces f16 result
963but accepts output modifiers.
964
965 ======================================== ================================================
966 Syntax Description
967 ======================================== ================================================
968 mul:2 Multiply the result by 2.
969 mul:4 Multiply the result by 4.
970 div:2 Multiply the result by 0.5.
971 ======================================== ================================================
972
973.. _amdgpu_synid_vop3_operand_modifiers:
974
975VOP3 Operand Modifiers
976----------------------
977
978Operand modifiers are not used separately. They are applied to source operands.
979
980.. _amdgpu_synid_abs:
981
982abs
983~~~
984
985Computes absolute value of its operand. Applied before :ref:`neg<amdgpu_synid_neg>` (if any).
986Valid for floating point operands only.
987
988 ======================================== ================================================
989 Syntax Description
990 ======================================== ================================================
991 abs(<operand>) Get absolute value of operand.
992 \|<operand>| The same as above.
993 ======================================== ================================================
994
995Examples:
996
Dmitry Preobrazhensky23da1102018-12-17 18:53:10 +0000997.. parsed-literal::
Dmitry Preobrazhensky51120d72018-12-17 17:38:11 +0000998
999 abs(v36)
Dmitry Preobrazhensky23da1102018-12-17 18:53:10 +00001000 \|v36|
Dmitry Preobrazhensky51120d72018-12-17 17:38:11 +00001001
1002.. _amdgpu_synid_neg:
1003
1004neg
1005~~~
1006
1007Computes negative value of its operand. Applied after :ref:`abs<amdgpu_synid_abs>` (if any).
1008Valid for floating point operands only.
1009
1010 ======================================== ================================================
1011 Syntax Description
1012 ======================================== ================================================
1013 neg(<operand>) Get negative value of operand.
1014 -<operand> The same as above.
1015 ======================================== ================================================
1016
1017Examples:
1018
Dmitry Preobrazhensky23da1102018-12-17 18:53:10 +00001019.. parsed-literal::
Dmitry Preobrazhensky51120d72018-12-17 17:38:11 +00001020
1021 neg(v[0])
1022 -v4
1023
1024VOP3P Modifiers
1025---------------
1026
1027This section describes modifiers of *regular* VOP3P instructions.
1028
1029*v_mad_mix_f32*, *v_mad_mixhi_f16* and *v_mad_mixlo_f16*
1030instructions use these modifiers :ref:`in a special manner<amdgpu_synid_mad_mix>`.
1031
1032GFX9 only.
1033
1034.. _amdgpu_synid_op_sel:
1035
1036op_sel
1037~~~~~~
1038
1039Selects the low [15:0] or high [31:16] operand bits as input to the operation
1040which results in the lower-half of the destination.
1041By default, low bits are used for all operands.
1042
1043The number of values specified by the *op_sel* modifier must match the number of source
1044operands. First value controls src0, second value controls src1 and so on.
1045
1046The value 0 selects the low bits, while 1 selects the high bits.
1047
1048 ================================= =============================================================
1049 Syntax Description
1050 ================================= =============================================================
1051 op_sel:[{0..1}] Select operand bits for instructions with 1 source operand.
1052 op_sel:[{0..1},{0..1}] Select operand bits for instructions with 2 source operands.
1053 op_sel:[{0..1},{0..1},{0..1}] Select operand bits for instructions with 3 source operands.
1054 ================================= =============================================================
1055
1056Examples:
1057
Dmitry Preobrazhensky23da1102018-12-17 18:53:10 +00001058.. parsed-literal::
Dmitry Preobrazhensky51120d72018-12-17 17:38:11 +00001059
1060 op_sel:[0,0]
1061 op_sel:[0,1,0]
1062
1063.. _amdgpu_synid_op_sel_hi:
1064
1065op_sel_hi
1066~~~~~~~~~
1067
1068Selects the low [15:0] or high [31:16] operand bits as input to the operation
1069which results in the upper-half of the destination.
1070By default, high bits are used for all operands.
1071
1072The number of values specified by the *op_sel_hi* modifier must match the number of source
1073operands. First value controls src0, second value controls src1 and so on.
1074
1075The value 0 selects the low bits, while 1 selects the high bits.
1076
1077 =================================== =============================================================
1078 Syntax Description
1079 =================================== =============================================================
1080 op_sel_hi:[{0..1}] Select operand bits for instructions with 1 source operand.
1081 op_sel_hi:[{0..1},{0..1}] Select operand bits for instructions with 2 source operands.
1082 op_sel_hi:[{0..1},{0..1},{0..1}] Select operand bits for instructions with 3 source operands.
1083 =================================== =============================================================
1084
1085Examples:
1086
Dmitry Preobrazhensky23da1102018-12-17 18:53:10 +00001087.. parsed-literal::
Dmitry Preobrazhensky51120d72018-12-17 17:38:11 +00001088
1089 op_sel_hi:[0,0]
1090 op_sel_hi:[0,0,1]
1091
1092.. _amdgpu_synid_neg_lo:
1093
1094neg_lo
1095~~~~~~
1096
1097Specifies whether to change sign of operand values selected by
1098:ref:`op_sel<amdgpu_synid_op_sel>`. These values are then used
1099as input to the operation which results in the upper-half of the destination.
1100
1101The number of values specified by this modifier must match the number of source
1102operands. First value controls src0, second value controls src1 and so on.
1103
1104The value 0 indicates that the corresponding operand value is used unmodified,
1105the value 1 indicates that negative value of the operand must be used.
1106
1107By default, operand values are used unmodified.
1108
1109This modifier is valid for floating point operands only.
1110
1111 ================================ ==================================================================
1112 Syntax Description
1113 ================================ ==================================================================
1114 neg_lo:[{0..1}] Select affected operands for instructions with 1 source operand.
1115 neg_lo:[{0..1},{0..1}] Select affected operands for instructions with 2 source operands.
1116 neg_lo:[{0..1},{0..1},{0..1}] Select affected operands for instructions with 3 source operands.
1117 ================================ ==================================================================
1118
1119Examples:
1120
Dmitry Preobrazhensky23da1102018-12-17 18:53:10 +00001121.. parsed-literal::
Dmitry Preobrazhensky51120d72018-12-17 17:38:11 +00001122
1123 neg_lo:[0]
1124 neg_lo:[0,1]
1125
1126.. _amdgpu_synid_neg_hi:
1127
1128neg_hi
1129~~~~~~
1130
1131Specifies whether to change sign of operand values selected by
1132:ref:`op_sel_hi<amdgpu_synid_op_sel_hi>`. These values are then used
1133as input to the operation which results in the upper-half of the destination.
1134
1135The number of values specified by this modifier must match the number of source
1136operands. First value controls src0, second value controls src1 and so on.
1137
1138The value 0 indicates that the corresponding operand value is used unmodified,
1139the value 1 indicates that negative value of the operand must be used.
1140
1141By default, operand values are used unmodified.
1142
1143This modifier is valid for floating point operands only.
1144
1145 =============================== ==================================================================
1146 Syntax Description
1147 =============================== ==================================================================
1148 neg_hi:[{0..1}] Select affected operands for instructions with 1 source operand.
1149 neg_hi:[{0..1},{0..1}] Select affected operands for instructions with 2 source operands.
1150 neg_hi:[{0..1},{0..1},{0..1}] Select affected operands for instructions with 3 source operands.
1151 =============================== ==================================================================
1152
1153Examples:
1154
Dmitry Preobrazhensky23da1102018-12-17 18:53:10 +00001155.. parsed-literal::
Dmitry Preobrazhensky51120d72018-12-17 17:38:11 +00001156
1157 neg_hi:[1,0]
1158 neg_hi:[0,1,1]
1159
1160clamp
1161~~~~~
1162
1163See a description :ref:`here<amdgpu_synid_clamp>`.
1164
1165.. _amdgpu_synid_mad_mix:
1166
1167VOP3P V_MAD_MIX Modifiers
1168-------------------------
1169
1170*v_mad_mix_f32*, *v_mad_mixhi_f16* and *v_mad_mixlo_f16* instructions
1171use *op_sel* and *op_sel_hi* modifiers
1172in a manner different from *regular* VOP3P instructions.
1173
1174See a description below.
1175
1176GFX9 only.
1177
1178.. _amdgpu_synid_mad_mix_op_sel:
1179
Dmitry Preobrazhensky78dfeb72018-12-28 11:48:23 +00001180m_op_sel
1181~~~~~~~~
Dmitry Preobrazhensky51120d72018-12-17 17:38:11 +00001182
1183This operand has meaning only for 16-bit source operands as indicated by
Dmitry Preobrazhensky78dfeb72018-12-28 11:48:23 +00001184:ref:`m_op_sel_hi<amdgpu_synid_mad_mix_op_sel_hi>`.
Dmitry Preobrazhensky51120d72018-12-17 17:38:11 +00001185It specifies to select either the low [15:0] or high [31:16] operand bits
1186as input to the operation.
1187
1188The number of values specified by the *op_sel* modifier must match the number of source
1189operands. First value controls src0, second value controls src1 and so on.
1190
1191The value 0 indicates the low bits, the value 1 indicates the high 16 bits.
1192
1193By default, low bits are used for all operands.
1194
1195 =============================== ================================================
1196 Syntax Description
1197 =============================== ================================================
1198 op_sel:[{0..1},{0..1},{0..1}] Select location of each 16-bit source operand.
1199 =============================== ================================================
1200
1201Examples:
1202
Dmitry Preobrazhensky23da1102018-12-17 18:53:10 +00001203.. parsed-literal::
Dmitry Preobrazhensky51120d72018-12-17 17:38:11 +00001204
1205 op_sel:[0,1]
1206
1207.. _amdgpu_synid_mad_mix_op_sel_hi:
1208
Dmitry Preobrazhensky78dfeb72018-12-28 11:48:23 +00001209m_op_sel_hi
1210~~~~~~~~~~~
Dmitry Preobrazhensky51120d72018-12-17 17:38:11 +00001211
1212Selects the size of source operands: either 32 bits or 16 bits.
1213By default, 32 bits are used for all source operands.
1214
1215The number of values specified by the *op_sel_hi* modifier must match the number of source
1216operands. First value controls src0, second value controls src1 and so on.
1217
1218The value 0 indicates 32 bits, the value 1 indicates 16 bits.
1219
1220The location of 16 bits in the operand may be specified by
Dmitry Preobrazhensky78dfeb72018-12-28 11:48:23 +00001221:ref:`m_op_sel<amdgpu_synid_mad_mix_op_sel>`.
Dmitry Preobrazhensky51120d72018-12-17 17:38:11 +00001222
1223 ======================================== ====================================
1224 Syntax Description
1225 ======================================== ====================================
1226 op_sel_hi:[{0..1},{0..1},{0..1}] Select size of each source operand.
1227 ======================================== ====================================
1228
1229Examples:
1230
Dmitry Preobrazhensky23da1102018-12-17 18:53:10 +00001231.. parsed-literal::
Dmitry Preobrazhensky51120d72018-12-17 17:38:11 +00001232
1233 op_sel_hi:[1,1,1]
1234
1235abs
1236~~~
1237
1238See a description :ref:`here<amdgpu_synid_abs>`.
1239
1240neg
1241~~~
1242
1243See a description :ref:`here<amdgpu_synid_neg>`.
1244
1245clamp
1246~~~~~
1247
1248See a description :ref:`here<amdgpu_synid_clamp>`.