@@ -99,7 +99,7 @@ duration in accordance with C11 section 7.6 "Floating-point environment
99
99
100
100
=== Vector Register Convention
101
101
102
- .Vector register convention
102
+ .Vector register convention for standard calling convention
103
103
[%autowidth]
104
104
|===
105
105
| Name | ABI Mnemonic | Meaning | Preserved across calls?
@@ -111,10 +111,28 @@ duration in accordance with C11 section 7.6 "Floating-point environment
111
111
| vxsat | | Vector fixed-point saturation flag register | No
112
112
|===
113
113
114
+ .Vector register convention for standard vector calling convention variant*
115
+ [%autowidth]
116
+ |===
117
+ | Name | ABI Mnemonic | Meaning | Preserved across calls?
114
118
115
- Vector registers are not used for passing arguments or return values; we
116
- intend to define a new calling convention variant to allow that as a future
117
- software optimization.
119
+ | v0 | | Argument register | No
120
+ | v1-v7 | | Callee-saved registers | Yes
121
+ | v8-v23 | | Argument registers | No
122
+ | v24-v31 | | Callee-saved registers | Yes
123
+ | vl | | Vector length | No
124
+ | vtype | | Vector data type register | No
125
+ | vxrm | | Vector fixed-point rounding mode register | No
126
+ | vxsat | | Vector fixed-point saturation flag register | No
127
+ |===
128
+
129
+ *: Functions that use vector registers to pass arguments and return values must
130
+ follow this calling convention. Some programming languages can require extra
131
+ functions to follow this calling convention (e.g. C/C++ functions with
132
+ attribute `riscv_vector_cc`).
133
+
134
+ Please refer to the <<Standard Vector Calling Convention Variant>> section for
135
+ more details about standard vector calling convention variant.
118
136
119
137
The `vxrm` and `vxsat` fields of `vcsr` are not preserved across calls and their
120
138
values are unspecified upon entry.
@@ -128,8 +146,8 @@ Any procedure that does explicitly write `vstart` to a nonzero value must zero
128
146
129
147
== Procedure Calling Convention
130
148
131
- This chapter defines standard calling conventions, and describes how to pass
132
- parameters and return values.
149
+ This chapter defines standard calling conventions and standard calling
150
+ convention variants, and describes how to pass arguments and return values.
133
151
134
152
Functions must follow the register convention defined in calling convention: the
135
153
contents of any register without specifying it as an argument register
@@ -329,6 +347,90 @@ type would be passed.
329
347
Floating-point registers fs0-fs11 shall be preserved across procedure calls,
330
348
provided they hold values no more than ABI_FLEN bits wide.
331
349
350
+ === Standard Vector Calling Convention Variant
351
+
352
+ The _RISC-V V Vector Extension_<<riscv-v-extension>> defines a set of thirty-two
353
+ vector registers, v0-v31. The _RISC-V Vector Extension Intrinsic
354
+ Document_<<rvv-intrinsic-doc>> defines vector types which include vector mask
355
+ types, vector data types, and tuple vector data types. A value of vector type can
356
+ be stored in vector register groups.
357
+
358
+ The remainder of this section applies only to named vector arguments, other
359
+ named arguments and return values follow the standard calling convention.
360
+ Variadic vector arguments are passed by reference.
361
+
362
+ v0 is used to pass the first vector mask argument to a function, and to return
363
+ vector mask result from a function. v8-v23 are used to pass vector data
364
+ arguments, tuple vector data arguments and the rest vector mask arguments to a
365
+ function, and to return vector data and vector tuple results from a function.
366
+
367
+ It must ensure that the entire contents of v1-v7 and v24-v31 are preserved
368
+ across the call.
369
+
370
+ Each vector data type and vector tuple type has an LMUL attribute that
371
+ indicates a vector register group. The value of LMUL indicates the number of
372
+ vector registers in the vector register group and requires the first vector
373
+ register number in the vector register group must be a multiple of it. For
374
+ example, the LMUL of `vint64m8_t` is 8, so v8-v15 vector register group can be
375
+ allocated to this type, but v9-v16 can not because the v9 register number is
376
+ not a multiple of 8. If LMUL is less than 1, it is treated as 1. If it is a
377
+ vector mask type, its LMUL is 1.
378
+
379
+ Each vector tuple type also has an NFIELDS attribute that indicates how many
380
+ vector register groups the type contains. Thus a vector tuple type needs to
381
+ take up LMUL×NFIELDS registers.
382
+
383
+ The rules for passing vector arguments are as follows:
384
+
385
+ 1. For the first vector mask argument, use v0 to pass it.
386
+
387
+ 2. For vector data arguments or rest vector mask arguments, starting from the
388
+ v8 register, if a vector register group between v8-v23 that has not been
389
+ allocated can be found and the first register number is a multiple of LMUL,
390
+ then allocate this vector register group to the argument and mark these
391
+ registers as allocated. Otherwise, pass it by reference and are replaced in
392
+ the argument list with the address.
393
+
394
+ 3. For tuple vector data arguments, starting from the v8 register, if NFIELDS
395
+ consecutive vector register groups between v8-v23 that have not been allocated
396
+ can be found and the first register number is a multiple of LMUL, then allocate
397
+ these vector register groups to the argument and mark these registers as
398
+ allocated. Otherwise, pass it by reference and are replaced in the argument list
399
+ with the address.
400
+
401
+ NOTE: The registers assigned to the tuple vector data argument must be
402
+ consecutive. For example, for the function
403
+ `void foo(vint32m1_t a, vint32m2_t b, vint32m1x2_t c)`, v8 will be allocated
404
+ to `a`, v10-v11 will be allocated to `b`, v12-v13 instead of v9 and v12 will
405
+ beallocated to `c`.
406
+
407
+ NOTE: It should be stressed that the search for the appropriate vector register
408
+ groups starts at v8 each time and does not start at the next register after the
409
+ registers are allocated for the previous vector argument. Therefore, it is
410
+ possible that the vector register number allocated to a vector argument can be
411
+ less than the vector register number allocated to previous vector arguments.
412
+ For example, for the function
413
+ `void foo (vint32m1_t a, vint32m2_t b, vint32m1_t c)`, according to the rules
414
+ of allocation, v8 will be allocated to `a`, v10-v11 will be allocated to `b`
415
+ and v9 will be allocated to `c`. This approach allows more vector registers to
416
+ be allocated to arguments in some cases.
417
+
418
+ Vector values are returned in the same manner as the first named argument of
419
+ the same type would be passed.
420
+
421
+ Vector types are disallowed in struct or union.
422
+
423
+ Vector arguments and return values are disallowed to pass to an unprototyped
424
+ function.
425
+
426
+ NOTE: Functions that use the standard vector calling convention variant must be
427
+ marked with `STO_RISCV_VARIANT_CC`, see <<Dynamic Linking>> for the meaning of
428
+ `STO_RISCV_VARIANT_CC`.
429
+
430
+ NOTE: `setjmp`/`longjmp` follow the standard calling convention, which clobbers
431
+ all vector registers. Hence, the standard vector calling convention variant
432
+ won't disrupt the `jmp_buf` ABI.
433
+
332
434
=== ILP32E Calling Convention
333
435
334
436
IMPORTANT: RV32E is not a ratified base ISA and so we cannot guarantee the
@@ -555,3 +657,13 @@ The following definitions apply for all ABIs defined in this document. Here
555
657
there is no differentiation between ILP32 and LP64 ABIs.
556
658
557
659
`wchar_t` is signed. `wint_t` is unsigned.
660
+
661
+ [bibliography]
662
+ == References
663
+
664
+ * [[[riscv-v-extension]]] "RISC-V V vector extension specification"
665
+ https://github.com/riscv/riscv-v-spec
666
+
667
+ * [[[rvv-intrinsic-doc]]] "RISC-V Vector Extension Intrinsic Document"
668
+ https://github.com/riscv-non-isa/rvv-intrinsic-doc
669
+
0 commit comments