Skip to content

Commit d4c38ee

Browse files
authored
Merge pull request #389 from lhtin/master
Proposal for Vector Calling Convention
2 parents 52a6631 + 0fc69ed commit d4c38ee

File tree

1 file changed

+118
-6
lines changed

1 file changed

+118
-6
lines changed

riscv-cc.adoc

+118-6
Original file line numberDiff line numberDiff line change
@@ -99,7 +99,7 @@ duration in accordance with C11 section 7.6 "Floating-point environment
9999

100100
=== Vector Register Convention
101101

102-
.Vector register convention
102+
.Vector register convention for standard calling convention
103103
[%autowidth]
104104
|===
105105
| Name | ABI Mnemonic | Meaning | Preserved across calls?
@@ -111,10 +111,28 @@ duration in accordance with C11 section 7.6 "Floating-point environment
111111
| vxsat | | Vector fixed-point saturation flag register | No
112112
|===
113113

114+
.Vector register convention for standard vector calling convention variant*
115+
[%autowidth]
116+
|===
117+
| Name | ABI Mnemonic | Meaning | Preserved across calls?
114118

115-
Vector registers are not used for passing arguments or return values; we
116-
intend to define a new calling convention variant to allow that as a future
117-
software optimization.
119+
| v0 | | Argument register | No
120+
| v1-v7 | | Callee-saved registers | Yes
121+
| v8-v23 | | Argument registers | No
122+
| v24-v31 | | Callee-saved registers | Yes
123+
| vl | | Vector length | No
124+
| vtype | | Vector data type register | No
125+
| vxrm | | Vector fixed-point rounding mode register | No
126+
| vxsat | | Vector fixed-point saturation flag register | No
127+
|===
128+
129+
*: Functions that use vector registers to pass arguments and return values must
130+
follow this calling convention. Some programming languages can require extra
131+
functions to follow this calling convention (e.g. C/C++ functions with
132+
attribute `riscv_vector_cc`).
133+
134+
Please refer to the <<Standard Vector Calling Convention Variant>> section for
135+
more details about standard vector calling convention variant.
118136

119137
The `vxrm` and `vxsat` fields of `vcsr` are not preserved across calls and their
120138
values are unspecified upon entry.
@@ -128,8 +146,8 @@ Any procedure that does explicitly write `vstart` to a nonzero value must zero
128146

129147
== Procedure Calling Convention
130148

131-
This chapter defines standard calling conventions, and describes how to pass
132-
parameters and return values.
149+
This chapter defines standard calling conventions and standard calling
150+
convention variants, and describes how to pass arguments and return values.
133151

134152
Functions must follow the register convention defined in calling convention: the
135153
contents of any register without specifying it as an argument register
@@ -329,6 +347,90 @@ type would be passed.
329347
Floating-point registers fs0-fs11 shall be preserved across procedure calls,
330348
provided they hold values no more than ABI_FLEN bits wide.
331349

350+
=== Standard Vector Calling Convention Variant
351+
352+
The _RISC-V V Vector Extension_<<riscv-v-extension>> defines a set of thirty-two
353+
vector registers, v0-v31. The _RISC-V Vector Extension Intrinsic
354+
Document_<<rvv-intrinsic-doc>> defines vector types which include vector mask
355+
types, vector data types, and tuple vector data types. A value of vector type can
356+
be stored in vector register groups.
357+
358+
The remainder of this section applies only to named vector arguments, other
359+
named arguments and return values follow the standard calling convention.
360+
Variadic vector arguments are passed by reference.
361+
362+
v0 is used to pass the first vector mask argument to a function, and to return
363+
vector mask result from a function. v8-v23 are used to pass vector data
364+
arguments, tuple vector data arguments and the rest vector mask arguments to a
365+
function, and to return vector data and vector tuple results from a function.
366+
367+
It must ensure that the entire contents of v1-v7 and v24-v31 are preserved
368+
across the call.
369+
370+
Each vector data type and vector tuple type has an LMUL attribute that
371+
indicates a vector register group. The value of LMUL indicates the number of
372+
vector registers in the vector register group and requires the first vector
373+
register number in the vector register group must be a multiple of it. For
374+
example, the LMUL of `vint64m8_t` is 8, so v8-v15 vector register group can be
375+
allocated to this type, but v9-v16 can not because the v9 register number is
376+
not a multiple of 8. If LMUL is less than 1, it is treated as 1. If it is a
377+
vector mask type, its LMUL is 1.
378+
379+
Each vector tuple type also has an NFIELDS attribute that indicates how many
380+
vector register groups the type contains. Thus a vector tuple type needs to
381+
take up LMUL×NFIELDS registers.
382+
383+
The rules for passing vector arguments are as follows:
384+
385+
1. For the first vector mask argument, use v0 to pass it.
386+
387+
2. For vector data arguments or rest vector mask arguments, starting from the
388+
v8 register, if a vector register group between v8-v23 that has not been
389+
allocated can be found and the first register number is a multiple of LMUL,
390+
then allocate this vector register group to the argument and mark these
391+
registers as allocated. Otherwise, pass it by reference and are replaced in
392+
the argument list with the address.
393+
394+
3. For tuple vector data arguments, starting from the v8 register, if NFIELDS
395+
consecutive vector register groups between v8-v23 that have not been allocated
396+
can be found and the first register number is a multiple of LMUL, then allocate
397+
these vector register groups to the argument and mark these registers as
398+
allocated. Otherwise, pass it by reference and are replaced in the argument list
399+
with the address.
400+
401+
NOTE: The registers assigned to the tuple vector data argument must be
402+
consecutive. For example, for the function
403+
`void foo(vint32m1_t a, vint32m2_t b, vint32m1x2_t c)`, v8 will be allocated
404+
to `a`, v10-v11 will be allocated to `b`, v12-v13 instead of v9 and v12 will
405+
beallocated to `c`.
406+
407+
NOTE: It should be stressed that the search for the appropriate vector register
408+
groups starts at v8 each time and does not start at the next register after the
409+
registers are allocated for the previous vector argument. Therefore, it is
410+
possible that the vector register number allocated to a vector argument can be
411+
less than the vector register number allocated to previous vector arguments.
412+
For example, for the function
413+
`void foo (vint32m1_t a, vint32m2_t b, vint32m1_t c)`, according to the rules
414+
of allocation, v8 will be allocated to `a`, v10-v11 will be allocated to `b`
415+
and v9 will be allocated to `c`. This approach allows more vector registers to
416+
be allocated to arguments in some cases.
417+
418+
Vector values are returned in the same manner as the first named argument of
419+
the same type would be passed.
420+
421+
Vector types are disallowed in struct or union.
422+
423+
Vector arguments and return values are disallowed to pass to an unprototyped
424+
function.
425+
426+
NOTE: Functions that use the standard vector calling convention variant must be
427+
marked with `STO_RISCV_VARIANT_CC`, see <<Dynamic Linking>> for the meaning of
428+
`STO_RISCV_VARIANT_CC`.
429+
430+
NOTE: `setjmp`/`longjmp` follow the standard calling convention, which clobbers
431+
all vector registers. Hence, the standard vector calling convention variant
432+
won't disrupt the `jmp_buf` ABI.
433+
332434
=== ILP32E Calling Convention
333435

334436
IMPORTANT: RV32E is not a ratified base ISA and so we cannot guarantee the
@@ -555,3 +657,13 @@ The following definitions apply for all ABIs defined in this document. Here
555657
there is no differentiation between ILP32 and LP64 ABIs.
556658

557659
`wchar_t` is signed. `wint_t` is unsigned.
660+
661+
[bibliography]
662+
== References
663+
664+
* [[[riscv-v-extension]]] "RISC-V V vector extension specification"
665+
https://github.com/riscv/riscv-v-spec
666+
667+
* [[[rvv-intrinsic-doc]]] "RISC-V Vector Extension Intrinsic Document"
668+
https://github.com/riscv-non-isa/rvv-intrinsic-doc
669+

0 commit comments

Comments
 (0)