-
Notifications
You must be signed in to change notification settings - Fork 272
✨[Zxcfu ISA ext.] add option to implement custom RISC-V instructions #264
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
34 commits
Select commit
Hold shift + click to select a range
9bcb115
[docs/figures] added 'Zxcfu' extension
stnolting 3aba96f
[rtl/system_integration] added Zxcfu ISA extensions
stnolting 28ba540
[rtl/core] TOP: added generic to enabale Zxcfu ISA extension
stnolting bd73053
[sim] added Zxcfu ISA ext.
stnolting e7c5ce3
[sw/lib] added "custom OPCODES" to intrinsic library
stnolting ccd158e
[sw/lib] CFU: comment typo fix
stnolting cd50dbf
[rtl/core] CPU control: added custom OPCODE logic
stnolting 41ea603
[rtl/core] SYSINFO: added Zxcfu flag
stnolting 5f5f4d6
[rtl/core] ALU: added CFU to CP4 slot
stnolting e383953
[rtl/core] CPU: added Zxcfu ISA extension
stnolting 27be939
[rtl/core] PACKAGE: added Zxcfu ext. and CFU module
stnolting 75607ae
:sparkles: [rtl/core] added CFU hardware module
stnolting 6c228c4
[sw/lib] add CFU drivers and SYSCONFIG bit definition
stnolting 9ae94d3
[sw/lib] RTE: added Zxcfu ISA extension
stnolting e205391
[docs/datasheet] SYSINFO: added Zxcfu flag
stnolting 790c9c1
[docs/datasheet] SOC: added Zxcfu generic
stnolting 8124196
[docs/datasheet] OVERVIEW: added CFU HW file
stnolting b7d515d
[docs/datasheet] CFS: added note/link to CFU
stnolting 2db059b
[docs/datasheet] CPU: added Zxcfu ISA extension
stnolting 62ba20e
[README] add links to Zxcfu extension and CFU module
stnolting ec025d5
[docs/datasheet] added CFU section (still empty)
stnolting 17219d2
minor edits
stnolting c79f5c3
:sparkles: [sw/example] added CFU example program
stnolting ad515d6
[docs/figures] added R2-type CFU instruction diagram
stnolting dbf8a1f
[sw/lib/include] fixed typo in intrinsics library
stnolting 1d97ffa
[docs/datasheet] minor edits -> "extensibility"
stnolting b7d5f35
[docs] added CPU block diagram
stnolting eda0278
minor edits
stnolting ea4d57c
[docs/datasheet] added CFU to SW file list
stnolting 67449f4
:books: [docs/datasheet] added new section "CFU"
stnolting 0e3a715
[docs/userguide] reworked section "Adding Custom Hardware Modules"
stnolting 1470de8
[docs/userguide] custom hardware extension options: minor edits
stnolting 63ecf74
[CHANGELOG] added v1.6.7.1
stnolting 4382a29
typo fixes in comments
stnolting File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,154 @@ | ||
<<< | ||
:sectnums: | ||
=== Custom Functions Unit (CFU) | ||
stnolting marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
The Custom Functions Unit is the central part of the <<_zxcfu_custom_instructions_extension_cfu>> and represents | ||
the actual hardware module, which is used to implement _custom RISC-V instructions_. The concept of the NEORV32 | ||
CFU has been highly inspired by https://github.com/google/CFU-Playground[google's CFU-Playground]. | ||
|
||
The CFU is intended for operations that are inefficient in terms of performance, latency, energy consumption or | ||
program memory requirements when implemented in pure software. Some potential application fields and exemplary | ||
use-cases might include: | ||
|
||
* **AI:** sub-word / vector / SIMD operations like adding all four bytes of a 32-bit data word | ||
* **Cryptographic:** bit substitution and permutation | ||
* **Communication:** conversions like binary to gray-code | ||
* **Image processing:** look-up-tables for color space transformations | ||
* implementing instructions from other RISC-V ISA extensions that are not yet supported by the NEORV32 | ||
|
||
[NOTE] | ||
The CFU is not intended for complex and autonomous functional units that implement complete accelerators | ||
like block-based AES de-/encoding). Such accelerator can be implemented within the <<_custom_functions_subsystem_cfs>>. | ||
A comparison of all chip-internal hardware extension options is provided in the user guide section | ||
https://stnolting.github.io/neorv32/ug/#_adding_custom_hardware_modules[Adding Custom Hardware Modules]. | ||
|
||
|
||
:sectnums: | ||
==== Custom CFU Instructions - General | ||
|
||
The custom instruction utilize a specific instruction space that has been explicitly reserved for user-defined | ||
extensions by the RISC-V specifications ("_Guaranteed Non-Standard Encoding Space_"). The NEORV32 CFU uses the | ||
_CUSTOM0_ opcode to identify custom instructions. The binary encoding of this opcode is `0001011`. | ||
|
||
The custom instructions processed by the CFU use the 32-bit **R2-type** RISC-V instruction format, which consists | ||
of six bit-fields: | ||
|
||
* `funct7`: 7-bit immediate | ||
* `rs2`: address of second source register | ||
* `rs1`: address of first source register | ||
* `funct3`: 3-bit immediate | ||
* `rd`: address of destination register | ||
* `opcode`: always `0001011` to identify custom instructions | ||
|
||
.CFU instruction format (RISC-V R2-type) | ||
image::cfu_r2type_instruction.png[align=center] | ||
|
||
[NOTE] | ||
Obviously, all bit-fields including the immediates have to be static at compile time. | ||
|
||
.Custom Instructions - Exceptions | ||
[NOTE] | ||
The CPU control logic can only check the _CUSTOM0_ opcode of the custom instructions to check if the | ||
instruction word is valid. It cannot check the `funct3` and `funct7` bit-fields since they are | ||
implementation-defined. Hence, a custom CFU instruction can never raise an illegal instruction exception. | ||
However, custom will raise an illegal instruction exception if the CFU is not enabled/implemented | ||
(i.e. `Zxcfu` ISA extension is not enabled). | ||
|
||
The CFU operates on the two source operands and return the processing result to the destination register. | ||
The actual instruction to be performed can be defined by using the `funct7` and `funct3` bit fields. | ||
These immediate bit-fields can also be used to pass additional data to the CFU like offsets, look-up-tables | ||
addresses or shift-amounts. However, the actual functionality is completely user-defined. | ||
|
||
|
||
:sectnums: | ||
==== Using Custom Instructions in Software | ||
|
||
The custom instructions provided by the CFU are included into plain C code by using **intrinsics**. Intrinsics | ||
behave like "normal" functions but under the hood they are a set of macros that hide the complexity of inline assembly. | ||
Using such intrinsics removes the need to modify the compiler, built-in libraries and the assembler when including custom | ||
instructions. | ||
|
||
The NEORV32 software framework provides 8 pre-defined custom instructions macros, which are defined in | ||
`sw/lib/include/neorv32_cpu_cfu.h`. Each intrinsic provides an implicit definition of the instruction word's | ||
`funct3` bit-field: | ||
|
||
.CFU instruction prototypes | ||
[source,c] | ||
---- | ||
neorv32_cfu_cmd0(funct7, rs1, rs2) // funct3 = 000 | ||
neorv32_cfu_cmd1(funct7, rs1, rs2) // funct3 = 001 | ||
neorv32_cfu_cmd2(funct7, rs1, rs2) // funct3 = 010 | ||
neorv32_cfu_cmd3(funct7, rs1, rs2) // funct3 = 011 | ||
neorv32_cfu_cmd4(funct7, rs1, rs2) // funct3 = 100 | ||
neorv32_cfu_cmd5(funct7, rs1, rs2) // funct3 = 101 | ||
neorv32_cfu_cmd6(funct7, rs1, rs2) // funct3 = 110 | ||
neorv32_cfu_cmd7(funct7, rs1, rs2) // funct3 = 111 | ||
---- | ||
|
||
Each intrinsic functions always returns a 32-bit value (the processing result). Furthermore, | ||
each intrinsic function requires three arguments: | ||
|
||
* `funct7` - 7-bit immediate | ||
* `rs2` - source operand 2, 32-bit | ||
* `rs1` - source operand 1, 32-bit | ||
|
||
The `funct7` bit-field is used to pass a 7-bit literal to the CFU. The `rs1` and `rs2` arguments to pass the | ||
actual data to the CFU. These arguments can be populated with variables or literals. The following example | ||
show how to pass arguments when executing `neorv32_cfu_cmd6`: `funct7` is set to all-zero, `rs1` is given | ||
the literal _2751_ and `rs2` is given a variable that contains the return value from `some_function()`. | ||
|
||
.CFU instruction usage example | ||
[source,c] | ||
---- | ||
uint32_t opb = some_function(); | ||
uint32_t res = neorv32_cfu_cmd6(0b0000000, 2751, opb); | ||
---- | ||
|
||
.CFU Example Program | ||
[TIP] | ||
There is a simple example program for the CFU, which shows how to use the _default_ CFU hardware module. | ||
The example program is located in `sw/example/demo_cfu`. | ||
|
||
|
||
:sectnums: | ||
==== Custom Instructions Hardware | ||
|
||
The actual functionality of the CFU's custom instruction is defined by the logic in the CFU itself. | ||
It is the responsibility of the designer to implement this logic within the CFU hardware module | ||
`rtl/core/neorv32_cpu_cp_cfu.vhd`. | ||
|
||
The CFU hardware module receives the data from instruction word's immediate bit-fields and also | ||
the operation data, which is fetched from the CPU's register file. | ||
|
||
.CFU instruction data passing example | ||
[source,c] | ||
---- | ||
uint32_t opb = 0x12345678; | ||
uint32_t res = neorv32_cfu_cmd6(0b0100111, 0x00cafe00, opb); | ||
---- | ||
|
||
In this example the CFU hardware module receives the two source operands as 32-bit signal | ||
and the immediate values as 7-bit and 3-bit signals: | ||
|
||
* `rs1_i` (32-bit) contains the data from the `rs1` register (here = `0x00cafe00`) | ||
* `rs2_i` (32-bit) contains the data from the `rs2` register (here = 0x12345678) | ||
* `control.funct3` (3-bit) contains the immediate value from the `funct3` bit-field (here = `0b110`; "cmd6") | ||
* `control.funct7` (7-bit) contains the immediate value from the `funct7` bit-field (here = `0b0100111`) | ||
|
||
The CFU executes the according instruction (for example this is selected by the `control.funct3` signal) | ||
and provides the operation result in the 32-bit `control.result` signal. The processing can be entirely | ||
combinatorial, so the result is available at the end of the current clock cycle. Processing can also | ||
take several clock cycles and may also include internal states and memories. As soon as the CFU has | ||
completed operations it sets the `control.done` signal high. | ||
|
||
.CFU Hardware Example & More Details | ||
[TIP] | ||
The default CFU module already implement some exemplary instructions that are used for illustration | ||
by the CFU example program. See the CFU's VHDL source file (`rtl/core/neorv32_cpu_cp_cfu.vhd`), which | ||
is highly commented to explain the available signals and the handshake with the CPU pipeline. | ||
|
||
.CFU Execution Time | ||
[NOTE] | ||
The CFU is not required to finish processing within a bound time. | ||
However, the designer should keep in mind that the CPU is **stalled** until the CFU has finished processing. | ||
This also means the CPU cannot react to pending interrupts. Nevertheless, interrupt requests will still be queued. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.