diff --git a/CHANGELOG.md b/CHANGELOG.md index 82694ab9a..9928f3655 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -26,6 +26,7 @@ defined by the `hw_version_c` constant in the main VHDL package file [`rtl/core/ | Date (*dd.mm.yyyy*) | Version | Comment | |:----------:|:-------:|:--------| +| 30.01.2022 | 1.6.7.1 | :sparkles: added **`Zxcfu` ISA extension for user-defined custom RISC-V instructions**; see [PR #264](https://github.com/stnolting/neorv32/pull/264) | | 28.01.2022 |[**:rocket:1.6.7**](https://github.com/stnolting/neorv32/releases/tag/v1.6.7) | **New release** | | 28.01.2022 | 1.6.6.10 | :bug: fixed bug in **bit-manipulation co-processor**: decoding collision between `cpop` and `rol` instructions; :bug: fixed bug in co-processor arbitration when an illegal instruction is detected; added four additional (yet unused) **CPU** co-processor slots; [PR #262](https://github.com/stnolting/neorv32/pull/262) | | 27.01.2022 | 1.6.6.9 | reworked **CFS** "user" logic; added CFS demo program; see [PR #261](https://github.com/stnolting/neorv32/pull/261) | diff --git a/README.md b/README.md index fcf032406..38b50d3e8 100644 --- a/README.md +++ b/README.md @@ -127,7 +127,9 @@ the "Minimal RISC-V Debug Specification Version 0.13.2" and compatible with **Op * _true random_ number generator ([TRNG](https://stnolting.github.io/neorv32/#_true_random_number_generator_trng)) * execute in place module ([XIP](https://stnolting.github.io/neorv32/#_execute_in_place_module_xip)) to directly execute code from SPI flash * custom functions subsystem ([CFS](https://stnolting.github.io/neorv32/#_custom_functions_subsystem_cfs)) -for tightly-coupled custom co-processor extensions and interfaces +for tightly-coupled custom accelerators and interfaces +* custom functions unit ([CFU](https://stnolting.github.io/neorv32/#_custom_functions_unit_cfu)) for up to 1024 +_custom RISC-V instructions_ [[back to top](#The-NEORV32-RISC-V-Processor)] @@ -187,6 +189,7 @@ documentation section). [[`Zihpm`](https://stnolting.github.io/neorv32/#_zihpm_hardware_performance_monitors)] [[`Zifencei`](https://stnolting.github.io/neorv32/#_zifencei_instruction_stream_synchronization)] [[`Zmmul`](https://stnolting.github.io/neorv32/#_zmmul_integer_multiplication)] +[[`Zxcfu`](https://stnolting.github.io/neorv32/#_zxcfu_custom_instructions_extension_cfu)] [[`PMP`](https://stnolting.github.io/neorv32/#_pmp_physical_memory_protection)] [[`DEBUG`](https://stnolting.github.io/neorv32/#_cpu_debug_mode)]** diff --git a/docs/datasheet/cpu.adoc b/docs/datasheet/cpu.adoc index b9e97123b..e825f4380 100644 --- a/docs/datasheet/cpu.adoc +++ b/docs/datasheet/cpu.adoc @@ -1,7 +1,7 @@ :sectnums: == NEORV32 Central Processing Unit (CPU) -image::riscv_logo.png[width=350,align=center] +image::neorv32_cpu_block.png[width=600,align=center] **Key Features** @@ -20,6 +20,7 @@ image::riscv_logo.png[width=350,align=center] ** `Zihpm` - hardware performance monitors ** `Zifencei` - instruction stream synchronization ** `Zmmul` - integer multiplication hardware +** `Zxcfu` - custom instructions extension ** `PMP` - physical memory protection ** `Debug` - debug mode * Compatible to the RISC-V user specifications and a subset of the RISC-V privileged architecture specifications - passes the official RISC-V Architecture Tests (v2+) @@ -684,6 +685,30 @@ high for one cycle to inform the memory system (like the i-cache to perform a fl Any additional flags within the `fence.i` instruction word are ignore by the hardware. +==== **`Zxcfu`** Custom Instructions Extension (CFU) + +The `Zxcfu` presents a NEORV32-specific _custom RISC-V_ ISA extension (`Z` = sub-extension, `x` = platform-specific +custom extension, `cfu` = name of the custom extension). When enabled via the `CPU_EXTENSION_RISCV_Zxcfu` configuration +generic, this ISA extensions adds the <<_custom_functions_unit_cfu>> to the CPU core. The CFU is a module that is +allows to add **custom RISC-V instructions** to the processor core. + +The CPU is implemented as ALU co-processor and is integrated right into the CPU's pipeline providing minimal data +transfer latency as it has direct access to the core's register file. Up to 1024 custom instructions can be +implemented within the CFU. These instructions are mapped to an OPCODE space that has been explicitly reserved by +the RISC-V spec for custom extensions. + +Software can utilize the custom instructions by using _intrinsic functions_, which are inline assembly functions that +behave like "regular" C functions. + +[TIP] +For more information regarding the CFU see section <<_custom_functions_unit_cfu>>. + +[TIP] +The CFU / `Zxcfu` ISA extension is intended for application-specific _instructions_. +If you like to add more complex accelerators or interfaces that can also operate independently of +the CPU take a look at the memory-mapped <<_custom_functions_subsystem_cfs>>. + + ==== **`PMP`** Physical Memory Protection The NEORV32 physical memory protection (PMP) is compatible to the RISC-V PMP specifications. It can be used @@ -796,6 +821,7 @@ configurations are presented in <<_cpu_performance>>. | Bit-manipulation - single-bit | `B(Zbs)` | `sbset[i]` `sbclr[i]` `sbinv[i]` `sbext[i]` | 3 | Bit-manipulation - shifted-add | `B(Zba)` | `sh1add` `sh2add` `sh3add` | 3 | Bit-manipulation - carry-less multiply | `B(Zbc)` | `clmul` `clmulh` `clmulr` | 3 + 32 +| CFU: custom instructions | `Zxcfu` | - | min. 4 |======================= [NOTE] @@ -1146,3 +1172,9 @@ be enabled ba enabling a constant in the main VHDL package file (`rtl/core/neorv -- "critical" number of PMP regions -- constant dedicated_reset_c : boolean := false; -- use dedicated hardware reset value for UNCRITICAL registers (FALSE=reset value is irrelevant (might simplify HW), default; TRUE=defined LOW reset value) ---- + + +<<< +// #################################################################################################################### + +include::cpu_cfu.adoc[] diff --git a/docs/datasheet/cpu_cfu.adoc b/docs/datasheet/cpu_cfu.adoc new file mode 100644 index 000000000..2109875fd --- /dev/null +++ b/docs/datasheet/cpu_cfu.adoc @@ -0,0 +1,154 @@ +<<< +:sectnums: +=== Custom Functions Unit (CFU) + +The Custom Functions Unit is the central part of the <<_zxcfu_custom_instructions_extension_cfu>> and represents +the actual hardware module, which is used to implement _custom RISC-V instructions_. The concept of the NEORV32 +CFU has been highly inspired by https://github.com/google/CFU-Playground[google's CFU-Playground]. + +The CFU is intended for operations that are inefficient in terms of performance, latency, energy consumption or +program memory requirements when implemented in pure software. Some potential application fields and exemplary +use-cases might include: + +* **AI:** sub-word / vector / SIMD operations like adding all four bytes of a 32-bit data word +* **Cryptographic:** bit substitution and permutation +* **Communication:** conversions like binary to gray-code +* **Image processing:** look-up-tables for color space transformations +* implementing instructions from other RISC-V ISA extensions that are not yet supported by the NEORV32 + +[NOTE] +The CFU is not intended for complex and autonomous functional units that implement complete accelerators +like block-based AES de-/encoding). Such accelerator can be implemented within the <<_custom_functions_subsystem_cfs>>. +A comparison of all chip-internal hardware extension options is provided in the user guide section +https://stnolting.github.io/neorv32/ug/#_adding_custom_hardware_modules[Adding Custom Hardware Modules]. + + +:sectnums: +==== Custom CFU Instructions - General + +The custom instruction utilize a specific instruction space that has been explicitly reserved for user-defined +extensions by the RISC-V specifications ("_Guaranteed Non-Standard Encoding Space_"). The NEORV32 CFU uses the +_CUSTOM0_ opcode to identify custom instructions. The binary encoding of this opcode is `0001011`. + +The custom instructions processed by the CFU use the 32-bit **R2-type** RISC-V instruction format, which consists +of six bit-fields: + +* `funct7`: 7-bit immediate +* `rs2`: address of second source register +* `rs1`: address of first source register +* `funct3`: 3-bit immediate +* `rd`: address of destination register +* `opcode`: always `0001011` to identify custom instructions + +.CFU instruction format (RISC-V R2-type) +image::cfu_r2type_instruction.png[align=center] + +[NOTE] +Obviously, all bit-fields including the immediates have to be static at compile time. + +.Custom Instructions - Exceptions +[NOTE] +The CPU control logic can only check the _CUSTOM0_ opcode of the custom instructions to check if the +instruction word is valid. It cannot check the `funct3` and `funct7` bit-fields since they are +implementation-defined. Hence, a custom CFU instruction can never raise an illegal instruction exception. +However, custom will raise an illegal instruction exception if the CFU is not enabled/implemented +(i.e. `Zxcfu` ISA extension is not enabled). + +The CFU operates on the two source operands and return the processing result to the destination register. +The actual instruction to be performed can be defined by using the `funct7` and `funct3` bit fields. +These immediate bit-fields can also be used to pass additional data to the CFU like offsets, look-up-tables +addresses or shift-amounts. However, the actual functionality is completely user-defined. + + +:sectnums: +==== Using Custom Instructions in Software + +The custom instructions provided by the CFU are included into plain C code by using **intrinsics**. Intrinsics +behave like "normal" functions but under the hood they are a set of macros that hide the complexity of inline assembly. +Using such intrinsics removes the need to modify the compiler, built-in libraries and the assembler when including custom +instructions. + +The NEORV32 software framework provides 8 pre-defined custom instructions macros, which are defined in +`sw/lib/include/neorv32_cpu_cfu.h`. Each intrinsic provides an implicit definition of the instruction word's +`funct3` bit-field: + +.CFU instruction prototypes +[source,c] +---- +neorv32_cfu_cmd0(funct7, rs1, rs2) // funct3 = 000 +neorv32_cfu_cmd1(funct7, rs1, rs2) // funct3 = 001 +neorv32_cfu_cmd2(funct7, rs1, rs2) // funct3 = 010 +neorv32_cfu_cmd3(funct7, rs1, rs2) // funct3 = 011 +neorv32_cfu_cmd4(funct7, rs1, rs2) // funct3 = 100 +neorv32_cfu_cmd5(funct7, rs1, rs2) // funct3 = 101 +neorv32_cfu_cmd6(funct7, rs1, rs2) // funct3 = 110 +neorv32_cfu_cmd7(funct7, rs1, rs2) // funct3 = 111 +---- + +Each intrinsic functions always returns a 32-bit value (the processing result). Furthermore, +each intrinsic function requires three arguments: + +* `funct7` - 7-bit immediate +* `rs2` - source operand 2, 32-bit +* `rs1` - source operand 1, 32-bit + +The `funct7` bit-field is used to pass a 7-bit literal to the CFU. The `rs1` and `rs2` arguments to pass the +actual data to the CFU. These arguments can be populated with variables or literals. The following example +show how to pass arguments when executing `neorv32_cfu_cmd6`: `funct7` is set to all-zero, `rs1` is given +the literal _2751_ and `rs2` is given a variable that contains the return value from `some_function()`. + +.CFU instruction usage example +[source,c] +---- +uint32_t opb = some_function(); +uint32_t res = neorv32_cfu_cmd6(0b0000000, 2751, opb); +---- + +.CFU Example Program +[TIP] +There is a simple example program for the CFU, which shows how to use the _default_ CFU hardware module. +The example program is located in `sw/example/demo_cfu`. + + +:sectnums: +==== Custom Instructions Hardware + +The actual functionality of the CFU's custom instruction is defined by the logic in the CFU itself. +It is the responsibility of the designer to implement this logic within the CFU hardware module +`rtl/core/neorv32_cpu_cp_cfu.vhd`. + +The CFU hardware module receives the data from instruction word's immediate bit-fields and also +the operation data, which is fetched from the CPU's register file. + +.CFU instruction data passing example +[source,c] +---- +uint32_t opb = 0x12345678; +uint32_t res = neorv32_cfu_cmd6(0b0100111, 0x00cafe00, opb); +---- + +In this example the CFU hardware module receives the two source operands as 32-bit signal +and the immediate values as 7-bit and 3-bit signals: + +* `rs1_i` (32-bit) contains the data from the `rs1` register (here = `0x00cafe00`) +* `rs2_i` (32-bit) contains the data from the `rs2` register (here = 0x12345678) +* `control.funct3` (3-bit) contains the immediate value from the `funct3` bit-field (here = `0b110`; "cmd6") +* `control.funct7` (7-bit) contains the immediate value from the `funct7` bit-field (here = `0b0100111`) + +The CFU executes the according instruction (for example this is selected by the `control.funct3` signal) +and provides the operation result in the 32-bit `control.result` signal. The processing can be entirely +combinatorial, so the result is available at the end of the current clock cycle. Processing can also +take several clock cycles and may also include internal states and memories. As soon as the CFU has +completed operations it sets the `control.done` signal high. + +.CFU Hardware Example & More Details +[TIP] +The default CFU module already implement some exemplary instructions that are used for illustration +by the CFU example program. See the CFU's VHDL source file (`rtl/core/neorv32_cpu_cp_cfu.vhd`), which +is highly commented to explain the available signals and the handshake with the CPU pipeline. + +.CFU Execution Time +[NOTE] +The CFU is not required to finish processing within a bound time. +However, the designer should keep in mind that the CPU is **stalled** until the CFU has finished processing. +This also means the CPU cannot react to pending interrupts. Nevertheless, interrupt requests will still be queued. diff --git a/docs/datasheet/overview.adoc b/docs/datasheet/overview.adoc index 2562750e8..cc47cd0c2 100644 --- a/docs/datasheet/overview.adoc +++ b/docs/datasheet/overview.adoc @@ -54,6 +54,7 @@ include::rationale.adoc[] * **NEORV32 CPU**: 32-bit `rv32i` RISC-V CPU ** RISC-V compatibility: passes the official architecture tests ** base architecture + privileged architecture (optional) + ISA extensions (optional) +** option to add custom RISC-V instructions (as custom ISA extension) ** rich set of customization options (ISA extensions, design goal: performance / area (/ energy), ...) ** aims to support <<_full_virtualization>> capabilities (CPU _and_ SoC) to increase execution safety ** official https://github.com/riscv/riscv-isa-manual/blob/master/marchid.md[RISC-V open source architecture ID] @@ -78,6 +79,21 @@ include::rationale.adoc[] For more in-depth details regarding the feature provided by he hardware see the according sections: <<_neorv32_central_processing_unit_cpu>> and <<_neorv32_processor_soc>>. +**Extensibility and Customization** + +The NEORV32 processor was designed to ease customization and extensibility and provides several options for adding +application-specific custom hardware modules and accelerators. The three most common options for adding custom +on-chip modules are listed below. + +* <<_processor_external_memory_interface_wishbone_axi4_lite>> for processor-external modules +* <<_custom_functions_subsystem_cfs>> for tightly-coupled processor-internal co-processors +* <<_custom_functions_unit_cfu>> for custom RISC-V instructions + +[TIP] +A more detailed comparison of the extension/customization options can be found in section +https://stnolting.github.io/neorv32/ug/#_adding_custom_hardware_modules[Adding Custom Hardware Modules] +of the user guide. + <<< // #################################################################################################################### @@ -143,6 +159,7 @@ neorv32_top.vhd - NEORV32 Processor top entity ├neorv32_cpu.vhd - NEORV32 CPU top entity │├neorv32_cpu_alu.vhd - Arithmetic/logic unit ││├neorv32_cpu_cp_bitmanip.vhd - Bit-manipulation co-processor (B ext.) +││├neorv32_cpu_cp_cfu.vhd - Custom functions (instruction) co-processor (Zxcfu ext.) ││├neorv32_cpu_cp_fpu.vhd - Floating-point co-processor (Zfinx ext.) ││├neorv32_cpu_cp_muldiv.vhd - Mul/Div co-processor (M extension) ││└neorv32_cpu_cp_shifter.vhd - Bit-shift co-processor diff --git a/docs/datasheet/rationale.adoc b/docs/datasheet/rationale.adoc index a9996d0c1..0e198136d 100644 --- a/docs/datasheet/rationale.adoc +++ b/docs/datasheet/rationale.adoc @@ -31,6 +31,8 @@ co-processors and even user-defined instructions. **Why RISC-V?** +image::riscv_logo.png[width=250,align=left] + [quote, RISC-V International, https://riscv.org/about/] ____ RISC-V is a free and open ISA enabling a new era of processor innovation through open standard collaboration. @@ -60,7 +62,7 @@ https://github.com/olofk/serv[SERV] in terms of size. It was build having a diff The project aims to provide _another option_ in the RISC-V / soft-core design space with a different performance vs. size trade-off and a different focus: _embrace_ concepts like documentation, platform-independence / portability, -RISC-V compatibility, _customization_ and _ease of use_ (see the <<_project_key_features>> below). +RISC-V compatibility, _ extensibility & customization_ and _ease of use_ (see the <<_project_key_features>> below). Furthermore, the NEORV32 pays special focus on _execution safety_ using <<_full_virtualization>>. The CPU aims to provide fall-backs for _everything that could go wrong_. This includes malformed instruction words, privilege escalations diff --git a/docs/datasheet/soc.adoc b/docs/datasheet/soc.adoc index 2af132393..a99e1caf8 100644 --- a/docs/datasheet/soc.adoc +++ b/docs/datasheet/soc.adoc @@ -399,6 +399,18 @@ cannot be used together with the `M` extension. See section <<_zmmul_integer_mul |====== +:sectnums!: +===== _CPU_EXTENSION_RISCV_Zxcfu_ + +[cols="4,4,2"] +[frame="all",grid="none"] +|====== +| **CPU_EXTENSION_RISCV_Zxcfu** | _boolean_ | false +3+| NEORV32-specific "custom RISC-V" ISA extensions: Implement the <<_custom_functions_unit_cfu>> for user-defined +custom instruction when _true_. See section <<_zxcfu_custom_instructions_extension_cfu>> for more information. +|====== + + // #################################################################################################################### :sectnums: ==== Extension Options diff --git a/docs/datasheet/soc_cfs.adoc b/docs/datasheet/soc_cfs.adoc index 45e8d0b0e..612059461 100644 --- a/docs/datasheet/soc_cfs.adoc +++ b/docs/datasheet/soc_cfs.adoc @@ -35,7 +35,11 @@ dedicated hardware accelerators for en-/decryption (AES), signal processing (FFT (CNNs) as well as custom IO systems like fast memory interfaces (DDR) and mass storage (SDIO), networking (CAN) or real-time data transport (I2S). -[INFO] +[TIP] +If you like to implement _custom instructions_ that are executed right within the CPU's ALU +see the <<_zxcfu_custom_instructions_extension_cfu>> and the according <<_custom_functions_unit_cfu>>. + +[TIP] Take a look at the template CFS VHDL source file (`rtl/core/neorv32_cfs.vhd`). The file is highly commented to illustrate all aspects that are relevant for implementing custom CFS-based co-processor designs. diff --git a/docs/datasheet/soc_sysinfo.adoc b/docs/datasheet/soc_sysinfo.adoc index a19645096..3307b53ec 100644 --- a/docs/datasheet/soc_sysinfo.adoc +++ b/docs/datasheet/soc_sysinfo.adoc @@ -52,6 +52,7 @@ will signal a "DEVICE ERROR" in this case. | `0` | _SYSINFO_CPU_ZICSR_ | `Zicsr` extension (`I` sub-extension) available when set (via top's <<_cpu_extension_riscv_zicsr>> generic) | `1` | _SYSINFO_CPU_ZIFENCEI_ | `Zifencei` extension (`I` sub-extension) available when set (via top's <<_cpu_extension_riscv_zifencei>> generic) | `2` | _SYSINFO_CPU_ZMMUL_ | `Zmmul` extension (`M` sub-extension) available when set (via top's <<_cpu_extension_riscv_zmmul>> generic) +| `3` | _SYSINFO_CPU_ZXCFU_ | `Zxcfu` extension (custom functions unit for custom instructions) available when set (via top's <<_cpu_extension_riscv_zxcfu>> generic) | `5` | _SYSINFO_CPU_ZFINX_ | `Zfinx` extension (`F` sub-/alternative-extension) available when set (via top's <<_cpu_extension_riscv_zfinx>> generic) | `6` | _SYSINFO_CPU_ZXSCNT_ | Custom extension - _Small_ CPU counters: `[m]cycle` & `[m]instret` CSRs have less than 64-bit when set (via top's <<_cpu_cnt_width>> generic) | `7` | _SYSINFO_CPU_ZXNOCNT_ | Custom extension - _NO_ CPU counters: `[m]cycle` & `[m]instret` CSRs are NOT available at all when set (via top's <<_cpu_cnt_width>> generic) diff --git a/docs/datasheet/software.adoc b/docs/datasheet/software.adoc index 0f5130540..9bd3ad36d 100644 --- a/docs/datasheet/software.adoc +++ b/docs/datasheet/software.adoc @@ -63,24 +63,25 @@ files are currently part of the NEORV32 core library: [options="header",grid="rows"] |======================= | C source file | C header file | Description -| - | `neorv32.h` | main NEORV32 definitions and library file -| `neorv32_cfs.c` | `neorv32_cfs.h` | HW driver (stub)footnote:[This driver file only represents a stub, since the real CFS drivers are defined by the actual CFS implementation.] functions for the custom functions subsystem -| `neorv32_cpu.c` | `neorv32_cpu.h` | HW driver functions for the NEORV32 **CPU** -| `neorv32_gpio.c` | `neorv32_gpio.h` | HW driver functions for the **GPIO** -| `neorv32_gptmr.c` | `neorv32_gptmr.h` | HW driver functions for the **GPTRM** -| - | `neorv32_intrinsics.h` | macros for (custom) intrinsics/instructions -| `neorv32_mtime.c` | `neorv32_mtime.h` | HW driver functions for the **MTIME** -| `neorv32_neoled.c` | `neorv32_neoled.h` | HW driver functions for the **NEOLED** -| `neorv32_pwm.c` | `neorv32_pwm.h` | HW driver functions for the **PWM** -| `neorv32_rte.c` | `neorv32_rte.h` | NEORV32 **runtime environment** and helpers -| `neorv32_slink.c` | `neorv32_slink.h` | HW driver functions for the **SLINK** -| `neorv32_spi.c` | `neorv32_spi.h` | HW driver functions for the **SPI** -| `neorv32_trng.c` | `neorv32_trng.h` | HW driver functions for the **TRNG** -| `neorv32_twi.c` | `neorv32_twi.h` | HW driver functions for the **TWI** -| `neorv32_uart.c` | `neorv32_uart.h` | HW driver functions for the **UART0** and **UART1** -| `neorv32_wdt.c` | `neorv32_wdt.h` | HW driver functions for the **WDT** -| `neorv32_xip.c` | `neorv32_xip.h` | HW driver functions for the **XIP** -| `neorv32_xirq.c` | `neorv32_xirq.h` | HW driver functions for the **XIRQ** +| - | `neorv32.h` | main NEORV32 definitions and library file +| `neorv32_cfs.c` | `neorv32_cfs.h` | HW driver (stub)footnote:[This driver file only represents a stub, since the real CFS drivers are defined by the actual CFS implementation.] functions for the custom functions subsystem +| `neorv32_cpu.c` | `neorv32_cpu.h` | HW driver functions for the NEORV32 **CPU** +| `neorv32_cpu_cfu.c` | `neorv32_cpu_cfu.h` | HW driver functions for the NEORV32 **CFU** (custom instructions) +| `neorv32_gpio.c` | `neorv32_gpio.h` | HW driver functions for the **GPIO** +| `neorv32_gptmr.c` | `neorv32_gptmr.h` | HW driver functions for the **GPTRM** +| - | `neorv32_intrinsics.h` | macros for (custom) intrinsics/instructions +| `neorv32_mtime.c` | `neorv32_mtime.h` | HW driver functions for the **MTIME** +| `neorv32_neoled.c` | `neorv32_neoled.h` | HW driver functions for the **NEOLED** +| `neorv32_pwm.c` | `neorv32_pwm.h` | HW driver functions for the **PWM** +| `neorv32_rte.c` | `neorv32_rte.h` | NEORV32 **runtime environment** and helpers +| `neorv32_slink.c` | `neorv32_slink.h` | HW driver functions for the **SLINK** +| `neorv32_spi.c` | `neorv32_spi.h` | HW driver functions for the **SPI** +| `neorv32_trng.c` | `neorv32_trng.h` | HW driver functions for the **TRNG** +| `neorv32_twi.c` | `neorv32_twi.h` | HW driver functions for the **TWI** +| `neorv32_uart.c` | `neorv32_uart.h` | HW driver functions for the **UART0** and **UART1** +| `neorv32_wdt.c` | `neorv32_wdt.h` | HW driver functions for the **WDT** +| `neorv32_xip.c` | `neorv32_xip.h` | HW driver functions for the **XIP** +| `neorv32_xirq.c` | `neorv32_xirq.h` | HW driver functions for the **XIRQ** |======================= .Documentation diff --git a/docs/figures/cfu_r2type_instruction.png b/docs/figures/cfu_r2type_instruction.png new file mode 100644 index 000000000..4c397d964 Binary files /dev/null and b/docs/figures/cfu_r2type_instruction.png differ diff --git a/docs/figures/neorv32_cpu.png b/docs/figures/neorv32_cpu.png index cf5a490a6..9cf349384 100644 Binary files a/docs/figures/neorv32_cpu.png and b/docs/figures/neorv32_cpu.png differ diff --git a/docs/figures/neorv32_cpu_block.png b/docs/figures/neorv32_cpu_block.png new file mode 100644 index 000000000..c6a48c765 Binary files /dev/null and b/docs/figures/neorv32_cpu_block.png differ diff --git a/docs/figures/neorv32_processor.png b/docs/figures/neorv32_processor.png index f0b436b34..e8646a37f 100644 Binary files a/docs/figures/neorv32_processor.png and b/docs/figures/neorv32_processor.png differ diff --git a/docs/userguide/adding_custom_hw_modules.adoc b/docs/userguide/adding_custom_hw_modules.adoc index 4d78ddf75..33f95b591 100644 --- a/docs/userguide/adding_custom_hw_modules.adoc +++ b/docs/userguide/adding_custom_hw_modules.adoc @@ -4,6 +4,7 @@ In resemblance to the RISC-V ISA, the NEORV32 processor was designed to ease customization and _extensibility_. The processor provides several predefined options to add application-specific custom hardware modules and accelerators. +A <<_comparative_summary>> is given at the end of this section. === Standard (_External_) Interfaces @@ -15,47 +16,94 @@ https://stnolting.github.io/neorv32/#_primary_universal_asynchronous_receiver_an https://stnolting.github.io/neorv32/#_serial_peripheral_interface_controller_spi[SPI] and https://stnolting.github.io/neorv32/#_two_wire_serial_interface_controller_twi[TWI]. -The SPI and (especially) the GPIO interfaces might be the most straightforward approaches since they -have a minimal protocol overhead. Device-specific interrupt capabilities can be added using the +The SPI and especially the GPIO interfaces might be the most straightforward approaches since they +have a minimal protocol overhead. Device-specific interrupt capabilities could be added using the https://stnolting.github.io/neorv32/#_external_interrupt_controller_xirq[External Interrupt Controller (XIRQ)]. + Beyond simplicity, these interface only provide a very limited bandwidth and require more sophisticated -software handling ("bit-banging" for the GPIO). +software handling ("bit-banging" for the GPIO). Hence, i is not recommend to use them for _chip-internal_ communication. === External Bus Interface The https://stnolting.github.io/neorv32/#_processor_external_memory_interface_wishbone_axi4_lite[External Bus Interface] -provides the classic approach to connect to custom IP. By default, the bus interface implements the widely adopted -Wishbone interface standard. However, this project also includes wrappers to bridge to other protocol standards like ARM's -AXI4-Lite or Intel's Avalon. By using a full-featured bus protocol, complex SoC structures can be implemented (including -several modules and even multi-core architectures). Many FPGA EDA tools provide graphical editors to build and customize -whole SoC architectures and even include pre-defined IP libraries. +provides the classic approach for attaching custom IP. By default, the bus interface implements the widely adopted +Wishbone interface standard. This project also includes wrappers to convert to other protocol standards like ARM's +AXI4-Lite or Intel's Avalon protocols. By using a full-featured bus protocol, complex SoC designs can be implemented +including several modules and even multi-core architectures. Many FPGA EDA tools provide graphical editors to build +and customize whole SoC architectures and even include pre-defined IP libraries. .Example AXI SoC using Xilinx Vivado image::neorv32_axi_soc.png[] +Custom hardware modules attached to the processor's bus interface have no limitations regarding their functionality. +User-defined interfaces (like DDR memory access) can be implemented and the hardware module can operate completely +independent of the CPU. + The bus interface uses a memory-mapped approach. All data transfers are handled by simple load/store operations since the external bus interface is mapped into the processor's https://stnolting.github.io/neorv32/#_address_space[address space]. -This allows a very simple still high-bandwidth communications. +This allows a very simple still high-bandwidth communications. However, high bus traffic may increase access latencies. === Stream Link Interface -The NEORV32 https://stnolting.github.io/neorv32/#_stream_link_interface_slink[Stream Link Interface] provides -point-to-point, unidirectional and parallel data channels that can be used to transfer streaming data. In -contrast to the external bus interface, the streaming data does not provide any kind of "direction" control, -so it can be seen as "constant address bursts". The stream link interface provides less protocol overhead -and less latency than the bus interface. Furthermore, FIFOs can be be configured to each direction (RX/TX) to -allow more CPU-independent operation. +The https://stnolting.github.io/neorv32/#_stream_link_interface_slink[Stream Link Interface (SLINK)] provides a +point-to-point, unidirectional and parallel data interface that can be used to transfer _streaming_ data. In +contrast to the external bus interface, the streaming interface does not provide any kind of advanced control, +so it can be seen as "constant address bursts" where data is transmitted _sequentially_ (no random accesses). +While the CPU needs to "feed" the stream link interfaces with data (and read back incoming data), the actual +processor-external processing of the data run independently of the CPU. + +The stream link interface provides less protocol overhead and less latency than the bus interface. Furthermore, +FIFOs can be be configured to each direction (RX/TX) to allow more CPU-independent operation. === Custom Functions Subsystem -The NEORV32 https://stnolting.github.io/neorv32/#_custom_functions_subsystem_cfs[Custom Functions Subsystem] is -an "empty" template for a processor-internal module. It provides 32 32-bit memory-mapped interface -registers that can be used to communicate with any arbitrary custom design logic. The intentions of this -subsystem is to provide a simple base, where the user can concentrate on implementing the actual design logic -rather than taking care of the communication between the CPU/software and the design logic. The interface -registers are already allocated within the processor's address space and are supported by the software framework -via low-level hardware access mechanisms. Additionally, the CFS provides a direct pre-defined interrupt channel to -the CPU, which is also supported by the NEORV32 runtime environment. +The https://stnolting.github.io/neorv32/#_custom_functions_subsystem_cfs[Custom Functions Subsystem (CFS)] is +an "empty" template for a memory-mapped, processor-internal module. + +The basic idea of this subsystem is to provide a convenient, simple and flexible platform, where the user can +concentrate on implementing the actual design logic rather than taking care of the communication between the +CPU/software and the design logic. Note that the CFS does not have direct access to memory. All data (and control +instruction) have to be send by the CPU. + +The use-cases for the CFS include medium-scale hardware accelerators that need to be tightly-coupled to the CPU. +Potential use cases could be DSP modules like CORDIC, cryptographic accelerators or custom interfaces (like IIS). + + +=== Custom Functions Unit + +The https://stnolting.github.io/neorv32/#_custom_functions_unit_cfu[Custom Functions Unit (CFU)] is a functional +unit that is integrated right into the CPU's pipeline. It allows to implement custom RISC-V instructions. +This extension option is intended for rather small logic that implements operations, which cannot be emulated +in pure software in an efficient way. Since the CFU has direct access to the core's register file it can operate +with minimal data latency. + + +=== Comparative Summary + +The following table gives a comparative summary of the most important factors when choosing one of the +chip-internal extension options: + +* https://stnolting.github.io/neorv32/#_custom_functions_unit_cfu[Custom Functions Unit] for CPU-internal custom RISC-V instructions +* https://stnolting.github.io/neorv32/#_custom_functions_subsystem_cfs[Custom Functions Subsystem] for tightly-coupled processor-internal co-processors +* https://stnolting.github.io/neorv32/#_stream_link_interface_slink[Stream Link Interface] for processor-external streaming modules +* https://stnolting.github.io/neorv32/#_processor_external_memory_interface_wishbone_axi4_lite[External Bus Interface] for processor-external memory-mapped modules + +.Comparison of On-Chip Extension Options +[cols="<1,^1,^1,^1,^1"] +[options="header",grid="rows"] +|======================= +| | Custom Functions Unit | Custom Functions Subsystem | Stream Link Interface | External Bus Interface +| **SoC location** | CPU-internal | processor-internal | processor-external | processor-external +| **HW complexity/size** | small | medium | unlimited | unlimited +| **CPU-independent operation** | no | partly | partly | completely +| **CPU interface** | register-file access | memory-mapped | memory-mapped | memory-mapped +| **Low-level CPU access scheme** | custom instructions | load/store | load/store | load/store +| **Random access** | - | yes | no, only sequential | yes +| **Access latency** | minimal | low | low | medium to high +| **External IO interfaces** | no | yes, but limited | yes | yes +| **Interrupt-capable** | no | yes | yes | user-defined +|======================= + diff --git a/rtl/core/neorv32_cpu.vhd b/rtl/core/neorv32_cpu.vhd index 5a26df0bc..43ee4b9de 100644 --- a/rtl/core/neorv32_cpu.vhd +++ b/rtl/core/neorv32_cpu.vhd @@ -5,6 +5,7 @@ -- # * neorv32_cpu.vhd - CPU top entity # -- # * neorv32_cpu_alu.vhd - Arithmetic/logic unit # -- # * neorv32_cpu_cp_bitmanip.vhd - Bit-manipulation co-processor # +-- # * neorv32_cpu_cp_cfu.vhd - Custom instructions co-processor # -- # * neorv32_cpu_cp_fpu.vhd - Single-precision FPU co-processor # -- # * neorv32_cpu_cp_muldiv.vhd - Integer multiplier/divider co-processor # -- # * neorv32_cpu_cp_shifter.vhd - Base ISA shifter unit # @@ -76,6 +77,7 @@ entity neorv32_cpu is CPU_EXTENSION_RISCV_Zihpm : boolean; -- implement hardware performance monitors? CPU_EXTENSION_RISCV_Zifencei : boolean; -- implement instruction stream sync.? CPU_EXTENSION_RISCV_Zmmul : boolean; -- implement multiply-only M sub-extension? + CPU_EXTENSION_RISCV_Zxcfu : boolean; -- implement custom (instr.) functions unit? CPU_EXTENSION_RISCV_DEBUG : boolean; -- implement CPU debug mode? -- Extension Options -- FAST_MUL_EN : boolean; -- use DSPs for M extension's multiplier @@ -182,6 +184,7 @@ begin cond_sel_string_f(CPU_EXTENSION_RISCV_Zifencei, "_Zifencei", "") & cond_sel_string_f(CPU_EXTENSION_RISCV_Zfinx, "_Zfinx", "") & cond_sel_string_f(CPU_EXTENSION_RISCV_Zmmul, "_Zmmul", "") & + cond_sel_string_f(CPU_EXTENSION_RISCV_Zxcfu, "_Zxcfu", "") & cond_sel_string_f(CPU_EXTENSION_RISCV_DEBUG, "_DEBUG", "") & "" severity note; @@ -225,6 +228,9 @@ begin -- Mul-extension -- assert not ((CPU_EXTENSION_RISCV_Zmmul = true) and (CPU_EXTENSION_RISCV_M = true)) report "NEORV32 CPU CONFIG ERROR! and extensions cannot co-exist!" severity error; + -- Custom Functions Unit -- + assert not (CPU_EXTENSION_RISCV_Zxcfu = true) report "NEORV32 CPU CONFIG NOTE: Implementing Custom Functions Unit (CFU) as ISA extension." severity note; + -- Debug mode -- assert not ((CPU_EXTENSION_RISCV_DEBUG = true) and (CPU_EXTENSION_RISCV_Zicsr = false)) report "NEORV32 CPU CONFIG ERROR! Debug mode requires extension to be enabled." severity error; assert not ((CPU_EXTENSION_RISCV_DEBUG = true) and (CPU_EXTENSION_RISCV_Zifencei = false)) report "NEORV32 CPU CONFIG ERROR! Debug mode requires extension to be enabled." severity error; @@ -257,6 +263,7 @@ begin CPU_EXTENSION_RISCV_Zihpm => CPU_EXTENSION_RISCV_Zihpm, -- implement hardware performance monitors? CPU_EXTENSION_RISCV_Zifencei => CPU_EXTENSION_RISCV_Zifencei, -- implement instruction stream sync.? CPU_EXTENSION_RISCV_Zmmul => CPU_EXTENSION_RISCV_Zmmul, -- implement multiply-only M sub-extension? + CPU_EXTENSION_RISCV_Zxcfu => CPU_EXTENSION_RISCV_Zxcfu, -- implement custom (instr.) functions unit? CPU_EXTENSION_RISCV_DEBUG => CPU_EXTENSION_RISCV_DEBUG, -- implement CPU debug mode? -- Extension Options -- CPU_CNT_WIDTH => CPU_CNT_WIDTH, -- total width of CPU cycle and instret counters (0..64) @@ -349,6 +356,7 @@ begin CPU_EXTENSION_RISCV_M => CPU_EXTENSION_RISCV_M, -- implement mul/div extension? CPU_EXTENSION_RISCV_Zmmul => CPU_EXTENSION_RISCV_Zmmul, -- implement multiply-only M sub-extension? CPU_EXTENSION_RISCV_Zfinx => CPU_EXTENSION_RISCV_Zfinx, -- implement 32-bit floating-point extension (using INT reg!) + CPU_EXTENSION_RISCV_Zxcfu => CPU_EXTENSION_RISCV_Zxcfu, -- implement custom (instr.) functions unit? -- Extension Options -- FAST_MUL_EN => FAST_MUL_EN, -- use DSPs for M extension's multiplier FAST_SHIFT_EN => FAST_SHIFT_EN -- use barrel shifter for shift operations diff --git a/rtl/core/neorv32_cpu_alu.vhd b/rtl/core/neorv32_cpu_alu.vhd index 818b1c8eb..b73655076 100644 --- a/rtl/core/neorv32_cpu_alu.vhd +++ b/rtl/core/neorv32_cpu_alu.vhd @@ -1,7 +1,7 @@ -- ################################################################################################# -- # << NEORV32 - Arithmetical/Logical Unit >> # -- # ********************************************************************************************* # --- # Main data and address ALU and co-processor interface/arbiter. # +-- # Main data/address ALU and ALU co-processor (= multi-cycle function units). # -- # ********************************************************************************************* # -- # BSD 3-Clause License # -- # # @@ -48,6 +48,7 @@ entity neorv32_cpu_alu is CPU_EXTENSION_RISCV_M : boolean; -- implement mul/div extension? CPU_EXTENSION_RISCV_Zmmul : boolean; -- implement multiply-only M sub-extension? CPU_EXTENSION_RISCV_Zfinx : boolean; -- implement 32-bit floating-point extension (using INT reg!) + CPU_EXTENSION_RISCV_Zxcfu : boolean; -- implement custom (instr.) functions unit? -- Extension Options -- FAST_MUL_EN : boolean; -- use DSPs for M extension's multiplier FAST_SHIFT_EN : boolean -- use barrel shifter for shift operations @@ -334,10 +335,31 @@ begin end generate; - -- Co-Processor 4: Reserved --------------------------------------------------------------- + -- Co-Processor 4: Custom (Instructions) Functions Unit ('Zxcfu' Extension) --------------- -- ------------------------------------------------------------------------------------------- - cp_result(4) <= (others => '0'); - cp_valid(4) <= '0'; + neorv32_cpu_cp_cfu_inst_true: + if (CPU_EXTENSION_RISCV_Zxcfu = true) generate + neorv32_cpu_cp_cfu_inst: neorv32_cpu_cp_cfu + port map ( + -- global control -- + clk_i => clk_i, -- global clock, rising edge + rstn_i => rstn_i, -- global reset, low-active, async + ctrl_i => ctrl_i, -- main control bus + start_i => cp_start(4), -- trigger operation + -- data input -- + rs1_i => rs1_i, -- rf source 1 + rs2_i => rs2_i, -- rf source 2 + -- result and status -- + res_o => cp_result(4), -- operation result + valid_o => cp_valid(4) -- data output valid + ); + end generate; + + neorv32_cpu_cp_cfu_inst_false: + if (CPU_EXTENSION_RISCV_Zxcfu = false) generate + cp_result(4) <= (others => '0'); + cp_valid(4) <= '0'; + end generate; -- Co-Processor 5: Reserved --------------------------------------------------------------- diff --git a/rtl/core/neorv32_cpu_control.vhd b/rtl/core/neorv32_cpu_control.vhd index 10658957f..af6e433a1 100644 --- a/rtl/core/neorv32_cpu_control.vhd +++ b/rtl/core/neorv32_cpu_control.vhd @@ -67,6 +67,7 @@ entity neorv32_cpu_control is CPU_EXTENSION_RISCV_Zihpm : boolean; -- implement hardware performance monitors? CPU_EXTENSION_RISCV_Zifencei : boolean; -- implement instruction stream sync.? CPU_EXTENSION_RISCV_Zmmul : boolean; -- implement multiply-only M sub-extension? + CPU_EXTENSION_RISCV_Zxcfu : boolean; -- implement custom (instr.) functions unit? CPU_EXTENSION_RISCV_DEBUG : boolean; -- implement CPU debug mode? -- Extension Options -- CPU_CNT_WIDTH : natural; -- total width of CPU cycle and instret counters (0..64) @@ -1127,6 +1128,17 @@ begin end if; + when opcode_cust0_c => -- CFU: custom RISC-V instructions (CUSTOM0 OPCODE space) + -- ------------------------------------------------------------ + if (CPU_EXTENSION_RISCV_Zxcfu = true) then + ctrl_nxt(ctrl_cp_id_msb_c downto ctrl_cp_id_lsb_c) <= cp_sel_cfu_c; -- trigger CFU CP + ctrl_nxt(ctrl_alu_func1_c downto ctrl_alu_func0_c) <= alu_func_copro_c; + execute_engine.state_nxt <= ALU_WAIT; + else + execute_engine.state_nxt <= SYS_WAIT; + end if; + + when others => -- system/csr access OR illegal opcode - nothing bad (= no commits) will happen here if there is an illegal opcode -- ------------------------------------------------------------ if (CPU_EXTENSION_RISCV_Zicsr = true) then @@ -1188,7 +1200,7 @@ begin ctrl_nxt(ctrl_alu_func1_c downto ctrl_alu_func0_c) <= alu_func_copro_c; -- wait for completion or abort on illegal instruction exception (the co-processor will also terminate operations) if (alu_idone_i = '1') or (trap_ctrl.exc_buf(exception_iillegal_c) = '1') then - ctrl_nxt(ctrl_rf_wb_en_c) <= '1'; -- valid RF write-back + ctrl_nxt(ctrl_rf_wb_en_c) <= '1'; -- valid RF write-back (won't happen in case of an illegal instruction) execute_engine.state_nxt <= DISPATCH; end if; @@ -1556,7 +1568,17 @@ begin end if; -- illegal E-CPU register? -- -- FIXME: rs2 is not checked! - illegal_register <= execute_engine.i_reg(instr_rs1_msb_c) or execute_engine.i_reg(instr_rd_msb_c); + illegal_register <= bool_to_ulogic_f(CPU_EXTENSION_RISCV_Zfinx) and (execute_engine.i_reg(instr_rs1_msb_c) or execute_engine.i_reg(instr_rd_msb_c)); + + when opcode_cust0_c => -- CFU: custom instructions + -- ------------------------------------------------------------ + if (CPU_EXTENSION_RISCV_Zxcfu = true) then -- CFU extension implemented + illegal_instruction <= '0'; + else + illegal_instruction <= '1'; + end if; + -- illegal E-CPU register? -- + illegal_register <= bool_to_ulogic_f(CPU_EXTENSION_RISCV_Zxcfu) and (execute_engine.i_reg(instr_rs2_msb_c) or execute_engine.i_reg(instr_rs1_msb_c) or execute_engine.i_reg(instr_rd_msb_c)); when others => -- undefined instruction -> illegal! -- ------------------------------------------------------------ diff --git a/rtl/core/neorv32_cpu_cp_cfu.vhd b/rtl/core/neorv32_cpu_cp_cfu.vhd new file mode 100644 index 000000000..360fa2a61 --- /dev/null +++ b/rtl/core/neorv32_cpu_cp_cfu.vhd @@ -0,0 +1,189 @@ +-- ################################################################################################# +-- # << NEORV32 - CPU Co-Processor: Custom (Instructions) Functions Unit >> # +-- # ********************************************************************************************* # +-- # Intended for user-defined custom instructions (R2-type format only). # +-- # See the CPU's documentation for more information. # +-- # # +-- # NOTE: Take a look at the "software-counterpart" of this CFU example in 'sw/example/demo_cfu'. # +-- # # +-- # NOTE: This is a very early and very exemplary implementation of the custom functions unit. # +-- # Hence, it is not yet optimized for minimal interface latency. # +-- # ********************************************************************************************* # +-- # BSD 3-Clause License # +-- # # +-- # Copyright (c) 2022, Stephan Nolting. All rights reserved. # +-- # # +-- # Redistribution and use in source and binary forms, with or without modification, are # +-- # permitted provided that the following conditions are met: # +-- # # +-- # 1. Redistributions of source code must retain the above copyright notice, this list of # +-- # conditions and the following disclaimer. # +-- # # +-- # 2. Redistributions in binary form must reproduce the above copyright notice, this list of # +-- # conditions and the following disclaimer in the documentation and/or other materials # +-- # provided with the distribution. # +-- # # +-- # 3. Neither the name of the copyright holder nor the names of its contributors may be used to # +-- # endorse or promote products derived from this software without specific prior written # +-- # permission. # +-- # # +-- # THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS # +-- # OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF # +-- # MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE # +-- # COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, # +-- # EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE # +-- # GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED # +-- # AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING # +-- # NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED # +-- # OF THE POSSIBILITY OF SUCH DAMAGE. # +-- # ********************************************************************************************* # +-- # The NEORV32 Processor - https://github.com/stnolting/neorv32 (c) Stephan Nolting # +-- ################################################################################################# + +library ieee; +use ieee.std_logic_1164.all; +use ieee.numeric_std.all; + +library neorv32; +use neorv32.neorv32_package.all; + +entity neorv32_cpu_cp_cfu is + port ( + -- global control -- + clk_i : in std_ulogic; -- global clock, rising edge + rstn_i : in std_ulogic; -- global reset, low-active, async + ctrl_i : in std_ulogic_vector(ctrl_width_c-1 downto 0); -- main control bus + start_i : in std_ulogic; -- trigger operation + -- data input -- + rs1_i : in std_ulogic_vector(data_width_c-1 downto 0); -- rf source 1 + rs2_i : in std_ulogic_vector(data_width_c-1 downto 0); -- rf source 2 + -- result and status -- + res_o : out std_ulogic_vector(data_width_c-1 downto 0); -- operation result + valid_o : out std_ulogic -- data output valid + ); +end neorv32_cpu_cp_cfu; + +architecture neorv32_cpu_cp_cfu_rtl of neorv32_cpu_cp_cfu is + + -- CFU controller - do not modify -- + type control_t is record + busy : std_ulogic; -- CFU is busy + done : std_ulogic; -- set to '1' when processing is done + result : std_ulogic_vector(data_width_c-1 downto 0); -- user's processing result (for write-back to register file) + funct3 : std_ulogic_vector(2 downto 0); -- "funct3" bit-field from custom instruction + funct7 : std_ulogic_vector(6 downto 0); -- "funct7" bit-field from custom instruction + end record; + signal control : control_t; + +begin + + -- CFU Controller ------------------------------------------------------------------------- + -- ------------------------------------------------------------------------------------------- + -- This controller is required to handle the CPU/pipeline interface. Do not modify! + cfu_control: process(rstn_i, clk_i) + begin + if (rstn_i = '0') then + control.busy <= '0'; + res_o <= (others => '0'); + elsif rising_edge(clk_i) then + res_o <= (others => '0'); -- default + if (control.busy = '0') then -- idle + if (start_i = '1') then + control.busy <= '1'; + end if; + else -- busy + if (control.done = '1') or (ctrl_i(ctrl_trap_c) = '1') then -- processing done? abort if trap + res_o <= control.result; + control.busy <= '0'; + end if; + end if; + end if; + end process cfu_control; + + -- CPU feedback -- + valid_o <= control.busy and control.done; -- set one cycle before result data + + -- pack user-defined instruction function bits -- + control.funct3 <= ctrl_i(ctrl_ir_funct3_2_c downto ctrl_ir_funct3_0_c); + control.funct7 <= ctrl_i(ctrl_ir_funct12_11_c downto ctrl_ir_funct12_5_c); + + +-- **************************************************************************************************************************** +-- Actual CFU user logic - Add your custom logic below +-- **************************************************************************************************************************** + + -- The CFU only supports the R2-type RISC-V instruction format. This format consists of two source registers (rs1 and rs2), + -- a destination register (rd) and two "immediate" bit-fields (funct7 and funct3). It is up to the user to decide which + -- of these fields are actually used by the CFU logic. + -- + -- The user logic of the CFU has access to the following pre-defined signals: + -- + -- ------------------------------------------------------------------------------------------- + -- Input Operands + -- ------------------------------------------------------------------------------------------- + -- > rs1_i (input, 32-bit): source register 1 + -- > rs2_i (input, 32-bit): source register 2 + -- > control.funct3 (input, 3-bit): 3-bit function select / immediate, driven by instruction word's funct3 bit field + -- > control.funct7 (input, 7-bit): 7-bit function select / immediate, driven by instruction word's funct7 bit field + -- + -- The two signal rs1_i and rs2_i provide the data read from the CPU's register file, which is adressed by the + -- instruction word's rs1 and rs2 bit-fields. + -- + -- The actual CFU operation can be defined by using the funct3 and funct7 signals. Both signals are directly driven by + -- the according bit-fields of the custom instruction. Note that these signals represent "immediates" that have to be + -- static already at compile time. These immediates can be used to select the actual function to be executed or they + -- can be used as immediates for certain operations (like shift amounts, addresses or offsets). + -- + -- [NOTE]: rs1_i and rs2_i are directly driven by the register file (block RAM). It is recommended to buffer these signals + -- using CFU-internal registers before using them for computations as the rs1 and rs2 nets need to drive a lot of logic + -- in the CPU. + -- + -- [NOTE]: It is not possible for the CFU and it's according instruction words to cause any kind of exception. The CPU + -- control logic only verifies the custom instructions OPCODE and checks if the CFU is implemented at all. No combination + -- of funct7 and funct3 will cause an exception. + -- + -- ------------------------------------------------------------------------------------------- + -- Result output + -- ------------------------------------------------------------------------------------------- + -- > control.result (output, 32-bit): processing result + -- + -- When the CFU has finished computation, the data in the control.result signal will be written to the CPU's register + -- file. The destination register is addressed by the rd bit-field in the instruction. The CFU result output is + -- registered in the CFU controller (see above) so do not worry too much about increasing the CPU's critical path. ;) + -- + -- ------------------------------------------------------------------------------------------- + -- Control + -- ------------------------------------------------------------------------------------------- + -- > rstn_i (input, 1-bit): asynchronous reset, low-active + -- > clk_i (input, 1-bit): main clock + -- > start_i (input, 1-bit): operation trigger (start processing, high for one cycle) + -- > control.done (output, 1-bit): set high when the processing is done + -- + -- For pure-combinatorial instructions (without internal state) a subset of those signals is sufficient; see the minimal + -- example below. If the CFU shall also include states (like memories, registers or "buffers") the start_i signal can be + -- used to trigger a new CFU operation. As soon as all internal computations have completed, the control.done signal has + -- to be set to indicate completion. This will write the result data (control.result) to the CPU register file. + -- + -- [IMPORTANT]: The control.done *has to be set at some time*, otherwise the CPU will be halted forever. + + + -- User Logic Example --------------------------------------------------------------------- + -- ------------------------------------------------------------------------------------------- + user_logic_function_select: process(control, rs1_i, rs2_i) + begin + -- This is a simple ALU that implements four pure-combinatorial instructions. + -- The actual function to-be-executed is selected by the "funct3" bit-field of the custom instruction. + case control.funct3 is + when "000" => control.result <= bin_to_gray_f(rs1_i); -- funct3 = "000": convert rs1 from binary to gray + when "001" => control.result <= gray_to_bin_f(rs1_i); -- funct3 = "001": convert rs1 from gray to binary + when "010" => control.result <= bit_rev_f(rs1_i); -- funct3 = "010": bit-reversal of rs1 + when "011" => control.result <= rs1_i xnor rs2_i; -- funct3 = "011": XNOR input operands + when others => control.result <= (others => '0'); -- not implemented, set to zero + end case; + end process user_logic_function_select; + + -- processing done? -- + control.done <= '1'; -- we are just doing pure-combinatorial data processing here, which is done "immediately" + + +end neorv32_cpu_cp_cfu_rtl; diff --git a/rtl/core/neorv32_package.vhd b/rtl/core/neorv32_package.vhd index 09e94ce77..69b9923e5 100644 --- a/rtl/core/neorv32_package.vhd +++ b/rtl/core/neorv32_package.vhd @@ -63,7 +63,7 @@ package neorv32_package is -- Architecture Constants (do not modify!) ------------------------------------------------ -- ------------------------------------------------------------------------------------------- constant data_width_c : natural := 32; -- native data path width - do not change! - constant hw_version_c : std_ulogic_vector(31 downto 0) := x"01060700"; -- no touchy! + constant hw_version_c : std_ulogic_vector(31 downto 0) := x"01060701"; -- no touchy! constant archid_c : natural := 19; -- official NEORV32 architecture ID - hands off! -- Check if we're inside the Matrix ------------------------------------------------------- @@ -454,6 +454,9 @@ package neorv32_package is constant opcode_atomic_c : std_ulogic_vector(6 downto 0) := "0101111"; -- atomic operations (A extension) -- floating point operations (Zfinx-only) (F/D/H/Q) -- constant opcode_fop_c : std_ulogic_vector(6 downto 0) := "1010011"; -- dual/single operand instruction + -- official "custom0/1" RISC-V opcodes - free for custom instructions -- + constant opcode_cust0_c : std_ulogic_vector(6 downto 0) := "0001011"; -- custom instructions 0 +--constant opcode_cust1_c : std_ulogic_vector(6 downto 0) := "0101011"; -- custom instructions 1 -- RISC-V Funct3 -------------------------------------------------------------------------- -- ------------------------------------------------------------------------------------------- @@ -780,7 +783,7 @@ package neorv32_package is constant cp_sel_muldiv_c : std_ulogic_vector(2 downto 0) := "001"; -- CP1: multiplication/division operations ('M' extensions) constant cp_sel_bitmanip_c : std_ulogic_vector(2 downto 0) := "010"; -- CP2: bit manipulation ('B' extensions) constant cp_sel_fpu_c : std_ulogic_vector(2 downto 0) := "011"; -- CP3: floating-point unit ('Zfinx' extension) ---constant cp_sel_res0_c : std_ulogic_vector(2 downto 0) := "100"; -- CP4: reserved + constant cp_sel_cfu_c : std_ulogic_vector(2 downto 0) := "100"; -- CP4: custom instructions CFU ('Zxcfu' extension) --constant cp_sel_res1_c : std_ulogic_vector(2 downto 0) := "101"; -- CP5: reserved --constant cp_sel_res2_c : std_ulogic_vector(2 downto 0) := "110"; -- CP6: reserved --constant cp_sel_res3_c : std_ulogic_vector(2 downto 0) := "111"; -- CP7: reserved @@ -945,6 +948,7 @@ package neorv32_package is CPU_EXTENSION_RISCV_Zihpm : boolean := false; -- implement hardware performance monitors? CPU_EXTENSION_RISCV_Zifencei : boolean := false; -- implement instruction stream sync.? CPU_EXTENSION_RISCV_Zmmul : boolean := false; -- implement multiply-only M sub-extension? + CPU_EXTENSION_RISCV_Zxcfu : boolean := false; -- implement custom (instr.) functions unit? -- Extension Options -- FAST_MUL_EN : boolean := false; -- use DSPs for M extension's multiplier FAST_SHIFT_EN : boolean := false; -- use barrel shifter for shift operations @@ -1104,6 +1108,7 @@ package neorv32_package is CPU_EXTENSION_RISCV_Zihpm : boolean; -- implement hardware performance monitors? CPU_EXTENSION_RISCV_Zifencei : boolean; -- implement instruction stream sync.? CPU_EXTENSION_RISCV_Zmmul : boolean; -- implement multiply-only M sub-extension? + CPU_EXTENSION_RISCV_Zxcfu : boolean; -- implement custom (instr.) functions unit? CPU_EXTENSION_RISCV_DEBUG : boolean; -- implement CPU debug mode? -- Extension Options -- FAST_MUL_EN : boolean; -- use DSPs for M extension's multiplier @@ -1181,6 +1186,7 @@ package neorv32_package is CPU_EXTENSION_RISCV_Zihpm : boolean; -- implement hardware performance monitors? CPU_EXTENSION_RISCV_Zifencei : boolean; -- implement instruction stream sync.? CPU_EXTENSION_RISCV_Zmmul : boolean; -- implement multiply-only M sub-extension? + CPU_EXTENSION_RISCV_Zxcfu : boolean; -- implement custom (instr.) functions unit? CPU_EXTENSION_RISCV_DEBUG : boolean; -- implement CPU debug mode? -- Extension Options -- CPU_CNT_WIDTH : natural; -- total width of CPU cycle and instret counters (0..64) @@ -1267,6 +1273,7 @@ package neorv32_package is CPU_EXTENSION_RISCV_M : boolean; -- implement mul/div extension? CPU_EXTENSION_RISCV_Zmmul : boolean; -- implement multiply-only M sub-extension? CPU_EXTENSION_RISCV_Zfinx : boolean; -- implement 32-bit floating-point extension (using INT reg!) + CPU_EXTENSION_RISCV_Zxcfu : boolean; -- implement custom (instr.) functions unit? -- Extension Options -- FAST_MUL_EN : boolean; -- use DSPs for M extension's multiplier FAST_SHIFT_EN : boolean -- use barrel shifter for shift operations @@ -1379,6 +1386,24 @@ package neorv32_package is ); end component; + -- Component: CPU Co-Processor Custom (Instr.) Functions Unit ('Zxcfu' extension) --------- + -- ------------------------------------------------------------------------------------------- + component neorv32_cpu_cp_cfu + port ( + -- global control -- + clk_i : in std_ulogic; -- global clock, rising edge + rstn_i : in std_ulogic; -- global reset, low-active, async + ctrl_i : in std_ulogic_vector(ctrl_width_c-1 downto 0); -- main control bus + start_i : in std_ulogic; -- trigger operation + -- data input -- + rs1_i : in std_ulogic_vector(data_width_c-1 downto 0); -- rf source 1 + rs2_i : in std_ulogic_vector(data_width_c-1 downto 0); -- rf source 2 + -- result and status -- + res_o : out std_ulogic_vector(data_width_c-1 downto 0); -- operation result + valid_o : out std_ulogic -- data output valid + ); + end component; + -- Component: CPU Bus Interface ----------------------------------------------------------- -- ------------------------------------------------------------------------------------------- component neorv32_cpu_bus @@ -2026,6 +2051,7 @@ package neorv32_package is CPU_EXTENSION_RISCV_Zihpm : boolean; -- implement hardware performance monitors? CPU_EXTENSION_RISCV_Zifencei : boolean; -- implement instruction stream sync.? CPU_EXTENSION_RISCV_Zmmul : boolean; -- implement multiply-only M sub-extension? + CPU_EXTENSION_RISCV_Zxcfu : boolean; -- implement custom (instr.) functions unit? CPU_EXTENSION_RISCV_DEBUG : boolean; -- implement CPU debug mode? -- Extension Options -- FAST_MUL_EN : boolean; -- use DSPs for M extension's multiplier diff --git a/rtl/core/neorv32_sysinfo.vhd b/rtl/core/neorv32_sysinfo.vhd index cd6cf8fed..aaf72af56 100644 --- a/rtl/core/neorv32_sysinfo.vhd +++ b/rtl/core/neorv32_sysinfo.vhd @@ -54,6 +54,7 @@ entity neorv32_sysinfo is CPU_EXTENSION_RISCV_Zihpm : boolean; -- implement hardware performance monitors? CPU_EXTENSION_RISCV_Zifencei : boolean; -- implement instruction stream sync.? CPU_EXTENSION_RISCV_Zmmul : boolean; -- implement multiply-only M sub-extension? + CPU_EXTENSION_RISCV_Zxcfu : boolean; -- implement custom (instr.) functions unit? CPU_EXTENSION_RISCV_DEBUG : boolean; -- implement CPU debug mode? -- Extension Options -- FAST_MUL_EN : boolean; -- use DSPs for M extension's multiplier @@ -141,8 +142,9 @@ begin sysinfo_mem(1)(00) <= bool_to_ulogic_f(CPU_EXTENSION_RISCV_Zicsr); -- Zicsr sysinfo_mem(1)(01) <= bool_to_ulogic_f(CPU_EXTENSION_RISCV_Zifencei); -- Zifencei sysinfo_mem(1)(02) <= bool_to_ulogic_f(CPU_EXTENSION_RISCV_Zmmul); -- Zmmul + sysinfo_mem(1)(03) <= bool_to_ulogic_f(CPU_EXTENSION_RISCV_Zxcfu); -- Zxcfu -- - sysinfo_mem(1)(04 downto 03) <= (others => '0'); -- reserved + sysinfo_mem(1)(04) <= '0'; -- reserved -- sysinfo_mem(1)(05) <= bool_to_ulogic_f(CPU_EXTENSION_RISCV_Zfinx); -- Zfinx ("F-alternative") sysinfo_mem(1)(06) <= bool_to_ulogic_f(boolean(CPU_CNT_WIDTH /= 64)); -- reduced-size CPU counters (Zxscnt) diff --git a/rtl/core/neorv32_top.vhd b/rtl/core/neorv32_top.vhd index 4b8e1455f..fd592db8d 100644 --- a/rtl/core/neorv32_top.vhd +++ b/rtl/core/neorv32_top.vhd @@ -67,6 +67,7 @@ entity neorv32_top is CPU_EXTENSION_RISCV_Zihpm : boolean := false; -- implement hardware performance monitors? CPU_EXTENSION_RISCV_Zifencei : boolean := false; -- implement instruction stream sync.? CPU_EXTENSION_RISCV_Zmmul : boolean := false; -- implement multiply-only M sub-extension? + CPU_EXTENSION_RISCV_Zxcfu : boolean := false; -- implement custom (instr.) functions unit? -- Extension Options -- FAST_MUL_EN : boolean := false; -- use DSPs for M extension's multiplier @@ -492,6 +493,7 @@ begin CPU_EXTENSION_RISCV_Zihpm => CPU_EXTENSION_RISCV_Zihpm, -- implement hardware performance monitors? CPU_EXTENSION_RISCV_Zifencei => CPU_EXTENSION_RISCV_Zifencei, -- implement instruction stream sync.? CPU_EXTENSION_RISCV_Zmmul => CPU_EXTENSION_RISCV_Zmmul, -- implement multiply-only M sub-extension? + CPU_EXTENSION_RISCV_Zxcfu => CPU_EXTENSION_RISCV_Zxcfu, -- implement custom (instr.) functions unit? CPU_EXTENSION_RISCV_DEBUG => ON_CHIP_DEBUGGER_EN, -- implement CPU debug mode? -- Extension Options -- FAST_MUL_EN => FAST_MUL_EN, -- use DSPs for M extension's multiplier @@ -1503,6 +1505,7 @@ begin CPU_EXTENSION_RISCV_Zihpm => CPU_EXTENSION_RISCV_Zihpm, -- implement hardware performance monitors? CPU_EXTENSION_RISCV_Zifencei => CPU_EXTENSION_RISCV_Zifencei, -- implement instruction stream sync.? CPU_EXTENSION_RISCV_Zmmul => CPU_EXTENSION_RISCV_Zmmul, -- implement multiply-only M sub-extension? + CPU_EXTENSION_RISCV_Zxcfu => CPU_EXTENSION_RISCV_Zxcfu, -- implement custom (instr.) functions unit? CPU_EXTENSION_RISCV_DEBUG => ON_CHIP_DEBUGGER_EN, -- implement CPU debug mode? -- Extension Options -- FAST_MUL_EN => FAST_MUL_EN, -- use DSPs for M extension's multiplier diff --git a/rtl/system_integration/neorv32_ProcessorTop_stdlogic.vhd b/rtl/system_integration/neorv32_ProcessorTop_stdlogic.vhd index c2b169101..0db5e4b74 100644 --- a/rtl/system_integration/neorv32_ProcessorTop_stdlogic.vhd +++ b/rtl/system_integration/neorv32_ProcessorTop_stdlogic.vhd @@ -59,6 +59,8 @@ entity neorv32_ProcessorTop_stdlogic is CPU_EXTENSION_RISCV_Zicntr : boolean := true; -- implement base counters? CPU_EXTENSION_RISCV_Zihpm : boolean := false; -- implement hardware performance monitors? CPU_EXTENSION_RISCV_Zifencei : boolean := false; -- implement instruction stream sync.? + CPU_EXTENSION_RISCV_Zmmul : boolean := false; -- implement multiply-only M sub-extension? + CPU_EXTENSION_RISCV_Zxcfu : boolean := false; -- implement custom (instr.) functions unit? -- Extension Options -- FAST_MUL_EN : boolean := false; -- use DSPs for M extension's multiplier FAST_SHIFT_EN : boolean := false; -- use barrel shifter for shift operations @@ -296,6 +298,8 @@ begin CPU_EXTENSION_RISCV_Zicntr => CPU_EXTENSION_RISCV_Zicntr, -- implement base counters? CPU_EXTENSION_RISCV_Zihpm => CPU_EXTENSION_RISCV_Zihpm, -- implement hardware performance monitors? CPU_EXTENSION_RISCV_Zifencei => CPU_EXTENSION_RISCV_Zifencei, -- implement instruction stream sync.? + CPU_EXTENSION_RISCV_Zmmul => CPU_EXTENSION_RISCV_Zmmul, -- implement multiply-only M sub-extension? + CPU_EXTENSION_RISCV_Zxcfu => CPU_EXTENSION_RISCV_Zxcfu, -- implement custom (instr.) functions unit? -- Extension Options -- FAST_MUL_EN => FAST_MUL_EN, -- use DSPs for M extension's multiplier FAST_SHIFT_EN => FAST_SHIFT_EN, -- use barrel shifter for shift operations diff --git a/rtl/system_integration/neorv32_SystemTop_AvalonMM.vhd b/rtl/system_integration/neorv32_SystemTop_AvalonMM.vhd index 94f1ae210..8c905ce4a 100644 --- a/rtl/system_integration/neorv32_SystemTop_AvalonMM.vhd +++ b/rtl/system_integration/neorv32_SystemTop_AvalonMM.vhd @@ -65,6 +65,7 @@ entity neorv32_top_avalonmm is CPU_EXTENSION_RISCV_Zihpm : boolean := false; -- implement hardware performance monitors? CPU_EXTENSION_RISCV_Zifencei : boolean := false; -- implement instruction stream sync.? CPU_EXTENSION_RISCV_Zmmul : boolean := false; -- implement multiply-only M sub-extension? + CPU_EXTENSION_RISCV_Zxcfu : boolean := false; -- implement custom (instr.) functions unit? -- Extension Options -- FAST_MUL_EN : boolean := false; -- use DSPs for M extension's multiplier @@ -259,6 +260,7 @@ begin CPU_EXTENSION_RISCV_Zihpm => CPU_EXTENSION_RISCV_Zihpm, CPU_EXTENSION_RISCV_Zifencei => CPU_EXTENSION_RISCV_Zifencei, CPU_EXTENSION_RISCV_Zmmul => CPU_EXTENSION_RISCV_Zmmul, + CPU_EXTENSION_RISCV_Zxcfu => CPU_EXTENSION_RISCV_Zxcfu, -- Extension Options -- FAST_MUL_EN => FAST_MUL_EN, diff --git a/rtl/system_integration/neorv32_SystemTop_axi4lite.vhd b/rtl/system_integration/neorv32_SystemTop_axi4lite.vhd index 01ce8a5c9..abb725253 100644 --- a/rtl/system_integration/neorv32_SystemTop_axi4lite.vhd +++ b/rtl/system_integration/neorv32_SystemTop_axi4lite.vhd @@ -65,6 +65,8 @@ entity neorv32_SystemTop_axi4lite is CPU_EXTENSION_RISCV_Zicntr : boolean := true; -- implement base counters? CPU_EXTENSION_RISCV_Zihpm : boolean := false; -- implement hardware performance monitors? CPU_EXTENSION_RISCV_Zifencei : boolean := false; -- implement instruction stream sync.? + CPU_EXTENSION_RISCV_Zmmul : boolean := false; -- implement multiply-only M sub-extension? + CPU_EXTENSION_RISCV_Zxcfu : boolean := false; -- implement custom (instr.) functions unit? -- Extension Options -- FAST_MUL_EN : boolean := false; -- use DSPs for M extension's multiplier FAST_SHIFT_EN : boolean := false; -- use barrel shifter for shift operations @@ -303,6 +305,8 @@ begin CPU_EXTENSION_RISCV_Zicntr => CPU_EXTENSION_RISCV_Zicntr, -- implement base counters? CPU_EXTENSION_RISCV_Zihpm => CPU_EXTENSION_RISCV_Zihpm, -- implement hardware performance monitors? CPU_EXTENSION_RISCV_Zifencei => CPU_EXTENSION_RISCV_Zifencei, -- implement instruction stream sync.? + CPU_EXTENSION_RISCV_Zmmul => CPU_EXTENSION_RISCV_Zmmul, -- implement multiply-only M sub-extension? + CPU_EXTENSION_RISCV_Zxcfu => CPU_EXTENSION_RISCV_Zxcfu, -- implement custom (instr.) functions unit? -- Extension Options -- FAST_MUL_EN => FAST_MUL_EN, -- use DSPs for M extension's multiplier FAST_SHIFT_EN => FAST_SHIFT_EN, -- use barrel shifter for shift operations diff --git a/sim/neorv32_tb.vhd b/sim/neorv32_tb.vhd index 6de688015..912e7182d 100644 --- a/sim/neorv32_tb.vhd +++ b/sim/neorv32_tb.vhd @@ -294,6 +294,8 @@ begin CPU_EXTENSION_RISCV_Zicntr => true, -- implement base counters? CPU_EXTENSION_RISCV_Zihpm => true, -- implement hardware performance monitors? CPU_EXTENSION_RISCV_Zifencei => true, -- implement instruction stream sync.? + CPU_EXTENSION_RISCV_Zmmul => false, -- implement multiply-only M sub-extension? + CPU_EXTENSION_RISCV_Zxcfu => true, -- implement custom (instr.) functions unit? -- Extension Options -- FAST_MUL_EN => false, -- use DSPs for M extension's multiplier FAST_SHIFT_EN => false, -- use barrel shifter for shift operations diff --git a/sim/simple/neorv32_tb.simple.vhd b/sim/simple/neorv32_tb.simple.vhd index 4d35c41b5..762d42b0e 100644 --- a/sim/simple/neorv32_tb.simple.vhd +++ b/sim/simple/neorv32_tb.simple.vhd @@ -187,6 +187,7 @@ begin CPU_EXTENSION_RISCV_Zihpm => true, -- implement hardware performance monitors? CPU_EXTENSION_RISCV_Zifencei => CPU_EXTENSION_RISCV_Zifencei, -- implement instruction stream sync.? CPU_EXTENSION_RISCV_Zmmul => false, -- implement multiply-only M sub-extension? + CPU_EXTENSION_RISCV_Zxcfu => true, -- implement custom (instr.) functions unit? -- Extension Options -- FAST_MUL_EN => false, -- use DSPs for M extension's multiplier FAST_SHIFT_EN => false, -- use barrel shifter for shift operations diff --git a/sw/example/demo_cfu/main.c b/sw/example/demo_cfu/main.c new file mode 100644 index 000000000..c3e38dbb0 --- /dev/null +++ b/sw/example/demo_cfu/main.c @@ -0,0 +1,183 @@ +// ################################################################################################# +// # << NEORV32 - CFU Custom Instructions Example Program >> # +// # ********************************************************************************************* # +// # BSD 3-Clause License # +// # # +// # Copyright (c) 2022, Stephan Nolting. All rights reserved. # +// # # +// # Redistribution and use in source and binary forms, with or without modification, are # +// # permitted provided that the following conditions are met: # +// # # +// # 1. Redistributions of source code must retain the above copyright notice, this list of # +// # conditions and the following disclaimer. # +// # # +// # 2. Redistributions in binary form must reproduce the above copyright notice, this list of # +// # conditions and the following disclaimer in the documentation and/or other materials # +// # provided with the distribution. # +// # # +// # 3. Neither the name of the copyright holder nor the names of its contributors may be used to # +// # endorse or promote products derived from this software without specific prior written # +// # permission. # +// # # +// # THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS # +// # OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF # +// # MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE # +// # COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, # +// # EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE # +// # GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED # +// # AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING # +// # NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED # +// # OF THE POSSIBILITY OF SUCH DAMAGE. # +// # ********************************************************************************************* # +// # The NEORV32 Processor - https://github.com/stnolting/neorv32 (c) Stephan Nolting # +// ################################################################################################# + + +/**********************************************************************//** + * @file demo_cfu/main.c + * @author Stephan Nolting + * @brief Example program showing how to use the CFU's custom instructions. + **************************************************************************/ +#include + + +/**********************************************************************//** + * @name User configuration + **************************************************************************/ +/**@{*/ +/** UART BAUD rate */ +#define BAUD_RATE 19200 +/** Number of test cases per CFU instruction */ +#define TESTCASES 4 +/**@}*/ + + +/**********************************************************************//** + * @name Prototypes + **************************************************************************/ +uint32_t xorshift32(void); + + +/**********************************************************************//** + * Main function + * + * @note This program requires the CFU and UART0. + * + * @return 0 if execution was successful + **************************************************************************/ +int main() { + + // initialize NEORV32 run-time environment + neorv32_rte_setup(); + + // setup UART0 at default baud rate, no parity bits, no HW flow control + neorv32_uart0_setup(BAUD_RATE, PARITY_NONE, FLOW_CONTROL_NONE); + + // check if UART0 is implemented + if (neorv32_uart0_available() == 0) { + return 1; // UART0 not available, exit + } + + // check if the CFU is implemented at all + // note that the CFU is wrapped in the core's "Zxcfu" ISA extension + if (neorv32_cpu_cfu_available() == 0) { + neorv32_uart0_printf("ERROR! CFU ('Zxcfu' ISA extensions) not implemented!\n"); + return 1; + } + + + // intro + neorv32_uart0_printf("\n<<< NEORV32 Custom Functions Unit (CFU) 'Custom Instructions' Example Program >>>\n\n"); + + neorv32_uart0_printf("NOTE: This program assumes the _default_ CFU hardware module, which implements\n" + " four simple data conversion instructions.\n\n"); + + neorv32_uart0_printf("NOTE: This program (and it's comments) just shows how to USE the CFU's custom\n" + " instructions. The actual implementation of these instructions is done\n" + " in the CFU hardware module (-> rtl/core/neorv32_cpu_cp_cfu.vhd).\n\n"); + + + // custom instructions usage examples + uint32_t i, opa, opb; + + neorv32_uart0_printf("\n--- CFU 'binary to gray' instruction (funct3 = 000) ---\n"); + for (i=0; i "intrinsics"). + // + // Each neorv32_cfu_cmd* function requires three arguments: + // - funct7: a compile-time static 7-bit immediate (put in the instruction's "funct7" bit field) + // - rs1: a 32-bit operand A (this is the first register file source rs1) + // - rs2: a 32-bit operand B (this is the first register second source rs2) + // + // The operands can be literals, variables, function return values, ... you name it. + // + // Each neorv32_cfu_cmd* function returns a 32-bit uint32_t data word, which represents + // the result of the according instruction. + // + // The 7-bit immediate ("funct7") can be used to pass small _static_ literals to the CFU + // or to do a more fine-grained function selection - it all depends on your hardware implementation! ;) + neorv32_uart0_printf("Result = 0x%x\n", neorv32_cfu_cmd0(0b0000000, opa, opb)); + } + + neorv32_uart0_printf("\n--- CFU 'gray to binary' instruction (funct3 = 001) ---\n"); + for (i=0; i> 17; + x32 ^= x32 << 5; + + return x32; +} diff --git a/sw/example/demo_cfu/makefile b/sw/example/demo_cfu/makefile new file mode 100644 index 000000000..cd8a97a6c --- /dev/null +++ b/sw/example/demo_cfu/makefile @@ -0,0 +1,40 @@ +################################################################################################# +# << NEORV32 - Application Makefile >> # +# ********************************************************************************************* # +# Make sure to add the RISC-V GCC compiler's bin folder to your PATH environment variable. # +# ********************************************************************************************* # +# BSD 3-Clause License # +# # +# Copyright (c) 2021, Stephan Nolting. All rights reserved. # +# # +# Redistribution and use in source and binary forms, with or without modification, are # +# permitted provided that the following conditions are met: # +# # +# 1. Redistributions of source code must retain the above copyright notice, this list of # +# conditions and the following disclaimer. # +# # +# 2. Redistributions in binary form must reproduce the above copyright notice, this list of # +# conditions and the following disclaimer in the documentation and/or other materials # +# provided with the distribution. # +# # +# 3. Neither the name of the copyright holder nor the names of its contributors may be used to # +# endorse or promote products derived from this software without specific prior written # +# permission. # +# # +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS # +# OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF # +# MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE # +# COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, # +# EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE # +# GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED # +# AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING # +# NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED # +# OF THE POSSIBILITY OF SUCH DAMAGE. # +# ********************************************************************************************* # +# The NEORV32 Processor - https://github.com/stnolting/neorv32 (c) Stephan Nolting # +################################################################################################# + +# Modify this variable to fit your NEORV32 setup (neorv32 home folder) +NEORV32_HOME ?= ../../.. + +include $(NEORV32_HOME)/sw/common/common.mk diff --git a/sw/lib/include/neorv32.h b/sw/lib/include/neorv32.h index 3a6d214fa..4f11178df 100644 --- a/sw/lib/include/neorv32.h +++ b/sw/lib/include/neorv32.h @@ -1254,6 +1254,7 @@ enum NEORV32_SYSINFO_CPU_enum { SYSINFO_CPU_ZICSR = 0, /**< SYSINFO_CPU (0): Zicsr extension (I sub-extension) available when set (r/-) */ SYSINFO_CPU_ZIFENCEI = 1, /**< SYSINFO_CPU (1): Zifencei extension (I sub-extension) available when set (r/-) */ SYSINFO_CPU_ZMMUL = 2, /**< SYSINFO_CPU (2): Zmmul extension (M sub-extension) available when set (r/-) */ + SYSINFO_CPU_ZXCFU = 3, /**< SYSINFO_CPU (3): Zxcfu extension (custom functions unit for custom instructions) available when set (r/-) */ SYSINFO_CPU_ZFINX = 5, /**< SYSINFO_CPU (5): Zfinx extension (F sub-/alternative-extension) available when set (r/-) */ SYSINFO_CPU_ZXSCNT = 6, /**< SYSINFO_CPU (6): Custom extension - Small CPU counters: "cycle" & "instret" CSRs have less than 64-bit when set (r/-) */ @@ -1322,14 +1323,15 @@ enum NEORV32_SYSINFO_SOC_enum { // ---------------------------------------------------------------------------- -// Include all IO driver headers +// Include all system header files // ---------------------------------------------------------------------------- -// cpu core -#include "neorv32_cpu.h" - // intrinsics #include "neorv32_intrinsics.h" +// cpu core +#include "neorv32_cpu.h" +#include "neorv32_cpu_cfu.h" + // neorv32 runtime environment #include "neorv32_rte.h" diff --git a/sw/lib/include/neorv32_cpu_cfu.h b/sw/lib/include/neorv32_cpu_cfu.h new file mode 100644 index 000000000..255ae2394 --- /dev/null +++ b/sw/lib/include/neorv32_cpu_cfu.h @@ -0,0 +1,71 @@ +// ################################################################################################# +// # << NEORV32: neorv32_cfu.h - CPU Core - CFU Co-Processor Hardware Driver >> # +// # ********************************************************************************************* # +// # BSD 3-Clause License # +// # # +// # Copyright (c) 2022, Stephan Nolting. All rights reserved. # +// # # +// # Redistribution and use in source and binary forms, with or without modification, are # +// # permitted provided that the following conditions are met: # +// # # +// # 1. Redistributions of source code must retain the above copyright notice, this list of # +// # conditions and the following disclaimer. # +// # # +// # 2. Redistributions in binary form must reproduce the above copyright notice, this list of # +// # conditions and the following disclaimer in the documentation and/or other materials # +// # provided with the distribution. # +// # # +// # 3. Neither the name of the copyright holder nor the names of its contributors may be used to # +// # endorse or promote products derived from this software without specific prior written # +// # permission. # +// # # +// # THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS # +// # OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF # +// # MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE # +// # COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, # +// # EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE # +// # GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED # +// # AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING # +// # NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED # +// # OF THE POSSIBILITY OF SUCH DAMAGE. # +// # ********************************************************************************************* # +// # The NEORV32 Processor - https://github.com/stnolting/neorv32 (c) Stephan Nolting # +// ################################################################################################# + + +/**********************************************************************//** + * @file neorv32_cpu_cfu.h + * @author Stephan Nolting + * @brief CPU Core custom functions unit HW driver header file. + **************************************************************************/ + +#ifndef neorv32_cpu_cfu_h +#define neorv32_cpu_cfu_h + +// prototypes +int neorv32_cpu_cfu_available(void); + + +/**********************************************************************//** + * @name CFU custom instructions (intrinsic) + **************************************************************************/ +/**@{*/ +/** CFU custom instruction 0 (funct3 = 000) */ +#define neorv32_cfu_cmd0(funct7, rs1, rs2) CUSTOM_INSTR_R2_TYPE(funct7, rs2, rs1, 0, RISCV_OPCODE_CUSTOM0) +/** CFU custom instruction 1 (funct3 = 001) */ +#define neorv32_cfu_cmd1(funct7, rs1, rs2) CUSTOM_INSTR_R2_TYPE(funct7, rs2, rs1, 1, RISCV_OPCODE_CUSTOM0) +/** CFU custom instruction 2 (funct3 = 010) */ +#define neorv32_cfu_cmd2(funct7, rs1, rs2) CUSTOM_INSTR_R2_TYPE(funct7, rs2, rs1, 2, RISCV_OPCODE_CUSTOM0) +/** CFU custom instruction 3 (funct3 = 011) */ +#define neorv32_cfu_cmd3(funct7, rs1, rs2) CUSTOM_INSTR_R2_TYPE(funct7, rs2, rs1, 3, RISCV_OPCODE_CUSTOM0) +/** CFU custom instruction 4 (funct3 = 100) */ +#define neorv32_cfu_cmd4(funct7, rs1, rs2) CUSTOM_INSTR_R2_TYPE(funct7, rs2, rs1, 4, RISCV_OPCODE_CUSTOM0) +/** CFU custom instruction 5 (funct3 = 101) */ +#define neorv32_cfu_cmd5(funct7, rs1, rs2) CUSTOM_INSTR_R2_TYPE(funct7, rs2, rs1, 5, RISCV_OPCODE_CUSTOM0) +/** CFU custom instruction 6 (funct3 = 110) */ +#define neorv32_cfu_cmd6(funct7, rs1, rs2) CUSTOM_INSTR_R2_TYPE(funct7, rs2, rs1, 6, RISCV_OPCODE_CUSTOM0) +/** CFU custom instruction 7 (funct3 = 111) */ +#define neorv32_cfu_cmd7(funct7, rs1, rs2) CUSTOM_INSTR_R2_TYPE(funct7, rs2, rs1, 7, RISCV_OPCODE_CUSTOM0) +/**@}*/ + +#endif // neorv32_cpu_cfu_h diff --git a/sw/lib/include/neorv32_intrinsics.h b/sw/lib/include/neorv32_intrinsics.h index 804654cfc..e28127d6b 100644 --- a/sw/lib/include/neorv32_intrinsics.h +++ b/sw/lib/include/neorv32_intrinsics.h @@ -154,6 +154,10 @@ asm(".set regnum_t3 , 28"); asm(".set regnum_t4 , 29"); asm(".set regnum_t5 , 30"); asm(".set regnum_t6 , 31"); + +/** Official RISC-V opcodes for custom extensions (CUSTOM0, CUSTOM1) */ +asm(".set RISCV_OPCODE_CUSTOM0 , 0b0001011"); +asm(".set RISCV_OPCODE_CUSTOM1 , 0b0101011"); /**@}*/ @@ -193,7 +197,8 @@ asm(".set regnum_t6 , 31"); asm volatile ( \ "" \ : [output] "=r" (__return) \ - : [input_i] "r" (rs1), [input_j] "r" (rs2) \ + : [input_i] "r" (rs1), \ + [input_j] "r" (rs2) \ ); \ asm volatile ( \ ".word ( \ @@ -205,7 +210,8 @@ asm(".set regnum_t6 , 31"); (((" #opcode ") & 0x7f) << 0) \ );" \ : [rd] "=r" (__return) \ - : "r" (rs1), "r" (rs2) \ + : "r" (rs1), \ + "r" (rs2) \ ); \ __return; \ }) @@ -220,7 +226,9 @@ asm(".set regnum_t6 , 31"); asm volatile ( \ "" \ : [output] "=r" (__return) \ - : [input_i] "r" (rs1), [input_j] "r" (rs2), [input_k] "r" (rs3) \ + : [input_i] "r" (rs1), \ + [input_j] "r" (rs2), \ + [input_k] "r" (rs3) \ ); \ asm volatile ( \ ".word ( \ @@ -232,7 +240,9 @@ asm(".set regnum_t6 , 31"); (((" #opcode ") & 0x7f) << 0) \ );" \ : [rd] "=r" (__return) \ - : "r" (rs1), "r" (rs2), "r" (rs3) \ + : "r" (rs1), \ + "r" (rs2), \ + "r" (rs3) \ ); \ __return; \ }) diff --git a/sw/lib/source/neorv32_cfs.c b/sw/lib/source/neorv32_cfs.c index 97fded1ef..f56da9a9e 100644 --- a/sw/lib/source/neorv32_cfs.c +++ b/sw/lib/source/neorv32_cfs.c @@ -3,7 +3,7 @@ // # ********************************************************************************************* # // # BSD 3-Clause License # // # # -// # Copyright (c) 2021, Stephan Nolting. All rights reserved. # +// # Copyright (c) 2022, Stephan Nolting. All rights reserved. # // # # // # Redistribution and use in source and binary forms, with or without modification, are # // # permitted provided that the following conditions are met: # @@ -49,7 +49,7 @@ /**********************************************************************//** - * Check if custom functions unit 0 was synthesized. + * Check if custom functions subsystem was synthesized. * * @return 0 if CFS was not synthesized, 1 if CFS is available. **************************************************************************/ diff --git a/sw/lib/source/neorv32_cpu_cfu.c b/sw/lib/source/neorv32_cpu_cfu.c new file mode 100644 index 000000000..c6fb8064f --- /dev/null +++ b/sw/lib/source/neorv32_cpu_cfu.c @@ -0,0 +1,60 @@ +// ################################################################################################# +// # << NEORV32: neorv32_cfu.c - CPU Core - CFU Co-Processor Hardware Driver >> # +// # ********************************************************************************************* # +// # BSD 3-Clause License # +// # # +// # Copyright (c) 2022, Stephan Nolting. All rights reserved. # +// # # +// # Redistribution and use in source and binary forms, with or without modification, are # +// # permitted provided that the following conditions are met: # +// # # +// # 1. Redistributions of source code must retain the above copyright notice, this list of # +// # conditions and the following disclaimer. # +// # # +// # 2. Redistributions in binary form must reproduce the above copyright notice, this list of # +// # conditions and the following disclaimer in the documentation and/or other materials # +// # provided with the distribution. # +// # # +// # 3. Neither the name of the copyright holder nor the names of its contributors may be used to # +// # endorse or promote products derived from this software without specific prior written # +// # permission. # +// # # +// # THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS # +// # OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF # +// # MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE # +// # COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, # +// # EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE # +// # GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED # +// # AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING # +// # NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED # +// # OF THE POSSIBILITY OF SUCH DAMAGE. # +// # ********************************************************************************************* # +// # The NEORV32 Processor - https://github.com/stnolting/neorv32 (c) Stephan Nolting # +// ################################################################################################# + + +/**********************************************************************//** + * @file neorv32_cpu_cfu.c + * @author Stephan Nolting + * @brief CPU Core custom functions unit HW driver source file. + **************************************************************************/ + +#include "neorv32.h" +#include "neorv32_cpu_cfu.h" + + +/**********************************************************************//** + * Check if custom functions unit was synthesized. + * + * @return 0 if CFU was not synthesized, 1 if CFU is available. + **************************************************************************/ +int neorv32_cpu_cfu_available(void) { + + // this is an ISA extension - not a SoC module + if (NEORV32_SYSINFO.CPU & (1 << SYSINFO_CPU_ZXCFU)) { + return 1; + } + else { + return 0; + } +} diff --git a/sw/lib/source/neorv32_rte.c b/sw/lib/source/neorv32_rte.c index 5434253b5..b23cf98a5 100644 --- a/sw/lib/source/neorv32_rte.c +++ b/sw/lib/source/neorv32_rte.c @@ -355,6 +355,9 @@ void neorv32_rte_print_hw_config(void) { if (tmp & (1<