NMSIS: support the terapines libdsp and libnn function libraries.#65
Closed
kaishaoshao wants to merge 103 commits intoNuclei-Software:masterfrom
Closed
NMSIS: support the terapines libdsp and libnn function libraries.#65kaishaoshao wants to merge 103 commits intoNuclei-Software:masterfrom
kaishaoshao wants to merge 103 commits intoNuclei-Software:masterfrom
Conversation
Signed-off-by: dongyongtao <dongyongtao@nucleisys.com>
RT-Thread/ThreadX/FreeRTOS/UCOSII support are updated see https://github.com/riscv-non-isa/riscv-elf-psabi-doc/blob/master/riscv-cc.adoc The stack grows downwards (towards lower addresses) and the stack pointer shall be aligned to a 128-bit boundary upon procedure entry. The ILP32E calling convention is designed to be usable with the RV32E ISA. This calling convention is the same as the integer calling convention, except for the following differences. The stack pointer need only be aligned to a 32-bit boundary.
Make task frame 16 bytes aligned by changing saved regs from 30 to 32 https://github.com/riscv-mcu/riscv-gcc/blob/553a166de2dd7451652a0973908e1f2459118506/gcc/config/riscv/riscv.h#L201-L211 Signed-off-by: Huaqi Fang <578567190@qq.com>
…t 16 bytes aligned This will make sure sp is 16 bytes aligned when call c function in task switch process see 1c24496
… enabled Signed-off-by: Huaqi Fang <578567190@qq.com>
Signed-off-by: dongyongtao <dongyongtao@nucleisys.com>
…eRTOS SMP The previous implementation of `vPortRecursiveLock` was unsafe on systems with weak memory ordering (like RISC-V), leading to potential race conditions. This commit corrects the implementation by introducing necessary memory barriers and optimizes the spin-wait loop. **Correctness Fixes:** * **Acquire Barrier**: An acquire memory barrier (`__RWMB()`) is added immediately after a successful atomic swap (`__AMOSWAP_W`). This is critical to prevent the compiler or CPU from reordering memory operations from within the critical section to before the lock is actually acquired. Without this, the lock provides no protection. * **Release Barrier**: A release memory barrier (`__RWMB()`) is added before the lock variable is cleared. This ensures that all memory writes within the critical section are globally visible *before* the lock is released. This prevents other cores from acquiring the lock and seeing stale data. **Performance and Logic Improvements:** * **Test-and-Test-and-Set (TTS)**: The lock acquisition logic has been restructured into a more efficient TTS pattern. The code now spins on a cheap, non-atomic read (`while (*pxSpinLock == 0)`) and only attempts the expensive atomic swap when the lock appears to be free. This significantly reduces bus contention and improves system performance when multiple cores are contending for a lock. * **NOP in Spin Loop**: A `__NOP()` has been added to the spin-wait loop. This can help reduce power consumption and pipeline pressure on some CPU architectures during tight spins. * **Improved Readability**: Added comments to clarify the logic for recursive locking, lock acquisition, and the purpose of the memory barriers. Signed-off-by: Huaqi Fang <578567190@qq.com>
The `ucOwnedByCore` and `ucRecursionCountByLock` arrays are used to manage recursive spinlocks and are accessed by multiple cores. Without the `volatile` keyword, the compiler might optimize away memory accesses and cache the array values in registers. This could lead to a core reading a stale value, causing incorrect lock behavior, race conditions, or deadlocks in a multi-core environment. Marking these arrays as `volatile` ensures that every access reads from or writes to main memory, guaranteeing the correct and up-to-date state is observed by all cores. Signed-off-by: Huaqi Fang <578567190@qq.com>
n300e is rv32emac, n300e is added in build system, npk and doc Signed-off-by: Huaqi Fang <578567190@qq.com>
…anges Signed-off-by: Huaqi Fang <578567190@qq.com>
This application demonstrates how to switch from machine mode to user mode on Nuclei RISC-V processors. It showcases the usage of PMP (Physical Memory Protection) configuration, ECLIC (Enhanced Core-Local Interrupt Controller) interrupt handling, and SysTimer functionality.
- Add safety recommendation for changing MTH: - Disable all interrupts before modifying MTH - Perform fence operation after changing MTH - Re-enable interrupts after the changes - Similar recommendation added for setting STH
Set MTH maybe not effective right now, it is neccessary to disable/enable interrupt for criticial mth set routine - Add code to disable interrupts before setting BASEPRI in vPortRaiseBASEPRI, ulPortRaiseBASEPRI, and vPortSetBASEPRI functions - This change ensures proper synchronization and prevents potential race conditions when modifying the BASEPRI register
…rt functions It may racely met a eclic mth setted, but interrupt still goes in, and then mth modified by other tasks switched in, and then return to previous vPortEnterCritical will face a assert Here configASSERT((__ECLIC_GetMth() & portMTH_MASK) == uxMaxSysCallMTH); - Add MSTATUS_MIE save and restore in vPortRaiseBASEPRI, ulPortRaiseBASEPRI, and vPortSetBASEPRI functions - Ensure interrupts are disabled before setting MTH to prevent potential race conditions
The sPMP and sMPU entries currently supported are limited up to 16 - Add conditional compilation to limit SPMP and SMPU entry numbers to 16 - Preserve original logic for cases where CFG_PMP_ENTRY_NUM <= 16 - Improve compatibility with systems having more than 16 PMP entries
…king - Add support for using MSTATUS.MIE instead of ECLIC.MTH for interrupt masking when configMAX_SYSCALL_INTERRUPT_PRIORITY >= 255 - Update port.c and portmacro.h to handle both interrupt masking methods - Modify FreeRTOSConfig.h files to set configMAX_SYSCALL_INTERRUPT_PRIORITY to 255 - Add comments to explain the behavior of configMAX_SYSCALL_INTERRUPT_PRIORITY
- Remove unnecessary kernel debug and assertion code - Eliminate redundant critical section management functions - Simplify interrupt enable/disable macros - Remove unused variables and functions related to max syscall priority - Optimize task switching and tick handling - The original portable code is modified based on FreeRTOS, now this port just use mie to do interrupt masking Signed-off-by: Huaqi Fang <578567190@qq.com>
Signed-off-by: Huaqi Fang <578567190@qq.com>
Signed-off-by: Huaqi Fang <578567190@qq.com>
…emetal applications
Now support PMP entries above 16 to 64 Signed-off-by: Huaqi Fang <578567190@qq.com>
Signed-off-by: Huaqi Fang <578567190@qq.com>
- Enhance NMSIS to support more PMP entries and enable __LD/__SD macro for rv32 - Add comments for ECLIC threshold MTH recommendations - Update FreeRTOS demo to use MSTATUS.MIE for interrupt masking - Fix FreeRTOS task stack alignment and optimize SMP spinlock implementation - Introduce new interrupt masking feature for FreeRTOS - Limit sPMP/sMPU entry numbers for evalsoc - Add demo_eclic_umode nsdk_cli configuration for CI Signed-off-by: Huaqi Fang <578567190@qq.com>
Signed-off-by: Huaqi Fang <578567190@qq.com>
Signed-off-by: Huaqi Fang <578567190@qq.com>
Signed-off-by: qiujiandong <qiujiandong@nucleisys.com>
Signed-off-by: qiujiandong <qiujiandong@nucleisys.com>
Signed-off-by: Huaqi Fang <578567190@qq.com>
Signed-off-by: Huaqi Fang <578567190@qq.com>
Signed-off-by: Huaqi Fang <578567190@qq.com>
This required Nuclei Studio and CPU Model >= 2025.10 Signed-off-by: Huaqi Fang <578567190@qq.com>
Signed-off-by: Huaqi Fang <578567190@qq.com>
…ne stalls on branch misprediction Signed-off-by: Huaqi Fang <578567190@qq.com>
Still not working, still debug it now Signed-off-by: Huaqi Fang <578567190@qq.com>
Still not working, just add porting code Signed-off-by: Huaqi Fang <578567190@qq.com>
exception
ux900_best_config_2c_ku060_50M_3274b1812_2f700b650_202402261123.bit
**** ThreadX SMP Linux Demonstration **** (c) 1996-2020 Microsoft Corporation
thread 0 events sent 65221, thread 0 cpu 1
thread 1 messages sent: 696771070, thread 1 cpu 0
thread 2 messages received: 696771515, thread 2 cpu 0
thread 3 obtained semaphore: 244577, thread 3 cpu 0
thread 4 obtained semaphore: 244576, thread 4 cpu 0
thread 5 events received: 65221, thread 5 cpu 0
thread 6 mutex obtained: 244577, thread 6 cpu 0
thread 7 mutex obtained: 244577, thread 7 cpu 0
**** ThreadX SMP Linux Demonstration **** (c) 1996-2020 Microsoft Corporation
thread 0 events sent 65222, thread 0 cpu 0
thread 1 messages sent: 696781710, thread 1 cpu 1
thread 2 messages received: 696782221, thread 2 cpu 1
thread 3 obtained semaphore: 244580, thread 3 cpu 1
thread 4 obtained semaphore: 244580, thread 4 cpu 1
thread 5 events received: 65222, thread 5 cpu 1
thread 6 mutex obtained: 244581, thread 6 cpu 1
thread 7 mutex obtained: 244580, thread 7 cpu 1
ux900_best_config_4c_vcu118_50M_42f9d913d_2f700b650_202402261855.bit
this is not working for smpx4
Nuclei SDK Build Time: Dec 5 2025, 11:47:02
Download Mode: SRAM
CPU Frequency 50322472 Hz
CPU HartID: 0
**** ThreadX SMP Linux Demonstration **** (c) 1996-2020 Microsoft Corporation
thread 0 events sent 1, thread 0 cpu 0
thread 1 messages sent: 719, thread 1 cpu 1
thread 2 messages received: 1077, thread 2 cpu 3
thread 3 obtained semaphore: 2, thread 3 cpu 3
thread 4 obtained semaphore: 1, thread 4 cpu 2
thread 5 events received: 1, thread 5 cpu 1
thread 6 mutex obtained: 2, thread 6 cpu 2
thread 7 mutex obtained: 2, thread 7 cpu 1
**** ThreadX SMP Linux Demonstration **** (c) 1996-2020 Microsoft Corporation
thread 0 events sent 2, thread 0 cpu 3
thread 1 messages sent: 11049, thread 1 cpu 2
thread 2 messages received: 11409, thread 2 cpu 1
thread 3 obtained semaphore: 5, thread 3 cpu 0
thread 4 obtained semaphore: 5, thread 4 cpu 1
2 thread 5 events received: U 20 thr2ad M CAUS2: t r P 0xa:012f08
8tex MTVaL ::0x0
, tr t2: 0x000100, d 7 utexthread 7 :utex obt ined e 5, thr0ad 7:cpu 2,
t
: 0x4, t5: 0x2, t6: 0xa0010870
a0: 0x2, a1: 0xa0010748, a2: 0x8, a3: 0x31000, a4: 0x1, a5: 0x18031008, a6: 0xa0010b10, a7: 0xf
cause: 0x38000002, epc: 0xa0012f88
msubm: 0x80
**** ThreadX SMP Linux Demonstration **** (c) 1996-2020 Microsoft Corporation
thread 0 events sent 3, thread 0 cpu 0
thread 1 messages sent: 20129, thread 1 cpu 1
thread 2 messages received: 20462, thread 2 cpu 3
thread 3 obtained semaphore: 9, thread 3 cpu 1
thread 4 obtained semaphore: 9, thread 4 cpu 1
thread 5 events received: 3, thread 5 cpu 3
thread 6 mutex obtained: 6, thread 6 cpu 1
thread 7 mutex obtained: 5, thread 7 cpu 2
**** ThreadX SMP Linux Demonstration **** (c) 1996-2020 Microsoft Corporation
thread 0 events sent 4, thread 0 cpu 3
MCAUSE : 0x38000002
MDCAUSE: 0x0
MEPC : 0xa0010000
MTVAL : 0x0
HARTID : 3
MCAUSE : 0x30000002
MDCAUSE: 0x0
MEPC : 0xa0010af0
MTVAL : 0x0
HARTID : 3
ra: 0xa0010af0, tp: 0xa00106d0, t0: 0xdeadbeef, t1: 0xdeadbeef, t2: 0xdeadbeef, t3: 0xdeadbeef, t4: 0xdeadbeef, t5: 0xdeadbeef, t6: 0xdeadbeef
Signed-off-by: Huaqi Fang <578567190@qq.com>
tested on SMPx4 ux900k_smp4_ecc-rv64imafdcb_zfh_dsp-i64d64ic64dc64l2c2048s2G-pa32_plic_eclic_ecc_pf1_pmp8-vcu118_50M_17651beacb_407314831_202511241738_v4.4.1.bit Still not working Signed-off-by: Huaqi Fang <578567190@qq.com>
…dler mcause and msubm need to be saved and restore to make sure the interrupt status is correct Thread 1 "riscv.cpu.0" hit Breakpoint 1, eclic_msip_handler () at ../../../OS/ThreadX/ports/nuclei/gcc/context.S:195 195 mret 1: /x ($mintstatus >> 24) & 0xF = 0xf 2: /x ($msubm >> 6) & 0x3 = 0x1 3: /x ($msubm >> 8) & 0x3 = 0x0 (gdb) si _tx_thread_system_return () at ../../../OS/ThreadX/ports/nuclei/tx_port.h:313 313 __RWMB(); 1: /x ($mintstatus >> 24) & 0xF = 0x0 2: /x ($msubm >> 6) & 0x3 = 0x0 3: /x ($msubm >> 8) & 0x3 = 0x0
both mcause and msubm should be save and restored during idle task emulation
…uch as FreeRTOS and ThreadX see riscv-mcu/qemu#9
The `volatle` restriction is unnecessary for the `vec_base` local variable. Signed-off-by: qiujiandong <qiujiandong@nucleisys.com>
…libdsp and libnn function libraries. Related commit id: f481c68
Author
|
Sorry,Due to network issues, I had to submit a PR again.Duplicate of #66 , closing this one because of wrong base branch. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.