reason:
To remove the "sync pause" and decouple the critical section from the dependency on enabling interrupts,
after that we need to further implement "schedlock + spinlock".
changelist
1 Modify the implementation of critical sections to no longer involve enabling interrupts or handling synchronous pause events.
2 GIC_SMP_CPUCALL attach to pause handler to remove arch interface up_cpu_paused_restore up_cpu_paused_save
3 Completely remove up_cpu_pause, up_cpu_resume, up_cpu_paused, and up_cpu_pausereq
4 change up_cpu_pause_async to up_send_cpu_sgi
Signed-off-by: hujun5 <hujun5@xiaomi.com>
for the citimon stats:
thread 0: thread 1:
enter_critical (t0)
up_switch_context
note suspend thread0 (t1)
thread running
IRQ happen, in ISR:
post thread0
up_switch_context
note resume thread0 (t2)
ISR continue f1
ISR continue f2
...
ISR continue fn
leave_critical (t3)
You will see, the thread 0, critical_section time is:
(t1 - t0) + (t3 - t2)
BUT, this result contains f1 f2 .. fn time spent, it is wrong
to tell user thead0 hold the critical lots of time but actually
not belong to it.
Resolve:
change the nxsched_suspend/resume_scheduler to real hanppends
Signed-off-by: ligd <liguiding1@xiaomi.com>
reason:
Currently, if we need to schedule a task to another CPU, we have to completely halt the other CPU,
manipulate the scheduling linked list, and then resume the operation of that CPU. This process is both time-consuming and unnecessary.
During this process, both the current CPU and the target CPU are inevitably subjected to busyloop.
The improved strategy is to simply send a cross-core interrupt to the target CPU.
The current CPU continues to run while the target CPU responds to the interrupt, eliminating the certainty of a busyloop occurring.
Signed-off-by: hujun5 <hujun5@xiaomi.com>
We need to record the parent's integer register context upon exception
entry to a separate non-volatile area. Why?
Because xcp.regs can move due to a context switch within the fork() system
call, be it either via interrupt or a synchronization point.
Fix this by adding a "sregs" area where the saved user context is placed.
The critical section within fork() is also unnecessary.
There was an error in the fork() routine when system calls are in use:
the child context is saved on the child's user stack, which is incorrect,
the context must be saved on the kernel stack instead.
The result is a full system crash if (when) the child executes on a
different CPU which does not have the same MMU mappings active.
This commit fixes the regression from https://github.com/apache/nuttx/pull/13561
In order to determine whether a context switch has occurred,
we can use g_running_task to store the current regs.
This allows us to compare the current register state with the previously
stored state to identify if a context switch has taken place.
Signed-off-by: hujun5 <hujun5@xiaomi.com>
with other functionalities removed.
reason:
by doing this we can reduce context switch time,
When we exit from an interrupt handler, we directly use tcb->xcp.regs
before
text data bss dec hex filename
138805 337 24256 163398 27e46 nuttx
after
text data bss dec hex filename
138499 337 24240 163076 27d04 nuttx
szie change -322
Signed-off-by: hujun5 <hujun5@xiaomi.com>
1. Similar to asan, supports single byte out of bounds detection
2. Fix the script to address the issue of not supporting the big end
Signed-off-by: wangmingrong1 <wangmingrong1@xiaomi.com>
The simple improvement is designed to speed up compilation and reduce download errors on github and local.
Added a folder nxtmpdir for storing third-party packages
nuttxworkspace
|
|- nuttx
|- apps
|- nxtmpdir
tools/Unix.mk:
added export NXTMPDIR := $(WSDIR)/nxtmpdir
tools/configure.sh:
added option -S creates the nxtmpdir folder for third-party packages.
tools/Config.mk:
added macro
CLONE - Git clone repository.
CHECK_COMMITSHA - Check if the branch contains the commit SHA-1.
tools/testbuild.sh:
added option -S
For now I added in the folder this package
ESP_HAL_3RDPARTY_URL = https://github.com/espressif/esp-hal-3rdparty.git
ARCH
arch/xtensa/src/esp32/Make.defs
arch/xtensa/src/esp32s2/Make.defs
arch/xtensa/src/esp32s3/Make.defs
arch/risc-v/src/common/espressif/Make.defs
arch/risc-v/src/esp32c3-legacy/Make.defs
but you can also add other packages (maybe also of apps)
The number of exception for risc-v is 16 (0 ~ 15)
for the machine ISA version 1.12 or earlier, the number of exception is 20
(0 ~ 19) from the ISA version 1.13. And maybe changed in the future.
Using a dedicated option to control the exception number to allow the earlier
version chip with customized exception number (e.g. 16 ~ 19 used) to define
the exception reason string correctly.
Signed-off-by: Huang Qi <huangqi3@xiaomi.com>
Revert "Parallelize depend file generation"
This reverts commit d5b6ec450f.
parallel depend ddc does not significantly speed up compilation,
intermediately generated .ddc files can cause problems if compilation is interrupted unexpectedly
Signed-off-by: xuxin19 <xuxin19@xiaomi.com>
This commit fixed the issue where the hardware timer wraps around and causes the system to halt.
Signed-off-by: ouyangxiangzhen <ouyangxiangzhen@xiaomi.com>
reason:
1 On different architectures, we can utilize more optimized strategies
to implement up_current_regs/up_set_current_regs.
eg. use interrupt registersor percpu registers.
code size
before
text data bss dec hex filename
262848 49985 63893 376726 5bf96 nuttx
after
text data bss dec hex filename
262844 49985 63893 376722 5bf92 nuttx
size change -4
Configuring NuttX and compile:
$ ./tools/configure.sh -l qemu-armv8a:nsh_smp
$ make
Running with qemu
$ qemu-system-aarch64 -cpu cortex-a53 -smp 4 -nographic \
-machine virt,virtualization=on,gic-version=3 \
-net none -chardev stdio,id=con,mux=on -serial chardev:con \
-mon chardev=con,mode=readline -kernel ./nuttx
Signed-off-by: hujun5 <hujun5@xiaomi.com>
Using the ts/tick conversion functions provided in clock.h
Do this caused we want speed up the time calculation, so change:
clock_time2ticks, clock_ticks2time, clock_timespec_add,
clock_timespec_compare, clock_timespec_subtract... to MACRO
Signed-off-by: ligd <liguiding1@xiaomi.com>
Attempt to service all interrupts pending in the PLIC's claim register. Ideally, this is more efficient than switching context for each interrupt received.
Provides two implementations:
- CSR_CYCLE: Cores which implement hardware performance monitoring.
- CSR_TIME: Uses the machine time registers.
Using the up_perf_xx bindings directory is more efficient than performing a nanosecond conversion on every gettime event.
This patch unifies the extended context save/restore for RISC-V,
allowing the customized context save/restore to be used, for example,
the extended context in rv32m1.
Signed-off-by: Huang Qi <huangqi3@xiaomi.com>
reason:
In SMP, when a context switch occurs, restore_critical_section is executed.
To reduce the time taken for context switching, we directly pass the required
parameters to restore_critical_section instead of acquiring them repeatedly.
Signed-off-by: hujun5 <hujun5@xiaomi.com>
We can see them in ifconfig:
ap> ifconfig
wlan0 Link encap:Ethernet HWaddr 42:64:7f:b3:12:03 at UP mtu 1500
inet addr:10.0.1.2 DRaddr:10.0.1.1 Mask:255.255.255.0
inet6 DRaddr: ::
RX: Received Fragment Errors Bytes
00000b9b 00000000 00000000 21daf5
IPv4 IPv6 ARP Dropped
00000a33 00000137 00000031 00000000
TX: Queued Sent Errors Timeouts Bytes
00000ac4 00000ac4 00000000 00000000 1a2103
Total Errors: 00000000
Signed-off-by: meijian <meijian@xiaomi.com>
Make this_cpu is arch independent and up_cpu_index do that.
In AMP mode, up_cpu_index() may return the index of the physical core.
Signed-off-by: fangxinyong <fangxinyong@xiaomi.com>
This commit fixes a deadlock in `esp32s3-devkit:sta_softap`
defconfig: `spin_lock_irqsave` was being used to enter a critical
section that calls `nxsem_post`. In this case, it's recommended
to use `[enter|leave]_critical_section` to avoid deadlocks when a
context switch may happen, for instance.
Some USB controllers can receive or send multiple data packets then
generate one interrupt. This mechanism can reduce the number of data
copies. Extend req buf to accommodate this.
Signed-off-by: yangsong8 <yangsong8@xiaomi.com>
This PR configures the BL808 MMU to cache the the User Text, Data and Heap. We enable the T-Head MMU Flags for Shareable, Bufferable and Cacheable, as explained in the previous PR: https://github.com/apache/nuttx/pull/13199
This PR fixes the Slow Memory Access for NuttX Apps on Ox64 BL808 SBC: https://github.com/apache/nuttx/issues/12696. With this fix, Ox64 NuttX CoreMark jumps from 19 to 1,104. (Close to Buildroot Linux CoreMark)
Modified Files:
`arch/risc-v/Kconfig`: Enabled `ARCH_MMU_EXT_THEAD` for BL808 SoC.
This PR configures the T-Head MMU to cache the the User Text, Data and Heap. We enable the MMU Flags for Shareable, Bufferable and Cacheable, as explained in this article: https://lupyuen.github.io/articles/plic3#appendix-mmu-caching-for-t-head-c906
This PR fixes the Slow Memory Access for NuttX Apps on BL808 and SG2000 SoCs: https://github.com/apache/nuttx/issues/12696. With this fix, SG2000 NuttX CoreMark jumps from 21 to 2,423. (Close to SG2000 Debian CoreMark)
We introduce a Kconfig Option: `ARCH_MMU_EXT_THEAD` ("System Type > Enable T-Head MMU extension support"). Enabling this Kconfig Option will configure the T-Head MMU to cache the User Text, Data and Heap.
This PR enables the MMU cache for only SG2000 SoC (Milk-V Duo S SBC). The next PR will apply the same settings to BL808 SoC (Pine64 Ox64 SBC).
Modified Files:
`arch/risc-v/Kconfig`: Added Kconfig Option `ARCH_MMU_EXT_THEAD` that will configure the T-Head MMU. Enabled `ARCH_MMU_EXT_THEAD` for SG2000 SoC.
`arch/risc-v/src/common/riscv_mmu.h`: Set the T-Head MMU Flags (Shareable, Bufferable and Cacheable) for User Text, Data and Heap, if `ARCH_MMU_EXT_THEAD` is enabled
`arch/risc-v/src/common/riscv_addrenv.c`: Extended the MMU Flags from 32 bits to 64 bits, to accommodate the T-Head MMU Flags
`arch/risc-v/src/common/riscv_exception.c`: Extended the MMU Flags from 32 bits to 64 bit, to accommodate the T-Head MMU Flags. This code is enabled only for MMU Paging (`CONFIG_PAGING`).
Only in the non-critical region, nuttx can the respond to the irq and not hold the lock
When returning from the irq, there is no need to check whether the lock needs to be released
we also need keep restore_critical_section in svc call
test:
Configuring NuttX and compile:
$ ./tools/configure.sh -l qemu-armv8a:nsh_smp
$ make
Running with qemu
$ qemu-system-aarch64 -cpu cortex-a53 -smp 4 -nographic \
-machine virt,virtualization=on,gic-version=3 \
-net none -chardev stdio,id=con,mux=on -serial chardev:con \
-mon chardev=con,mode=readline -kernel ./nuttx
Signed-off-by: hujun5 <hujun5@xiaomi.com>
Summary:
1.Modify the conditions for entering different include header files
2.Added pre-definition for _Atomic _Bool when it is missing
3.Added nuttx for stdatomic implementation. When toolchain does not support atomic, use lib/stdatomic to implement it
Signed-off-by: chenrun1 <chenrun1@xiaomi.com>