reason:
1 On different architectures, we can utilize more optimized strategies
to implement up_current_regs/up_set_current_regs.
eg. use interrupt registersor percpu registers.
code size
before
text data bss dec hex filename
262848 49985 63893 376726 5bf96 nuttx
after
text data bss dec hex filename
262844 49985 63893 376722 5bf92 nuttx
size change -4
Configuring NuttX and compile:
$ ./tools/configure.sh -l qemu-armv8a:nsh_smp
$ make
Running with qemu
$ qemu-system-aarch64 -cpu cortex-a53 -smp 4 -nographic \
-machine virt,virtualization=on,gic-version=3 \
-net none -chardev stdio,id=con,mux=on -serial chardev:con \
-mon chardev=con,mode=readline -kernel ./nuttx
Signed-off-by: hujun5 <hujun5@xiaomi.com>
Using the ts/tick conversion functions provided in clock.h
Do this caused we want speed up the time calculation, so change:
clock_time2ticks, clock_ticks2time, clock_timespec_add,
clock_timespec_compare, clock_timespec_subtract... to MACRO
Signed-off-by: ligd <liguiding1@xiaomi.com>
1 Only the idle task can have the flag TCB_FLAG_CPU_LOCKED.
According to the code logic, btcb cannot be an idle task, so this check can be removed.
2 Optimized the preemption logic check and removed the call to nxsched_add_prioritized.
3 Speed up the scheduling time while avoiding the potential for
tasks to be moved multiple times between g_assignedtasks and g_readytorun.
Configuring NuttX and compile:
$ ./tools/configure.sh -l qemu-armv8a:nsh_smp
$ make
Running with qemu
$ qemu-system-aarch64 -cpu cortex-a53 -smp 4 -nographic \
-machine virt,virtualization=on,gic-version=3 \
-net none -chardev stdio,id=con,mux=on -serial chardev:con \
-mon chardev=con,mode=readline -kernel ./nuttx
Signed-off-by: hujun5 <hujun5@xiaomi.com>
Most tools used for compliance and SBOM generation use SPDX identifiers
This change brings us a step closer to an easy SBOM generation.
Signed-off-by: Alin Jerpelea <alin.jerpelea@sony.com>
After these wdog refactor:
We conducted a latency measurement using the rt-tests/cyclictest (commit cadd661) on an x86_64 NUC12 equipped with an i7-1255U processor and 16GB of LPDDR5 memory. The specific command used for this microbenchmark was cyclictest -q -l 100000 -h 30000, which is designed to assess the responsiveness of the cyclic timer.
The findings from our benchmark are summarized below, highlighting the minimum, median, and maximum latency values for each operating system tested:
Operating System Minimum Latency (us) Median Latency (us) Maximum Latency (us)
Linux 48 53 410
PreemptRT 6 57 148
Xenomai 53 53 64
NuttX 64 626 1212
NuttX (refactor) 1 1 3
In this table, "Min" indicates the shortest latency observed, "Median" represents the middle value of the latency distribution, and "Max" denotes the longest latency encountered.
The systems tested were as follows:
Linux: ACRN version 6.1.80 (commit f528146)
PreemptRT: Linux kernel 5.4.251 with the 5.4.254-rt85 patch applied
Xenomai: Linux kernel 5.4.251 patched with ipipe-core-5.4.239-x86-13
These results clearly demonstrate the varying performance of different operating systems in terms of timer latency, the refactored NuttX showing particularly low latency values.
Signed-off-by: ligd <liguiding1@xiaomi.com>
Now we have CONFIG_USEC_PER_TICK, and for our timer system, all the calculation used 'tick'.
And all the timespec should change to 'tick' before use wd_start(), so USEC2TICK() can NOT be avoided.
Then there must be an 'less then one tick' loss.
One resolution:
ticks++ anyway when wd_start(). But this will caused time expired more a tick.
Another resolution:
Change the testcase, and allow the following logic:
t1 = current_time();
sleep(3);
t2 = current_time();
allow: t2 - t1 >= 3;
(original test must be: t2- t1 > 3)
The original test think the time must be elapse-ing, and the (t2 - t1) must bigger then 3,
but in our system, we use 'tick' as the minimal wdog unit, then there must a precision loss.
Now we choose first resolution.
Signed-off-by: ligd <liguiding1@xiaomi.com>
This patch addresses an issue where the elapsed time was uncorrectly calculated.
Signed-off-by: ouyangxiangzhen <ouyangxiangzhen@xiaomi.com>
Signed-off-by: ligd <liguiding1@xiaomi.com>
For the nested interrupt, one thing should decleared:
We are in ISR context, but no meaning we are disabled the interrupts.
Signed-off-by: ligd <liguiding1@xiaomi.com>
This patch moved the g_wdtimernested to wd_start.c
Signed-off-by: ouyangxiangzhen <ouyangxiangzhen@xiaomi.com>
Signed-off-by: ligd <liguiding1@xiaomi.com>
If g_wdactivelist has been changed in the wdog callback, the list traversal with next pointer will cause problem.
Signed-off-by: ouyangxiangzhen <ouyangxiangzhen@xiaomi.com>
Signed-off-by: ligd <liguiding1@xiaomi.com>
This commit refactors the wdog module to use absolute time representation internally. The main improvements include:
1. Fixed recursive watchdog handling caused by calling wd_start within watchdog timeout callback function.
2. Simplified timer processing to improve performance and enhance code readability.
3. Improved accuracy of timers.
4. Reduced critical section and interrupt disable time, improving real-time performance.
Signed-off-by: ouyangxiangzhen <ouyangxiangzhen@xiaomi.com>
Signed-off-by: ligd <liguiding1@xiaomi.com>
reason:
In the SMP, when a context switch occurs, restore_critical_section is executed.
In order to reduce the time taken for context switching,
we inline the restore_critical_section function.
Given that restore_critical_section is small in size
and is called from only one location, inlining it does not increase the size of the image.
Signed-off-by: hujun5 <hujun5@xiaomi.com>
The file descriptors of kernel threads should be clean and should
not be generated based on user threads. Instead, an idle thread
hould be chosen.
Signed-off-by: dongjiuzhu1 <dongjiuzhu1@xiaomi.com>
Configuring NuttX and compile:
$ ./tools/configure.sh -l qemu-armv8a:nsh_smp
$ make
Running with qemu
$ qemu-system-aarch64 -cpu cortex-a53 -smp 4 -nographic \
-machine virt,virtualization=on,gic-version=3 \
-net none -chardev stdio,id=con,mux=on -serial chardev:con \
-mon chardev=con,mode=readline -kernel ./nuttx
reason:
If the type of tcb is TSTATE_TASK_ASSIGNED, removing it using nxsched_remove_not_running
and then putting it back into the queue may result in a context switch.
Signed-off-by: hujun5 <hujun5@xiaomi.com>
reason:
1 We place interrupt handling functions of the same priority into the work queue corresponding
to that priority, allowing high-priority interrupts to preempt low-priority ones,
thus ensuring the real-time performance of high-priority interrupts.
2 The sole purpose of the interrupt handler is to wake
up the work queue of the corresponding priority and execute the interrupt handling function.
3 Compared to the functionality of isr threads, this
approach saves more memory, particularly when the number of interrupts is large.
Signed-off-by: hujun5 <hujun5@xiaomi.com>
Since pthread_mutex is implemented by sem, it is impossible to see in ps who holds the lock and causes the wait.
Replace sem with mutex implementation to solve the above problems
Signed-off-by: yinshengkai <yinshengkai@xiaomi.com>
reason:
Since smp call handler may lead to context switching,
we need to update the context information by calling up_cpu_paused_[save|restore].
Signed-off-by: hujun5 <hujun5@xiaomi.com>
Make this_cpu is arch independent and up_cpu_index do that.
In AMP mode, up_cpu_index() may return the index of the physical core.
Signed-off-by: fangxinyong <fangxinyong@xiaomi.com>
reason:
dynaminc create g_irqmap to reduce the use of data segments
CONFIG_ARCH_NUSER_INTERRUPTS should be one more than the number of IRQs actually used
Signed-off-by: hujun5 <hujun5@xiaomi.com>
Calling malloc in the critical section may cause thread switching.
Add a retry mechanism to handle the situation where other threads have been successfully expanded.
Signed-off-by: yinshengkai <yinshengkai@xiaomi.com>
Signed-off-by: ligd <liguiding1@xiaomi.com>
enable the scheduler may cause the context to switch to a high-priority task,
which will failure to clear pending events correctly.
Signed-off-by: chao an <anchao@lixiang.com>