There will be a large performance loss after SCHED_CRITMONITOR is enabled.
By isolating thread running time-related functions, CPU load can be run with less overhead.
Signed-off-by: yinshengkai <yinshengkai@xiaomi.com>
Signed-off-by: buxiasen <buxiasen@xiaomi.com>
When statistics when sched_cpuload_critMonitor, thread switching only
calculates the current CPU's running time. Other CPU running threads
are updated for updates
Signed-off-by: yinshengkai <yinshengkai@xiaomi.com>
Using the ts/tick conversion functions provided in clock.h
Do this caused we want speed up the time calculation, so change:
clock_time2ticks, clock_ticks2time, clock_timespec_add,
clock_timespec_compare, clock_timespec_subtract... to MACRO
Signed-off-by: ligd <liguiding1@xiaomi.com>
1 Only the idle task can have the flag TCB_FLAG_CPU_LOCKED.
According to the code logic, btcb cannot be an idle task, so this check can be removed.
2 Optimized the preemption logic check and removed the call to nxsched_add_prioritized.
3 Speed up the scheduling time while avoiding the potential for
tasks to be moved multiple times between g_assignedtasks and g_readytorun.
Configuring NuttX and compile:
$ ./tools/configure.sh -l qemu-armv8a:nsh_smp
$ make
Running with qemu
$ qemu-system-aarch64 -cpu cortex-a53 -smp 4 -nographic \
-machine virt,virtualization=on,gic-version=3 \
-net none -chardev stdio,id=con,mux=on -serial chardev:con \
-mon chardev=con,mode=readline -kernel ./nuttx
Signed-off-by: hujun5 <hujun5@xiaomi.com>
Most tools used for compliance and SBOM generation use SPDX identifiers
This change brings us a step closer to an easy SBOM generation.
Signed-off-by: Alin Jerpelea <alin.jerpelea@sony.com>
This patch addresses an issue where the elapsed time was uncorrectly calculated.
Signed-off-by: ouyangxiangzhen <ouyangxiangzhen@xiaomi.com>
Signed-off-by: ligd <liguiding1@xiaomi.com>
For the nested interrupt, one thing should decleared:
We are in ISR context, but no meaning we are disabled the interrupts.
Signed-off-by: ligd <liguiding1@xiaomi.com>
This patch moved the g_wdtimernested to wd_start.c
Signed-off-by: ouyangxiangzhen <ouyangxiangzhen@xiaomi.com>
Signed-off-by: ligd <liguiding1@xiaomi.com>
This commit refactors the wdog module to use absolute time representation internally. The main improvements include:
1. Fixed recursive watchdog handling caused by calling wd_start within watchdog timeout callback function.
2. Simplified timer processing to improve performance and enhance code readability.
3. Improved accuracy of timers.
4. Reduced critical section and interrupt disable time, improving real-time performance.
Signed-off-by: ouyangxiangzhen <ouyangxiangzhen@xiaomi.com>
Signed-off-by: ligd <liguiding1@xiaomi.com>
reason:
In the SMP, when a context switch occurs, restore_critical_section is executed.
In order to reduce the time taken for context switching,
we inline the restore_critical_section function.
Given that restore_critical_section is small in size
and is called from only one location, inlining it does not increase the size of the image.
Signed-off-by: hujun5 <hujun5@xiaomi.com>
The file descriptors of kernel threads should be clean and should
not be generated based on user threads. Instead, an idle thread
hould be chosen.
Signed-off-by: dongjiuzhu1 <dongjiuzhu1@xiaomi.com>
Configuring NuttX and compile:
$ ./tools/configure.sh -l qemu-armv8a:nsh_smp
$ make
Running with qemu
$ qemu-system-aarch64 -cpu cortex-a53 -smp 4 -nographic \
-machine virt,virtualization=on,gic-version=3 \
-net none -chardev stdio,id=con,mux=on -serial chardev:con \
-mon chardev=con,mode=readline -kernel ./nuttx
reason:
If the type of tcb is TSTATE_TASK_ASSIGNED, removing it using nxsched_remove_not_running
and then putting it back into the queue may result in a context switch.
Signed-off-by: hujun5 <hujun5@xiaomi.com>
reason:
Since smp call handler may lead to context switching,
we need to update the context information by calling up_cpu_paused_[save|restore].
Signed-off-by: hujun5 <hujun5@xiaomi.com>
Make this_cpu is arch independent and up_cpu_index do that.
In AMP mode, up_cpu_index() may return the index of the physical core.
Signed-off-by: fangxinyong <fangxinyong@xiaomi.com>
"/mnt/yang/qixinwei_cmake/nuttx/sched/sched/sched_removeblocked.c", line 58: warning #188-D:
enumerated type mixed with another type
tstate_t task_state = btcb->task_state;
"/mnt/yang/qixinwei_cmake/nuttx/sched/sched/sched_setpriority.c", line 243: warning #188-D:
enumerated type mixed with another type
tstate_t task_state = tcb->task_state;
Signed-off-by: yanghuatao <yanghuatao@xiaomi.com>
when the thread to backtrace is exiting, get_tcb and up_backtrace in
different critical section may cause try to dump invalid pointer, have
to ensure the nxsched_get_tcb and up_backtrace inside same critical
section procedure.
Signed-off-by: buxiasen <buxiasen@xiaomi.com>
Thread args have already been saved to stack after the TLS section by
nxtask_setup_stackargs. This is to retrieve it for use.
Signed-off-by: Yanfeng Liu <yfliu2008@qq.com>
we can use g_cpu_lockset to determine whether we are currently in the scheduling lock,
and all accesses and modifications to g_cpu_lockset, g_cpu_irqlock, g_cpu_irqset
are in the critical section, so we can directly operate on it.
test:
We can use qemu for testing.
compiling
make distclean -j20; ./tools/configure.sh -l qemu-armv8a:nsh_smp ;make -j20
running
qemu-system-aarch64 -cpu cortex-a53 -smp 4 -nographic -machine virt,virtualization=on,gic-version=3 -net none -chardev stdio,id=con,mux=on -serial chardev:con -mon chardev=con,mode=readline -kernel ./nuttx
Signed-off-by: hujun5 <hujun5@xiaomi.com>
sched implementation not depends on macro abstraction, so revert below commit:
This reverts commit 4e62d0005a
This reverts commit 0f0c370520
This reverts commit ad0efd04ee
Signed-off-by: chao an <anchao@lixiang.com>
we removed "select SCHED_RESUMESCHEDULER",
As SCHED_RESUMESCHEDULER is not a necessary feature in SMP,
turning it on by default may affect performance.
Signed-off-by: hujun5 <hujun5@xiaomi.com>
test:
We can use qemu for testing.
compiling
make distclean -j20; ./tools/configure.sh -l qemu-armv8a:nsh_smp ;make -j20 running
qemu-system-aarch64 -cpu cortex-a53 -smp 4 -nographic -machine virt,virtualization=on,gic-version=3 -net none -chardev stdio,id=con,mux=on -serial chardev:con -mon chardev=con,mode=readline -kernel ./nuttx
We have also tested this patch on other ARM hardware platforms.
Signed-off-by: hujun5 <hujun5@xiaomi.com>
purpose:
1 sched_lock is very time-consuming, and reducing its invocations can improve performance.
2 sched_lock is prone to misuse, and narrowing its scope of use is to prevent people from referencing incorrect code and using it
test:
We can use qemu for testing.
compiling
make distclean -j20; ./tools/configure.sh -l qemu-armv8a:nsh_smp ;make -j20
running
qemu-system-aarch64 -cpu cortex-a53 -smp 4 -nographic -machine virt,virtualization=on,gic-version=3 -net none -chardev stdio,id=con,mux=on -serial chardev:con -mon chardev=con,mode=readline -kernel ./nuttx
We have also tested this patch on other ARM hardware platforms.
Signed-off-by: hujun5 <hujun5@xiaomi.com>
Prior to this commit, doswitch is returned uninitialized if
(task_state == TSTATE_TASK_ASSIGNED || task_state == TSTATE_TASK_RUNNING)
&& no context switch occurs
&& (cpu == me).
Signed-off-by: Mingjie Shen <shen497@purdue.edu>
In SMP mode, up_cpu_index()/this_cpu() are the same, both return the index of the physical core.
In AMP mode, up_cpu_index() will return the index of the physical core, and this_cpu() will always return 0
| #ifdef CONFIG_SMP
| # define this_cpu() up_cpu_index()
| #elif defined(CONFIG_AMP)
| # define this_cpu() (0)
| #else
| # define this_cpu() (0)
| #endif
Signed-off-by: chao an <anchao@lixiang.com>
1. add support to join main task
| static pthread_t self;
|
| static void *join_task(void *arg)
| {
| int ret;
| ret = pthread_join(self, NULL); <--- /* Fix Task could not be joined */
| return NULL;
| }
|
| int main(int argc, char *argv[])
| {
| pthread_t thread;
|
| self = pthread_self();
|
| pthread_create(&thread, NULL, join_task, NULL);
| sleep(1);
|
| pthread_exit(NULL);
| return 0;
| }
2. Detach active thread will not alloc for additional join, just update the task flag.
3. Remove the return value waiting lock logic (data_sem),
the return value will be stored in the waiting tcb.
4. Revise the return value of pthread_join(), consistent with linux
e.g:
Joining a detached and canceled thread should return EINVAL, not ESRCH
https://pubs.opengroup.org/onlinepubs/009695399/functions/pthread_join.html
[EINVAL]
The value specified by thread does not refer to a joinable thread.
NOTE:
This PR will not increase stack usage, but struct tcb_s will increase 32 bytes.
Signed-off-by: chao an <anchao@lixiang.com>
The delete flag is not synchronized with the life cycle of the group,
if the flag set before waitpid(), the tcb will be mistakenly deleted
by group_del_waiter(), use-after-free will happen.
Regression by:
| commit 29e50ffa73 (origin/master, origin/HEAD)
| Author: chao an <anchao@lixiang.com>
| Date: Mon Mar 4 09:19:27 2024 +0800
|
| sched/group: move task group into task_tcb_s to improve performance
|
| move task group into task_tcb_s to avoid access allocator to improve performance
|
| for Task Termination, the time consumption will be reduced ~2us (Tricore TC397 300MHZ):
| 15.97(us) -> 13.55(us)
|
| Signed-off-by: chao an <anchao@lixiang.com>
Signed-off-by: chao an <anchao@lixiang.com>
move task group into task_tcb_s to avoid access allocator to improve performance
for Task Termination, the time consumption will be reduced ~2us (Tricore TC397 300MHZ):
15.97(us) -> 13.55(us)
Signed-off-by: chao an <anchao@lixiang.com>