Summary:
- I found a deadlock during Wi-Fi audio streaming test plus stress test
- The testing environment was spresense:wifi_smp (NCPUS=4)
- The deadlock happened because two CPUs called up_cpu_pause() almost simultaneously
- This situation should not happen, because up_cpu_pause() is called in a critical section
- Actually, the latter call was from nxsem_post() in an IRQ handler
- And when enter_critical_section() was called, irq_waitlock() detected a deadlock
- Then it called up_cpu_paused() to break the deadlock
- However, this resulted in setting g_cpu_irqset on the CPU
- Even though another CPU had held a g_cpu_irqlock
- This situation violates the critical section and should be avoided
- To avoid the situation, if a CPU sets g_cpu_irqset after calling up_cpu_paused()
- The CPU must release g_cpu_irqlock first
- Then retry irq_waitlock() to acquire g_cpu_irqlock
Impact:
- Affect SMP
Testing:
- Tested with spresense:wifi_smp (NCPUS=2 and 4)
- Tested with spresense:smp
- Tested with sim:smp
- Tested with sabre-6quad:smp (QEMU)
- Tested with maix-bit:smp (QEMU)
- Tested with esp32-core:smp (QEMU)
- Tested with lc823450-xgevk:rndis
Signed-off-by: Masayuki Ishikawa <Masayuki.Ishikawa@jp.sony.com>
Follow the POSIX description.
SIGTSTP should be sent when the Ctrl-Z characters is encountered, not SIGSTP.
Testing:
Built with hifive1-revb:nsh (CONFIG_SERIAL_TERMIOS=y, CONFIG_SIG_DEFAULT=y and CONFIG_TTY_SIGTSTP=y)
Summary:
- I noticed sched_lock() logic is different from sched_unlock()
- I think sched_lock() should use critical section
- Also, the code should be simple like sched_unlock()
- This commit fixes these issues
Impact:
- Affects SMP only
Testing:
- Tested with spresense:wifi_smp (both NCPUS=2 and 3)
- Tested with lc823450-xgevk:rndis
- Tested with maix-bit:smp (QEMU)
- Tested with esp32-core:smp (QEMU)
- Tested with sabre-6quad:smp (QEMU)
Signed-off-by: Masayuki Ishikawa <Masayuki.Ishikawa@jp.sony.com>
Summary:
- I noticed DEBUGASSERT() happens in sched_unlock()
- The test was Wi-Fi audio streaming stress test with spresense 3cores
- Actually, g_cpu_schedlock was locked but g_cpu_lockset was incorrect
- Finally, I found that cpu was obtained before enter_critical_section()
- And the task was moved from one cpu to another cpu
- However, that call should be done within the critical section
- This commit fixes this issue
Impact:
- Affects SMP only
Testing:
- Tested with spresense:wifi_smp (both NCPUS=2 and 3)
- Tested with lc823450-xgevk:rndis
- Tested with maix-bit:smp (QEMU)
- Tested with esp32-core:smp (QEMU)
- Tested with sabre-6quad:smp (QEMU)
Signed-off-by: Masayuki Ishikawa <Masayuki.Ishikawa@jp.sony.com>
to ensure the basic info(e.g. pid) setup correctly before call arch API
Signed-off-by: Xiang Xiao <xiaoxiang@xiaomi.com>
Change-Id: I851cb0fdf22f45844938dafc5981b3f576100dba
Summary:
- During Wi-Fi audio streaming test, I found a deadlock in nxtask_exit()
- Actually, nxtask_exit() was called and tried to enter critical section
- In enter_critical_section(), there is a deadlock avoidance logic
- However, if switched to a new rtcb with irqcount=0, the logic did not work
- Because the 2nd critical section was treated as if it were the 1st one
- Actually, it tried to run the deadlock avoidance logic
- But nxtask_exit() was called with critical section (i.e. IRQ already disabled)
- So the logic did not work as expected because up_irq_restore() did not enable the IRQ.
- This commit fixes this issue by incrementing irqcount before calling nxtask_terminate()
- Also it adjusts g_cpu_irqlock and g_cpu_lockset
Impact:
- Affects SMP only
Testing:
- Tested with spresense:wifi_smp (smp, ostest, nxplayer, telnetd)
- Tested with sabre-6quad:smp with QEMU (smp, ostest)
- Tested with maix-bit:smp with QEMU (smp, ostest)
- Tested with esp32-core:smp with QEMU (smp, ostest)
Signed-off-by: Masayuki Ishikawa <Masayuki.Ishikawa@jp.sony.com>
since nxtask_startup will initialize c++ global variables which shouldn't
be done inside the kernel thread
Signed-off-by: Xiang Xiao <xiaoxiang@xiaomi.com>
Move sched/task/task/task_gettid.c to libs/libc/unistd/lib_gettid.c. gettid() is a dumb wrapper around getpid(). It is wasteful of resources to support TWO systme calls, one for getpid() and one for gettid(). Instead, move gettid() in the C library where it calls the single sysgtem call, getpid(). Much cleaner.
1.Reduce the default size of task_group_s(~512B each task)
2.Scale better between simple and complex application
Signed-off-by: Xiang Xiao <xiaoxiang@xiaomi.com>
Change-Id: Ia872137504fddcf64d89c48d6f0593d76d582710
Summary:
- ARCH_GLOBAL_IRQDISABLE was initially introduced for LC823450 SMP
- At that time, i.MX6 (quad Cortex-A9) did not use this config
- However, this option is now used for all CPUs which support SMP
- So it's good timing for refactoring the code
Impact:
- Should have no impact because the logic is the same for SMP
Testing:
- Tested with board: spresense:smp, spresense:wifi_smp
- Tested with qemu: esp32-core:smp, maix-bit:smp, sabre-6quad:smp
- Build only: lc823450-xgevk:rndis, sam4cmp-db:nsh
Signed-off-by: Masayuki Ishikawa <Masayuki.Ishikawa@jp.sony.com>
it is useful to pass the nonempty argument to change the init task behaviour
Signed-off-by: Xiang Xiao <xiaoxiang@xiaomi.com>
Change-Id: I684e9c76b9eac54404d0e4e63ab78e51e039c9a8
to save the preserved space(1KB) and also avoid the heap overhead
Signed-off-by: Xiang Xiao <xiaoxiang@xiaomi.com>
Change-Id: I694073f68e1bd63960cedeea1ddec441437be025
Change the preallocated message and descriptor from 32/24 to 4.
The total size is reduce from 1892 to 532
Signed-off-by: Xiang Xiao <xiaoxiang@xiaomi.com>
Change-Id: I79d199465daef678986868f773876289859f42fc
1.Don't preallocate sigaction list since it's used only in the task context
2.Reduce the preserved item which is used only in the task context from 16 to 4
The total memory decrease from 1280B to 480B
Signed-off-by: Xiang Xiao <xiaoxiang@xiaomi.com>
Change-Id: Ib5d5a7365c7d443fc0e99c0d3ea943e85f67ca8c
since the maximum number of argument pass to wd_start in the whole
code base is 2 and change CONFIG_MAX_WDOGPARMS in some defconfig
from 1 to 2 oherwise pthread_condclockwait will fail
Signed-off-by: Xiang Xiao <xiaoxiang@xiaomi.com>
Change-Id: Ib6cb28b8c0722058849e7be916e164513431d21c
Found by clang-check:
sched/sched_waitpid.c:380:33: warning: Although the value stored to 'ret' is used in the enclosing expression, the value is never actually read from 'ret'
(pid != (pid_t)-1 && (ret = nxsig_kill(pid, 0)) < 0))
^ ~~~~~~~~~~~~~~~~~~
1 warning generated.
because nx_task_idle doesn't call sched_note_start. To avoid the
same error happen again in the furture, nx_task_idle is removed.
Signed-off-by: Xiang Xiao <xiaoxiang@xiaomi.com>
Make.dep file should be updated by .config changed after first make.
There are 2 cases affected for this problem:
1) Add source files by config symbol
2) Include header files in #ifdef directive
These 2 cases may not be included in Make.dep and this may prevent the
differential build from working correctly.
Since up_release_stack auto detect whether the memory come from builtin heap
if (ttype == TCB_FLAG_TTYPE_KERNEL)
{
if (kmm_heapmember(dtcb->stack_alloc_ptr))
{
kmm_free(dtcb->stack_alloc_ptr);
}
}
else
{
/* Use the user-space allocator if this is a task or pthread */
if (umm_heapmember(dtcb->stack_alloc_ptr))
{
kumm_free(dtcb->stack_alloc_ptr);
}
}
This reverts commit 124e6ee53d.
add PR_SET_NAME_EXT/PR_GET_NAME_EXT extension to avoid semantic
conflicts, use extened version for pthread_setname_np/pthread_getname_np
Change-Id: I40404c737977a623130dcd37feb4061b5526e466
Signed-off-by: chao.an <anchao@xiaomi.com>
to avoid the similar code spread around each application
Signed-off-by: Xiang Xiao <xiaoxiang@xiaomi.com>
Change-Id: I8967d647eaf2ecae47f29f83e7fa322ef1b42a02
change the stack pointer type from (uint32_t *) to (void *)
Change-Id: I90bb7d6a9cb0184c133578a1a2ae9a19c233ad30
Signed-off-by: chao.an <anchao@xiaomi.com>
If there's no child status available immediately,
return 0 without blocking as specified by the standards.
I checked the following version of the standard.
I believe it has always been this way though.
The Open Group Base Specifications Issue 7, 2018 edition
IEEE Std 1003.1-2017 (Revision of IEEE Std 1003.1-2008)
If there's no child status available immediately,
return 0 without blocking as specified by the standards.
The implementation for non CONFIG_SCHED_HAVE_PARENT case
seems ok in this regard.
I checked the following version of the standard.
I believe it has always been this way though.
The Open Group Base Specifications Issue 7, 2018 edition
IEEE Std 1003.1-2017 (Revision of IEEE Std 1003.1-2008)
Prohibit use of pthread_cleanup API's by kernel threads. The pthread pthread_cleanup functions MUST run in user mode, making them unusable for kernel threads.
See Issue #1263
since it is impossible that the current running thread is
in the waiting state and then need to wake up self.
Signed-off-by: Xiang Xiao <xiaoxiang@xiaomi.com>
Change-Id: Ie2ba55c382eb3eb7c8d9f04bba1b9e294aaf6196
utilize the call inside nxtask_exit instead, also move
nxsched_suspend_scheduler to nxtask_exit for symmetry
Signed-off-by: Xiang Xiao <xiaoxiang@xiaomi.com>
Change-Id: I219fc15faf0026e452b0db3906aa40b40ac677f3
In the FLAT build if CONFIG_LIB_SYSCALL=y, then the function task_spawn() will be duplicated.: One version in libs/libc/spawn and one version in sched/task.
The version of task_spawn in lib/libc/spawn exists only if CONFIG_LIB_SYSCALL is selected. In that case, the one in sched/task/task_spawn.c should be static, at least in the FLAT build.
The version of task_spawn.c in libs/libc/spawn simply marshals the parameters into a structure and calls nx_task_spawn(). If CONFIG_LIB_SYSCALL is defined then nx_task_spawn() will un-marshal the data can call the real task spawn. This nonsense is only necessary because task_spawn has 8 parameters and the maximum number of parameters in a system call is only 6.
Without syscalls: Application should call directly in task_spawn() in sched/task/task_spawn.c and, hence, it must not be static
With syscalls: Application should call the marshalling task_spawn() in libs/libc/spawn/lib_task_spawn.c -> That will call the autogenerated nx_task_spawn() proxy -> And generate a system call -> The system call will the unmarshalling nx_task_spawn() in sched/task/task_spawn.c -> Which will, finally, call the real task_spawn().
The side-effect of making task_spawn() static is that it then cannot be used within the OS. But as far as I can tell, nothing in the OS itself currently uses task_spawn() so I think it is safe to make it conditionally static. But that only protects from duplicate symbols in the useless case mentioned above.
Block and MTD drivers may be opened and managed as though they were character drivers. But this is really sleight of hand; there is a hidden character driver proxy that mediates the interface to the block and MTD drivers in this case.
fstat(), however, did not account for this. It would report the characteristics of the proxy character driver, not of the underlying block or MTD driver.
This change corrects that. fstat now checks if the character driver is such a proxy and, if so, reports the characteristics of the underlying block or MTD driver, not the proxy character driver.
sched_releasetcb() will normally free the stack allocated for a task. However, a task with a custom, user-managed stack may be created using nxtask_init() followed by nxtask_activer. If such a custom stack is used then it must not be free in this many or a crash will most likely result.
This chagne addes a flag call TCB_FLAG_CUSTOM_STACK that may be passed in the the pre-allocted TCB to nxtask_init(). This flag is not used internally anywhere in the OS except that if set, it will prevent sched_releasetcb() from freeing that custom stack.
Add trivial function nxtask_uninit(). This function will undo all operations on a TCB performed by task_init() and release the TCB by calling kmm_free(). This is intended primarily to support error recovery operations after a successful call to task_init() such was when a subsequent call to task_activate fails.
That error recovery is trivial but not obvious. This helper function should eliminate confusion about what to do to recover after calling nxtask_init()
-Move task_init() and task_activate() prototypes from include/sched.h to include/nuttx/sched.h. These are internal OS functions and should not be exposed to the user.
-Remove references to task_init() and task_activate() from the User Manual.
-Rename task_init() to nxtask_init() since since it is an OS internal function
-Rename task_activate() to nxtask_activate since it is an OS internal function
Functions within the OS must never set the errno value. fs_fdopen() was setting the errno value. Now, after some parameter changes, it reports errors via a negated errno integer return value as do most all other internal OS functions.