Summary:
- This commit removes critical section in mm_sem.c which was
added to stabilize the NuttX SMP kernel in Mar 2018.
Impact:
- SMP only
Testing:
- Tested with ostest with the following configs
- maix-bit:smp (QEMU), esp32-devkitc:smp (QEMU)
- sabre-6quad:smp (QEMU), spresense:smp, sim:smp
- Tested with nxplayer with the following configs
- spresense:wifi_smp, spresense:rndis_smp
Signed-off-by: Masayuki Ishikawa <Masayuki.Ishikawa@jp.sony.com>
* Simplify EINTR/ECANCEL error handling
1. Add semaphore uninterruptible wait function
2 .Replace semaphore wait loop with a single uninterruptible wait
3. Replace all sem_xxx to nxsem_xxx
* Unify the void cast usage
1. Remove void cast for function because many place ignore the returned value witout cast
2. Replace void cast for variable with UNUSED macro
This commit repartitions the logic by moving some of the changes from mm_sem.c into task_getpid.c. The logic is equivalent for the case of mm_trysemaphore(), but no has wider impact since it potentially affects all callers of getpid(). Hence, this change may also introduce some other issues that will need to be addressed.
This change adds a check to mm_trysemaphore() (the root implementation of both kmm_trysemaphore() and umm_trysemaphore()). It checks if the that task that is apparently executing is marked as RUNNING. If not, how could the non-running task be trying to get the MM semaphore? I think only in the exact scenario that Eunbong Song has described.
So I think the solution should provide the same protection as 91aa26774b but without the horrific consequences to memory usage.
Fix SMP related bugs
* sched/sched: Fix a deadlock in SMP mode
Two months ago, I introduced sched_tasklist_lock() and
sched_tasklist_unlock() to protect tasklists in SMP mode.
Actually, this change works pretty well for HTTP audio
streaming aging test with lc823450-xgevk.
However, I found a deadlock in the scheduler when I tried
similar aging tests with DVFS autonomous mode where CPU
clock speed changed based on cpu load. In this case, call
sequences were as follows;
cpu1: sched_unlock()->sched_mergepending()->sched_addreadytorun()->up_cpu_pause()
cpu0: sched_lock()->sched_mergepending()
To avoid this deadlock, I added sched_tasklist_unlock() when calling
up_cpu_pause() and sched_addreadytorun(). Also, added
sched_tasklist_lock() after the call.
Signed-off-by: Masayuki Ishikawa <Masayuki.Ishikawa@jp.sony.com>
* libc: Add critical section in lib_filesem.c for SMP
To set my_pid into fs_folder atomically in SMP mode,
critical section API must be used.
Signed-off-by: Masayuki Ishikawa <Masayuki.Ishikawa@jp.sony.com>
* mm: Add critical section in mm_sem.c for SMP
To set my_pid into mm_folder atomically in SMP mode,
critical section API must be used.
Signed-off-by: Masayuki Ishikawa <Masayuki.Ishikawa@jp.sony.com>
* net: Add critical section in net_lock.c for SMP
To set my pid (me) into fs_folder atomically in SMP mode,
critical section API must be used.
Signed-off-by: Masayuki Ishikawa <Masayuki.Ishikawa@jp.sony.com>
Approved-by: Gregory Nutt <gnutt@nuttx.org>
binfmt/, configs/, grahics/, libc/, mm/, net/, sched/: OS references to the errno variable should always use the set_errno(), get_errno() macros
arch/arm/src/stm32 and stm32f7: Architecture-specific code is not permitted to modify the errno variable. drivers/ and libc/: OS references to the errno variable should always use the set_errno(), get_errno() macros
Replace all usage kill() in the OS proper with nxsig_kill().
sched/signal: Add nxsig_kill() which is functionally equivalent to kill() except that it does not modify the errno variable.
This commit backs out most of commit b4747286b1. That change was added because sem_wait() would sometimes cause cancellation points inappropriated. But with these recent changes, nxsem_wait() is used instead and it is not a cancellation point.
In the OS, all calls to sem_wait() changed to nxsem_wait(). nxsem_wait() does not return errors via errno so each place where nxsem_wait() is now called must not examine the errno variable.
In all OS functions (not libraries), change sem_wait() to nxsem_wait(). This will prevent the OS from creating bogus cancellation points and from modifying the per-task errno variable.
sched/semaphore: Add the function nxsem_wait(). This is a new internal OS interface. It is functionally equivalent to sem_wait() except that (1) it is not a cancellation point, and (2) it does not set the per-thread errno value on return.
sched/semaphore: Add nxsem_post() which is identical to sem_post() except that it never modifies the errno variable. Changed all references to sem_post in the OS to nxsem_post().
sched/semaphore: Add nxsem_destroy() which is identical to sem_destroy() except that it never modifies the errno variable. Changed all references to sem_destroy() in the OS to nxsem_destroy().
libc/semaphore and sched/semaphore: Add nxsem_getprotocol() and nxsem_setprotocola which are identical to sem_getprotocol() and set_setprotocol() except that they never modifies the errno variable. Changed all references to sem_setprotocol in the OS to nxsem_setprotocol(). sem_getprotocol() was not used in the OS
libc/semaphore: Add nxsem_getvalue() which is identical to sem_getvalue() except that it never modifies the errno variable. Changed all references to sem_getvalue in the OS to nxsem_getvalue().
sched/semaphore: Rename all internal private functions from sem_xyz to nxsem_xyz. The sem_ prefix is (will be) reserved only for the application semaphore interfaces.
libc/semaphore: Add nxsem_init() which is identical to sem_init() except that it never modifies the errno variable. Changed all references to sem_init in the OS to nxsem_init().
sched/semaphore: Rename sem_tickwait() to nxsem_tickwait() so that it is clear this is an internal OS function.
sched/semaphoate: Rename sem_reset() to nxsem_reset() so that it is clear this is an internal OS function.