The previous implementation of clearing global IRQ in sched_addreadytorun()
and sched_removereadytorun() was done too early. As a result, nxsem_post()
would have a chance to enter the critical section even nxsem_wait() is
still not in blocked state. This patch moves clearing global IRQ controls
from sched_addreadytorun() and sched_removereadytorun() to sched_resumescheduler()
to ensure that nxsem_post() can enter the critical section correctly.
For this change, sched_resumescheduler.c is always necessary for SMP configuration.
In addition, by this change, task_exit() had to be modified so that it calls
sched_resumescheduler() because it calls sched_removescheduler() inside the
function, otherwise it will cause a deadlock.
However, I encountered another DEBUGASSERT() in sched_cpu_select() during
HTTP streaming aging test on lc823450-xgevk. Actually sched_cpu_select()
accesses the g_assignedtasks which might be changed by another CPU. Similarly,
other tasklists might be modified simultaneously if both CPUs are executing
scheduling logic. To avoid this, I introduced tasklist protetion APIs.
With these changes, SMP kernel stability has been much improved.
Signed-off-by: Masayuki Ishikawa <Masayuki.Ishikawa@jp.sony.com>
sched/sched: Correct some build issues introduced by last set of changes.
sched/sched: Add new internal OS function nxsched_setaffinity() that is identical to sched_isetaffinity() except that it does not modify the errno value. All usage of sched_setaffinity() within the OS is replaced with nxsched_setaffinity().
sched/sched: Internal functions sched_reprioritize() and sched_setpriority() no longer movidify the errno value. Also renamed to nxsched_reprioritize() and sched_setpriority().
sched/sched: Add new internal OS function nxsched_getscheduler() that is identical to sched_getscheduler() except that it does not modify the errno value. All usage of sched_getscheduler() within the OS is replaced with nxsched_getscheduler().
sched/sched: Add new internal OS function nxsched_setparam() that is identical to sched_setparam() except that it does not modify the errno value. All usage of sched_setparam() within the OS is replaced with nxsched_setparam().
sched/sched: Add new internal OS function nxsched_getparam() that is identical to sched_getparam() except that it does not modify the errno value (actually, the previous value erroneously neglected to set the errno value to begin with, but this fixes both issues). All usage of sched_getparam() within the OS is replaced with nxsched_getparam().
These APIs are used in sched_note.c to protect instumentation data.
The deffrence between these APIs to exsiting spin_lock() and spin_unlock()
is that they do not perform insturumentation to avoid recursive call
when SCHED_INSTRUMENTATION_SPINLOCKS=y.
Signed-off-by: Masayuki Ishikawa <Masayuki.Ishikawa@jp.sony.com>
sched/sched: Remove DEBUGASSERT() in sched_mergepending()
Because this DEBUGASSERT() assumes that while loop executes only once.
Signed-off-by: Masayuki Ishikawa <Masayuki.Ishikawa@jp.sony.com>
Approved-by: Gregory Nutt <gnutt@nuttx.org>
Replace all usage kill() in the OS proper with nxsig_kill().
sched/signal: Add nxsig_kill() which is functionally equivalent to kill() except that it does not modify the errno variable.
sigtimedwait() -> nxsig_timedwait()
sigwaitinfo() -> nxsig_waitinfo()
nanosleep() -> nxsig_nanosleep()
The internal OS versions differ from the standard application interfaces in that:
- They do not create cancellation points, and
- they do not modify the application's errno variable
Squashed commit of the following:
sched/signal: Replace all usage of sigwaitinfo(), sigtimedwait(), and nanosleep() with the OS internal counterparts nxsig_waitinfo(), nxsig_timedwait(), and nxsig_nanosleep().
sched/signal: Add nxsig_nanosleep(). This is an internal OS version of nanosleep(). It differs in that it does not set the errno varaiable and does not create a cancellation point.
sched/signal: Add nxsig_timedwait() and nxsig_waitinfo(). These are internal OS versions of sigtimedwait() and sigwaitinfo(). They differ in that they do not set the errno varaiable and they do not create cancellation points.
This commit backs out most of commit b4747286b1. That change was added because sem_wait() would sometimes cause cancellation points inappropriated. But with these recent changes, nxsem_wait() is used instead and it is not a cancellation point.
In the OS, all calls to sem_wait() changed to nxsem_wait(). nxsem_wait() does not return errors via errno so each place where nxsem_wait() is now called must not examine the errno variable.
In all OS functions (not libraries), change sem_wait() to nxsem_wait(). This will prevent the OS from creating bogus cancellation points and from modifying the per-task errno variable.
sched/semaphore: Add the function nxsem_wait(). This is a new internal OS interface. It is functionally equivalent to sem_wait() except that (1) it is not a cancellation point, and (2) it does not set the per-thread errno value on return.
This commit corrects this. This is matching logic in sched_addreadytorun to avoid starting new tasks within the critical section (unless the CPU is the holder of the lock). The holder of the IRQ lock must be permitted to do whatever it needs to do.
The three fixes are to handle cases in the SMP configuration where one CPU does need to make modifications to TCB and data structures on a task that could be running running on another CPU. Those three cases are task_delete(), task_restart(), and execution of signal handles. In all three cases the solutions is basically the same: (1) Call sched_cpu_pause(tcb) to pause the CPU on which the task is running, (2) perform the necessary operations, then (3) call up_cpu_resume() to restart the paused CPU.