The previous implementation of clearing global IRQ in sched_addreadytorun()
and sched_removereadytorun() was done too early. As a result, nxsem_post()
would have a chance to enter the critical section even nxsem_wait() is
still not in blocked state. This patch moves clearing global IRQ controls
from sched_addreadytorun() and sched_removereadytorun() to sched_resumescheduler()
to ensure that nxsem_post() can enter the critical section correctly.
For this change, sched_resumescheduler.c is always necessary for SMP configuration.
In addition, by this change, task_exit() had to be modified so that it calls
sched_resumescheduler() because it calls sched_removescheduler() inside the
function, otherwise it will cause a deadlock.
However, I encountered another DEBUGASSERT() in sched_cpu_select() during
HTTP streaming aging test on lc823450-xgevk. Actually sched_cpu_select()
accesses the g_assignedtasks which might be changed by another CPU. Similarly,
other tasklists might be modified simultaneously if both CPUs are executing
scheduling logic. To avoid this, I introduced tasklist protetion APIs.
With these changes, SMP kernel stability has been much improved.
Signed-off-by: Masayuki Ishikawa <Masayuki.Ishikawa@jp.sony.com>
binfmt/, configs/, grahics/, libc/, mm/, net/, sched/: OS references to the errno variable should always use the set_errno(), get_errno() macros
arch/arm/src/stm32 and stm32f7: Architecture-specific code is not permitted to modify the errno variable. drivers/ and libc/: OS references to the errno variable should always use the set_errno(), get_errno() macros
sched/sched: Correct some build issues introduced by last set of changes.
sched/sched: Add new internal OS function nxsched_setaffinity() that is identical to sched_isetaffinity() except that it does not modify the errno value. All usage of sched_setaffinity() within the OS is replaced with nxsched_setaffinity().
sched/sched: Internal functions sched_reprioritize() and sched_setpriority() no longer movidify the errno value. Also renamed to nxsched_reprioritize() and sched_setpriority().
sched/sched: Add new internal OS function nxsched_getscheduler() that is identical to sched_getscheduler() except that it does not modify the errno value. All usage of sched_getscheduler() within the OS is replaced with nxsched_getscheduler().
sched/sched: Add new internal OS function nxsched_setparam() that is identical to sched_setparam() except that it does not modify the errno value. All usage of sched_setparam() within the OS is replaced with nxsched_setparam().
sched/sched: Add new internal OS function nxsched_getparam() that is identical to sched_getparam() except that it does not modify the errno value (actually, the previous value erroneously neglected to set the errno value to begin with, but this fixes both issues). All usage of sched_getparam() within the OS is replaced with nxsched_getparam().
These APIs are used in sched_note.c to protect instumentation data.
The deffrence between these APIs to exsiting spin_lock() and spin_unlock()
is that they do not perform insturumentation to avoid recursive call
when SCHED_INSTRUMENTATION_SPINLOCKS=y.
Signed-off-by: Masayuki Ishikawa <Masayuki.Ishikawa@jp.sony.com>
If SMP=n or SMP=y && SPINLOCK_IRQ=n, this works in the same way as before.
If SMP=y && SPINLOCK_IRQ=y, performance will be improved.
Signed-off-by: Masayuki Ishikawa <Masayuki.Ishikawa@jp.sony.com>
In ARM document regarding memory barrires, SP_DMB() must be issued
before changing a spinlock state to SP_UNLOCKED. However, we found
that SP_DSB() is also needed to ensure that spin_unlock() works
correctly for network streaming aging test.
Signed-off-by: Masayuki Ishikawa <Masayuki.Ishikawa@jp.sony.com>
sched/sched: Remove DEBUGASSERT() in sched_mergepending()
Because this DEBUGASSERT() assumes that while loop executes only once.
Signed-off-by: Masayuki Ishikawa <Masayuki.Ishikawa@jp.sony.com>
Approved-by: Gregory Nutt <gnutt@nuttx.org>
SMP: Introduce spin_lock_irqsave() and spin_unlock_irqrestore()
These APIs are simplified version of enter_critical_section() and
leave_critical_section() to protect data (e.g. registers) in SMP mode.
By using these APIs inside drivers, performace will be improved.
Signed-off-by: Masayuki Ishikawa <Masayuki.Ishikawa@jp.sony.com>
Approved-by: Gregory Nutt <gnutt@nuttx.org>
Change all calls to mq_receive() and mq_timedreceive() in the OS to calls to nxmq_receive() and nxmq_timedreceive(), making appropriate changes for differences in return values.
sched/mqueue: Add nxmq_receive() and mxmq_timedreceive() which are functionally equivalent to the standard mq_receive and mq_timedreceive() except that (1) they do not create cancellation points, and (2) the do not modify the application's errno variable.
Change all calls to mq_send() and mq_timedsend() in the OS to calls to nxmq_send() and nxmq_timedsend(), making appropriate changes for differences in return values.
sched/mqueue: Add internal function nxmq_send() and nxmq_timedsend() that are equivalent to mq_send() and mq_timedsend() except that they do not create cancellation points and do to not modify the errno variable.
sched/mqueue: Rename all private static functions for use the nxmq_ vs. mq_ naming.
sched/mqueue: Rename all OS internal functions declared in sched/mqueue/mqueue.h to begin with nxmq_ vs. mq_. The mq_ prefix is reserved for standard application interfaces.
Replace all calls to sigprocmask() in the OS proper with calls to nxsig_procmask().
sched/signal: Add internal OS interface nxsig_procmask(). This internal interface is equivalent to the standard sigprocmask() used by applications except that it does not modify the errno value. Also fixes a problem in that the original sigprocmask() was not setting the errno.
Replace all calls to sigqueue() in the OS proper with calls to nxsig_queue() to avoid accessing the errno variable.
sched/signal: Add nxsig_queue() which is functionally equivalent to sigqueue() except that it does not modify the errno variable.
Replace all usage kill() in the OS proper with nxsig_kill().
sched/signal: Add nxsig_kill() which is functionally equivalent to kill() except that it does not modify the errno variable.
Squashed commit of the following:
Change all calls to usleep() in the OS proper to calls to nxsig_usleep()
sched/signal: Add a new OS internal function nxsig_usleep() that is functionally equivalent to usleep() but does not cause a cancellaption point and does not modify the errno variable.
sched/signal: Add a new OS internal function nxsig_sleep() that is functionally equivalent to sleep() but does not cause a cancellaption point.
sigtimedwait() -> nxsig_timedwait()
sigwaitinfo() -> nxsig_waitinfo()
nanosleep() -> nxsig_nanosleep()
The internal OS versions differ from the standard application interfaces in that:
- They do not create cancellation points, and
- they do not modify the application's errno variable
Squashed commit of the following:
sched/signal: Replace all usage of sigwaitinfo(), sigtimedwait(), and nanosleep() with the OS internal counterparts nxsig_waitinfo(), nxsig_timedwait(), and nxsig_nanosleep().
sched/signal: Add nxsig_nanosleep(). This is an internal OS version of nanosleep(). It differs in that it does not set the errno varaiable and does not create a cancellation point.
sched/signal: Add nxsig_timedwait() and nxsig_waitinfo(). These are internal OS versions of sigtimedwait() and sigwaitinfo(). They differ in that they do not set the errno varaiable and they do not create cancellation points.
What was I thinking? I missed that litle minus sign and the possibility that the errno might be some positive non-zero value.
This reverts commit 43880878e4.
This is analogous to similar renaming that was done previously for semaphores.
Squashed commit of the following:
sched/signal: Fix a few compile warnings introduced by naming changes.
sched/signal: Rename all private, internal signl functions to use the nxsig_ prefix.
sched/signal: Rename sig_removependingsignal, sig_unmaskpendingsignal, and sig_mqnotempty to nxsig_remove_pendingsignal, nxsig_unmask_pendingsignal, and nxsig_mqnotempty to make it clear that these are OS internal interfaces.
sched/signal: Rename sig_findaction and sig_lowest to nxsig_find_action and nxsig_lowest to make it clear that these are OS internal interfaces.
sched/signal: Rename sig_allocatepingsigaction and sig_deliver to nxsig_alloc_pendingsigaction and nxsig_deliver to make it clear that these are OS internal interfaces.
sched/signal: Rename sig_cleanup, sig_release, sig_releasependingaction, and sig_releasependingsignal to nxsig_cleanup, nxsig_release, nxsig_release_pendingaction, and nxsig_release_pendingsignal to make it clear that these are OS internal interfaces.
sched/signal: Rename sig_tcbdispatch and sig_dispatch to nxsig_tcbdispatch and nxsig_dispatch to make it clear that these are OS internal interfaces.
sched/signal: Rename sig_releaseaction and sig_pendingset to nxsig_release_action and nxsig_pendingset to make it clear that these are OS internal interfaces.
sched/signal: Rename sig_initialize and sig_allocateactionblock to nxsig_initialize and nxsig_alloc_actionblock to make it clear that these are OS internal interfaces.
This commit backs out most of commit b4747286b1. That change was added because sem_wait() would sometimes cause cancellation points inappropriated. But with these recent changes, nxsem_wait() is used instead and it is not a cancellation point.
In the OS, all calls to sem_wait() changed to nxsem_wait(). nxsem_wait() does not return errors via errno so each place where nxsem_wait() is now called must not examine the errno variable.
In all OS functions (not libraries), change sem_wait() to nxsem_wait(). This will prevent the OS from creating bogus cancellation points and from modifying the per-task errno variable.
sched/semaphore: Add the function nxsem_wait(). This is a new internal OS interface. It is functionally equivalent to sem_wait() except that (1) it is not a cancellation point, and (2) it does not set the per-thread errno value on return.
sched/semaphore: Add nxsem_post() which is identical to sem_post() except that it never modifies the errno variable. Changed all references to sem_post in the OS to nxsem_post().
sched/semaphore: Add nxsem_destroy() which is identical to sem_destroy() except that it never modifies the errno variable. Changed all references to sem_destroy() in the OS to nxsem_destroy().
libc/semaphore and sched/semaphore: Add nxsem_getprotocol() and nxsem_setprotocola which are identical to sem_getprotocol() and set_setprotocol() except that they never modifies the errno variable. Changed all references to sem_setprotocol in the OS to nxsem_setprotocol(). sem_getprotocol() was not used in the OS
libc/semaphore: Add nxsem_getvalue() which is identical to sem_getvalue() except that it never modifies the errno variable. Changed all references to sem_getvalue in the OS to nxsem_getvalue().
sched/semaphore: Rename all internal private functions from sem_xyz to nxsem_xyz. The sem_ prefix is (will be) reserved only for the application semaphore interfaces.
libc/semaphore: Add nxsem_init() which is identical to sem_init() except that it never modifies the errno variable. Changed all references to sem_init in the OS to nxsem_init().
sched/semaphore: Rename sem_tickwait() to nxsem_tickwait() so that it is clear this is an internal OS function.
sched/semaphoate: Rename sem_reset() to nxsem_reset() so that it is clear this is an internal OS function.
net/socket/recvfrom.c: Check fromlen integrity before using it.
net/socket/net_sockets.c: Always check for valid psock before using.
net/tcp/tcp_send_unbuffered.c: Avoid using psock beforing checking its integrity.
sched/timer/timer_create.c: Fix watchdog resource leak if cannot allocate a new timer.
Add clock_resynchronize for better synchronization of CLOCK_REALTIME and CLOCK_MONOTONIC to match RTC after resume from low-power state.
Add up_rtc_getdatetime_with_subseconds under CONFIG_ARCH_HAVE_RTC_SUBSECONDS to allow initializing (and resynchronizing) system clock with subseconds accuracy RTC.
Entropy pool gathers environmental noise from device drivers, user-space, etc., and returns good random numbers, suitable for cryptographic use. Based on entropy pool design from *BSDs and uses BLAKE2Xs algorithm for CSPRNG output.
Patch also adds /dev/urandom support for using entropy pool RNG and new 'getrandom' system call for getting randomness without file-descriptor usage (thus avoiding file-descriptor exhaustion attacks). The 'getrandom' interface is similar as 'getentropy' and 'getrandom' available on OpenBSD and Linux respectively.
This was a consequence of the recent robust mutex changes. If robust mutexes are selected, then each mutex that a thread takes is retained in a list in threads TCB. If the thread exits and that list is not empty, then we know that the thread exitted while holding mutexes. And, in that case, each will be marked as inconsistent and the any waiter for the thread is awakened.
For the case of pthread_mutex_trywait(), the mutex was not being added to the list! while not usually a fatal error, this was caught by an assertion when pthread_mutex_unlock() was called: It tried to remove the mutex from the TCB list and it was not there when, of course, it shoule be.
The fix was to add pthread_mutex_trytake() which does sem_trywait() and if successful, does correctly add the mutext to the TCB list. This should eliminated the assertion.
in the case of CONFIG_SEM_PREALLOCHOLDERS=0
The call to sem_restorebaseprio_task context switches in the
sem_foreachholder(sem, sem_restoreholderprioB, stcb); call
prior to releasing the holder. So the running task is left
as a holder as is the started task. Leaving both slots filled
Thus failing to perforem the boost/or restoration on the
correct tcb.
This PR fixes this by releasing the running task slot prior
to reprioritization that can lead to the context switch.
To faclitate this, the interface to sem_restorebaseprio
needed to take the tcb from the holder prior to the
holder being freed. In the failure case where sched_verifytcb
fails it added the overhead of looking up the holder.
There is also the adfitinal thunking on the foreach to
get from holer to holder->tcb.
An alternate approach could be to leve the interface the same
and allocate a holder on the stack of sem_restoreholderprioB
copy the sem's holder to it, free it as is done in this pr
and and then pass that address sem_restoreholderprio as the
holder. It could then get the holder's tcb but we would
keep the same sem_findholder in sched_verifytcb.
Provide a user defined callback context for irq's, such that when
registering a callback users can provide a pointer that will get
passed back when the isr is called.
This commit corrects this. This is matching logic in sched_addreadytorun to avoid starting new tasks within the critical section (unless the CPU is the holder of the lock). The holder of the IRQ lock must be permitted to do whatever it needs to do.
The three fixes are to handle cases in the SMP configuration where one CPU does need to make modifications to TCB and data structures on a task that could be running running on another CPU. Those three cases are task_delete(), task_restart(), and execution of signal handles. In all three cases the solutions is basically the same: (1) Call sched_cpu_pause(tcb) to pause the CPU on which the task is running, (2) perform the necessary operations, then (3) call up_cpu_resume() to restart the paused CPU.
But certain logic interacts with tasks in different ways. The only one that comes to mind are wdogs. There is a tasking interface that to manipulate wdogs, and a different interface in the timer interrupt handling logic to manage wdog expirations.
In the normal case, this is fine. Since the tasking level code calls enter_critical_section, interrupts are disabled an no conflicts can occur. But that may not be the case in the SMP case. Most architectures do not permit disabling interrupts on other CPUs so enter_critical_section must work differently: Locks are required to protect code.
So this change adds locking (via enter_critical section) to wdog expiration logic for the the case if the SMP configuration.