pthread_cond_wait() should be an atomic operation in the mutex lock/unlock.
Since the sched_lock() has been wrongly deleted in the previous commit,
the context switch will occurred after the mutex was unlocked:
--------------------------------------------------------------------
Task1(Priority 100) | Task2(Priority 101)
|
pthread_mutex_lock(mutex); |
| |
pthread_cond_wait(cond, mutex) |
| | |
| | |
| ->enter_critical_section() |
| ->pthread_mutex_give(mutex) | ----> pthread_mutex_lock(mutex); // contex switch to high priority task
| | pthread_cond_signal(cond); // signal before wait
| | <---- pthread_mutex_unlock(mutex); // switch back to original task
| ->pthread_sem_take(cond->sem)| // try to wait the signal, Deadlock.
| ->leave_critical_section() |
|
| ->pthread_mutex_take(mutex) |
| |
pthread_mutex_lock(mutex); |
---------------------------------------------------------------------
This PR will bring back sched_lock()/sched_unlock() to avoid context switch to ensure atomicity
Signed-off-by: chao an <anchao@xiaomi.com>
pthread_cond_wait is preempted after releasing the lock, sched_lock cannot lock threads from other CPUs, use enter_critical_section
Signed-off-by: hujun5 <hujun5@xiaomi.com>
NuttX kernel should not use the syscall functions, especially after
enabling CONFIG_SCHED_INSTRUMENTATION_SYSCALL, all system functions
will be traced to backend, which will impact system performance.
Signed-off-by: chao an <anchao@xiaomi.com>
The "p" format specifier already prepends the pointer address with
"0x" when printing.
Signed-off-by: Gustavo Henrique Nihei <gustavo.nihei@espressif.com>
A mutex may be configured with rather exotic options such as recursive, unsafe, etc. The availability of these mutex options is controlled by configuation settings. When each option is enabled, additional fields are managed inside of the mutex structure.
pthread_cond_wait() and pthread_timed_wait() do the following atomically: (1) unlock the mutex, (2) wait for the condition, and (3) restore the mutex lock. When that lock is restored, pthread_cond_[timed]wait() must also restore the exact configuration of the mutex data structure if these "exotic" features are enabled.
Now that nxsem_wait_uninterruptible() returns an error when the thread is canceled, new logic is being executed. This revealed an existing error in pthread_cond_wait(). The nature of the error is as follows:
* pthread_cond_wait() must atomically (1) release the mutex, (2) wait on the condition, and (3) re-establish the mutex. Errors can occur on any of these steps. In this case the ECANCELED error was (very correctly) reported by nxsem_wait_uninterruptible().
* However, logic in step (3) had a bug; it re-acquired the mutex, but then a bug in existing logic cause the mutex to be improperly initialized. Essentially, the wrong error status value was being testing. This resulted in a nonsequitar and errors reported by the OS test.
This problem was found by apps/testing/ostest/pthread_cleanup.c
* Simplify EINTR/ECANCEL error handling
1. Add semaphore uninterruptible wait function
2 .Replace semaphore wait loop with a single uninterruptible wait
3. Replace all sem_xxx to nxsem_xxx
* Unify the void cast usage
1. Remove void cast for function because many place ignore the returned value witout cast
2. Replace void cast for variable with UNUSED macro
This commit backs out most of commit b4747286b1. That change was added because sem_wait() would sometimes cause cancellation points inappropriated. But with these recent changes, nxsem_wait() is used instead and it is not a cancellation point.
In the OS, all calls to sem_wait() changed to nxsem_wait(). nxsem_wait() does not return errors via errno so each place where nxsem_wait() is now called must not examine the errno variable.
In all OS functions (not libraries), change sem_wait() to nxsem_wait(). This will prevent the OS from creating bogus cancellation points and from modifying the per-task errno variable.
sched/semaphore: Add the function nxsem_wait(). This is a new internal OS interface. It is functionally equivalent to sem_wait() except that (1) it is not a cancellation point, and (2) it does not set the per-thread errno value on return.