pthread_cond_wait() should be an atomic operation in the mutex lock/unlock.
Since the sched_lock() has been wrongly deleted in the previous commit,
the context switch will occurred after the mutex was unlocked:
--------------------------------------------------------------------
Task1(Priority 100) | Task2(Priority 101)
|
pthread_mutex_lock(mutex); |
| |
pthread_cond_wait(cond, mutex) |
| | |
| | |
| ->enter_critical_section() |
| ->pthread_mutex_give(mutex) | ----> pthread_mutex_lock(mutex); // contex switch to high priority task
| | pthread_cond_signal(cond); // signal before wait
| | <---- pthread_mutex_unlock(mutex); // switch back to original task
| ->pthread_sem_take(cond->sem)| // try to wait the signal, Deadlock.
| ->leave_critical_section() |
|
| ->pthread_mutex_take(mutex) |
| |
pthread_mutex_lock(mutex); |
---------------------------------------------------------------------
This PR will bring back sched_lock()/sched_unlock() to avoid context switch to ensure atomicity
Signed-off-by: chao an <anchao@xiaomi.com>
Resolving the issue with the ltp_interfaces_pthread_join_6_2 test case.
In SMP mode, the pthread may still be in the process of exiting when
pthread_join returns, and calling pthread_join again at this time will
result in an error. The error code returned should be ESRCH.
Signed-off-by: zhangyuan21 <zhangyuan21@xiaomi.com>
use PTHREAD_CLEANUP_STACKSIZE to enable or disable interfaces pthread_cleanup_push() and pthread_cleanup_pop().
reasons:(1)same as TLS_TASK_NELEM (2)it is no need to use two variables
Signed-off-by: yanghuatao <yanghuatao@xiaomi.com>
This is preparation to use kernel stack for everything when the user
process enters the kernel. Now the user stack is in use when the user
process runs a system call, which might not be the safest option.
https://pubs.opengroup.org/onlinepubs/9699919799/functions/pthread_detach.html
If an implementation detects that the value specified by the thread argument
to pthread_detach() does not refer to a joinable thread, it is recommended
that the function should fail and report an [EINVAL] error.
If an implementation detects use of a thread ID after the end of its lifetime,
it is recommended that the function should fail and report an [ESRCH] error.
Signed-off-by: zhangyuan21 <zhangyuan21@xiaomi.com>
The current implementation requires the use of enter_critical_section, so the source code needs to be moved to kernel space
Signed-off-by: hujun5 <hujun5@xiaomi.com>
pthread_cond_wait is preempted after releasing the lock, sched_lock cannot lock threads from other CPUs, use enter_critical_section
Signed-off-by: hujun5 <hujun5@xiaomi.com>
Refer to issue #8867 for details and rational.
Convert sigset_t to an array type so that more than 32 signals can be supported.
Why not use a uin64_t?
- Using a uin32_t is more flexible if we decide to increase the number of signals beyound 64.
- 64-bit accesses are not atomic, at least not on 32-bit ARMv7-M and similar
- Keeping the base type as uint32_t does not introduce additional overhead due to padding to achieve 64-bit alignment of uin64_t
- Some architectures still supported by NuttX do not support uin64_t
types,
Increased the number of signals to 64. This matches Linux. This will support all xsignals defined by Linux and also 32 real time signals (also like Linux).
This is is a work in progress; a draft PR that you are encouraged to comment on.
Remove calls to the userspace API exit() from the kernel. The problem
with doing such calls is that the exit functions are called with kernel
mode privileges which is a big security no-no.
There is currently a big problem in the address environment handling which
is that the address environment is released too soon when the process is
exiting. The current MMU mappings will always be the exiting process's, which means
the system needs them AT LEAST until the next context switch happens. If
the next thread is a kernel thread, the address environment is needed for
longer.
Kernel threads "lend" the address environment of the previous user process.
This is beneficial in two ways:
- The kernel processes do not need an allocated address environment
- When a context switch happens from user -> kernel or kernel -> kernel,
the TLB does not need to be flushed. This must be done only when
changing to a different user address environment.
Another issue is when a new process is created; the address environment
of the new process must be temporarily instantiated by up_addrenv_select().
However, the system scheduler does not know that the process has a different
address environment to its own and when / if a context restore happens, the
wrong MMU page directory is restored and the process will either crash or
do something horribly wrong.
The following changes are needed to fix the issues:
- Add mm_curr which is the current address environment of the process
- Add a reference counter to safeguard the address environment
- Whenever an address environment is mapped to MMU, its reference counter
is incremented
- Whenever and address environment is unmapped from MMU, its reference
counter is decremented, and tested. If no more references -> drop the
address environment and release the memory as well
- To limit the context switch delay, the address environment is freed in
a separate low priority clean-up thread (LPWORK)
- When a process temporarily instantiates another process's address
environment, the scheduler will now know of this and will restore the
correct mappings to MMU
Why is this not causing more noticeable issues ? The problem only happens
under the aforementioned special conditions, and if a context switch or
IRQ occurs during this time.
Detach the address environment handling from the group structure to the
tcb. This is preparation to fix rare cases where the system (MMU) is left
without a valid page directory, e.g. when a process exits.
NuttX kernel should not use the syscall functions, especially after
enabling CONFIG_SCHED_INSTRUMENTATION_SYSCALL, all system functions
will be traced to backend, which will impact system performance.
Signed-off-by: chao an <anchao@xiaomi.com>
According to posix spec, this function should never return `EINTR`.
This fixes the call to `pthread_mutex_take` so it keeps retrying the
lock and doesn't return `EINTR`
D:\code\incubator-nuttx\sched\pthread\pthread_create.c(154,22):
warning C4189: “pjoin”: local variable is initialized but not referenced
[D:\code\incubator-nuttx\vs20221\sched\sched.vcxproj]
D:\code\incubator-nuttx\sched\group\group_setupidlefiles.c(61,28):
warning C4189: “group”: local variable is initialized but not referenced
[D:\code\incubator-nuttx\vs20221\sched\sched.vcxproj]
Reference:
https://docs.microsoft.com/en-us/cpp/error-messages/compiler-warnings/compiler-warning-level-4-c4189?view=msvc-170
Signed-off-by: chao.an <anchao@xiaomi.com>
The "p" format specifier already prepends the pointer address with
"0x" when printing.
Signed-off-by: Gustavo Henrique Nihei <gustavo.nihei@espressif.com>
since _exit may kill all sibling thread when
HAVE_GROUP_MEMBERS equal true. Regressed by:
commit 622677d4a1
Author: Ville Juven <ville.juven@unikie.com>
Date: Mon May 2 15:15:06 2022 +0300
libc: Implement exit, atexit, on_exit and cxa_exit on the user side
For CONFIG_BUILD_KERNEL using the sched/task/task_exithook implementation
will just not work. It calls user code with kernel privileges which is
a bit of a security issue.
Signed-off-by: Xiang Xiao <xiaoxiang@xiaomi.com>
pthread_condclockwait() can not distinguish between interrupt and timeout,
which cause these API not follow POSIX:
pthread_rwlock_timedrdlock()
pthread_rwlock_timedwrlock()
pthread_condtimedwait()
POSIX:
Upon return from the signal handler the thread resumes waiting for
the condition variable as if it wasnot interrupted
These functions shall not return an error code of [EINTR].
Replacing nxsem_wait() with nxsem_clockwait_uninterruptible() can solve it.
Signed-off-by: jihandong <jihandong@xiaomi.com>
if a pthread set attr is detach,and when call pthread_create,
new thread exit quikly,new thread's tcb be free,then pthread_create
use new thread's tcb will crash.
Signed-off-by: anjiahao <anjiahao@xiaomi.com>