The spawn proxy thread is a special existence in NuttX, usually some developers
spend a lot of time on stack overflow of spawn proxy thread:
https://github.com/apache/nuttx/issues/9046https://github.com/apache/nuttx/pull/9081
In order to avoid similar issues, this PR will remove spawn proxy thread to simplify
the process of task/posix_spawn().
1. Postpone the related processing of spawn file actions until after task_init()
2. Delete the temporary thread of spawn proxy and related global variables
Signed-off-by: chao an <anchao@xiaomi.com>
Instead of using a volatile storage for the address environment in the
binfmt / loadinfo structures, always allocate the address environment
from kheap.
This serves two purposes:
- If the task creation fails, any kernel thread that depends on the
address environment created during task creation will not lose their
mappings (because they hold a reference to it)
- The current address environment variable (g_addrenv) will NEVER contain
a stale / incorrect value
- Releasing the address environment is simplified as any pointer given
to addrenv_drop() can be assumed to be heap memory
- Makes the kludge function addrenv_clear_current irrelevant, as the
system will NEVER have invalid mappings any more
Refer to issue #8867 for details and rational.
Convert sigset_t to an array type so that more than 32 signals can be supported.
Why not use a uin64_t?
- Using a uin32_t is more flexible if we decide to increase the number of signals beyound 64.
- 64-bit accesses are not atomic, at least not on 32-bit ARMv7-M and similar
- Keeping the base type as uint32_t does not introduce additional overhead due to padding to achieve 64-bit alignment of uin64_t
- Some architectures still supported by NuttX do not support uin64_t
types,
Increased the number of signals to 64. This matches Linux. This will support all xsignals defined by Linux and also 32 real time signals (also like Linux).
This is is a work in progress; a draft PR that you are encouraged to comment on.
Remove calls to the userspace API exit() from the kernel. The problem
with doing such calls is that the exit functions are called with kernel
mode privileges which is a big security no-no.
Do not allow a deferred cancellation if the group is exiting, it is too
dangerous to allow the threads to execute any user space code after the
exit has started.
If the cancelled thread is not inside a cancellation point, just kill it
immediately via asynchronous cancellation. This will create far less
problems than allowing it to continue running user code.
There is an issue where the wrong process exit code is given to the parent
when a process exits. This happens when the process has pthreads running
user code i.e. not within a cancel point / system call.
Why does this happen ?
When exit() is called, the following steps are done:
- group_kill_children(), which tells the children to die via pthread_cancel()
Then, one of two things can happen:
1. if the child is in a cancel point, it gets scheduled to allow it to leave
the cancel point and gets destroyed immediately
2. if the child is not in a cancel point, a "cancel pending" flag is set and
the child will die when the next cancel point is encountered
So what is the problem here?
The last thread alive dispatches SIGCHLD to the parent, which carries the
process's exit code. The group head has the only meaningful exit code and
this is what should be passed. However, in the second case, the group head
exits before the child, taking the process exit code to its grave. The child
that was alive will exit next and will pass its "status" to the parent process,
but this status is not the correct value to pass.
This commit fixes the issue by passing the group head's exit code ALWAYS to
the parent process.
Detach the address environment handling from the group structure to the
tcb. This is preparation to fix rare cases where the system (MMU) is left
without a valid page directory, e.g. when a process exits.
NuttX kernel should not use the syscall functions, especially after
enabling CONFIG_SCHED_INSTRUMENTATION_SYSCALL, all system functions
will be traced to backend, which will impact system performance.
Signed-off-by: chao an <anchao@xiaomi.com>
A testcase as following:
child_task()
{
sleep(3);
}
main_task()
{
while (1)
{
ret = task_create("child_task", child_task, );
sleep(1);
task_delete(ret);
}
}
Root casuse:
task_delete hasn's cover the condition that the deleted-task
is justing running on the other CPU.
Fix:
Let the nxsched_remove_readytorun() do the real work
Signed-off-by: ligd <liguiding1@xiaomi.com>
1. When pthread exit, set the default cancellability state to NONCANCELABLE state.
2. Make sure modify tcb->flags is atomic operations.
Signed-off-by: zhangyuan21 <zhangyuan21@xiaomi.com>
It takes about 10 cycles to obtain the task list according to the task
status. In most cases, we know the task status, so we can directly
add the task from the specified task list to reduce time consuming.
It takes about 10 cycles to obtain the task list according to the task
status. In most cases, we know the task status, so we can directly
delete the task from the specified task list to reduce time consuming.
strlcpy ensure the destination is NUL-terminated, and also fix warning:
```c
task/task_prctl.c:138:15: warning: 'strncpy' output may be truncated copying 30 bytes from a string of length 31 [-Wstringop-truncation]
138 | strncpy(name, tcb->name, CONFIG_TASK_NAME_SIZE - 1);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
```
Signed-off-by: Huang Qi <huangqi3@xiaomi.com>
For CONFIG_BUILD_KERNEL using the sched/task/task_exithook implementation
will just not work. It calls user code with kernel privileges which is
a bit of a security issue.
Deleting a task from another task's context will not do, so shut
this gate down for BUILD_KERNEL. In this case if a task wants another
task to terminate, it must ask the other task to politely kill itself.
Note: kthreads still need this, also, the kernel can delete a task
without asking.
If address environments are in use, it is not possible to simply
memcpy from from one process to another. The current implementation
of env_dup does precisely this and thus, it fails at once when it is
attempted between two user processes.
The solution is to use the kernel's heap as an intermediate buffer.
This is a simple, effective and common way to do a fork().
Obviously this is not needed for kernel processes.
argv is allocated from stack and then belong to userspace,
so task_info_s is a best location to hold this information.
Signed-off-by: Xiang Xiao <xiaoxiang@xiaomi.com>
nxspawn_open() is expected to return "OK" when it success, but
it doesn't return it in case of executing dup2.
Because of this, the "Command as parameter" couldn't work with
Builtin Apps.
'pid' cannot really be used uninitialized, but Clang analyzer does not
see it. Add initializer to silence it and also make debugging slightly
easier.
Explicitly set pid's address to NULL to fix this complaint:
"Address of stack memory associated with local variable 'pid' is still
referred to by the global variable 'g_spawn_parms' upon returning to
the caller."
No functional change.
Signed-off-by: Juha Niskanen <juha.niskanen@haltian.com>
since the standard require the caller pass the name explicitly
https://pubs.opengroup.org/onlinepubs/009695399/functions/posix_spawn.html:
The argument argv is an array of character pointers to null-terminated strings.
The last member of this array shall be a null pointer and is not counted in argc.
These strings constitute the argument list available to the new process image.
The value in argv[0] should point to a filename that is associated with the
process image being started by the posix_spawn() or posix_spawnp() function.
Signed-off-by: Xiang Xiao <xiaoxiang@xiaomi.com>
since the standard require the caller pass the name explicitly
https://pubs.opengroup.org/onlinepubs/009695399/functions/posix_spawn.html:
The argument argv is an array of character pointers to null-terminated strings.
The last member of this array shall be a null pointer and is not counted in argc.
These strings constitute the argument list available to the new process image.
The value in argv[0] should point to a filename that is associated with the
process image being started by the posix_spawn() or posix_spawnp() function.
Signed-off-by: Xiang Xiao <xiaoxiang@xiaomi.com>
Change-Id: Id79ffcc501ae9552dc4e908418ff555f498be7f1
It's better to save one argument by returning pid directly.
This change also follow the convention of task_create.
BTW, it is reasonable to adjust the function prototype a
little bit from both implementation and consistency since
task_spawn is NuttX specific API.
Signed-off-by: Xiang Xiao <xiaoxiang@xiaomi.com>
pthread_exit will be called recursive when pthread_cancel
or other cleanup operation with syscalls that support
cancellation, to avoid this by mark current tcb flag as
TCB_FLAG_CANCEL_DOING instead of TCB_FLAG_CANCEL_PENDING.
Signed-off-by: Huang Qi <huangqi3@xiaomi.com>
Drop to user-space in kernel/protected build with up_pthread_exit,
now all pthread_cleanup functions executed in user mode.
* A new syscall SYS_pthread_exit added
* A new tcb flag TCB_FLAG_CANCEL_DOING added
* up_pthread_exit implemented for riscv/arm arch
Signed-off-by: Huang Qi <huangqi3@xiaomi.com>
arch: Allocate the space from the beginning in up_stack_frame
and modify the affected portion:
1.Correct the stack dump and check
2.Allocate tls_info_s by up_stack_frame too
3.Move the stack fork allocation from arch to sched
Signed-off-by: Xiang Xiao <xiaoxiang@xiaomi.com>
Summary:
- I noticed that getopt() test in ostest wailed with
esp32-devkitc:smp and spresense:smp
- Finally, I found that the task-specific data is not
initialized.
- This commit fixes this issue
Impact:
- None
Testing:
- Tested with ostest esp32-devkitc:smp and spresense:smp
Signed-off-by: Masayuki Ishikawa <Masayuki.Ishikawa@jp.sony.com>
getopt() in the FLAT build environment is not thread safe. This is because global variables that are process-specific in Unix are truly global in the FLAT build. Moving the getopt() variables into TLS resolves this issue.
No side-effects are expected other than to getopt()
Tested with sim:nsh
it is wrong to define a new grpid_t, but not reuse pid_t,
because it make getpid(parent) == getppid(child) impossible.
Signed-off-by: Xiang Xiao <xiaoxiang@xiaomi.com>
Summary:
- During investigating critical section with semaphores, I noticed
that nxtask_flushstreams() is called with a critical section.
- The function calls lib_flushall() which handles a semaphore
in userspace.
- So it should be done without a critical section
Impact:
- SMP only
Testing:
- Tested with ostest the following configs
- esp32-devkitc:smp (QEMU), sabre-6quad:smp (QEMU)
- maix-bit:smp (QEMU), sim:smp
- spresense:smp
- Tested with nxplayer and stress test with spresense:wifi_smp
Signed-off-by: Masayuki Ishikawa <Masayuki.Ishikawa@jp.sony.com>
Summary:
- This commit fixes comments and label in nxtask_assign_pid()
Impact:
- None
Testing:
- Built with spresense:wifi
Signed-off-by: Masayuki Ishikawa <Masayuki.Ishikawa@jp.sony.com>