If the gap between sp and stack_top is too small,
then the stack will not be output,
modify the conditional loop condition, and fix this problem
Signed-off-by: anjiahao <anjiahao@xiaomi.com>
A new locking mechanism: read/write locks
When there is a writer it is not possible to put on a read lock or a write lock; when there is a reader it is possible to reenter the read lock but not the write lock.
Writers are exclusive locks, readers are shared locks.
At the same time through the waiter count to determine whether there is currently a blocked task, if there is then in the unlock time to wake up all the waiter, through the priority of the competition to complete the blocked lock execution.
For example:
When we have a reader blocking two waiter writers, when the reader is unlocked it wakes up both writers. The writer with higher priority wakes up and checks for a successful condition and locks the lock, the second writer wakes up and fails to check for a condition and continues to block the lock.
Signed-off-by: chenrun1 <chenrun1@xiaomi.com>
This moves all the public POSIX semaphore functions into libc and with
this most of the user-space logic is also moved; namely cancel point and
errno handling.
This also removes the need for the _SEM_XX macros used to differentiate
which API is used per user-/kernel mode. Such macros are henceforth
unnecessary.
PR #11165 causes an unnecessary regression; task_delete no longer works,
if the deleted task is from another group.
The logic that prevents this comes from:
nxnotify_cancellation() ->
tls_get_info_pid() ->
nxsched_get_stackinfo()
Which checks for permissions, which does not make sense in this case since
it is the kernel asking for the stack information.
Fix this by partially reverting 11165 and implementing a direct path for
the kernel to query for any tasks TLS.
Commit 9244b5a737 added support
for non-standard field si_user that is useful for passing context
pointers to signal handlers.
This commits makes it work for all signals, not just SA_KERNELHAND.
Previously si_user for normal signals was uninitialized garbage.
The task files should consult the "spawn action" and "O_CLOEXEC flags"
to determine further whether the file should be duplicated.
This PR will further optimize file list duplicating to avoid the performance
regression caused by additional file operations.
Signed-off-by: chao an <anchao@xiaomi.com>
This moves task / thread cancel point logic from the NuttX kernel into
libc, while the data needed by the cancel point logic is moved to TLS.
The change is an enabler to move user-space APIs to libc as well, for
a coherent user/kernel separation.
Some assertions in extreme cases will cause syslog to be unable to
output logs normally, so this PR will restore the input registers
into the array of last registers to ensure that we can also obtain
some important informations.
Signed-off-by: chao an <anchao@xiaomi.com>
If the semaphore is shared, the holder has put its own mmapped address
to pholder->sem. This means we must switch to the holder's address
environment when going through the held semaphores list.
A better option would be to get the kernel mapped address for the
semaphore's physical page, but that mechanism is not functional yet.
This fixes a full system crash when CONFIG_PRIORITY_INHERITANCE=y and
CONFIG_BUILD_KERNEL=y and user makes shared semaphore via:
int semfd = shm_open("sem", O_CREAT | O_RDWR, 0666);
sem_t *sem = mmap(0, sizeof(sem_t), PROT_READ | PROT_WRITE, MAP_SHARED, semfd, 0);
Previous adjtime() implementation was limited to adjusting system
timer tick period. This commit reimplements the internals to use
a kernel watchdog timer. Platform-independent part of the code now
works also for adjusting hires RTC and tickless timer rate.
User code facing API is unchanged. Architecture code API has changed:
up_adj_timer_period() is replaced by up_adjtime().
Other improvements:
- Support query of remaining adjustment by passing NULL to first
argument of adjtime(). This matches Linux behavior.
- Improve resolution available for architecture driver, previously
limited to 1 microsecond per tick. Now 1 nanosecond per second.
Set the newly spawned process's signal mask, if the caller has instructed
to do so by setting POSIX_SPAWN_SETSIGMASK.
This is called after the task has been created but has NOT been started
yet.
Like the name implies, it is supposed to set the spawn attributes for
the NuttX specific "spawn proxy task" which was historically used as
a proxy to spawn new tasks. The proxy handled file actions and the signal
mask which are inherited from the parent.
The proxy task does not exist anymore, thus the proxy task attributes
do not need to be set anymore either.
Also, the function is currently still used, but the signal mask is set
for the spawning process, not the proxy process, and this is most
DEFINITELY an error (as the spawning process's signal mask changes
unexpectedly).
Setting the signal mask for the newly spawned process is simple, just
set it directly, if instructed to do so. This will be done in a later
patch!
VELAPLATFO-18473
refs:
https://man7.org/linux/man-pages/man2/fcntl.2.html
If the FD_CLOEXEC bit is set, the file descriptor will automatically
be closed during a successful execve(2).
(If the execve(2) fails, the file descriptor is left open.)
modify:
1. Ensure that the child task copies all fds of the parent task,
including those with O_CLOEXE.
2. Make sure spawn_file_action is executed under fd with O_CLOEXEC,
otherwise it will fail.
3. When a new task is activated or exec is called, close all fds
with O_CLOEXEC flags.
Signed-off-by: dongjiuzhu1 <dongjiuzhu1@xiaomi.com>
modlib/modlib_symbols.c: In function ‘modlib_symcallback’:
modlib/modlib_symbols.c:215:13: warning: implicit declaration of function ‘modlib_depend’; did you mean ‘modlib_read’? [-Wimplicit-function-declaration]
215 | ret = modlib_depend(exportinfo->modp, modp);
| ^~~~~~~~~~~~~
| modlib_read
Signed-off-by: dongjiuzhu1 <dongjiuzhu1@xiaomi.com>
Handle task spawn attributes as task spawn file actions are handled.
Why? This removes the need for sched_lock() when the task is being
spawned. When loading the new task from a file the scheduler can be
locked for a VERY LONG time, in the order of hundreds of milliseconds!
This is unacceptable for real time operation.
Also fixes a latent bug in exec_module, spawn_file_actions is executed
at a bad location; when CONFIG_ARCH_ADDRENV=y actions will point to the
new process's address environment (as it is temporarily instantiated at
that point). Fix this by moving it to after addrenv_restore.
This commit adds support for custom stream via fopencookie function.
The function allows the programmer the create his own custom stream
for IO operations and hook his custom functions to it.
This is a non POSIX interface defined in Standard C library and implemented
according to it. The only difference is in usage of off_t instead of
off64_t. Programmer can use 64 bits offset if CONFIG_FS_LARGEFILE is
enabled. In that case off_t is defined as int64_t (int32_t otherwise).
Field fs_fd is removed from file_struct and fs_cookie is used instead
as a shared variable for file descriptor or user defined cookie.
The interface will be useful for future fmemopen implementation.
Signed-off-by: Michal Lenc <michallenc@seznam.cz>
Exit immediately when finished processing the current CPU
if there are no other CPUs to be processed.
Signed-off-by: zhangyuan21 <zhangyuan21@xiaomi.com>
Support smp function call, calling smp_call_function allows
a specific core to execute a function. It should be noted
that there should be no waiting operations in the executed
function.
Signed-off-by: zhangyuan21 <zhangyuan21@xiaomi.com>
both functions aren't suitable to be put into libc,
because they call the kernel internal functions directly.
Signed-off-by: Xiang Xiao <xiaoxiang@xiaomi.com>
This path just for modify Mac sim-02 issue.
The compiler require the firt paramter of atomic_compare_exchange_strong
is atomic type and second parameter is int type.
Signed-off-by: TaiJu Wu <tjwu1217@gmail.com>
spinlock.c:
Implement read write spinlock.
Readers can take lock simultaneously but only one writer can take lock.
irq_spinlock.c:
Align g_irq_spin_count.
If the lock is NULL, the caller will get global lock (e.g. g_irq_spin) and spin_lock_irqsave() support nest on the same CPU.
If the CPU can write lock, it can call write_lock_irqsave() again (e.g. support nest).
Signed-off-by: TaiJu Wu <tjwu1217@gmail.com>
Co-authored-by: David Sidrane <David.Sidrane@Nscdg.com>
In addition to printing out the thread name (task name in flat mode),
print the parent process's name as well.
It is quite useful to know which process is the parent of a faulting
thread, although this information can be read from the assert dump, in
some cases the dump might be incomplete (due to e.g. stack corruption,
which causes another exception and PANIC().)
test config: ./tools/configure.sh -l qemu-armv8a:nsh_smp
Pass ostest
No matter big-endian or little-endian, ticket spinlock only check the
next and the owner is equal or not.
If they are equal, it means there is a task hold the lock or lock is
free.
Signed-off-by: TaiJu Wu <tjwu1217@gmail.com>
Co-authored-by: Xiang Xiao <xiaoxiang781216@gmail.com>
clock_getcycle always returns an incremented cycle value
If the hardware does not support perf event it will use arch_alarm's up_perf_gettime
Signed-off-by: yinshengkai <yinshengkai@xiaomi.com>
1) Previously adjustments less than 1 microsecond per tick would be
completely ignored. Now they are applied over a shorter period at
a rate of 1 us per tick.
2) Previously CLOCK_ADJTIME_PERIOD was in units of 1/100th of second.
Change to milliseconds to be more generally useful unit.
Change setting name to CLOCK_ADJTIME_PERIOD_MS to make the unit change
easier to notice.
3) Previously CLOCK_ADJTIME_SLEWLIMIT was in percentage.
Most clock crystals have better accuracy than 1%, so the minimum slew
rate was excessive. Change to CLOCK_ADJTIME_SLEWLIMIT_PPM with setting
value in parts per million.
4) No need to use floating point math in clock_adjtime.c.
CONFIG_SCHED_CPULOAD_EXTCLK doesn't actually require tickless mode.
As long as the platform provides external call to nxsched_process_cpuload(),
it will work in either tickless or ticking mode.
Removed Kconfig dependency.
Instead, CONFIG_SCHED_CPULOAD_SYSCLK does require ticking mode to work,
as documented in CONFIG_SCHED_CPULOAD help text.
Added the dependency to Kconfig also.
* build-globals.sh
- Only look in the nuttx for external symbols used when loading
dynamic shared objects
* include/elf64.h
- Correct the type of fields in the Elf64_Phdr structure
* libs/libc/dlfcn/lib_dlclose.c
- Distinguish between ET_DYN and other objects as the former
has both text and data in a single allocation to reserve
GOT offsets
* libs/libc/dlfcn/lib_dlopen.c
- Code formatting
* libs/libc/modlib/modlib_bind.c
- Distinguish between relocation entry sizes by section type
- Handle RELA style relocations
* libs/libc/modlib/modlib_globals.S
- Formatting fixes
- Symbols should not be weak - they exist or they don't
* include/nuttx/lib/modlib.h
- Add an inidcator to module_s to distinguish between ET_DYN and other
* libs/libc/modlib/modlib_load.c
- ET_DYN objects need to keep the relative displacement between the text
and data sections due to GOT references from the former to the latter.
This also implies that linking may require modification from the default
for the shared objects being produced. For example, default alignment may
mean nearly 64K of wasted space.
* libs/libc/modlib/modlib_unload.c
sched/module/mod_rmmod.c
- Distingusih between freeing of ET_DYN storage and other as the former
is a single allocation.
* libs/libc/modlib/mod_insmod.c
- Cater for ET_DYN objects having init and preinit sections
bug:
user thread: hpwork:
timer_create() with SIGEV_THREAD
timer_settime()
irq -> work_queue() add nxsig_notification_worker to Q
timer_delete()
nxsig_cancel_notification()
call nxsig_notification_worker()
work_cancel()
timer_free()
nxsig_notification_worker() used after free
root cause:
work_cancel() can't cancel work completely, the worker may alreay be running.
resolve:
use work_cancel_sync() API to cancel the work completely
Signed-off-by: ligd <liguiding1@xiaomi.com>
In the 'wd_timer',the callback function executed by 'wd_expiration' could call wd_start,and g_wdtickbase might be updated.Subsequently, g_wdtickbase is incremented by the value of ticks, causing g_wdtickbase to be greater than the actual passage of time.
Signed-off-by: yangguangcai <yangguangcai@xiaomi.com>
modify the default errno from ENOSYS to EINVAL can pass the ltp
case: ltp_timer_getoverrun_speculative_6_1,
ltp_timer_getoverrun_speculative_6_2, ltp_timer_getoverrun_speculative_6_3
Signed-off-by: guoshichao <guoshichao@xiaomi.com>
aio client will queue asynchronous io requests to the worker threads.
if PRIORITY_INHERITANCE is enabled, client thread's priority will be
set to worker threads. There will be multi-boost/restore of worker
threads' priority and assert the system.
No need priority multi-boot/restore to worker thread because client
thread's priority is alway the same.
Signed-off-by: fangxinyong <fangxinyong@xiaomi.com>
vfork use waitpid hang father process,
but waitpid release child processs information by default.
So when user call wait, it return errno 10.
Signed-off-by: yangyalei <yangyalei@xiaomi.com>
If scheduling occurs in file_fsync,
fl_lock may be released, and an error may
occur when calling nxmutex_unlock
Signed-off-by: anjiahao <anjiahao@xiaomi.com>
To determine whether a signal is real-time signal or standard signal, the POSIX standard https://www.man7.org/linux/man-pages/man7/signal.7.html defines a real-time signal between SIGRTMIN and SIGRTMAX , which can store multiple copies, otherwise only one can be retained.
Signed-off-by: xinhaiteng <xinhaiteng@xiaomi.com>
If we are running on a single CPU architecture, then we know interrupts
are disabled and there is no need to explicitly call enter_critical_section().
However, in the SMP case, enter_critical_section() is required prevent
multiple cpu to enter timer_start.
Signed-off-by: zhangyuan21 <zhangyuan21@xiaomi.com>
core0 may write the data used by other cpu, this will cause cache inconsistency.
so need fulsh dcache before start other cpus.
Signed-off-by: zhangyuan21 <zhangyuan21@xiaomi.com>
The memory allocated with strdup and asprintf is done via lib_malloc
so we need to use lib_free to deallocate memory otherwise the assertion
"Free memory from the wrong heap" is hit with flat mode and user separated
heap enabled mode.
Signed-off-by: Petro Karashchenko <petro.karashchenko@gmail.com>
We can use the driver in nuttx to download
files with debugger
Signed-off-by: anjiahao <anjiahao@xiaomi.com>
Signed-off-by: chao an <anchao@xiaomi.com>
reproduce:
static void *pthread(void *arg)
{
system(arg);
}
void test (int argc, char *argv[])
{
pthread_create(&pthread0, &attr, pthread, argv[1]);
pthread_create(&pthread1, &attr, pthread, argv[2]);
}
only one pthread system() returnd, othres hanged
rootcause:
As we known, system() will create a new task called:
system -c XX
The example:
parent group child groups
pthread0 -> waitpid() -> system -c ps -> exit() -> nxtask_signalparent()
pthread1 -> waitpid() -> system -c ls -> exit() -> nxtask_signalparent()
Each child group exit with function nxtask_signalparent(),
As we expect:
system -c ps will signal pthread0
system -c ls will signal pthread1
But actually:
system -c ps will signal pthread0/1
system -c ls will signal pthread0/1
As the spec, we know, this behavior is normal:
https://man7.org/linux/man-pages/man2/sigwaitinfo.2.html
So for this situation, when the signo is SIGCHLD, we broadcast.
Signed-off-by: ligd <liguiding1@xiaomi.com>
Standard POSIX specification in URL “https://pubs.opengroup.org/onlinepubs/9699919799/functions/mq_timedsend.html” requires that "EBADF" be returned when message queue not opened for writing.So i update the related descriptions in this file, no function changed.
Signed-off-by: yangjiao <yangjiao@xiaomi.com>
Standard POSIX specification in URL “https://pubs.opengroup.org/onlinepubs/9699919799/functions/mq_send.html” requires that "EBADF" be returned when mqdes is not open for writing. And message priorities range from 0 to {MQ_PRIO_MAX}-1. In this change, i update them to follow POSIX spec.
Signed-off-by: yangjiao <yangjiao@xiaomi.com>
In POSIX testcase "open_posix_testsuite/conformance/interfaces/mq_receive/11-2.c", it will return "EPERM" when message queue is not opened for reading, but the standard POSIX specification in URL “https://pubs.opengroup.org/onlinepubs/9699919799/functions/mq_receive.html” requires that "EBADF" be returned.So in this change, i update it.
Signed-off-by: yangjiao <yangjiao@xiaomi.com>
When supporting high-priority interrupts, updating the
g_running_tasks within a high-priority interrupt may be
cause problems. The g_running_tasks should only be updated
when it is determined that a task context switch has occurred.
Signed-off-by: zhangyuan21 <zhangyuan21@xiaomi.com>
fix ltp pthread_cond_wait_1 test question, child should inherit parent
priority by default.
nsh> ltp_pthread_cond_wait_1
Error: the policy or priority not correct
Signed-off-by: yangyalei <yangyalei@xiaomi.com>
The nxspawn_dup2 function will return a value greater than 0,
so the loop should only exit if ret is less than 0.
Signed-off-by: zhangyuan21 <zhangyuan21@xiaomi.com>
pass ltp sigaction case 19-5.c, 23-5,c and 28-5.c.
When SIGCONT is dispatched, resume all members of the task group.
So there is nothing do in default action.
Signed-off-by: fangxinyong <fangxinyong@xiaomi.com>
1. as we can use fork to implement vfork, so we rename the vfork to
fork, and use the fork method as the base to implement vfork method
2. create the vfork function as a libc function based on fork
function
Signed-off-by: guoshichao <guoshichao@xiaomi.com>
1. Update all CMakeLists.txt to adapt to new layout
2. Fix cmake build break
3. Update all new file license
4. Fully compatible with current compilation environment(use configure.sh or cmake as you choose)
------------------
How to test
From within nuttx/. Configure:
cmake -B build -DBOARD_CONFIG=sim/nsh -GNinja
cmake -B build -DBOARD_CONFIG=sim:nsh -GNinja
cmake -B build -DBOARD_CONFIG=sabre-6quad/smp -GNinja
cmake -B build -DBOARD_CONFIG=lm3s6965-ek/qemu-flat -GNinja
(or full path in custom board) :
cmake -B build -DBOARD_CONFIG=$PWD/boards/sim/sim/sim/configs/nsh -GNinja
This uses ninja generator (install with sudo apt install ninja-build). To build:
$ cmake --build build
menuconfig:
$ cmake --build build -t menuconfig
--------------------------
2. cmake/build: reformat the cmake style by cmake-format
https://github.com/cheshirekow/cmake_format
$ pip install cmakelang
$ for i in `find -name CMakeLists.txt`;do cmake-format $i -o $i;done
$ for i in `find -name *\.cmake`;do cmake-format $i -o $i;done
Co-authored-by: Matias N <matias@protobits.dev>
Signed-off-by: chao an <anchao@xiaomi.com>
A normal user task calls pthread_exit(), will crash at DEBUGASSERT.
Cause pthread_exit limit in user pthread task.
test case:
int main(int argc, FAR char *argv[])
{
pthread_exit(NULL);
return 0;
}
>> test
>> echo $?
>> 0
Signed-off-by: fangxinyong <fangxinyong@xiaomi.com>
pthread_cond_wait() should be an atomic operation in the mutex lock/unlock.
Since the sched_lock() has been wrongly deleted in the previous commit,
the context switch will occurred after the mutex was unlocked:
--------------------------------------------------------------------
Task1(Priority 100) | Task2(Priority 101)
|
pthread_mutex_lock(mutex); |
| |
pthread_cond_wait(cond, mutex) |
| | |
| | |
| ->enter_critical_section() |
| ->pthread_mutex_give(mutex) | ----> pthread_mutex_lock(mutex); // contex switch to high priority task
| | pthread_cond_signal(cond); // signal before wait
| | <---- pthread_mutex_unlock(mutex); // switch back to original task
| ->pthread_sem_take(cond->sem)| // try to wait the signal, Deadlock.
| ->leave_critical_section() |
|
| ->pthread_mutex_take(mutex) |
| |
pthread_mutex_lock(mutex); |
---------------------------------------------------------------------
This PR will bring back sched_lock()/sched_unlock() to avoid context switch to ensure atomicity
Signed-off-by: chao an <anchao@xiaomi.com>
When asserting, automatically analyze whether
there is a deadlock in the thread, and if there
is a deadlock, print out the deadlocked thread.
The principle is to analyze whether there is
a lock ring through the tcb holder.
Signed-off-by: anjiahao <anjiahao@xiaomi.com>
This PR is a modification that optimizes priority inheritance
for only one holder. After the above modifications are completed,
the mutex lock->unlock process that supports priority inheritance
can be optimized by 200 cycles.
Before modify: 2000 cycle
After modify: 1742 cycle
Signed-off-by: zhangyuan21 <zhangyuan21@xiaomi.com>
Fixed ltp_stress_mqueues_multi_send_rev_1 test issue:
In SMP mode, tg_members will operate on different cores.
Adding interrupt locking operations ensures that the operation
of tg_members will not be interrupted by other cores.
Signed-off-by: zhangyuan21 <zhangyuan21@xiaomi.com>
Resolving the issue with the ltp_interfaces_pthread_join_6_2 test case.
In SMP mode, the pthread may still be in the process of exiting when
pthread_join returns, and calling pthread_join again at this time will
result in an error. The error code returned should be ESRCH.
Signed-off-by: zhangyuan21 <zhangyuan21@xiaomi.com>
use PTHREAD_CLEANUP_STACKSIZE to enable or disable interfaces pthread_cleanup_push() and pthread_cleanup_pop().
reasons:(1)same as TLS_TASK_NELEM (2)it is no need to use two variables
Signed-off-by: yanghuatao <yanghuatao@xiaomi.com>
This is preparation to use kernel stack for everything when the user
process enters the kernel. Now the user stack is in use when the user
process runs a system call, which might not be the safest option.
This adds functionality to map pages dynamically into kernel virtual
memory. This allows implementing I/O remap for example, which is a useful
(future) feature.
Now, the first target is to support mapping user pages for the kernel.
Why? There are some userspace structures that might be needed when the
userspace process is not running. Semaphores are one such example. Signals
and the WDT timeout both need access to the user semaphore to work
properly. Even though for this only obtaining the kernel addressable
page pool virtual address is needed, for completeness a procedure is
provided to map several pages.
to avoid the infinite recusive dispatch:
*0 myhandler (signo=27, info=0xf3e38b9c, context=0x0) at ltp/testcases/open_posix_testsuite/conformance/interfaces/sigqueue/7-1.c:39
*1 0x58f1c39e in nxsig_deliver (stcb=0xf4e20f40) at signal/sig_deliver.c:167
*2 0x58fa0664 in up_schedule_sigaction (tcb=0xf4e20f40, sigdeliver=0x58f1bab5 <nxsig_deliver>) at sim/sim_schedulesigaction.c:88
*3 0x58f19907 in nxsig_queue_action (stcb=0xf4e20f40, info=0xf4049334) at signal/sig_dispatch.c:115
*4 0x58f1b089 in nxsig_tcbdispatch (stcb=0xf4e20f40, info=0xf4049334) at signal/sig_dispatch.c:435
*5 0x58f31853 in nxsig_unmask_pendingsignal () at signal/sig_unmaskpendingsignal.c:104
*6 0x58f1ca09 in nxsig_deliver (stcb=0xf4e20f40) at signal/sig_deliver.c:199
*7 0x58fa0664 in up_schedule_sigaction (tcb=0xf4e20f40, sigdeliver=0x58f1bab5 <nxsig_deliver>) at sim/sim_schedulesigaction.c:88
*8 0x58f19907 in nxsig_queue_action (stcb=0xf4e20f40, info=0xf4049304) at signal/sig_dispatch.c:115
*9 0x58f1b089 in nxsig_tcbdispatch (stcb=0xf4e20f40, info=0xf4049304) at signal/sig_dispatch.c:435
*10 0x58f31853 in nxsig_unmask_pendingsignal () at signal/sig_unmaskpendingsignal.c:104
*11 0x58f1ca09 in nxsig_deliver (stcb=0xf4e20f40) at signal/sig_deliver.c:199
*12 0x58fa0664 in up_schedule_sigaction (tcb=0xf4e20f40, sigdeliver=0x58f1bab5 <nxsig_deliver>) at sim/sim_schedulesigaction.c:88
*13 0x58f19907 in nxsig_queue_action (stcb=0xf4e20f40, info=0xf40492d4) at signal/sig_dispatch.c:115
*14 0x58f1b089 in nxsig_tcbdispatch (stcb=0xf4e20f40, info=0xf40492d4) at signal/sig_dispatch.c:435
*15 0x58f31853 in nxsig_unmask_pendingsignal () at signal/sig_unmaskpendingsignal.c:104
*16 0x58f1ca09 in nxsig_deliver (stcb=0xf4e20f40) at signal/sig_deliver.c:199
*17 0x58fa0664 in up_schedule_sigaction (tcb=0xf4e20f40, sigdeliver=0x58f1bab5 <nxsig_deliver>) at sim/sim_schedulesigaction.c:88
*18 0x58f19907 in nxsig_queue_action (stcb=0xf4e20f40, info=0xf40492a4) at signal/sig_dispatch.c:115
*19 0x58f1b089 in nxsig_tcbdispatch (stcb=0xf4e20f40, info=0xf40492a4) at signal/sig_dispatch.c:435
*20 0x58f31853 in nxsig_unmask_pendingsignal () at signal/sig_unmaskpendingsignal.c:104
*21 0x58f1ca09 in nxsig_deliver (stcb=0xf4e20f40) at signal/sig_deliver.c:199
*22 0x58fa0664 in up_schedule_sigaction (tcb=0xf4e20f40, sigdeliver=0x58f1bab5 <nxsig_deliver>) at sim/sim_schedulesigaction.c:88
*23 0x58f19907 in nxsig_queue_action (stcb=0xf4e20f40, info=0xf4049274) at signal/sig_dispatch.c:115
*24 0x58f1b089 in nxsig_tcbdispatch (stcb=0xf4e20f40, info=0xf4049274) at signal/sig_dispatch.c:435
*25 0x58f31853 in nxsig_unmask_pendingsignal () at signal/sig_unmaskpendingsignal.c:104
*26 0x58f1ca09 in nxsig_deliver (stcb=0xf4e20f40) at signal/sig_deliver.c:199
*27 0x58fa0664 in up_schedule_sigaction (tcb=0xf4e20f40, sigdeliver=0x58f1bab5 <nxsig_deliver>) at sim/sim_schedulesigaction.c:88
*28 0x58f19907 in nxsig_queue_action (stcb=0xf4e20f40, info=0xf4049244) at signal/sig_dispatch.c:115
*29 0x58f1b089 in nxsig_tcbdispatch (stcb=0xf4e20f40, info=0xf4049244) at signal/sig_dispatch.c:435
*30 0x58f31853 in nxsig_unmask_pendingsignal () at signal/sig_unmaskpendingsignal.c:104
*31 0x58f1ca09 in nxsig_deliver (stcb=0xf4e20f40) at signal/sig_deliver.c:199
Signed-off-by: Xiang Xiao <xiaoxiang@xiaomi.com>
Signal must be masked when it is delivered to a signal handler per:
https://pubs.opengroup.org/onlinepubs/007904875/functions/sigaction.html:
When a signal is caught by a signal-catching function installed by sigaction(), a new signal mask is calculated and installed for the duration of the signal-catching function (or until a call to either sigprocmask() or sigsuspend() is made). This mask is formed by taking the union of the current signal mask and the value of the sa_mask for the signal being delivered [XSI] [Option Start] unless SA_NODEFER or SA_RESETHAND is set, [Option End] and then including the signal being delivered. If and when the user's signal handler returns normally, the original signal mask is restored.
Any action queued for that signal while the signal is masked should be deferred. It should go into the group pending signal list and should not be processed until until the signal is unmasked (which should occur when the signal handler returns).
https://pubs.opengroup.org/onlinepubs/9699919799/functions/pthread_detach.html
If an implementation detects that the value specified by the thread argument
to pthread_detach() does not refer to a joinable thread, it is recommended
that the function should fail and report an [EINVAL] error.
If an implementation detects use of a thread ID after the end of its lifetime,
it is recommended that the function should fail and report an [ESRCH] error.
Signed-off-by: zhangyuan21 <zhangyuan21@xiaomi.com>
1. Get the value of sp from dump regs when an exception occurs,
to avoid getting the value of fp from up_getsp and causing
incomplete stack printing.
2. Determine which stack the value belongs to based on the value
of SP to avoid false reports of stack overflow
Signed-off-by: zhangyuan21 <zhangyuan21@xiaomi.com>
Set the Default CPU bits. The way to use the unset CPU is to call the
sched_setaffinity function to bind a task to the CPU. bit0 means CPU0.
Signed-off-by: zhangyuan21 <zhangyuan21@xiaomi.com>
CURRENT_REGS may change during assert handling, so pass
in the 'regs' parameter at the entry point of _assert.
Signed-off-by: zhangyuan21 <zhangyuan21@xiaomi.com>
The spawn proxy thread is a special existence in NuttX, usually some developers
spend a lot of time on stack overflow of spawn proxy thread:
https://github.com/apache/nuttx/issues/9046https://github.com/apache/nuttx/pull/9081
In order to avoid similar issues, this PR will remove spawn proxy thread to simplify
the process of task/posix_spawn().
1. Postpone the related processing of spawn file actions until after task_init()
2. Delete the temporary thread of spawn proxy and related global variables
Signed-off-by: chao an <anchao@xiaomi.com>
This commit adds Linux like adjtime() interface that is used to correct
the system time clock if it varies from real value. The adjustment is
done by slight adjustment of clock period and therefore the adjustment
is without time jumps (both forward and backwards)
The implementation is enabled by CONFIG_CLOCK_ADJTIME and separated from
CONFIG_CLOCK_TIMEKEEPING functions. Options CONFIG_CLOCK_ADJTIME_SLEWLIMIT
and CONFIG_CLOCK_ADJTIME_PERIOD can be used to control the adjustment
speed.
Interfaces up_get_timer_period() and up_adj_timer_period() has to be
defined by architecture level support.
This is not a POSIX interface but derives from 4.3BSD, System V.
It is also supported for Linux compatibility.
Signed-off-by: Michal Lenc <michallenc@seznam.cz>
Store the old environment in a local context so another temporary address
environment can be selected. This can happen especially when a process
is being loaded (the new process's mappings are temporarily instantiated)
and and interrupt occurs.
The current implementation requires the use of enter_critical_section, so the source code needs to be moved to kernel space
Signed-off-by: hujun5 <hujun5@xiaomi.com>
pthread_cond_wait is preempted after releasing the lock, sched_lock cannot lock threads from other CPUs, use enter_critical_section
Signed-off-by: hujun5 <hujun5@xiaomi.com>
Instead of using a volatile storage for the address environment in the
binfmt / loadinfo structures, always allocate the address environment
from kheap.
This serves two purposes:
- If the task creation fails, any kernel thread that depends on the
address environment created during task creation will not lose their
mappings (because they hold a reference to it)
- The current address environment variable (g_addrenv) will NEVER contain
a stale / incorrect value
- Releasing the address environment is simplified as any pointer given
to addrenv_drop() can be assumed to be heap memory
- Makes the kludge function addrenv_clear_current irrelevant, as the
system will NEVER have invalid mappings any more
Problem:
AppBringup task in default priority 240 ->
board_late_initialize() ->
some driver called work_queue() ->
nxsem_post(&(wqueue).sem) failed because sem_count is 0
hp work_thread in default priority 224 ->
nxsem_wait_uninterruptible(&wqueue->sem);
so hp_work_thread can't wake up, worker can't run immediately.
Signed-off-by: dongjiuzhu1 <dongjiuzhu1@xiaomi.com>
- Remove the temporary "saved" variable when temporarily changing MMU
mappings to access another process's memory. The fact that it has an
address environment is enough to make the choice
- Restore nxflat_addrenv_restore-macro. It was accidentally lost when
the address environment handling was re-factored.
- The code will detect an error condition described in
https://cwiki.apache.org/confluence/display/NUTTX/Signaling+Semaphores+and+Priority+Inheritance
- The kernel will go to PANIC if semaphore holder can't be allocated even
if CONFIG_DEBUG_ASSERTIONS is disabled
- Clean-up code that handled posing of semaphore with priority inheritance
enabled from the interrupt context (remove nxsem_restore_baseprio_irq())
Summary:
- Support arm64 pmu api, Currently only the cycle counter function is supported.
- Using ARM64 PMU hardware capability to implement perf interface, modify all
perf interface related code.
- Support for pmu init under smp.
Signed-off-by: wangming9 <wangming9@xiaomi.com>
After enabling this option, you can automatically trace the function instrumentation without adding tracepoint manually.
This is similar to the Function Trace effect of the linux kernel
Signed-off-by: yinshengkai <yinshengkai@xiaomi.com>
Assert in nxsem_post if:
- Priority inheritance is enabled on a semaphore
- A thread that does not hold the semaphore attempts to post it
This will detect an error condition described in https://cwiki.apache.org/confluence/display/NUTTX/Signaling+Semaphores+and+Priority+Inheritance
None. The debug instrumentation is only enabled if CONFIG_DEBUG_ASSERTIONS is enabled.
Use sim:ostest. Verify that no assertions occur.
Compilation error occurs after SCHED_CRITMONITOR is enabled
sched/sched_critmonitor.c:315: undefined reference to `serr'
Signed-off-by: yinshengkai <yinshengkai@xiaomi.com>
sem_t is user memory and the correct mappings are needed to perform
the semaphore wait interruption.
Otherwise either a page fault, or access to the WRONG address environment
happens.
Refer to issue #8867 for details and rational.
Convert sigset_t to an array type so that more than 32 signals can be supported.
Why not use a uin64_t?
- Using a uin32_t is more flexible if we decide to increase the number of signals beyound 64.
- 64-bit accesses are not atomic, at least not on 32-bit ARMv7-M and similar
- Keeping the base type as uint32_t does not introduce additional overhead due to padding to achieve 64-bit alignment of uin64_t
- Some architectures still supported by NuttX do not support uin64_t
types,
Increased the number of signals to 64. This matches Linux. This will support all xsignals defined by Linux and also 32 real time signals (also like Linux).
This is is a work in progress; a draft PR that you are encouraged to comment on.
Calling syslog to print logs in clock_gettime will cause the system to have recursive output, i.e., clock_gettime->sinfo->syslog->clock_gettime, with the consequences of stack overflow or non-stop log output.
Decouple the semcount and the work queue length.
Previous Problem:
If a work is queued and cancelled in high priority threads (or queued
by timer and cancelled by another high priority thread) before
work_thread runs, the queue operation will mark work_thread as ready to
run, but the cancel operation minus the semcount back to -1 and makes
wqueue->q empty. Then the work_thread still runs, found empty queue,
and wait sem again, then semcount becomes -2 (being minused by 1)
This can be done multiple times, then semcount can become very small
value. Test case to produce incorrect semcount:
high_priority_task()
{
for (int i = 0; i < 10000; i++)
{
work_queue(LPWORK, &work, worker, NULL, 0);
work_cancel(LPWORK, &work);
usleep(1);
}
/* Now the g_lpwork.sem.semcount is a value near -10000 */
}
With incorrect semcount, any queue operation when the work_thread is
busy, will only increase semcount and push work into queue, but cannot
trigger work_thread (semcount is negative but work_thread is not
waiting), then there will be more and more works left in queue while
the work_thread is waiting sem and cannot call them.
Signed-off-by: Zhe Weng <wengzhe@xiaomi.com>
The _unmasked_ signal action was never added if the task is in system call
and waiting for (a different) signal.
This fixes deliver especially for default signal actions / unmaskable
signals, like SIGTERM.
As far as I can interpret how signal delivery should work when the signal
is blocked, it should still be sent to the pending queue even if the signal
is masked. When the sigmask changes it will be delivered.
The original implementation did not add the pending signal action, if
stcb->task_state == TSTATE_WAIT_SIG is true.
An attempt to patch this was made in #8563 but it is insufficient as it
creates an issue when the task is not waiting for a signal, but is in
syscall, in this case the signal is incorrectly queued twice.
since the chip/board vendor could disable dirvers/note and
provide the implementation of sched_note_xxx by self
Signed-off-by: Xiang Xiao <xiaoxiang@xiaomi.com>
Remove calls to the userspace API exit() from the kernel. The problem
with doing such calls is that the exit functions are called with kernel
mode privileges which is a big security no-no.
Do not allow a deferred cancellation if the group is exiting, it is too
dangerous to allow the threads to execute any user space code after the
exit has started.
If the cancelled thread is not inside a cancellation point, just kill it
immediately via asynchronous cancellation. This will create far less
problems than allowing it to continue running user code.
For some reason the signal action was never performed if the receiveing
task was within a system call, the pending queue inser was simply missing.
This fixes the issue.
There is an issue where the wrong process exit code is given to the parent
when a process exits. This happens when the process has pthreads running
user code i.e. not within a cancel point / system call.
Why does this happen ?
When exit() is called, the following steps are done:
- group_kill_children(), which tells the children to die via pthread_cancel()
Then, one of two things can happen:
1. if the child is in a cancel point, it gets scheduled to allow it to leave
the cancel point and gets destroyed immediately
2. if the child is not in a cancel point, a "cancel pending" flag is set and
the child will die when the next cancel point is encountered
So what is the problem here?
The last thread alive dispatches SIGCHLD to the parent, which carries the
process's exit code. The group head has the only meaningful exit code and
this is what should be passed. However, in the second case, the group head
exits before the child, taking the process exit code to its grave. The child
that was alive will exit next and will pass its "status" to the parent process,
but this status is not the correct value to pass.
This commit fixes the issue by passing the group head's exit code ALWAYS to
the parent process.
The function is not relevant any longer, remove it. Also remove
save_addrenv_t, the parameter taken by up_addrenv_restore.
Implement addrenv_select() / addrenv_restore() to handle the temporary
instantiation of address environments, e.g. when a process is being
created.
There is currently a big problem in the address environment handling which
is that the address environment is released too soon when the process is
exiting. The current MMU mappings will always be the exiting process's, which means
the system needs them AT LEAST until the next context switch happens. If
the next thread is a kernel thread, the address environment is needed for
longer.
Kernel threads "lend" the address environment of the previous user process.
This is beneficial in two ways:
- The kernel processes do not need an allocated address environment
- When a context switch happens from user -> kernel or kernel -> kernel,
the TLB does not need to be flushed. This must be done only when
changing to a different user address environment.
Another issue is when a new process is created; the address environment
of the new process must be temporarily instantiated by up_addrenv_select().
However, the system scheduler does not know that the process has a different
address environment to its own and when / if a context restore happens, the
wrong MMU page directory is restored and the process will either crash or
do something horribly wrong.
The following changes are needed to fix the issues:
- Add mm_curr which is the current address environment of the process
- Add a reference counter to safeguard the address environment
- Whenever an address environment is mapped to MMU, its reference counter
is incremented
- Whenever and address environment is unmapped from MMU, its reference
counter is decremented, and tested. If no more references -> drop the
address environment and release the memory as well
- To limit the context switch delay, the address environment is freed in
a separate low priority clean-up thread (LPWORK)
- When a process temporarily instantiates another process's address
environment, the scheduler will now know of this and will restore the
correct mappings to MMU
Why is this not causing more noticeable issues ? The problem only happens
under the aforementioned special conditions, and if a context switch or
IRQ occurs during this time.
Detach the address environment handling from the group structure to the
tcb. This is preparation to fix rare cases where the system (MMU) is left
without a valid page directory, e.g. when a process exits.
NuttX kernel should not use the syscall functions, especially after
enabling CONFIG_SCHED_INSTRUMENTATION_SYSCALL, all system functions
will be traced to backend, which will impact system performance.
Signed-off-by: chao an <anchao@xiaomi.com>
Implement a function for dropping references to the group structure and
finally freeing the allocated memory, if the group has been marked for
destruction
The number of work entries will be inconsistent with semaphore count
if the work is canceled, in extreme case, semaphore count will overflow
and fallback to 0 the workqueue will stop scheduling the enqueue work.
Signed-off-by: chao an <anchao@xiaomi.com>
continue the follow work:
commit 43e7b13697
Author: Xiang Xiao <xiaoxiang@xiaomi.com>
Date: Sun Jan 22 19:31:32 2023 +0800
assert: Log the assertion expression in case of fail
Signed-off-by: Xiang Xiao <xiaoxiang@xiaomi.com>