cpu0 thread0: cpu1:
sched_yield()
nxsched_set_priority()
nxsched_running_setpriority()
nxsched_reprioritize_rtr()
nxsched_add_readytorun()
up_cpu_pause()
IRQ enter
arm64_pause_handler()
enter_critical_section() begin
up_cpu_paused() pick thread0
arm64_restorestate() set thread0 tcb->xcp.regs to CURRENT_REGS
up_switch_context()
thread0 -> thread1
arm64_syscall()
case SYS_switch_context
change thread0 tcb->xcp.regs
restore_critical_section()
enter_critical_section() done
leave_critical_section()
IRQ leave with restore CURRENT_REGS
ERROR !!!
Reason:
As descript above, cpu0 swith task: thread0 -> thread1, and the
syscall() execute slowly, this time cpu1 pick thread0 to run at
up_cpu_paused(). Then cpu0 syscall execute, cpu1 IRQ leave error.
Resolve:
Move arm64_restorestate() after enter_critical_section() done
This is a continued fix with:
https://github.com/apache/nuttx/pull/6833
Signed-off-by: ligd <liguiding1@xiaomi.com>
spinlock.c:
Implement read write spinlock.
Readers can take lock simultaneously but only one writer can take lock.
irq_spinlock.c:
Align g_irq_spin_count.
If the lock is NULL, the caller will get global lock (e.g. g_irq_spin) and spin_lock_irqsave() support nest on the same CPU.
If the CPU can write lock, it can call write_lock_irqsave() again (e.g. support nest).
Signed-off-by: TaiJu Wu <tjwu1217@gmail.com>
Co-authored-by: David Sidrane <David.Sidrane@Nscdg.com>
test config: ./tools/configure.sh -l qemu-armv8a:nsh_smp
Pass ostest
No matter big-endian or little-endian, ticket spinlock only check the
next and the owner is equal or not.
If they are equal, it means there is a task hold the lock or lock is
free.
Signed-off-by: TaiJu Wu <tjwu1217@gmail.com>
Co-authored-by: Xiang Xiao <xiaoxiang781216@gmail.com>
When supporting high-priority interrupts, updating the
g_running_tasks within a high-priority interrupt may be
cause problems. The g_running_tasks should only be updated
when it is determined that a task context switch has occurred.
Signed-off-by: zhangyuan21 <zhangyuan21@xiaomi.com>
1. Update all CMakeLists.txt to adapt to new layout
2. Fix cmake build break
3. Update all new file license
4. Fully compatible with current compilation environment(use configure.sh or cmake as you choose)
------------------
How to test
From within nuttx/. Configure:
cmake -B build -DBOARD_CONFIG=sim/nsh -GNinja
cmake -B build -DBOARD_CONFIG=sim:nsh -GNinja
cmake -B build -DBOARD_CONFIG=sabre-6quad/smp -GNinja
cmake -B build -DBOARD_CONFIG=lm3s6965-ek/qemu-flat -GNinja
(or full path in custom board) :
cmake -B build -DBOARD_CONFIG=$PWD/boards/sim/sim/sim/configs/nsh -GNinja
This uses ninja generator (install with sudo apt install ninja-build). To build:
$ cmake --build build
menuconfig:
$ cmake --build build -t menuconfig
--------------------------
2. cmake/build: reformat the cmake style by cmake-format
https://github.com/cheshirekow/cmake_format
$ pip install cmakelang
$ for i in `find -name CMakeLists.txt`;do cmake-format $i -o $i;done
$ for i in `find -name *\.cmake`;do cmake-format $i -o $i;done
Co-authored-by: Matias N <matias@protobits.dev>
Signed-off-by: chao an <anchao@xiaomi.com>
Summary:
- Support arm64 pmu api, Currently only the cycle counter function is supported.
- Using ARM64 PMU hardware capability to implement perf interface, modify all
perf interface related code.
- Support for pmu init under smp.
Signed-off-by: wangming9 <wangming9@xiaomi.com>
Situation:
Assume we have 2 cpus, and busy run task0.
CPU0 CPU1
task0 -> task1 task2 -> task0
1. remove task0 form runninglist
2. take task1 as new tcb
3. add task0 to blocklist
4. clear spinlock
4.1 remove task2 form runninglist
4.2 take task0 as new tcb
4.3 add task2 to blocklist
4.4 use svc ISR swith to task0
4.5 crash
5. use svc ISR swith to task1
Fix:
Move clear spinlock to the end of svc ISR
Signed-off-by: ligd <liguiding1@xiaomi.com>
reason:
1. g_running_tasks = thread A
2. thread A exit (free thread A's tcb) -> thread B
3. thread B interrupt by irq
4. check g_running_tasks->flags -> kasan report used after free
rootcause:
g_running_tasks has't set completely when syscall hanppened
Resolve:
Use rtcb (get at ISR begining) instead
Signed-off-by: ligd <liguiding1@xiaomi.com>
because not all compiler support the weak attribute, and
many features are either always used or guarded by config.
Signed-off-by: Xiang Xiao <xiaoxiang@xiaomi.com>
Summary:
- SP_SECTION was introduced to allocate spinlock in non-cachable
region mainly for Cortex-A to stabilize the NuttX SMP kernel
- However, all spinlocks are now allocated in cachable area and
works without any problems
- So SP_SECTION should be removed to simplify the kernel code
Impact:
- None
Testing:
- Build test only
Signed-off-by: Masayuki Ishikawa <Masayuki.Ishikawa@jp.sony.com>
Summary:
- This commit changes spinlock APIs (spin_lock_irqsave/spin_unlock_irqrestore)
- In the previous implementation, the global spinlock (i.e. g_irq_spin) was used.
- This commit allows to use caller specific spinlock but also supports to use
g_irq_spin for backword compatibility (In this case, NULL must be specified)
Impact:
- None
Testing:
- Tested with the following configurations
- spresnse:wifi, spresense:wifi_smp
- esp32-devkitc:smp (QEMU), sabre6-quad:smp (QEMU)
- maxi-bit:smp (QEMU), sim:smp
- stm32f4discovery:wifi
Signed-off-by: Masayuki Ishikawa <Masayuki.Ishikawa@jp.sony.com>
Summary:
- I noticed that Cortex-A SGI can be masked
- We thought the SGI is not maskable
- Although I can not remember how I tested it before
- It actually works as expected now
- Also, fixed the number of remaining bugs in TODO
Impact:
- No impact
Testing:
- Tested with sabre-6quad:smp (QEMU and dev board)
- Add the following code in up_idle() before calling asm("WFI");
+ if (0 != up_cpu_index())
+ {
+ up_irq_save();
+ }
- Run the hello app, you can see "Hello, World!!"
- But nsh will freeze soon because arm_pause_handler is not called.
Signed-off-by: Masayuki Ishikawa <Masayuki.Ishikawa@jp.sony.com>
Summary:
- I found a deadlock during Wi-Fi audio streaming test plus stress test
- The testing environment was spresense:wifi_smp (NCPUS=4)
- The deadlock happened because two CPUs called up_cpu_pause() almost simultaneously
- This situation should not happen, because up_cpu_pause() is called in a critical section
- Actually, the latter call was from nxsem_post() in an IRQ handler
- And when enter_critical_section() was called, irq_waitlock() detected a deadlock
- Then it called up_cpu_paused() to break the deadlock
- However, this resulted in setting g_cpu_irqset on the CPU
- Even though another CPU had held a g_cpu_irqlock
- This situation violates the critical section and should be avoided
- To avoid the situation, if a CPU sets g_cpu_irqset after calling up_cpu_paused()
- The CPU must release g_cpu_irqlock first
- Then retry irq_waitlock() to acquire g_cpu_irqlock
Impact:
- Affect SMP
Testing:
- Tested with spresense:wifi_smp (NCPUS=2 and 4)
- Tested with spresense:smp
- Tested with sim:smp
- Tested with sabre-6quad:smp (QEMU)
- Tested with maix-bit:smp (QEMU)
- Tested with esp32-core:smp (QEMU)
- Tested with lc823450-xgevk:rndis
Signed-off-by: Masayuki Ishikawa <Masayuki.Ishikawa@jp.sony.com>
Summary:
- ARCH_GLOBAL_IRQDISABLE was initially introduced for LC823450 SMP
- At that time, i.MX6 (quad Cortex-A9) did not use this config
- However, this option is now used for all CPUs which support SMP
- So it's good timing for refactoring the code
Impact:
- Should have no impact because the logic is the same for SMP
Testing:
- Tested with board: spresense:smp, spresense:wifi_smp
- Tested with qemu: esp32-core:smp, maix-bit:smp, sabre-6quad:smp
- Build only: lc823450-xgevk:rndis, sam4cmp-db:nsh
Signed-off-by: Masayuki Ishikawa <Masayuki.Ishikawa@jp.sony.com>