Masayuki Ishikawa 6ec94082a1 sched: irq: Fix enter_critical_section() in an irq handler for SMP
Summary:
- I found a deadlock during Wi-Fi audio streaming test plus stress test
- The testing environment was spresense:wifi_smp (NCPUS=4)
- The deadlock happened because two CPUs called up_cpu_pause() almost simultaneously
- This situation should not happen, because up_cpu_pause() is called in a critical section
- Actually, the latter call was from nxsem_post() in an IRQ handler
- And when enter_critical_section() was called, irq_waitlock() detected a deadlock
- Then it called up_cpu_paused() to break the deadlock
- However, this resulted in setting g_cpu_irqset on the CPU
- Even though another CPU had held a g_cpu_irqlock
- This situation violates the critical section and should be avoided
- To avoid the situation, if a CPU sets g_cpu_irqset after calling up_cpu_paused()
- The CPU must release g_cpu_irqlock first
- Then retry irq_waitlock() to acquire g_cpu_irqlock

Impact:
- Affect SMP

Testing:
- Tested with spresense:wifi_smp (NCPUS=2 and 4)
- Tested with spresense:smp
- Tested with sim:smp
- Tested with sabre-6quad:smp (QEMU)
- Tested with maix-bit:smp (QEMU)
- Tested with esp32-core:smp (QEMU)
- Tested with lc823450-xgevk:rndis

Signed-off-by: Masayuki Ishikawa <Masayuki.Ishikawa@jp.sony.com>
2020-11-03 08:19:41 +01:00
..