reason:
To remove the "sync pause" and decouple the critical section from the dependency on enabling interrupts,
after that we need to further implement "schedlock + spinlock".
changelist
1 Modify the implementation of critical sections to no longer involve enabling interrupts or handling synchronous pause events.
2 GIC_SMP_CPUCALL attach to pause handler to remove arch interface up_cpu_paused_restore up_cpu_paused_save
3 Completely remove up_cpu_pause, up_cpu_resume, up_cpu_paused, and up_cpu_pausereq
4 change up_cpu_pause_async to up_send_cpu_sgi
Signed-off-by: hujun5 <hujun5@xiaomi.com>
reason:
In x86_64, g_current_regs is still used for context switching.
This commit fixes the regression from https://github.com/apache/nuttx/pull/13616
Signed-off-by: hujun5 <hujun5@xiaomi.com>
signal handler stack must be properly aligned, otherwise vector instructions doesn't work in signal handler
Signed-off-by: p-szafonimateusz <p-szafonimateusz@xiaomi.com>
move initial RSP for AP cores below regs area.
otherwise IDLE thread for AP cores can be corrupted
XCP region now match regs allocation in up_initial_state()
Signed-off-by: p-szafonimateusz <p-szafonimateusz@xiaomi.com>
for the citimon stats:
thread 0: thread 1:
enter_critical (t0)
up_switch_context
note suspend thread0 (t1)
thread running
IRQ happen, in ISR:
post thread0
up_switch_context
note resume thread0 (t2)
ISR continue f1
ISR continue f2
...
ISR continue fn
leave_critical (t3)
You will see, the thread 0, critical_section time is:
(t1 - t0) + (t3 - t2)
BUT, this result contains f1 f2 .. fn time spent, it is wrong
to tell user thead0 hold the critical lots of time but actually
not belong to it.
Resolve:
change the nxsched_suspend/resume_scheduler to real hanppends
Signed-off-by: ligd <liguiding1@xiaomi.com>
reason:
Currently, if we need to schedule a task to another CPU, we have to completely halt the other CPU,
manipulate the scheduling linked list, and then resume the operation of that CPU. This process is both time-consuming and unnecessary.
During this process, both the current CPU and the target CPU are inevitably subjected to busyloop.
The improved strategy is to simply send a cross-core interrupt to the target CPU.
The current CPU continues to run while the target CPU responds to the interrupt, eliminating the certainty of a busyloop occurring.
Signed-off-by: hujun5 <hujun5@xiaomi.com>
Revert "Parallelize depend file generation"
This reverts commit d5b6ec450f.
parallel depend ddc does not significantly speed up compilation,
intermediately generated .ddc files can cause problems if compilation is interrupted unexpectedly
Signed-off-by: xuxin19 <xuxin19@xiaomi.com>
prepare 16550 UART driver to support PCI:
- [breaking change] change argument of uart_ioctl() from `struct file *filep` to `FAR struct u16550_s *priv`
Also fix moxart_16550.c build related to this change
- [breaking change] change argument of uart_getreg() and uart_putreg from `uart_addrwidth_t base` to `FAR struct u16550_s *priv`
Also fix arch/x86/src/qemu/qemu_serial.c and arch/x86_64/src/intel64/intel64_serial.c related to this change
- [breaking change] change argument of uart_dmachan() from `uart_addrwidth_t base` to `FAR struct u16550_s *priv`
- move `struct u16550_s` to public header
- generalize UART_XXX_OFFSET so we can use it with any register increment
- make u16550_bind(), u16550_interrupt(), u16550_interrupt() public
- remove arch/or1k/src/common/or1k_uart.c and use common 16550 MIMO interfacve
- change irq type in `struct u16550_s` from uint8_t to int to match MSI API
Signed-off-by: p-szafonimateusz <p-szafonimateusz@xiaomi.com>
Some of PCI drivers require OS interfaces that can't be executed in the INIT context.
In that case we have to postpone PCI drivers probing and call it for example
in board initialization logic.
Signed-off-by: p-szafonimateusz <p-szafonimateusz@xiaomi.com>
reason:
1 On different architectures, we can utilize more optimized strategies
to implement up_current_regs/up_set_current_regs.
eg. use interrupt registersor percpu registers.
code size
before
text data bss dec hex filename
262848 49985 63893 376726 5bf96 nuttx
after
text data bss dec hex filename
262844 49985 63893 376722 5bf92 nuttx
size change -4
Configuring NuttX and compile:
$ ./tools/configure.sh -l qemu-armv8a:nsh_smp
$ make
Running with qemu
$ qemu-system-aarch64 -cpu cortex-a53 -smp 4 -nographic \
-machine virt,virtualization=on,gic-version=3 \
-net none -chardev stdio,id=con,mux=on -serial chardev:con \
-mon chardev=con,mode=readline -kernel ./nuttx
Signed-off-by: hujun5 <hujun5@xiaomi.com>
Intel CET (Control-flow Enforcement Technology) is a hardware enhancement aimed at mitigating the Retpoline vulnerability, but it may impact CPU branch prediction performance. This commit added ARCH_INTEL64_DISABLE_CET, which can disable CET completely with compilation option `-fcf-protection=none`.
Signed-off-by: ouyangxiangzhen <ouyangxiangzhen@xiaomi.com>
It was discovered that attempting to load x86-64 format ELF files with a multiboot1 header using the qemu `-kernel` command would result in an error, as multiboot1 only allows x86-32 format ELF files. To address this limitation, we have developed a simple x86_32 bootloader. This bootloader is designed to copy the `nuttx.bin` file to the designated memory address (`0x100000`) and then transfer control to NuttX by executing a jump instruction (`jmp 0x100000`).
Signed-off-by: ouyangxiangzhen <ouyangxiangzhen@xiaomi.com>
These are changes to make HPET work with ACRN hypervisor:
- FSB interrupt delivery (which works like PCI MSI)
- 32-bit mode support
Signed-off-by: p-szafonimateusz <p-szafonimateusz@xiaomi.com>
Using the HLT instruction in VM usually traps into the Hypervisor and releases CPU control. This will result in real-time performance degradation. Using the NOP or MWAIT instruction for an IDLE loop can reduce energy consumption while not trapping into the Hypervisor.
Signed-off-by: ouyangxiangzhen <ouyangxiangzhen@xiaomi.com>
reason:
In SMP, when a context switch occurs, restore_critical_section is executed.
To reduce the time taken for context switching, we directly pass the required
parameters to restore_critical_section instead of acquiring them repeatedly.
Signed-off-by: hujun5 <hujun5@xiaomi.com>
Make this_cpu is arch independent and up_cpu_index do that.
In AMP mode, up_cpu_index() may return the index of the physical core.
Signed-off-by: fangxinyong <fangxinyong@xiaomi.com>
Create version.h
chip/intel64_irq.c:78:34: error: conflicting type qualifiers for ‘g_irq_spin’
78 | static spinlock_t g_irq_spin;
| ^~~~~~~~~~
In file included from chip/intel64_irq.c:40:
include/nuttx/spinlock.h:168:28: note: previous declaration of ‘g_irq_spin’ with type ‘spinlock_t’ {aka ‘volatile unsigned char’}
168 | extern volatile spinlock_t g_irq_spin;
| ^~~~~~~~~~
chip/intel64_cpu.c: In function ‘x86_64_cpu_ready_set’:
chip/intel64_cpu.c:314:3: warning: implicit declaration of function ‘spin_lock’ [-Wimplicit-function-declaration]
314 | spin_lock(&g_ap_boot);
| ^~~~~~~~~
chip/intel64_cpu.c:322:3: warning: implicit declaration of function ‘spin_unlock’; did you mean ‘sched_unlock’? [-Wimplicit-function-declaration]
322 | spin_unlock(&g_ap_boot);
| ^~~~~~~~~~~
Signed-off-by: chao an <anchao@lixiang.com>
HPET can be used as system clock for x86_64
to set HPET as system clock you have to enable:
CONFIG_ONESHOT=y
CONFIG_ALARM_ARCH=y
CONFIG_INTEL64_ONESHOT=y
CONFIG_ARCH_INTEL64_HPET_ALARM=y
Signed-off-by: p-szafonimateusz <p-szafonimateusz@xiaomi.com>
rename _rdtsc macro to rdtsc to avoid conflict with external code
rename rdtsc macro to rdtscp to be the same as asm instruction used
Signed-off-by: p-szafonimateusz <p-szafonimateusz@xiaomi.com>
Add addrenv support for x86_64.
For now we support mapping on PT level, so PD, PDT and PML4 are static
Signed-off-by: p-szafonimateusz <p-szafonimateusz@xiaomi.com>
use new MMU api to implement up_map_region().
The new implementation support maping over 0xffffffff but requires CONFIG_MM_PGALLOC=y
Signed-off-by: p-szafonimateusz <p-szafonimateusz@xiaomi.com>