Commit Graph

368 Commits

Author SHA1 Message Date
hujun5
9dc3e4ee41 arm64: fix fvp smp faild to boot
reason:
we should give a busy wait addr

This commit fixes the regression from https://github.com/apache/nuttx/pull/13640

Signed-off-by: hujun5 <hujun5@xiaomi.com>
2024-10-24 18:04:49 +08:00
hujun5
6e7d90e195 arm64: remove the operation of clearing interrupts during GIC initialization
To align with the implementation of ARMv7-A, remove the operation of clearing
interrupts during GIC initialization to avoid losing interrupts during asynchronous startup.

Signed-off-by: hujun5 <hujun5@xiaomi.com>
2024-10-24 18:04:49 +08:00
hujun5
956d77ba23 arm64:add busy wait flag
Signed-off-by: hujun5 <hujun5@xiaomi.com>
2024-10-24 18:04:49 +08:00
hujun5
6dd26a3e68 arm64/smp: changing the startup of arm64 SMP from serial to parallel
Signed-off-by: hujun5 <hujun5@xiaomi.com>
2024-10-24 18:04:49 +08:00
hujun5
33e30239f1 sched: replace sync pause with async pause for nxtask_terminate
reason:
In the kernel, we are planning to remove all occurrences of up_cpu_pause as one of the steps to
simplify the implementation of critical sections. The goal is to enable spin_lock_irqsave to encapsulate critical sections,
thereby facilitating the replacement of critical sections(big lock) with smaller spin_lock_irqsave(small lock)

Signed-off-by: hujun5 <hujun5@xiaomi.com>
2024-10-17 12:51:14 +02:00
hujun5
c35e25b7e5 arch: rename xxxx_pause.c to xxxx_smpcall.c
Signed-off-by: hujun5 <hujun5@xiaomi.com>
2024-10-11 01:30:51 +08:00
hujun5
d54bc8a9f8 arch: remove up_cpu_pause up_cpu_resume up_cpu_paused up_cpu_pausereq
reason:
  To remove the "sync pause" and decouple the critical section from the dependency on enabling interrupts,
  after that we need to further implement "schedlock + spinlock".
changelist
  1 Modify the implementation of critical sections to no longer involve enabling interrupts or handling synchronous pause events.
  2 GIC_SMP_CPUCALL attach to pause handler to remove arch interface up_cpu_paused_restore up_cpu_paused_save
  3 Completely remove up_cpu_pause, up_cpu_resume, up_cpu_paused, and up_cpu_pausereq
  4 change up_cpu_pause_async to up_send_cpu_sgi

Signed-off-by: hujun5 <hujun5@xiaomi.com>
2024-10-11 01:30:51 +08:00
hujun5
4fd92edee7 sched: replace sync pause with async pause for nxsig_process
Signed-off-by: hujun5 <hujun5@xiaomi.com>
2024-10-11 01:30:51 +08:00
hujun5
487fcb3bce signal: adjust the signal processing logic to remove the judgment
Signed-off-by: hujun5 <hujun5@xiaomi.com>
2024-10-11 01:30:51 +08:00
hujun5
8275a846b1 arch: move sigdeliver to common code
Signed-off-by: hujun5 <hujun5@xiaomi.com>
2024-10-11 01:30:51 +08:00
buxiasen
ec58a6ab25 arch: cpu pause when sigaction only necessary if tcb running
Signed-off-by: buxiasen <buxiasen@xiaomi.com>
2024-10-11 01:30:51 +08:00
Ville Juven
192dde1b58 arm64_task/pthread_start: Set sp_el0 upon starting user process
As the handling of sp_el0 was moved from the context switch routine
to exception entry/exit, we must set sp_el0 explicitly when the user
process is first started.
2024-10-11 01:30:51 +08:00
lipengfei28
7975c94d88 Kernel build: enter exception save sp_sl0,exit exception restroe sp_el0
Signed-off-by: lipengfei28 <lipengfei28@xiaomi.com>
2024-10-11 01:30:51 +08:00
ligd
669cd3aa2c arm64: simply the vectors
Signed-off-by: ligd <liguiding1@xiaomi.com>
2024-10-11 01:30:51 +08:00
ligd
7cbcf82bed arm64: save FPU regs every time
Signed-off-by: ligd <liguiding1@xiaomi.com>
2024-10-11 01:30:51 +08:00
qinwei1
d782f6c1ac arm64: add arm64_current_el to obtain current EL
Summary
  Add a macro to obtain current execute level

Signed-off-by: qinwei1 <qinwei1@xiaomi.com>
2024-10-11 01:30:51 +08:00
qinwei1
40d40015f4 arm64: refine the fatal handler
Summary
  The original implement for exception handler is very simple and
haven't framework for breakpoint/watchpoint routine or brk instruction.
  I refine the fatal handler and add framework for debug handler to
register or unregister. this is a prepare for watchpoint/breakpoint
implement

Signed-off-by: qinwei1 <qinwei1@xiaomi.com>
2024-10-11 01:30:51 +08:00
ligd
1fb4f8f50e arch: change nxsched_suspend/resume_scheduler() called position
for the citimon stats:

thread 0:                     thread 1:
enter_critical (t0)
up_switch_context
note suspend thread0 (t1)

                              thread running
                              IRQ happen, in ISR:
                                post thread0
                                up_switch_context
                                note resume thread0 (t2)
                                ISR continue f1
                                ISR continue f2
                                ...
                                ISR continue fn

leave_critical (t3)

You will see, the thread 0, critical_section time is:
(t1 - t0) + (t3 - t2)

BUT, this result contains f1 f2 .. fn time spent, it is wrong
to tell user thead0 hold the critical lots of time but actually
not belong to it.

Resolve:
change the nxsched_suspend/resume_scheduler to real hanppends

Signed-off-by: ligd <liguiding1@xiaomi.com>
2024-10-10 15:31:50 +02:00
lipengfei28
7cf9b73f18 arm64 pci legacy irq do not support yet
Signed-off-by: lipengfei28 <lipengfei28@xiaomi.com>
2024-10-10 15:31:08 +02:00
yezhonghui
52ffca5b8e arm64 support gicv2m for pci irq
Signed-off-by: yezhonghui <yezhonghui@xiaomi.com>
2024-10-10 15:30:00 +02:00
hujun5
a8717c6453 arch: We can use an independent SIG interrupt to handle async pause, which can save processing time.
Signed-off-by: hujun5 <hujun5@xiaomi.com>
2024-10-09 23:32:31 +08:00
hujun5
ed998c08c4 sched: change the SMP scheduling policy from synchronous to asynchronous
reason:
Currently, if we need to schedule a task to another CPU, we have to completely halt the other CPU,
manipulate the scheduling linked list, and then resume the operation of that CPU. This process is both time-consuming and unnecessary.

During this process, both the current CPU and the target CPU are inevitably subjected to busyloop.

The improved strategy is to simply send a cross-core interrupt to the target CPU.
The current CPU continues to run while the target CPU responds to the interrupt, eliminating the certainty of a busyloop occurring.

Signed-off-by: hujun5 <hujun5@xiaomi.com>
2024-10-09 23:32:31 +08:00
hujun5
8cd52bee2e arm64: g_current_regs is only used to determine if we are in irq, with other functionalities removed.
Signed-off-by: hujun5 <hujun5@xiaomi.com>
2024-10-09 18:16:43 +08:00
hujun5
d573952790 irq: use up_interrupt_context to replace up_current_regs
Signed-off-by: hujun5 <hujun5@xiaomi.com>
2024-10-09 18:16:43 +08:00
Ville Juven
10b40abecc arm64_task/pthread_start: Convert the C / inline ASM code to assembly
The aforementioned functions can/will fail if the C compiler decides
to use the stack for the incoming entrypt/etc. parameters.

Fix this issue by converting the jump to user part into pure assembly,
ensuring the stack is NOT used for the parameters.
2024-09-21 23:24:02 +08:00
Ville Juven
6e15994f4c arm64_addrenv: Add support for 4 level MMU translations
The original code made the incorrect assumption that the amount of
translation levels is 3, but this is incorrect. The amount of levels is 4
and the amount of levels that are utilized / in use is set dynamically
from the amount of VA bits in use.
2024-09-21 08:36:23 -03:00
Ville Juven
a559f3495a arm64_addrenv: Fix the amount of page table levels
The VMSAv8-64 translation system has 4 page table levels in total, ranging
from 0-3. The address environment code assumes only 3 levels, from 1-3 but
this is wrong; the amount of levels _utilized_ depends on the configured
VA size CONFIG_ARM64_VA_BITS. With <= 39 bits 3 levels is enough, while
if the va range is larger, the 4th translation table level is taken into
use dynamically by shifting the base translation table level.

From arm64_mmu.c, where va_bits is the amount of va bits used in address
translations:
(va_bits <= 21)       - base level 3
(22 <= va_bits <= 30) - base level 2
(31 <= va_bits <= 39) - base level 1
(40 <= va_bits <= 48) - base level 0

The base level is what is configured as the page directory root. This also
affects the performance of address translations i.e. if the VA range is
smaller, address translations are also faster as the page table walk is
shorter.
2024-09-21 08:36:23 -03:00
wangmingrong1
469418f3c9 mm/kasan: Kasan global support setting alignment length
1. Similar to asan, supports single byte out of bounds detection
2. Fix the script to address the issue of not supporting the big end

Signed-off-by: wangmingrong1 <wangmingrong1@xiaomi.com>
2024-09-20 21:47:23 +08:00
wangmingrong1
071af0c993 mm/kasan: Tag kasan and generic kasan use the same instrumentation options
1. Tested on QEMU, the two sockets were basically the same, and their performance was not affected. The size of the generated bin file was also the same
2. Extract global detection as a separate file, both types of Kasan support global variable out of bounds detection simultaneously

Signed-off-by: wangmingrong1 <wangmingrong1@xiaomi.com>
2024-09-20 21:47:23 +08:00
hujun5
a754c517cc irq: use per-cpu reg to replace g_current_regs
Signed-off-by: hujun5 <hujun5@xiaomi.com>
2024-09-19 08:30:09 +08:00
wangmingrong
ae3facda53 kasan: Implementation of Kasan based on software tags.
Currently, only aarch64 is supported

Signed-off-by: wangmingrong <wangmingrong@xiaomi.com>
2024-09-19 03:15:29 +08:00
Masayuki Ishikawa
df298c186f Revert "build depend:Revert Make.dep intermediate ddc file"
This reverts commit ddc3119c4e.
2024-09-15 19:29:47 +08:00
xuxin19
ddc3119c4e build depend:Revert Make.dep intermediate ddc file
Revert "Parallelize depend file generation"
This reverts commit d5b6ec450f.

parallel depend ddc does not significantly speed up compilation,
intermediately generated .ddc files can cause problems if compilation is interrupted unexpectedly

Signed-off-by: xuxin19 <xuxin19@xiaomi.com>
2024-09-15 10:01:58 +08:00
hujun5
908df725ad arch: use up_current_regs/up_set_current_regs replace CURRENT_REGS
reason:
1 On different architectures, we can utilize more optimized strategies
  to implement up_current_regs/up_set_current_regs.
eg. use interrupt registersor percpu registers.

code size
before
    text    data     bss     dec     hex filename
 262848   49985   63893  376726   5bf96 nuttx

after
       text    data     bss     dec     hex filename
 262844   49985   63893  376722   5bf92 nuttx

size change -4

Configuring NuttX and compile:
$ ./tools/configure.sh -l qemu-armv8a:nsh_smp
$ make
Running with qemu
$ qemu-system-aarch64 -cpu cortex-a53 -smp 4 -nographic \
   -machine virt,virtualization=on,gic-version=3 \
   -net none -chardev stdio,id=con,mux=on -serial chardev:con \
   -mon chardev=con,mode=readline -kernel ./nuttx

Signed-off-by: hujun5 <hujun5@xiaomi.com>
2024-09-13 23:18:58 +08:00
Jukka Laitinen
78d2d884d3 arch/arm64/src/imx9/imx9_lpi2c.c: Ignore spurious RX interrupts
Check remaining data count, just in case an extra RX interrupt occurs
after receiving a message

Signed-off-by: Jukka Laitinen <jukkax@ssrc.tii.ae>
2024-09-13 09:18:30 -03:00
Jouni Ukkonen
ac319ba49a arch/arm64/imx9: Add system reset controller
System reset controller to powercycle ml and media blocks
and disable power-isolation

Signed-off-by: Jouni Ukkonen <jouni.ukkonen@unikie.com>
2024-09-13 08:48:34 +08:00
Jouni Ukkonen
101c2f0421 arch/arm64/imx9: Configure ENET clock
Configure ENET clock to 125MHz in clock init

Signed-off-by: Jouni Ukkonen <jouni.ukkonen@unikie.com>
2024-09-13 01:50:22 +08:00
Jukka Laitinen
3f82050623 arch/arm64/src/imx9: Add config option to select TX clk direction
TX clock or ref clock can be driven either from outside (PHY / oscilator) or by the ENET block.

Typical connection with RMII PHY is that the PHY drives the refclk.

Signed-off-by: Jukka Laitinen <jukkax@ssrc.tii.ae>
2024-09-13 01:50:22 +08:00
Jukka Laitinen
56c9cbd7af arch/arm64/src/imx9: Add register definitions for imx9 wakeupmix block control
Signed-off-by: Jukka Laitinen <jukkax@ssrc.tii.ae>
2024-09-13 01:50:22 +08:00
Jouni Ukkonen
c2a300d2b0 arch/arm64/imx9: Change Kconfig logic
New configuration IMX9_HAVE_ATF_FIRMWARE introduced,
it is default on and it selects ARM64_HAVE_PSCI, when compiling
bootloader or when using bootloader that does not have atf
this shall be disabled

Signed-off-by: Jouni Ukkonen <jouni.ukkonen@unikie.com>
2024-09-13 01:41:56 +08:00
Jukka Laitinen
08949dec66 arch/arm64/src/imx9/imx9_lpi2c.c: Cleanups and error fixes
Clean up the interrupt-driven logic in the driver; handle error cases properly,
remove dead code and simplify logic.

Signed-off-by: Jukka Laitinen <jukkax@ssrc.tii.ae>
2024-09-12 10:27:12 -03:00
Jukka Laitinen
9bee10f05e imx9/edma: Fix function prototypes
Change "DMACH_HANDLE *handle" into "DMACH_HANDLE handle". The DMACH_HANDLE is already
defined as "void *".

Signed-off-by: Jukka Laitinen <jukkax@ssrc.tii.ae>
2024-09-12 10:25:17 -03:00
Jukka Laitinen
201be401c0 arch/arm64/src/imx9/imx9_lowputc.h: Allow linking to C++ by adding extern "C"
Signed-off-by: Jukka Laitinen <jukkax@ssrc.tii.ae>
2024-09-12 10:24:40 -03:00
Jukka Laitinen
b805b73681 arch/arm64/src/imx9/imx9_lowputc.c: Fix a some preprocessor macros
Signed-off-by: Jukka Laitinen <jukkax@ssrc.tii.ae>
2024-09-12 10:24:40 -03:00
Ville Juven
8e7c0617ff arm64/Kconfig: Make the ARM64_PA/VA_BITS a true Kconfig variable
Enforcing the default 48-bit VA for everyone also implies a 4 page table
translation system. However, if less than 40 bits are needed, a full
translation table level can be dropped, making the translations faster.

Thus, make this into a configurable option, instead of enforcing the same
address widht for everyone.
2024-09-12 17:16:20 +08:00
Ville Juven
ca4bd482a0 arm64/task/pthread_start: Fix rare issue with context register location
There is a tiny possibility that when a process is started a trap is
taken which causes a context switch. This moves the kernel stack
unexpectedly and the task start logic no longer works.

Fix this by recording the initial context location, and use that to
trampoline into the user process with interrupts disabled. This ensures
the context stays intact AND the kernel stack is fully unwound before
the user process starts.
2024-09-11 08:59:01 -03:00
Ville Juven
87d9dac817 arm64/syscall: (Re-)enable interrupts only if they were previously enabled
Don't change the CPU state unexpectedly
2024-09-11 19:51:35 +08:00
Ville Juven
498275ca43 arm64/irq: Add mask for DAIF and SPSR DAIF bits
Use them for critical section handling, removes a bit of copy&pasted
code behind CONFIG_ARM64_DECODEFIQ flag
2024-09-11 19:51:35 +08:00
Ville Juven
48f545d54a arm64/crt0.c: Fix stack alignment when executing signal trampoline
The stack alignment requirement is 16-bytes, not 8-bytes.
2024-09-11 19:49:24 +08:00
Ville Juven
132868b728 arm64_syscall.c: Don't need to set register context during syscall
The register context is not needed, the original idea was to provide
the user stack pointer for signal handler delivery, but the user stack
can be obtained via sp_el0 so the context registers are not needed.

SP0 is not stored upon exception entry anyways, so this code is just
completely redundant and wrong.
2024-09-10 23:23:21 +08:00