RPMSG is associated with the use of HPWORK / LPWORK queues.
After sending a message to the remote end (Linux), the system
waits for an ack before proceeding. Unfortunately this may
take sometimes more time than one would expect. Ack waiting is
also unnecessary: nothing is done with that information. Even
worse, the net_lock() is also held during the blocked time so
it blocks other network stacks that are unrelated to this.
Also reorganize the mpfs_opensbi_*.S so that the trap
handler is easily relocated in the linker .ld file without
the need to relocate the utils.S. This makes it easier to
separate the files into own segments. The trap file should be
located in the zero device.
Moreover, provide support for simultaneous ACK and message
present handling capabilities in both directions. There are
times when both bits are set but only other is being handled.
In the end, the maximum throughput of the RPMSG bus increases
easily 10-20% or even more.
Signed-off-by: Eero Nurkkala <eero.nurkkala@offcode.fi>
SiFive document: "ECC Error Handling Guide" states:
"Any SRAM block or cache memory containing ECC functionality needs to be
initialized prior to use. ECC will correct defective bits based on memory
contents, so if memory is not first initialized to a known state, then ECC
will not operate as expected. It is recommended to use a DMA, if available,
to write the entire SRAM or cache to zeros prior to enabling ECC reporting.
If no DMA is present, use store instructions issued from the processor."
Clean the cache at this early stage so no ECC errors will be flooding later.
Signed-off-by: Eero Nurkkala <eero.nurkkala@offcode.fi>
Check that the base address and region size are properly aligned with
relation to each other.
With NAPOT encoding the area base and size are not arbitrary, as when
the size increases the amount of bits available for encoding the base
address decreases.
Implement the previously empty mpfs_ddr_rand with adapted "seiran128" code
from https://github.com/andanteyk/prng-seiran
This implements a non-secure prng, which is minimal in size. The DDR training
doesn't need cryptographically secure prng, and linking in the NuttX crypto
would increase the code size significantly for bootloaders.
Signed-off-by: Jukka Laitinen <jukkax@ssrc.tii.ae>
Also move the DDRC clock enablement and reset to mpfs_init_ddr. This doesn't
change the functionality, but is the cleaner place for it.
Signed-off-by: Jukka Laitinen <jukkax@ssrc.tii.ae>
Especially the write calibration must bail out if the memory test timeouts,
otherwise the device will get stuck in running the memory test in sequence,
and it will always timeout.
Negative error value was also not properly returned from mpfs_mtc_test.
Signed-off-by: Jukka Laitinen <jukkax@ssrc.tii.ae>
It doesn't make sense to try to auto-determine write latency, it may pass with too low value.
Keep the existing implementation if the write latency has been set to minimum
value, otherwise just set it.
Signed-off-by: Jukka Laitinen <jukkax@ssrc.tii.ae>
Calculate how long an I2C transation will take in microseconds, and use
this as the timeout for mpfs_i2c_sem_waitdone.
The reason for doing this is not to keep an i2c bus reserved for the full
1 second timeout, if e.g. a sensor is not on the bus / is faulty and
non-responsive. Reading the other sensors will be blocked for a relatively
long time (1 second) in this case. This fixes such behavior.
1 page might not be enough, if the task has a bigger stack. Best effort
is to allocate the default amount, however this won't work will all
tasks either.
Currently TX_FIFO_SIZE is not altered in mpfs_ep_set_fifo_size(),
but all paths (RX and TX) change MPFS_USB_RX_FIFO_SIZE only.
Fix the TX_FIFO_SIZE setup.
Signed-off-by: Eero Nurkkala <eero.nurkkala@offcode.fi>
This adds a config flag to remove manual bclksclk training if one wants
to just use the controller's own training.
Manual addcmd training depends on the manual bclksclk training, so this
also adds this dependency in Kconfig.
Signed-off-by: Jukka Laitinen <jukkax@ssrc.tii.ae>
Decreasing the value may increase DQ/DQS window size. Keep the default value
(1) for the existing board configurations.
Signed-off-by: Jukka Laitinen <jukkax@ssrc.tii.ae>
Adds a platform specific implementation for tickless schedular operation. This includes:
- Tickless operation for vexriscv cores.
- Tickless operation for vexriscv-smp cores.
- Ticked operation for vexriscv-smp cores.
Ticked operation for vexriscv core has been refactored.
Additional default configuration added to demonstrate operation.
Both tickless and ticked options use Litex timer0 for scheduling intervals. This is significantly faster than interfaceing with the risc-v mtimer through opensbi.
Adding the CONFIG_ARCH_PERF_EVENTS configuration to enable
hardware performance counting,solve the problem that some platform
hardware counting support is not perfect, you can choose to use
software interface.
This is configured using CONFIG_ARCH_PERF_EVENTS, so weak_functions
are removed to prevent confusion
To use hardware performance counting, must:
1. Configure CONFIG_ARCH_PERF_EVENTS, default selection
2. Call up_perf_init for initialization
Signed-off-by: wangming9 <wangming9@xiaomi.com>
1. virtio devics/drivers match and probe/remote mechanism;
2. virtio mmio transport layer based on OpenAmp (Compatible with both
virtio mmio version 1 and 2);
3. virtio-serial driver based on new virtio framework;
4. virtio-rng driver based on new virtio framework;
5. virtio-net driver based on new virtio framework
(IOB Offload implementation);
6. virtio-blk driver based on new virtio framework;
7. Remove the old virtio mmio framework, the old framework only
support mmio transport layer, and the new framwork support
more transport layer and this commit has implemented all the
old virtio drivers;
8. Refresh the the qemu-arm64 and qemu-riscv virtio related
configs, and update its README.txt;
New virtio-net driver has better performance
Compared with previous virtio-mmio-net:
| | master/-c | master/-s | this/-c | this/-s |
| :--------------------: | :-------: | :-------: | :-----: | :-----: |
| qemu-armv8a:netnsh | 539Mbps | 524Mbps | 906Mbps | 715Mbps |
| qemu-armv8a:netnsh_smp | 401Mbps | 437Mbps | 583Mbps | 505Mbps |
| rv-virt:netnsh | 487Mbps | 512Mbps | 760Mbps | 634Mbps |
| rv-virt:netnsh_smp | 387Mbps | 455Mbps | 447Mbps | 502Mbps |
| rv-virt:netnsh64 | 602Mbps | 595Mbps | 881Mbps | 769Mbps |
| rv-virt:netnsh64_smp | 414Mbps | 515Mbps | 491Mbps | 525Mbps |
| rv-virt:knetnsh64 | 515Mbps | 457Mbps | 606Mbps | 540Mbps |
| rv-virt:knetnsh64_smp | 308Mbps | 389Mbps | 415Mbps | 474Mbps |
Note: Both CONFIG_IOB_NBUFFERS=64, using iperf command, all in Mbits/sec
Tested in QEMU 7.2.2
Signed-off-by: wangbowen6 <wangbowen6@xiaomi.com>
Signed-off-by: Zhe Weng <wengzhe@xiaomi.com>
VELAPLATFO-12536
This provides the initial hooks for Flattened Device Tree support
with QEMU RV. It also provides a new procfs file that exposes the
fdt to userspace much like the /sys/firmware/fdt endpoint in Linux.
See https://www.kernel.org/doc/Documentation/ABI/testing/sysfs-firmware-ofw
Nodes in the fdt are not yet usable by the OS.
Signed-off-by: Brennan Ashton <bashton@brennanashton.com>
Signed-off-by: liaoao <liaoao@xiaomi.com>
This PR adds support for the StarFive JH7110 RISC-V SoC. This will be used by the upcoming port of NuttX for PINE64 Star64 SBC. [The source files are explained in the articles here](https://github.com/lupyuen/nuttx-star64)
Modified Files in arch/risc-v:
Kconfig: Added ARCH_CHIP_JH7110 for JH7110 SoC
New Files in arch/risc-v:
include/jh7110/chip.h: JH7110 Definitions
include/jh7110/irq.h: Support 127 External Interrupts
src/jh7110/chip.h: Interrupt Stack Macro
src/jh7110/jh7110_allocateheap.c: Kernel Heap
src/jh7110/jh7110_head.S: Linux Header and Boot Code
src/jh7110/jh7110_irq.c: Configure Interrupts
src/jh7110/jh7110_irq_dispatch.c: Dispatch Interrupts
src/jh7110/jh7110_memorymap.h: Memory Map
src/jh7110/jh7110_mm_init.c, jh7110_mm_init.h: Memory Mgmt
src/jh7110/jh7110_pgalloc.c: Page Allocator
src/jh7110/jh7110_start.c: Startup Code
src/jh7110/jh7110_timerisr.c: Timer Interrupt
src/jh7110/hardware/jh7110_memorymap.h: PLIC Base Address
src/jh7110/hardware/jh7110_plic.h: PLIC Register Addresses
src/jh7110/Kconfig: JH7110 Config
src/jh7110/Make.defs: Makefile
When supporting high-priority interrupts, updating the
g_running_tasks within a high-priority interrupt may be
cause problems. The g_running_tasks should only be updated
when it is determined that a task context switch has occurred.
Signed-off-by: zhangyuan21 <zhangyuan21@xiaomi.com>
Instead of clearing the fields individually, just wipe the whole register.
This can be done because flags and rm are just parts of the fcsr.
31 8 5 0
+--------------+--------+-----------+
| | | |
| RESERVED | FRM | FSTATUS |
| | | |
+--------------+--------+-----------+
FCSR
- Save the FPU registers into the tcb so they don't get lost if the stack
frame for xcp.regs moves (as it does)
- Handle interger and FPU register save/load separately
- Integer registers are saved/loaded always, like before
- FPU registers are only saved during a context switch:
- Save ONLY if FPU is dirty
- Restore always if FPU has been used (not in FSTATE_OFF, FSTATE_INIT)
- Remove all lazy-FPU related logic from the macros, it is not needed
Why? The tcb can contain info that is needed by the context switch
routine. One example is lazy-FPU handling; the integer registers can
be stored into the stack, because they are always stored & restored.
Lazy-FPU however needs a non-volatile location to store the FPU registers
as the save feature will skip saving a clean FPU, but the restore must
always restore the FPU registers if the thread uses FPU.