.. _nxflat:

======
NXFLAT
======

Overview
========

Functionality
-------------

NXFLAT is a customized and simplified version of binary format
implemented a few years ago called
`XFLAT <http://xflat.sourceforge.net/>`__ With the NXFLAT binary format
you will be able to do the following:

  - Place separately linked programs in a file system, and
  - Execute those programs by dynamically linking them to the base NuttX
    code.

This allows you to extend the NuttX base code after it has been written
into FLASH. One motivation for implementing NXFLAT is support clean CGI
under an HTTPD server.

This feature is especially attractive when combined with the NuttX ROMFS
support: ROMFS allows you to execute programs in place (XIP) in flash
without copying anything other than the .data section to RAM. In fact,
the initial NXFLAT release only worked on ROMFS. Later extensions also
support execution NXFLAT binaries from an SRAM copy as well.

This NuttX feature includes:

  - A dynamic loader that is built into the NuttX core (See
    `GIT <https://github.com/apache/nuttx/blob/master/binfmt/>`__).
  - Minor changes to RTOS to support position independent code, and
  - A linker to bind ELF binaries to produce the NXFLAT binary format
    (See GIT).

Background
----------

NXFLAT is derived from `XFLAT <http://xflat.sourceforge.net/>`__. XFLAT
is a toolchain add that provides full shared library and XIP executable
support for processors that have no Memory Management Unit
(MMU:sup:`1`). NXFLAT is greatly simplified for the deeply embedded
environment targeted by NuttX:

  - NXFLAT does not support shared libraries, because
  - NXFLAT does not support *exportation* of symbol values from a module

Rather, the NXFLAT module only *imports* symbol values. In the NXFLAT
model, the (PIC:sup:`2`) NXFLAT module resides in a FLASH file system
and when it is loaded at run time, it is dynamically linked only to the
(non-PIC) base NuttX code: The base NuttX *exports* a symbol table; the
NXFLAT module *imports* those symbol value to dynamically bind the
module to the base code.

Limitations
-----------

  - **ROMFS (or RAM mapping) Only**:
    The current NXFLAT release will work only with either (1) NXFLAT
    executable modules residing on a ROMFS file system, or (2) executables
    residing on other file systems provided that CONFIG_FS_RAMMAP is
    defined. This limitation is because the loader depends on the capability
    to mmap() the code segment. See the NuttX User Guide for further information.

    NUTTX does not provide any general kind of file mapping capability.
    In fact, true file mapping is only possible with MCUs that provide an MMU1.
    Without an MMU, file system may support eXecution In Place (XIP) to mimic
    file mapping. Only the ROMFS file system supports that kind of XIP execution
    need by NXFLAT.

    It is also possible to simulate file mapping by allocating memory, copying
    the NXFLAT binary file into memory, and executing from the copy of the
    executable file in RAM. That capability can be enabled with the CONFIG_FS_RAMMAP
    configuration option. With that option enabled, NXFLAT will work that kind
    of file system but will require copying of all NXFLAT executables to RAM.

  - **GCC/ARM/Cortex-M3/4 Only**:
    At present, the NXFLAT toolchain is only available for ARM and Cortex-M3/4 (thumb2) targets.

  - **Read-Only Data in RAM**:
    With older GCC compilers (at least up to 4.3.3), read-only data must
    reside in RAM. In code generated by GCC, all data references are
    indexed by the PIC2 base register (that is usually R10 or sl for the
    ARM processors). The includes read-only data (.rodata). Embedded
    firmware developers normally like to keep .rodata in FLASH with
    the code sections. But because all data is referenced with the
    PIC base register, all of that data must lie in RAM. A NXFLAT
    change to work around this is under investigation3.

    Newer GCC compilers (at least from 4.6.3), read-only data is
    no long GOT-relative, but is now accessed PC-relative.
    With PC relative addressing, read-only data must reside in the I-Space.

  - **Globally Scoped Function Function Pointers**:
    If a function pointer is taken to a statically defined function,
    then (at least for ARM) GCC will generate a relocation that NXFLAT
    cannot handle. The workaround is make all such functions global in
    scope. A fix would involve a change to the GCC compiler as described
    in Appendix B.

  - **Special Handling of Callbacks**:
    Callbacks through function pointers must be avoided or, when
    then cannot be avoided, handled very specially. The reason
    for this is that the PIC module requires setting of a special
    value in a PIC register. If the callback does not set the PIC
    register, then the called back function will fail because it
    will be unable to correctly access data memory. Special logic
    is in place to handle some NuttX callbacks: Signal callbacks
    and watchdog timer callbacks. But other callbacks (like those
    used with qsort() must be avoided in an NXFLAT module.

Supported Processors
--------------------

As mentioned `above <#limitations>`__, the NXFLAT toolchain is only
available for ARM and Cortex-M3 (thumb2) targets. Furthermore, NXFLAT
has only been tested on the Eagle-100 LMS6918 Cortex-M3 board.

Development Status
------------------

The initial release of NXFLAT was made in NuttX version 0.4.9. Testing
is limited to the tests found under ``apps/examples/nxflat`` in the
source tree. Some known problems exist (see the
`TODO <https://github.com/apache/nuttx/blob/master/TODO>`__ list). As
such, NXFLAT is currently in an early alpha phase.

NXFLAT Toolchain
================

Building the NXFLAT Toolchain
-----------------------------

In order to use NXFLAT, you must use special NXFLAT tools to create the
binary module in FLASH. To do this, you will need to download the
buildroot package and build it on your Linux or Cygwin machine. The
buildroot can be downloaded from
`Bitbucket.org <https://bitbucket.org/nuttx/buildroot/downloads>`__. You
will need version 0.1.7 or later.

Here are some general build instructions:

-  You must have already configured NuttX in ``<some-dir>/nuttx``
-  Download the buildroot package ``buildroot-0.x.y`` into
   ``<some-dir>``
-  Unpack ``<some-dir>/buildroot-0.x.y.tar.gz`` using a command like ``tar zxf buildroot-0.x.y``.
   This will result in a new directory like ``<some-dir>/buildroot-0.x.y``
-  Move this into position:
   ``mv <some-dir>/buildroot-0.x.y``\ <some-dir>/buildroot
-  ``cd``\ <some-dir>/buildroot
-  Copy a configuration file into the top buildroot directory:
   ``cp boards/abc-defconfig-x.y.z .config``.
-  Enable building of the NXFLAT tools by ``make menuconfig``. Select to
   build the NXFLAT toolchain with GCC (you can also select omit
   building GCC with and only build the NXFLAT toolchain for use with
   your own GCC toolchain).
-  Make the toolchain: ``make``. When the make completes, the tool
   binaries will be available under
   ``<some-dir>/buildroot/build_abc/staging_dir/bin``

mknxflat
--------

``mknxflat`` is used to build a *thunk* file. See below
for usage::

  Usage: mknxflat [options] <bfd-filename>

  Where options are one or more of the following.  Note
  that a space is always required between the option and
  any following arguments.

    -d Use dynamic symbol table. [symtab]
    -f <cmd-filename>
        Take next commands from <cmd-filename> [cmd-line]
    -o <out-filename>
       Output to  [stdout]
    -v Verbose output [no output]
    -w Import weakly declared functions, i.e., weakly
       declared functions are expected to be provided at
       load-time [not imported]

ldnxflat
--------

``ldnxflat`` is use to link your object files along with the *thunk*
file generated by ``mknxflat`` to produce the NXFLAT
binary module. See below for usage::

  Usage: ldnxflat [options] <bfd-filename>

  Where options are one or more of the following.  Note
  that a space is always required between the option and
  any following arguments.

    -d Use dynamic symbol table [Default: symtab]
    -e <entry-point>
       Entry point to module [Default: _start]
    -o <out-filename>
       Output to <out-filename> [Default: <bfd-filename>.nxf]
    -s <stack-size>
       Set stack size to <stack-size> [Default: 4096]
    -v Verbose output. If -v is applied twice, additional
       debug output is enabled [Default: no verbose output].

mksymtab
--------

There is a small helper program available in ``nuttx/tools`` call
``mksymtab``. ``mksymtab`` can be sued to generate symbol tables for the
NuttX base code that would be usable by the typical NXFLAT application.
``mksymtab`` builds symbol tables from common-separated value (CSV)
files. In particular, the CSV files:

  #. ``nuttx/syscall/syscall.csv`` that describes the NuttX RTOS
     interface, and
  #. ``nuttx/libc/libc.csv`` that describes the NuttX C library interface.
  #. ``nuttx/libc/math.cvs`` that descirbes any math library.

::

  USAGE: ./mksymtab <cvs-file> <symtab-file>

  Where:

    <cvs-file>   : The path to the input CSV file
    <symtab-file>: The path to the output symbol table file
    -d           : Enable debug output

For example,

::

  cd nuttx/tools
  cat ../syscall/syscall.csv ../libc/libc.csv | sort >tmp.csv
  ./mksymtab.exe tmp.csv tmp.c

Making an NXFLAT module
-----------------------

Below is a snippet from an NXFLAT make file (simplified from NuttX
`Hello,
World! <https://github.com/apache/nuttx-apps/blob/master/examples/nxflat/tests/hello/Makefile>`__
example).

* Target 1:

  .. code-block:: makefile

    hello.r1: hello.o
      abc-nuttx-elf-ld -r -d -warn-common -o $@ $^

* Target 2:

  .. code-block:: makefile

    hello-thunk.S: hello.r1
      mknxflat -o $@ $^

* Target 3:

  .. code-block:: makefile

    hello.r2: hello-thunk.S
      abc-nuttx-elf-ld -r -d -warn-common -T binfmt/libnxflat/gnu-nxflat-gotoff.ld -no-check-sections -o $@ hello.o hello-thunk.o

* Target 4:

  .. code-block:: makefile

    hello: hello.r2
      ldnxflat -e main -s 2048 -o $@ $^

**Target 1**. This target links all of the module's object files
together into one relocatable object. Two relocatable objects will be
generated; this is the first one (hence, the suffic ``.r1``). In this
"Hello, World!" case, there is only a single object file, ``hello.o``,
that is linked to produce the ``hello.r1`` object.

When the module's object files are compiled, some special compiler
CFLAGS must be provided. First, the option ``-fpic`` is required to tell
the compiler to generate position independent code (other GCC options,
like ``-fno-jump-tables`` might also be desirable). For ARM compilers,
two additional compilation options are required: ``-msingle-pic-base``
and ``-mpic-register=r10``.

**Target 2**. Given the ``hello.r1`` relocatable object, this target
will invoke ```mknxflat`` <#mknxflat>`__ to make the *thunk* file,
``hello-thunk.S``. This *thunk* file contains all of the information
needed to create the imported function list.

**Target 3** This target is similar to **Target 1**. In this case, it
will link together the module's object files (only ``hello.o`` here)
along with the assembled *thunk* file, ``hello-thunk.o`` to create the
second relocatable object, ``hello.r2``. The linker script,
``gnu-nxflat-gotoff.ld`` is required at this point to correctly position
the sections. This linker script produces two segments: An *I-Space*
(Instruction Space) segment containing mostly ``.text`` and a *D-Space*
(Data Space) segment containing ``.got``, ``.data``, and ``.bss``
sections. The I-Space section must be origined at address 0 (so that the
segment's addresses are really offsets into the I-Space segment) and the
D-Space section must also be origined at address 0 (so that segment's
addresses are really offsets into the I-Space segment). The option
``-no-check-sections`` is required to prevent the linker from failing
because these segments overlap.

**NOTE:** There are two linker scripts located at ``binfmt/libnxflat/``.

  #. ``binfmt/libnxflat/gnu-nxflat-gotoff.ld``. Older versions of GCC
     (at least up to GCC 4.3.3), use GOT-relative addressing to access RO
     data. In that case, read-only data (.rodata) must reside in D-Space
     and this linker script should be used.
  #. ``binfmt/libnxflat/gnu-nxflat-pcrel.ld``. Newer versions of GCC
     (at least as of GCC 4.6.3), use PC-relative addressing to access RO
     data. In that case, read-only data (.rodata) must reside in I-Space
     and this linker script should be used.

**Target 4**. Finally, this target will use the ``hello.r2`` relocatable
object to create the final, NXFLAT module ``hello`` by executing
``ldnxflat``.

**binfmt Registration** NXFLAT calls :c:func:`register_binfmt` to
incorporate itself into the system.

Appendix A: No GOT Operation
============================

When GCC generate position independent code, new code sections will
appear in your programs. One of these is the GOT (Global Offset Table)
and, in ELF environments, another is the PLT (Procedure Lookup Table.
For example, if your C code generated (ARM) assembly language like this
without PIC:

.. code-block:: asm

          ldr     r1, .L0         /* Fetch the offset to 'x' */
          ldr     r0, [r10, r1]   /* Load the value of 'x' with PIC offset */
          /* ... */
  .L0:    .word   x               /* Offset to 'x' */

Then when PIC is enabled (say with the -fpic compiler option), it will
generate code like this:

.. code-block:: asm

          ldr     r1, .L0         /* Fetch the offset to the GOT entry */
          ldr     r1, [r10, r1]   /* Fetch the (relocated) address of 'x' from the GOT */
          ldr     r0, [r1, #0]    /* Fetch the value of 'x' */
          /* ... */
  .L1     .word   x(GOT)          /* Offset to entry in the GOT */

See
`reference <http://xflat.sourceforge.net/NoMMUSharedLibs.html#shlibsgot>`__

Notice that the generates an extra level of indirection through the GOT.
This indirection is not needed by NXFLAT and only adds more RAM usage
and execution time.

NXFLAT (like `XFLAT <http://xflat.sourceforge.net/>`__) can work even
better without the GOT. Patches against older version of GCC exist to
eliminate the GOT indirections. Several are available
`here <http://xflat.cvs.sourceforge.net/viewvc/xflat/xflat/gcc/>`__ if
you are inspired to port them to a new GCC version.

Appendix B: PIC Text Workaround
===============================

There is a problem with the memory model in GCC that prevents it from
being used as you need to use it in the NXFLAT context. The problem is
that GCC PIC model assumes that the executable lies in a flat,
contiguous (virtual) address space like::

  Virtual
  .text
  .got
  .data
  .bss

It assumes that the PIC base register (usually r10 for ARM) points to
the base of ``.text`` so that any address in ``.text``, ``.got``,
``.data``, ``.bss`` can be found with an offset from the same base
address. But that is not the memory arrangement that we need in the XIP
embedded environment. We need two memory regions, one in FLASH
containing shared code and on per task in RAM containing task-specific
data::

  Flash	  RAM
  .text   .got
          .data
          .bss

The PIC base register needs to point to the base of the ``.got`` and
only addresses in the ``.got``, ``.data``, and ``.bss`` sections can be
accessed as an offset from the PIC base register. See also this `XFLAT
discussion <http://xflat.cvs.sourceforge.net/viewvc/*checkout*/xflat/xflat/gcc/README?revision=1.1.1.1>`__.

Patches against older version of GCC exist to correct this GCC behavior.
Several are available
`here <http://xflat.cvs.sourceforge.net/viewvc/xflat/xflat/gcc/>`__ if
you are inspired to port them to a new GCC version.