diff --git a/TODO b/TODO deleted file mode 100644 index b7f0eac94a..0000000000 --- a/TODO +++ /dev/null @@ -1,2560 +0,0 @@ -NuttX TODO List (Last updated February 24, 2021) -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -This file summarizes known NuttX bugs, limitations, inconsistencies with -standards, things that could be improved, and ideas for enhancements. This -TODO list does not include issues associated with individual board ports. See -also the individual README.txt files in the boards/ sub-directories for -issues related to each board port. - -nuttx/: - - (16) Task/Scheduler (sched/) - (2) SMP - (1) Memory Management (mm/) - (0) Power Management (drivers/pm) - (5) Signals (sched/signal, arch/) - (2) pthreads (sched/pthread, libs/libc/pthread) - (0) Message Queues (sched/mqueue) - (1) Work Queues (sched/wqueue) - (6) Kernel/Protected Build - (2) C++ Support - (5) Binary loaders (binfmt/) - (17) Network (net/, drivers/net) - (4) USB (drivers/usbdev, drivers/usbhost) - (2) Other drivers (drivers/) - (9) Libraries (libs/libc/, libs/libm/) - (11) File system/Generic drivers (fs/, drivers/) - (10) Graphics Subsystem (graphics/) - (1) Build system / Toolchains - (2) Linux/Cygwin simulation (arch/sim) - (5) ARM (arch/arm/) - -apps/ and other Add-Ons: - - (1) Network Utilities (apps/netutils/) - (1) NuttShell (NSH) (apps/nshlib) - (2) System libraries apps/system (apps/system) - (1) Modbus (apps/modbus) - (5) Other Applications & Tests (apps/examples/) - -o Task/Scheduler (sched/) - ^^^^^^^^^^^^^^^^^^^^^^^ - - Title: CHILD PTHREAD TERMINATION - Description: When a tasks exits, shouldn't all of its child pthreads also be - terminated? - - This behavior was implemented as an options controlled by the - configuration setting CONFIG_SCHED_EXIT_KILL_CHILDREN. This - option must be used with caution, however. It should not be - used unless you are certain of what you are doing. Uninformed - of this option can often lead to memory leaks since, for - example, memory allocations held by threads are not - automatically freed! - - Status: Closed. No, this behavior will not be implemented unless - specifically selected. - Priority: Medium, required for good emulation of process/pthread model. - The current behavior allows for the main thread of a task to - exit() and any child pthreads will persist. That does raise - some issues: The main thread is treated much like just-another- - pthread but must follow the semantics of a task or a process. - That results in some inconsistencies (for example, with robust - mutexes, what should happen if the main thread exits while - holding a mutex?) - - Title: pause() NON-COMPLIANCE - Description: In the POSIX description of this function the pause() function - must suspend the calling thread until delivery of a signal whose - action is either to execute a signal-catching function or to - terminate the process. The current implementation only waits for - any non-blocked signal to be received. It should only wake up if - the signal is delivered to a handler. - Status: Open. - Priority: Medium Low. - - Title: ON-DEMAND PAGING INCOMPLETE - Description: On-demand paging has recently been incorporated into the RTOS. - The design of this feature is described here: - https://nuttx.apache.org/docs/latest/components/paging.html. - As of this writing, the basic feature implementation is - complete and much of the logic has been verified. The test - harness for the feature exists only for the NXP LPC3131 (see - boards/arm/lpc31xx/ea3131/configs/pgnsh and locked - directories). There are some limitations of this testing so - I still cannot say that the feature is fully functional. - Status: Open. This has been put on the shelf for some time. - Priority: Medium-Low - - Title: GET_ENVIRON_PTR() - Description: get_environ_ptr() (sched/sched_getenvironptr.c) is not implemented. - The representation of the environment strings selected for - NuttX is not compatible with the operation. Some significant - re-design would be required to implement this function and that - effort is thought to be not worth the result. - Status: Open. No change is planned. - Priority: Low -- There is no plan to implement this. - - Title: TIMER_GETOVERRUN() - Description: timer_getoverrun() (sched/timer_getoverrun.c) is not implemented. - Status: Open - Priority: Low -- There is no plan to implement this. - - Title: INCOMPATIBILITIES WITH execv() AND execl() - Description: Simplified 'execl()' and 'execv()' functions are provided by - NuttX. NuttX does not support processes and hence the concept - of overlaying a tasks process image with a new process image - does not make any sense. In NuttX, these functions are - wrapper functions that: - - 1. Call the non-standard binfmt function 'exec', and then - 2. exit(0). - - As a result, the current implementations of 'execl()' and - 'execv()' suffer from some incompatibilities, the most - serious of these is that the exec'ed task will not have - the same task ID as the vfork'ed function. So the parent - function cannot know the ID of the exec'ed task. - Status: Open - Priority: Medium Low for now - - Title: ISSUES WITH atexit(), on_exit(), AND pthread_cleanup_pop() - Description: These functions execute with the following bad properties: - - 1. They run with interrupts disabled, - 2. They run in supervisor mode (if applicable), and - 3. They do not obey any setup of PIC or address - environments. Do they need to? - 4. In the case of task_delete() and pthread_cancel() without - deferred cancellation, these callbacks will run on the - thread of execution and address context of the caller of - task_delete() or pthread_cancel(). That is very bad! - - The fix for all of these issues it to have the callbacks - run on the caller's thread as is currently done with - signal handlers. Signals are delivered differently in - PROTECTED and KERNEL modes: The delivery involves a - signal handling trampoline function in the user address - space and two signal handlers: One to call the signal - handler trampoline in user mode (SYS_signal_handler) and - on in with the signal handler trampoline to return to - supervisor mode (SYS_signal_handler_return) - - The primary difference is in the location of the signal - handling trampoline: - - - In PROTECTED mode, there is on a single user space blob - with a header at the beginning of the block (at a well- - known location. There is a pointer to the signal handler - trampoline function in that header. - - In the KERNEL mode, a special process signal handler - trampoline is used at a well-known location in every - process address space (ARCH_DATA_RESERVE->ar_sigtramp). - Status: Open - Priority: Medium Low. This is an important change to some less - important interfaces. For the average user, these - functions are just fine the way they are. - - Title: execv() AND vfork() - Description: There is a problem when vfork() calls execv() (or execl()) to - start a new application: When the parent thread calls vfork() - it receives and gets the pid of the vforked task, and *not* - the pid of the desired execv'ed application. - - The same tasking arrangement is used by the standard function - posix_spawn(). However, posix_spawn uses the non-standard, internal - NuttX interface task_reparent() to replace the child's parent task - with the caller of posix_spawn(). That cannot be done with vfork() - because we don't know what vfork() is going to do. - - Any solution to this is either very difficult or impossible without - an MMU. - Status: Open - Priority: Low (it might as well be low since it isn't going to be fixed). - - Title: errno IS NOT SHARED AMONG THREADS - Description: In NuttX, the errno value is unique for each thread. But for - bug-for-bug compatibility, the same errno should be shared by - the task and each thread that it creates. It is *very* easy - to make this change: Just move the tls_errno field from - struct tls_info_s to struct task_group_s. However, I am still - not sure if this should be done or not. - NOTE: glibc behaves this way unless __thread is defined then, - in that case, it behaves like NuttX (using TLS to save the - thread local errno). - Status: Closed. The existing solution is better and compatible with - thread-aware GLIBC (although its incompatibilities could show - up in porting some code). I will retain this issue for - reference only. - Priority: N/A - - Title: SCALABILITY - Description: Task control information is retained in simple lists. This - is completely appropriate for small embedded systems where - the number of tasks, N, is relatively small. Most list - operations are O(N). This could become an issue if N gets - very large. - - In that case, these simple lists should be replaced with - something more performant such as a balanced tree in the - case of ordered lists. Fortunately, most internal lists are - hidden behind simple accessor functions and so the internal - data structures can be changed if need with very little impact. - - Explicitly reference to the list structure are hidden behind - the macro this_task(). - - Status: Open - Priority: Low. Things are just the way that we want them for the way - that NuttX is used today. - - Title: INTERNAL VERSIONS OF USER FUNCTIONS - Description: The internal NuttX logic uses the same interfaces as does - the application. That sometime produces a problem because - there is "overloaded" functionality in those user interfaces - that are not desirable. - - For example, having cancellation points hidden inside of the - OS can cause non-cancellation point interfaces to behave - strangely. - - Here is another issue:  Internal OS functions should not set - errno and should never have to look at the errno value to - determine the cause of the failure.  The errno is provided - for compatibility with POSIX application interface - requirements and really doesn't need to be used within the - OS. - - Both of these could be fixed if there were special internal - versions these functions.  For example, there could be a an - nxsem_wait() that does all of the same things as sem_wait() - was does not create a cancellation point and does not set - the errno value on failures. - - Everything inside the OS would use nx_sem_wait(). - Applications would call sem_wait() which would just be a - wrapper around nx_sem_wait() that adds the cancellation point - and that sets the errno value on failures. - - On particularly difficult issue is the use of common memory - manager C, and NX libraries in the build. For the PROTECTED - and KERNEL builds, this issue is resolved. In that case, - The OS links with a different version of the libraries than - does the application: The OS version would use the OS internal - interfaces and the application would use the standard - interfaces. - - But for the FLAT build, both the OS and the applications use - the same library functions. For applications, the library - functions *must* support errno's and cancellation and, hence, - these are also used within the OS. - - But that raises yet another issue: If the application - version of the libraries use the standard interfaces - internally, then they may generate unexpected cancellation - points. For example, the memory management would take a - semaphore using sem_wait() to get exclusive access to the - heap. That means that every call to malloc() and free() - would be a cancellation point, a clear POSIX violation. - - Changes like that could clean up some of this internal - craziness. - - UPDATE: - 2017-10-03: This change has been completed for the case of - semaphores used in the OS. Still need to checkout signals - and messages queues that are also used in the OS. Also - backed out commit b4747286b19d3b15193b2a5e8a0fe48fa0a8638c. - 2017-10-06: This change has been completed for the case of - signals used in the OS. Still need to checkout messages - queues that are also used in the OS. - 2017-10-10: This change has been completed for the case of - message queue used in the OS. I am keeping this issue - open because (1) there are some known remaining calls that - that will modify the errno (such as dup(), dup2(), - nxtask_activate(), kthread_create(), exec(), mq_open(), - mq_close(), and others) and (2) there may still be calls that - create cancellation points. Need to check things like open(), - close(), read(), write(), and possibly others. - 2018-01-30: This change has been completed for the case of - scheduler functions used within the OS: sched_getparam(), - sched_setparam(), sched_getscheduler(), sched_setschedule(), - and sched_setaffinity(), - 2018-09-15: This change has been completed for the case of - open() used within the OS. There are places under libs/ and - boards/ that have not been converted. I also note cases - where fopen() is called under libs/libc/netdb/. - 2019-09-11: built_isavail() no longer sets the errno variable. - - Status: Open - Priority: Low. Things are working OK the way they are. But the design - could be improved and made a little more efficient with this - change. - - Task: IDLE THREAD TCB SETUP - Description: There are issues with setting IDLE thread stacks: - - The problem is colorizing that stack to use with stack usage - monitoring logic. There is logic in some start functions to - do this in a function called go_nx_start. - It is available in these architectures: - - ./arm/src/efm32/efm32_start.c:static void go_nx_start(void *pv, unsigned int nbytes) - ./arm/src/kinetis/kinetis_start.c:static void go_nx_start(void *pv, unsigned int nbytes) - ./arm/src/sam34/sam_start.c:static void go_nx_start(void *pv, unsigned int nbytes) - ./arm/src/samv7/sam_start.c:static void go_nx_start(void *pv, unsigned int nbytes) - ./arm/src/stm32/stm32_start.c:static void go_nx_start(void *pv, unsigned int nbytes) - ./arm/src/stm32f7/stm32_start.c:static void go_nx_start(void *pv, unsigned int nbytes) - ./arm/src/stm32l4/stm32l4_start.c:static void go_nx_start(void *pv, unsigned int nbytes) - ./arm/src/tms570/tms570_boot.c:static void go_nx_start(void *pv, unsigned int nbytes) - ./arm/src/xmc4/xmc4_start.c:static void go_nx_start(void *pv, unsigned int nbytes) - - But no others. - Status: Open - Priority: Low, only needed for more complete debug. - - Title: PRIORITY INHERITANCE WITH SPORADIC SCHEDULER - Description: The sporadic scheduler manages CPU utilization by a task by - alternating between a high and a low priority. In either - state, it may have its priority boosted. However, under - some circumstances, it is impossible in the current design to - switch to the correct priority if a semaphore held by the - sporadic thread is participating in priority inheritance: - - There is an issue when switching from the high to the low - priority state. If the priority was NOT boosted above the - higher priority, it still may still need to boosted with - respect to the lower priority. If the highest priority - thread waiting on a semaphore held by the sporadic thread is - higher in priority than the low priority but less than the - higher priority, then new thread priority should be set to - that middle priority, not to the lower priority. - - In order to do this we would need to know the highest - priority from among all tasks waiting for the all semaphores - held by the sporadic task. That information could be - retained by the priority inheritance logic for use by the - sporadic scheduler. The boost priority could be retained in - a new field of the TCB (say, pend_priority). That - pend_priority could then be used when switching from the - higher to the lower priority. - Status: Open - Priority: Low. Does anyone actually use the sporadic scheduler? - - Title: SIMPLIFY SPORADIC SCHEDULER DESIGN - Description: I have been planning to re-implement sporadic scheduling for - some time. I believe that the current implementation is - unnecessarily complex. There is no clear statement for the - requirements of sporadic scheduling that I could find, so I - based the design on some behaviors of another OS that I saw - published (QNX as I recall). - - But I think that the bottom line requirement for sporadic - scheduling is that is it should make a best attempt to - control a fixed percentage of CPU bandwidth for a task in - during an interval only by modifying it is priority between - a low and a high priority. The current design involves - several timers: A "budget" timer plus a variable number of - "replenishment" timers and a lot of nonsense to duplicate QNX - behavior that I think I not necessary. - - It think that the sporadic scheduler could be re-implemented - with only the single "budget" timer. Instead of starting a - new "replenishment" timer when the task is resumed, that - single timer could just be extended. - Status: Open - Priority: Low. This is an enhancement. And does anyone actually use - the sporadic scheduler? - - Title: REMOVE NESTED CANCELLATION POINT SUPPORT - Description: The current implementation support nested cancellation points. - The TCB field cpcount keeps track of that nesting level. - However, cancellation points should not be calling other - cancellation points so this design could be simplified by - removing all support for nested cancellation points. - Status: Open - Priority: Low. No harm is being done by the current implementation. - This change is primarily for aesthetic reasons. If would - reduce memory usage by a very small but probably - insignificant amount. - - Title: DAEMONIZE ELF PROGRAM - Description: It is a common practice to "daemonize" to detach a task from - its parent. This is used with NSH, for example, so that NSH - will not stall, waiting in waitpid() for the child task to - exit. - - Daemonization is done to creating a new task which continues - to run while the original task exits (sending the SIGCHLD - signal to the parent and awakening waitpid()). In a pure - POSIX system, this is down with fork(), perhaps like: - - if (fork() != 0) - { - exit(); - } - - but is usually done with task_create() in NuttX. But when - task_create() is called from within an ELF program, a very - perverse situation is created: - - The basic problem involves address environments and task groups: - "Task groups" are emulations of Linux processes. For the - case of the FLAT, ELF module, the address environment is - allocated memory that contains the ELF module. - - When you call task_create() from the ELF program, you now - have two task groups running in the same address environment. - That is a perverse situation for which there is no standard - solution. There is nothing comparable to that. Even in - Linux, fork() creates another address environment (although - it is an exact copy of the original). - - When the ELF program was created, the function exec() in - binfmt/binfmt_exec.c runs. It sets up a call back that will - be invoked when the ELF program exits. - - When ELF program exits, the address environment is destroyed - and the other task running in the same address environment is - then running in stale memory and will eventually crash. - - Nothing special happens when the other created task running - in the allocated address environment exits since has no such - call backs. - - In order to make this work you would need logic like: - - 1. When the ELF task calls task_create(), it would need to: - - a. Detect that task_create() was called from an ELF program, - b. increment a reference count on the address environment, and - c. Set up the same exit hook for the newly created task. - - 2. Then when either the ELF program task or the created task - in the same address environment exits, it would decrement - the reference count. When the last task exits, the reference - count would go to zero and the address environment could be - destroyed. - - This is complex work and would take some effort and probably - requires redesign of existing code and interfaces to get a - proper, clean, modular solution. - - Status: Open - Priority: Medium-Low. A simple work-arounds when using NSH is to use - the '&' postfix to put the started ELF program into background. - -o SMP - ^^^ - - Title: MISUSE OF sched_lock() IN SMP MODE - Description: The OS API sched_lock() disables pre-emption and locks a - task in place. In the single CPU case, it is also often - used to enforce a simple critical section since not other - task can run while pre-emption is locked. - - This, however, does not generalize to the SMP case. In the - SMP case, there are multiple tasks running on multiple CPUs. - The basic behavior is still correct: The task that has - locked pre-emption will not be suspended. However, there - is no longer any protection for use as a critical section: - tasks running on other CPUs may still execute that - unprotected code region. - - The solution is to replace the use of sched_lock() with - stronger protection such as spin_lock_irqsave(). - Status: Open - Priority: Medium for SMP system. Not critical to single CPU systems. - NOTE: There are no known bugs from this potential problem. - - Title: ISSUES WITH ACCESSING CPU INDEX - Description: The CPU number is accessed usually with the macro this_cpu(). - The returned CPU number is then used for various things, - typically as an array index. However, if pre-emption is - not disabled,then it is possible that a context switch - could occur and that logic could run on another CPU with - possible fatal consequences. - - We need to evaluate all use of this_cpu() and assure that - it is used in a way that guarantees the the code continues - to execute on the same CPU. - - Status: Open - Prioity: Medium. This is a logical problem but I have never seen - an bugs caused by this. But I believe that failures are - possible. - -o Memory Management (mm/) - ^^^^^^^^^^^^^^^^^^^^^^^ - - Title: FREE MEMORY ON TASK EXIT - Description: Add an option to free all memory allocated by a task when the - task exits. This is probably not be worth the overhead for a - deeply embedded system. - - There would be complexities with this implementation as well - because often one task allocates memory and then passes the - memory to another: The task that "owns" the memory may not - be the same as the task that allocated the memory. - - Update. From the NuttX forum: - ...there is a good reason why task A should never delete task B. - That is because you will strand memory resources. Another feature - lacking in most flat address space RTOSs is automatic memory - clean-up when a task exits. - - That behavior just comes for free in a process-based OS like Linux: - Each process has its own heap and when you tear down the process - environment, you naturally destroy the heap too. - - But RTOSs have only a single, shared heap. I have spent some time - thinking about how you could clean up memory required by a task - when a task exits. It is not so simple. It is not as simple as - just keeping memory allocated by a thread in a list then freeing - the list of allocations when the task exists. - - It is not that simple because you don't know how the memory is - being used. For example, if task A allocates memory that is used - by task B, then when task A exits, you would not want to free that - memory needed by task B. In a process-based system, you would - have to explicitly map shared memory (with reference counting) in - order to share memory. So the life of shared memory in that - environment is easily managed. - - I have thought that the way that this could be solved in NuttX - would be: (1) add links and reference counts to all memory allocated - by a thread. This would increase the memory allocation overhead! - (2) Keep the list head in the TCB, and (3) extend mmap() and munmap() - to include the shared memory operations (which would only manage - the reference counting and the life of the allocation). - - Then what about pthreads? Memory should not be freed until the last - pthread in the group exists. That could be done with an additional - reference count on the whole allocated memory list (just as streams - and file descriptors are now shared and persist until the last - pthread exits). - - I think that would work but to me is very unattractive and - inconsistent with the NuttX "small footprint" objective. ... - - Other issues: - - Memory free time would go up because you would have to remove - the memory from that list in free(). - - There are special cases inside the RTOS itself. For example, - if task A creates task B, then initial memory allocations for - task B are created by task A. Some special allocators would - be required to keep this memory on the correct list (or on - no list at all). - - Updated 2016-06-25: - For processors with an MMU (Memory Management Unit), NuttX can be - built in a kernel mode. In that case, each process will have a - local copy of its heap (filled with sbrk()) and when the process - exits, its local heap will be destroyed and the underlying page - memory is recovered. - - So in this case, NuttX work just link Linux or or *nix systems: - All memory allocated by processes or threads in processes will - be recovered when the process exits. - - But not for the flat memory build. In that case, the issues - above do apply. There is no safe way to recover the memory in - that case (and even if there were, the additional overhead would - not be acceptable on most platforms). - - This does not prohibit anyone from creating a wrapper for malloc() - and an atexit() callback that frees memory on task exit. People - are free and, in fact, encouraged, to do that. However, since - it is inherently unsafe, I would never incorporate anything - like that into NuttX. - - Status: Open. No changes are planned. NOTE: This applies to the FLAT - and PROTECTED builds only. There is no such leaking of memory - in the KERNEL build mode. - Priority: Medium/Low, a good feature to prevent memory leaks but would - have negative impact on memory usage and code size. - -o Power Management (drivers/pm) - ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -o Signals (sched/signal, arch/) - ^^^^^^^^^^^^^^^^^^^^^^^ - - Title: STANDARD SIGNALS - Description: 'Standard' signals and signal actions are not fully - supported. The SIGCHLD signal is supported and, if the - option CONFIG_SIG_DEFAULT=y is included, some signals will - perform their default actions (dependent upon addition - configuration settings): - - Signal Action Additional Configuration - ------- -------------------- ------------------------- - SIGUSR1 Abnormal Termination CONFIG_SIG_SIGUSR1_ACTION - SIGUSR2 Abnormal Termination CONFIG_SIG_SIGUSR2_ACTION - SIGALRM Abnormal Termination CONFIG_SIG_SIGALRM_ACTION - SIGPOLL Abnormal Termination CONFIG_SIG_SIGPOLL_ACTION - SIGSTOP Suspend task CONFIG_SIG_SIGSTOP_ACTION - SIGTSTP Suspend task CONFIG_SIG_SIGSTOP_ACTION - SIGCONT Resume task CONFIG_SIG_SIGSTOP_ACTION - SIGINT Abnormal Termination CONFIG_SIG_SIGKILL_ACTION - SIGKILL Abnormal Termination CONFIG_SIG_SIGKILL_ACTION - - Status: Open. No further changes are planned. - Priority: Low, required by standards but not so critical for an - embedded system. - - Title: SIGEV_THREAD - Description: Implementation of support for SIGEV_THREAD is available - only in the FLAT build mode because it uses the OS work queues to - perform the callback. The alternative for the PROTECTED and KERNEL - builds would be to create pthreads in the user space to perform the - callbacks. That is not a very attractive solution due to performance - issues. It would also require some additional logic to specify the - TCB of the parent so that the pthread could be bound to the correct - group. - - There is also some user-space logic in libs/libc/aio/lio_listio.c. - That logic could use the user-space work queue for the callbacks. - Status: Low, there are alternative designs. However, these features - are required by the POSIX standard. - Priority: Low for now - - Title: SIGNAL NUMBERING - Description: In signal.h, the range of valid signals is listed as 0-31. However, - in many interfaces, 0 is not a valid signal number. The valid - signal number should be 1-32. The signal set operations would need - to map bits appropriately. - Status: Open - Priority: Low. Even if there are only 31 usable signals, that is still a lot. - - Title: NO QUEUING of SIGNAL ACTIONS - Description: In the architecture specific implementation of struct xcptcontext, - there are fields used by signal handling logic to pass the state - information needed to dispatch signal actions to the appropriate - handler. - - There is only one copy of this state information in the - implementations of struct xcptcontext and, as a consequence, - if there is a signal handler executing on a thread, then addition - signal actions will be lost until that signal handler completes - and releases those resources. - Status: Open - Priority: Low. This design flaw has been around for ages and no one has yet - complained about it. Apparently the visibility of the problem is - very low. - - Title: QUEUED SIGNAL ACTIONS ARE INAPPROPRIATELY DEFERRED - Description: The implement of nxsig_deliver() does the following in a loop: - - It takes the next next queued signal action from a list - - Calls the architecture-specific up_sigdeliver() to perform - the signal action (through some sleight of hand in - up_schedule_sigaction()) - - up_sigdeliver() is a trampoline function that performs the - actual signal action as well as some housekeeping functions - then - - up_sigdeliver() performs a context switch back to the normal, - uninterrupted thread instead of returning to nxsig_deliver(). - - The loop in nxsig_deliver() then will have the opportunity to - run until when that normal, uninterrupted thread is suspended. - Then the loop will continue with the next queued signal - action. - - Normally signals execute immediately. The is the whole reason - why almost all blocking APIs return when a signal is received - (with errno equal to EINTR). - Status: Open - Priority: Low. This design flaw has been around for ages and no one has yet - complained about it. Apparently the visibility of the problem is - very low. - -o pthreads (sched/pthreads libs/libc/pthread) - ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - - Title: PTHREAD_PRIO_PROTECT - Description: Extend pthread_mutexattr_setprotocol(). It should support - PTHREAD_PRIO_PROTECT (and so should its non-standard counterpart - sem_setproto()). - - "When a thread owns one or more mutexes initialized with the - PTHREAD_PRIO_PROTECT protocol, it shall execute at the higher of its - priority or the highest of the priority ceilings of all the mutexes - owned by this thread and initialized with this attribute, regardless of - whether other threads are blocked on any of these mutexes or not. - - "While a thread is holding a mutex which has been initialized with - the PTHREAD_PRIO_INHERIT or PTHREAD_PRIO_PROTECT protocol attributes, - it shall not be subject to being moved to the tail of the scheduling queue - at its priority in the event that its original priority is changed, - such as by a call to sched_setparam(). Likewise, when a thread unlocks - a mutex that has been initialized with the PTHREAD_PRIO_INHERIT or - PTHREAD_PRIO_PROTECT protocol attributes, it shall not be subject to - being moved to the tail of the scheduling queue at its priority in the - event that its original priority is changed." - - Status: Open. No changes planned. - Priority: Low -- about zero, probably not that useful. Priority inheritance is - already supported and is a much better solution. And it turns out - that priority protection is just about as complex as priority inheritance. - Excerpted from my post in a Linked-In discussion: - - "I started to implement this HLS/"PCP" semaphore in an RTOS that I - work with (https://apache.nuttx.org) and I discovered after doing the - analysis and basic code framework that a complete solution for the - case of a counting semaphore is still quite complex -- essentially - as complex as is priority inheritance. - - "For example, suppose that a thread takes 3 different HLS semaphores - A, B, and C. Suppose that they are prioritized in that order with - A the lowest and C the highest. Suppose the thread takes 5 counts - from A, 3 counts from B, and 2 counts from C. What priority should - it run at? It would have to run at the priority of the highest - priority semaphore C. This means that the RTOS must maintain - internal information of the priority of every semaphore held by - the thread. - - "Now suppose it releases one count on semaphore B. How does the - RTOS know that it still holds 2 counts on B? With some complex - internal data structure. The RTOS would have to maintain internal - information about how many counts from each semaphore are held - by each thread. - - "How does the RTOS know that it should not decrement the priority - from the priority of C? Again, only with internal complexity. It - would have to know the priority of every semaphore held by - every thread. - - "Providing the HLS capability on a simple pthread mutex would not - be such quite such a complex job if you allow only one mutex per - thread. However, the more general case seems almost as complex - as priority inheritance. I decided that the implementation does - not have value to me. I only wanted it for its reduced - complexity; in all other ways I believe that it is the inferior - solution. So I discarded a few hours of programming. Not a - big loss from the experience I gained." - - Title: INAPPROPRIATE USE OF sched_lock() BY pthreads - Description: In implementation of standard pthread functions, the non- - standard, NuttX function sched_lock() is used. This is very - strong since it disables pre-emption for all threads in all - task groups. I believe it is only really necessary in most - cases to lock threads in the task group with a new non- - standard interface, say pthread_lock(). - - This is because the OS resources used by a thread such as - mutexes, condition variable, barriers, etc. are only - meaningful from within the task group. So, in order to - performance exclusive operations on these resources, it is - only necessary to block other threads executing within the - task group. - - This is an easy change: pthread_lock() and pthread_unlock() - would simply operate on a semaphore retained in the task - group structure. I am, however, hesitant to make this change: - In the FLAT build model, there is nothing that prevents people - from accessing the inter-thread controls from threads in - different task groups. Making this change, while correct, - might introduce subtle bugs in code by people who are not - using NuttX correctly. - Status: Open - Priority: Low. This change would improve real-time performance of the - OS but is not otherwise required. - -o Message Queues (sched/mqueue) - ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -o Work Queues (sched/wqueue) - ^^^^^^^^^^^^^^^^^^^^^^^^^^ - - Title: WORK QUEUE DELAY INACCURACIES - Description: Each queued work may have an optional delay value associated - with it. That delay should be respect to the time that the - work is queued. However, since we do not know the time the - work is queue, the actual delay will be respect to the time - that the work is processed. Under certain conditions, the - work may sit in the queue for some time before it is - processed, leading to an inaccuracy in the delay. - - One solution might involved saving the time when in the work - structure when the work is queued. Then the delay logic can - take the difference between the processing time and the - queued time to get a more accurate delay. - Status: Open - Priority: In all known use cased, the priority is low. A problem is - would only occur if the work queue is overload or if work in - the work queue suspends waiting for a resource (both of which - are much bigger problems). - -o Kernel/Protected Build - ^^^^^^^^^^^^^^^^^^^^^^ - - Title: C++ CONSTRUCTORS HAVE TOO MANY PRIVILEGES (PROTECTED MODE) - Description: When a C++ ELF module is loaded, its C++ constructors are called - via sched/task_starthook.c logic. This logic runs in protected mode. - The is a security hole because the user code runs with kernel- - privileges when the constructor executes. - - Destructors likely have the opposite problem. The probably try to - execute some kernel logic in user mode? Obviously this needs to - be investigated further. - Status: Open - Priority: Low (unless you need build a secure C++ system). - - Title: TOO MANY SYSCALLS - Description: There are a few syscalls that operate very often in user space. - Since syscalls are (relatively) time consuming this could be - a performance issue. Here is some numbers that I collected - in an application that was doing mostly printf output: - - sem_post - 18% of syscalls - sem_wait - 18% of syscalls - getpid - 59% of syscalls - -------------------------- - 95% of syscalls - - Obviously system performance could be improved greatly by simply - optimizing these functions so that they do not need to system calls - so frequently. This getpid() call is part of the re-entrant - semaphore logic used with printf() and other C buffered I/O. - Something like TLS might be used to retain the thread's ID - locally. - - Linux, for example, has functions call up() and down(). up() - increments the semaphore count but does not call into the kernel - unless incrementing the count unblocks a task; similarly, down - decrements the count and does not call into the kernel unless - the count becomes negative the caller must be blocked. - - Update: - "I am thinking that there should be a "magic" global, user- - accessible variable that holds the PID of the currently - executing thread; basically the PID of the task at the head - of the ready-to-run list. This variable would have to be reset - each time the head of the ready-to-run list changes. - - "Then getpid() could be implemented in user space with no system call - by simply reading this variable. - - "This one would be easy: Just a change to include/nuttx/userspace.h, - boards////kernel/up_userspace.c, libs/libc/, - sched/sched_addreadytorun.c, and sched/sched_removereadytorun.c. - That would eliminate 59% of the syscalls." - - Update: - This is probably also just a symptom of the OS test that does mostly - console output. The requests for the pid() are part of the - implementation of the I/O's re-entrant semaphore implementation and - would not be an issue in the more general case. - - Update: - One solution might be to use TLS, add the PID to struct - tls_info_s. Then the PID could be obtained without a system call. - TLS is not very useful in the FLAT build, however. TLS works by - putting per-thread data at the bottom of an aligned stack. The - current stack pointer is then ANDed with the alignment mask to - obtain the per-thread data address. - - There are problems with this in the FLAT and PROTECTED builds: - First the maximum size of the stack is limited by the number - of bits in the mask. This means that you need to have a very - high alignment to support tasks with large stacks. But - secondly, the higher the alignment of the stacks stacks, the - more memory is lost to fragmentation. - - In the KERNEL build, the the stack lies at a virtual address - and it is possible to have highly aligned stacks with no such - penalties. - Status: Open - Priority: Low-Medium. Right now, I do not know if these syscalls are a - real performance issue or not. The above statistics were collected - from a an atypical application (the OS test), and does an excessive - amount of console output. There is probably no issue with more typical - embedded applications. - - Title: SECURITY ISSUES - Description: In the current designed, the kernel code calls into the user-space - allocators to allocate user-space memory. It is a security risk to - call into user-space in kernel-mode because that could be exploited - to gain control of the system. That could be fixed by dropping to - user mode before trapping into the memory allocators; the memory - allocators would then need to trap in order to return (this is - already done to return from signal handlers; that logic could be - renamed more generally and just used for a generic return trap). - - Another place where the system calls into the user code in kernel - mode is work_usrstart() to start the user work queue. That is - another security hole that should be plugged. - Status: Open - Priority: Low (unless security becomes an issue). - - Title: MICRO-KERNEL - Description: The initial kernel build cut many interfaces at a very high level. - The resulting monolithic kernel is then rather large. It would - not be a prohibitively large task to reorganize the interfaces so - that NuttX is built as a micro-kernel, i.e., with only the core - OS services within the kernel and with other OS facilities, such - as the file system, message queues, etc., residing in user-space - and to interfacing with those core OS facilities through traps. - Status: Open - Priority: Low. This is a good idea and certainly an architectural - improvement. However, there is no strong motivation now do - do that partitioning work. - - Title: USER MODE TASKS CAN MODIFY PRIVILEGED TASKS - Description: Certain interfaces, such as sched_setparam(), - sched_setscheduler(), etc. can be used by user mode tasks to - modify the behavior of privileged kernel threads. - For a truly secure system. Privileges need to be checked in - every interface that permits one thread to modify the - properties of another thread. - - NOTE: It would be a simple matter to simply disable user - threads from modifying privileged threads. However, you - might also want to be able to modify privileged threads from - user tasks with certain permissions. Permissions is a much - more complex issue. - - task_delete(), for example, is not permitted to kill a kernel - thread. But should not a privileged user task be able to do - so? - Status: Open - Priority: Low for most embedded systems but would be a critical need if - NuttX were used in a secure system. - - Title: SIGNAL ACTION VULNERABILITY - Description: When a signal action is performed, the user stack is used. - Unlike Linux, applications do not have separate user and - supervisor stacks; everything is done on the user stack. - - In the implementation of up_sigdeliver(), a copy of the - register contents that will be restored is present on the - stack and could be modified by the user application. Thus, - if the user mucks with the return stack, problems could - occur when the user task returns to supervisor mode from - the the signal handler. - - A recent commit (3 Feb 2019) does protect the status register - and return address so that a malicious task cannot change the - return address or switch to supervisor mode. Other register - are still modifiable so there is other possible mayhem that - could be done. - - A better solution, in lieu of a kernel stack, would be to - eliminate the stack-based register save area altogether and, - instead, save the registers in another, dedicated state save - area in the TCB. The only hesitation to this option is that - it would significantly increase the size of the TCB structure - and, hence, the per-thread memory overhead. - Status: Open - Priority: Medium-ish if are attempting to make a secure environment that - may host malicious code. Very low for the typical FLAT build, - however. - -o C++ Support - ^^^^^^^^^^^ - - Title: STATIC CONSTRUCTORS AND MULTITASKING - Description: The logic that calls static constructors operates on the main - thread of the initial user application task. Any static - constructors that cache task/thread specific information such - as C streams or file descriptors will not work in other tasks. - See also UCLIBC++ AND STATIC CONSTRUCTORS below. - Status: Open - Priority: Low and probably will not changed. In these case, there will - need to be an application specific solution. - - Title: UCLIBC++ AND STATIC CONSTRUCTORS - uClibc++ was designed to work in a Unix environment with - processes and with separately linked executables. Each process - has its own, separate uClibc++ state. uClibc++ would be - instantiated like this in Linux: - - 1) When the program is built, a tiny start-up function is - included at the beginning of the program. Each program has - its own, separate list of C++ constructors. - - 2) When the program is loaded into memory, space is set aside - for uClibc's static objects and then this special start-up - routine is called. It initializes the C library, calls all - of the constructors, and calls atexit() so that the destructors - will be called when the process exits. - - In this way, you get a per-process uClibc++ state since there - is per-process storage of uClibc++ global state and per-process - initialization of uClibc++ state. - - Compare this to how NuttX (and most embedded RTOSs) would work: - - 1) The entire FLASH image is built as one big blob. All of the - constructors are lumped together and all called together at - one time. - - This, of course, does not have to be so. We could segregate - constructors by some criteria and we could use a task start - up routine to call constructors separately. We could even - use ELF executables that are separately linked and already - have their constructors separately called when the ELF - executable starts. - - But this would not do you very much good in the case of - uClibc++ because: - - 2) NuttX does not support processes, i.e., separate address - environments for each task. As a result, the scope of global - data is all tasks. Any change to the global state made by - one task can effect another task. There can only one - uClibc++ state and it will be shared by all tasks. uClibc++ - apparently relies on global instances (at least for cin and - cout) there is no way to have any unique state for any - "task group". - - [NuttX does not support processes because in order to have - true processes, your hardware must support a memory management - unit (MMU) and I am not aware of any mainstream MCU that has - an MMU (or, at least an MMU that is capable enough to support - processes).] - - NuttX does not have processes, but it does have "task groups". - See https://cwiki.apache.org/confluence/display/NUTTX/Tasks+vs.+Threads+FAQ. - A task group is the task plus all of the pthreads created by - the task via pthread_create(). Resources like FILE streams - are shared within a task group. Task groups are like a poor - man's process. - - This means that if the uClibc++ static classes are initialized - by one member of a task group, then cin/cout should work - correctly with all threads that are members of task group. The - destructors would be called when the final member of the task - group exists (if registered via atexit()). - - So if you use only pthreads, uClibc++ should work very much like - it does in Linux. If your NuttX usage model is like one process - with many threads then you have Linux compatibility. - - If you wanted to have uClibc++ work across task groups, then - uClibc++ and NuttX would need some extensions. I am thinking - along the lines of the following: - - 1) There is a per-task group storage are within the RTOS (see - include/nuttx/sched.h). If we add some new, non-standard APIs - then uClibc++ could get access to per-task group storage (in - the spirit of pthread_getspecific() which gives you access to - per-thread storage). - - 2) Then move all of uClibc++'s global state into per-task group - storage and add a uClibc++ initialization function that would: - a) allocate per-task group storage, b) call all of the static - constructors, and c) register with atexit() to perform clean- - up when the task group exits. - - That would be a fair amount of effort. I don't really know what - the scope of such an effort would be. I suspect that it is not - large but probably complex. - - NOTES: - - 1) See STATIC CONSTRUCTORS AND MULTITASKING - - 2) To my knowledge, only some uClibc++ ofstream logic is - sensitive to this. All other statically initialized classes - seem to work OK across different task groups. - Status: Open - Priority: Low. I have no plan to change this logic now unless there is - some strong demand to do so. - -o Binary loaders (binfmt/) - ^^^^^^^^^^^^^^^^^^^^^^^^ - - Title: NXFLAT TESTS - Description: Not all of the NXFLAT test under apps/examples/nxflat are working. - Most simply do not compile yet. tests/mutex runs okay but - outputs garbage on completion. - - Update: 13-27-1, tests/mutex crashed with a memory corruption - problem the last time that I ran it. - Status: Open - Priority: High - - Title: ARM UP_GETPICBASE() - Description: The ARM up_getpicbase() does not seem to work. This means - the some features like wdog's might not work in NXFLAT modules. - Status: Open - Priority: Medium-High - - Title: NXFLAT READ-ONLY DATA IN RAM - Description: At present, all .rodata must be put into RAM. There is a - tentative design change that might allow .rodata to be placed - in FLASH (see Documentation/NuttXNxFlat.html). - Status: Open - Priority: Medium - - Title: GOT-RELATIVE FUNCTION POINTERS - Description: If the function pointer to a statically defined function is - taken, then GCC generates a relocation that cannot be handled - by NXFLAT. There is a solution described in Documentation/NuttXNxFlat.html, - by that would require a compiler change (which we want to avoid). - The simple workaround is to make such functions global in scope. - Status: Open - Priority: Low (probably will not fix) - - Title: USE A HASH INSTEAD OF A STRING IN SYMBOL TABLES - Description: In the NXFLAT symbol tables... Using a 32-bit hash value instead - of a string to identify a symbol should result in a smaller footprint. - Status: Open - Priority: Low - - Title: WINDOWS-BASED TOOLCHAIN BUILD - Description: Windows build issue. Some of the configurations that use NXFLAT have - the linker script specified like this: - - NXFLATLDFLAGS2 = $(NXFLATLDFLAGS1) -T$(TOPDIR)/binfmt/libnxflat/gnu-nxflat-gotoff.ld -no-check-sections - - That will not work for windows-based tools because they require Windows - style paths. The solution is to do something like this: - - if ($(CONFIG_CYGWIN_WINTOOL),y) - NXFLATLDSCRIPT=${cygpath -w $(TOPDIR)/binfmt/libnxflat/gnu-nxflat-gotoff.ld} - else - NXFLATLDSCRIPT=$(TOPDIR)/binfmt/libnxflat/gnu-nxflat-gotoff.ld - endif - - Then use - - NXFLATLDFLAGS2 = $(NXFLATLDFLAGS1) -T"$(NXFLATLDSCRIPT)" -no-check-sections - - Status: Open - Priority: There are too many references like the above. They will have - to get fixed as needed for Windows native tool builds. - -o Network (net/, drivers/net) - ^^^^^^^^^^^^^^^^^^^^^^^^^^^ - - Title: LISTENING FOR UDP BROADCASTS - Description: Incoming UDP broadcast should only be accepted if listening on - INADDR_ANY(?) - Status: Open - Priority: Low - - Title: CONCURRENT, UNBUFFERED TCP SEND OPERATIONS - Description: At present, there cannot be two concurrent active TCP send - operations in progress using the same socket *unless* - CONFIG_TCP_WRITE_BUFFER. This is because the uIP ACK logic - will support only one transfer at a time. - - Such a situation could occur if explicit TCP send operations - are performed using the same socket (or dup's of the same) - socket on two different threads. It can also occur implicitly - when you execute more than one thread over and NSH Telenet - session. - - There are two possible solutions: - - 1. Remove option to build the network without write buffering - enabled. This is is simplest and perhaps the best option. - Certainly a system can be produced with a smaller RAM - footprint without write buffering. However, that probably - does not justify permitted a crippled system. - - 2. Another option is to serialize the non-buffered writes for - a socket with a mutex. i.e., add a mutex to make sure that - each send that is started is able to be the exclusive - sender until all of the data to be sent has been ACKed. - That can be a very significant delay involving the send, - waiting for the ACK or a timeout and possible retransmissions! - - Although it uses more memory, I believe that option 1 is the - better solution and will avoid difficult TCP bugs in the future. - - Status: Open. - Priority: Medium-Low. This is only an important issue for people who - use multi-threaded, unbuffered TCP networking without a full - understanding of the issues. - - Title: POLL/SELECT ON TCP/UDP SOCKETS NEEDS READ-AHEAD - Description: poll()/select() only works for availability of buffered TCP/UDP - read data (when read-ahead is enabled). The way writing is - handled in the network layer, either (1) If CONFIG_UDP/TCP_WRITE_BUFFERS=y - then we never have to wait to send; otherwise, we always have - to wait to send. So it is impossible to notify the caller - when it can send without waiting. - - An exception "never having to wait" is the case where we are - out of memory for use in write buffering. In that case, the - blocking send()/sendto() would have to wait for the memory - to become available. - Status: Open, probably will not be fixed. - Priority: Medium... this does effect porting of applications that expect - different behavior from poll()/select() - - Title: INTERFACES TO LEAVE/JOIN IGMP MULTICAST GROUP - Description: The interfaces used to leave/join IGMP multicast groups is non-standard. - RFC3678 (IGMPv3) suggests ioctl() commands to do this (SIOCSIPMSFILTER) but - also status that those APIs are historic. NuttX implements these ioctl - commands, but is non-standard because: (1) It does not support IGMPv3, and - (2) it looks up drivers by their device name (e.g., "eth0") vs IP address. - - Linux uses setsockopt() to control multicast group membership using the - IP_ADD_MEMBERSHIP and IP_DROP_MEMBERSHIP options. It also looks up drivers - using IP addresses (It would require additional logic in NuttX to look up - drivers by IP address). See http://tldp.org/HOWTO/Multicast-HOWTO-6.html - Status: Open - Priority: Medium. All standards compatibility is important to NuttX. However, most - the mechanism for leaving and joining groups is hidden behind a wrapper - function so that little of this incompatibilities need be exposed. - - Title: CLOSED CONNECTIONS IN THE BACKLOG - If a connection is backlogged but accept() is not called quickly, then - that connection may time out. How should this be handled? Should the - connection be removed from the backlog if it is times out or is closed? - Or should it remain in the backlog with a status indication so that accept() - can fail when it encounters the invalid connection? - Status: Open - Priority: Medium. Important on slow applications that will not accept - connections promptly. - - Title: IPv6 REQUIRES ADDRESS FILTER SUPPORT - Description: IPv6 requires that the Ethernet driver support NuttX address - filter interfaces. Several Ethernet drivers do support there, - however. Others support the address filtering interfaces but - have never been verified: - - C5471, LM3S, ez80, DM0x90 NIC, PIC, LPC54: Do not support - address filtering. - Kinetis, LPC17xx, LPC43xx: Untested address filter support - - Status: Open - Priority: Pretty high if you want a to use IPv6 on these platforms. - - Title: UDP MULTICAST RECEPTION - Description: The logic in udp_input() expects either a single receive socket or - none at all. However, multiple sockets should be capable of - receiving a UDP datagram (multicast reception). This could be - handled easily by something like: - - for (conn = NULL; conn = udp_active (pbuf, conn); ) - - If the callback logic that receives a packet responds with an - outgoing packet, then it will over-write the received buffer, - however. recvfrom() will not do that, however. We would have - to make that the rule: Recipients of a UDP packet must treat - the packet as read-only. - Status: Open - Priority: Low, unless your logic depends on that behavior. - - Title: NETWORK WON'T STAY DOWN - Description: If you enable the NSH network monitor (CONFIG_NSH_NETINIT_MONITOR) - then the NSH 'ifdown' command is broken. Doing 'nsh> ifconfig eth0' - will, indeed, bring the network down. However, the network monitor - notices the change in the link status and will bring the network - back up. There needs to be some kind of interlock between - cmd_ifdown() and the network monitor thread to prevent this. - Status: Open - Priority: Low, this is just a nuisance in most cases. - - Title: FIFO CLEAN-UP AFTER CLOSING UNIX DOMAIN DATAGRAM SOCKET - Description: FIFOs are used as the IPC underlying all local Unix domain - sockets. In NuttX, FIFOs are implemented as device drivers - (not as a special FIFO files). The FIFO device driver is - instantiated when the Unix domain socket communications begin - and will automatically be released when (1) the driver is - unlinked and (2) all open references to the driver have been - closed. But there is no mechanism in place now to unlink the - FIFO when the Unix domain datagram socket is no longer used. - The primary issue is timing.. the FIFO should persist until - it is no longer needed. Perhaps there should be a delayed - call to unlink() (using a watchdog or the work queue). If - the driver is re-opened, the delayed unlink could be - canceled? Needs more thought. - NOTE: This is not an issue for Unix domain streams sockets: - The end-of-life of the FIFO is well determined when sockets - are disconnected and support for that case is fully implemented. - Status: Open - Priority: Low for now because I don't have a situation where this is a - problem for me. If you use the same Unix domain paths, then - it is not a issue; in fact it is more efficient if the FIFO - devices persist. But this would be a serious problem if, - for example, you create new Unix domain paths dynamically. - In that case you would effectively have a memory leak and the - number of FIFO instances grow. - - Title: TCP IPv4-MAPPED IPv6 ADDRESSES - Description: The UDP implementation in net/udp contains support for Hybrid - dual-stack IPv6/IPv4 implementations that utilize a special - class of addresses, the IPv4-mapped IPv6 addresses. You can - see that UDP implementation in: - - udp_callback.c: - ip6_map_ipv4addr(ipv4addr, - udp_send.c: - ip6_is_ipv4addr((FAR struct in6_addr*)conn->u.ipv6.raddr))) - ip6_is_ipv4addr((FAR struct in6_addr*)conn->u.ipv6.raddr)) - in_addr_t raddr = ip6_get_ipv4addr((FAR struct in6_addr*)conn->u.ipv6.raddr); - - There is no corresponding support for TCP sockets. - Status: Open - Priority: Low. I don't know of any issues now, but I am sure that - someone will encounter this in the future. - - Title: MISSING netdb INTERFACES - Description: There is no implementation for many netdb interfaces such as - getnetbyname(), getprotobyname(), getnameinfo(), etc. - Status: Open - Priority: Low - - Title: ETHERNET WITH MULTIPLE LPWORK THREADS - Description: Recently, Ethernet drivers were modified to support multiple - work queue structures. The question was raised: "My only - reservation would be, how would this interact in the case of - having CONFIG_STM32_ETHMAC_LPWORK and CONFIG_SCHED_LPNTHREADS - > 1? Can it be guaranteed that one work item won't be - interrupted and execution switched to another? I think so but - am not 100% confident." - - I suspect that you right. There are probably vulnerabilities - in the CONFIG_STM32_ETHMAC_LPWORK with CONFIG_SCHED_LPNTHREADS - > 1 case. But that really doesn't depend entirely upon the - change to add more work queue structures. Certainly with only - work queue structure you would have concurrent Ethernet - operations in that multiple LP threads; just because the work - structure is available, does not mean that there is not dequeued - work in progress. The multiple structures probably widens the - window for that concurrency, but does not create it. - - The current Ethernet designs depend upon a single work queue to - serialize data. In the case of multiple LP threads, some - additional mechanism would have to be added to enforce that - serialization. - - NOTE: Most drivers will call net_lock() and net_unlock() around - the critical portions of the driver work. In that case, all work - will be properly serialized. This issue only applies to drivers - that may perform operations that require protection outside of - the net_lock'ed region. Sometimes, this may require extending - the netlock() to be beginning of the driver work function. - - Status: Open - Priority: High if you happen to be using Ethernet in this configuration. - - Title: NETWORK DRIVERS USING HIGH PRIORITY WORK QUEUE - Description: Many network drivers run the network on the high priority work - queue thread (or support an option to do so). Networking should - not be done on the high priority work thread because it interferes - with real-time behavior. Fix by forcing all network drivers to - run on the low priority work queue. - Status: Open - Priority: Low. Not such big deal for demo network test and demo - configurations except that it provides a bad example for a product - OS configuration. - - Title: REPARTITION DRIVER FUNCTIONALITY - Description: Every network driver performs the first level of packet decoding. - It examines the packet header and calls ipv4_input(), ipv6_input(). - icmp_input(), etc. as appropriate. This is a maintenance problem - because it means that any changes to the network input interfaces - affects all drivers. - - A better, more maintainable solution would use a single net_input() - function that would receive all incoming packets. This function - would then perform that common packet decoding logic that is - currently implemented in every network driver. - Status: Open - Priority: Low. Really just as aesthetic maintainability issue. - - Title: BROADCAST WITH MULTIPLE NETWORK INTERFACES - Description: There is currently no mechanism to send a broadcast packet - out through several network interfaces. Currently packets - can be sent to only one device. Logic in netdev_findby_ipvXaddr() - currently just selects the first device in the list of - devices; only that device will receive broadcast packets. - Status: Open - Priority: High if you require broadcast on multiple networks. There is - no simple solution known at this time, however. Perhaps - netdev_findby_ipvXaddr() should return a list of devices rather - than a single device? All upstream logic would then have to - deal with a list of devices. That would be a huge effect and - certainly doesn't dount as a "simple solution". - - Title: ICMPv6 FOR 6LoWPAN - Description: The current ICMPv6 and neighbor-related logic only works with - Ethernet MAC. For 6LoWPAN, a new more conservative IPv6 - neighbour discovery is provided by RFC 6775. This RFC needs to - be supported in order to support ping6 on a 6LoWPAN network. - If RFC 6775 were implemented, then arbitrary IPv6 addresses, - including addresses from DHCPv6 could be used. - - UPDATE: With IPv6 neighbor discovery, any IPv6 address may - be associated with any short or extended address. In fact, - that is the whole purpose of the neighbor discover logic: It - plays the same role as ARP in IPv4; it ultimately just manages - a neighbor table that, like the arp table, provides the - mapping between IP addresses and node addresses. - - The NuttX, Contiki-based 6LoWPAN implementation circumvented - the need for the neighbor discovery logic by using only MAC- - based addressing, i.e., the lower two or eight bytes of the - IP address are the node address. - - Most of the 6LoWPAN compression algorithms exploit this to - compress the IPv6 address to nothing but a bit indicating - that the IP address derives from the node address. So I - think IPv6 neighbor discover is useless in the current - implementation. - - If we want to use IPv6 neighbor discovery, we could dispense - with the all MAC based addressing. But if we want to retain - the more compact MAC-based addressing, then we don't need - IPv6 neighbor discovery. - - So, the full neighbor discovery logic is not currently useful, - but it would still be nice to have enough in place to support - ping6. Full neighbor support would probably be necessary if we - wanted to route 6LoWPAN frames outside of the WPAN. - - Status: Open - Priority: Low for now. I don't plan on implementing this. It would - only be relevant if we were to decide to abandon the use of - MAC-based addressing in the 6LoWPAN implementation. - - Title: ETHERNET LOCAL BROADCAST DOES NOT WORK - Description: In case of "local broadcast" the system still send ARP - request to the destination, but it shouldn't, it should - broadcast. For Example, the system has network with IP - 10.0.0.88, netmask of 255.255.255.0, it should send - messages for 10.0.0.255 as broadcast, and not send ARP - for 10.0.0.255 - - For more easier networking, the next line should have give - me the broadcast address of the network, but it doesn't: - - ioctl(_socket_fd, SIOCGIFBRDADDR, &bc_addr); - Status: Open - Priority: Medium - - Title: TCP ISSUES WITH QUICK CLOSE - Description: This failure has been reported in the accept() logic: - - - psock_tcp_accept() waits on net_lockedwait() below - - The accept operation completes, the socket is in the connected - state and psock_accept() is awakened. It cannot run, - however, because its priority is low and so it is blocked - from execution. - - In the mean time, the remote host sends a - packet which is presumably caught in the read-ahead buffer. - - Then the remote host closes the socket. Nothing happens on - the target side because net_start_monitor() has not yet been - called. - - Then accept() finally runs, but not with a connected but - rather with a disconnected socket. This fails when it - attempts to start the network monitor on the disconnected - socket below. - - It is also impossible to read the buffered TCP data from a - disconnected socket. The TCP recvfrom() logic would also - need to permit reading buffered data from a disconnected - socket. - - This problem was report when the target hosted an FTP server - and files were being accessed by FileZilla. - - connect() most likely has this same issue. - - A work-around might be to raise the priority of the thread - that calls accept(). accept() might also need to check the - tcpstateflags in the connection structure before returning - in order to assure that the socket truly is connected. - Status: Open - Priority: Medium. I have never heard of this problem being reported - before, so I suspect it might not be so prevalent as one - might expect. - - Title: LOCAL DATAGRAM RECVFROM RETURNS WRONG SENDER ADDRESS - Description: The recvfrom logic for local datagram sockets returns the - incorrect sender "from" address. Instead, it returns the - receiver's "to" address. This means that returning a reply - to the "from" address receiver sending a packet to itself. - Status: Open - Priority: Medium High. This makes using local datagram sockets in - anything but a well-known point-to-point configuration - impossible. - -o USB (drivers/usbdev, drivers/usbhost) - ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - - Title: USB STORAGE DRIVER DELAYS - Description: There is a workaround for a bug in drivers/usbdev/usbdev_storage.c. - that involves delays. This needs to be redesigned to eliminate these - delays. See logic conditioned on CONFIG_USBMSC_RACEWAR. - - If queuing of stall requests is supported by the DCD then this workaround - is not required. In this case, (1) the stall is not sent until all - write requests preceding the stall request are sent, (2) the stall is - sent, and then after the stall is cleared, (3) all write requests - queued after the stall are sent. - - See, for example, the queuing of pending stall requests in the SAM3/4 - UDP driver at arch/arm/src/sam34/sam_udp.c. There the logic is do this - is implemented with a normal request queue, a pending request queue, a - stall flag and a stall pending flag: - - 1) If the normal request queue is not empty when the STALL request is - received, the stall pending flag is set. - 2) If addition write requests are received while the stall pending flag - is set (or while waiting for the stall to be sent), those write requests - go into the pending queue. - 3) When the normal request queue empties successful and all of the write - transfers complete, the STALL is sent. The stall pending flag is - cleared and the stall flag is set. Now the endpoint is really stalled. - 4) After the STALL is cleared (via the Clear Feature SETUP), the pending - request queue is copied to the normal request queue, the stall flag is - cleared, and normal write request processing resumes. - - Status: Open - Priority: Medium - - Title: EP0 OUT CLASS DATA - Description: There is no mechanism in place to handle EP0 OUT data transfers. - There are two aspects to this problem, neither are easy to fix - (only because of the number of drivers that would be impacted): - - 1. The class drivers only send EP0 write requests and these are - only queued on EP0 IN by this drivers. There is never a read - request queued on EP0 OUT. - 2. But EP0 OUT data could be buffered in a buffer in the driver - data structure. However, there is no method currently - defined in the USB device interface to obtain the EP0 data. - - Updates: (1) The USB device-to-class interface as been extended so - that EP0 OUT data can accompany the SETUP request sent to the - class drivers. (2) The logic in the STM32 F4 OTG FS device driver - has been extended to provide this data. Updates are still needed - to other drivers. - - Here is an overview of the required changes: - New two buffers in driver structure: - - 1. The existing EP0 setup request buffer (ctrlreq, 8 bytes) - 2. A new EP0 data buffer to driver state structure (ep0data, - max packetsize) - - Add a new state: - - 3. Waiting for EP0 setup OUT data (EP0STATE_SETUP_OUT) - - General logic flow: - - 1. When an EP0 SETUP packet is received: - - Read the request into EP0 setup request buffer (ctrlreq, - 8 bytes) - - If this is an OUT request with data length, set the EP0 - state to EP0STATE_SETUP_OUT and wait to receive data on - EP0. - - Otherwise, the SETUP request may be processed now (or, - in the case of the F4 driver, at the conclusion of the - SETUP phase). - 2. When EP0 the EP0 OUT DATA packet is received: - - Verify state is EP0STATE_SETUP_OUT - - Read the request into the EP0 data buffer (ep0data, max - packet size) - - Now process the previously buffered SETUP request along - with the OUT data. - 3. When the setup packet is dispatched to the class driver, - the OUT data must be passed as the final parameter in the - call. - - Update 2013-9-2: The new USB device-side driver for the SAMA5D3 - correctly supports OUT SETUP data following the same design as - per above. - - Update 2013-11-7: David Sidrane has fixed with issue with the - STM32 F1 USB device driver. Still a few more to go before this - can be closed out. - - Status: Open - Priority: High for class drivers that need EP0 data. For example, the - CDC/ACM serial driver might need the line coding data (that - data is not used currently, but it might be). - - Title: IMPROVED USAGE of STM32 USB RESOURCES - Description: The STM32 platforms use a non-standard, USB host peripheral - that uses "channels" to implement data transfers the current - logic associates each channel with an pipe/endpoint (with two - channels for bi-directional control endpoints). The OTGFS - peripheral has 8 channels and the OTGHS peripheral has 12 - channels. - - This works okay until you add a hub and try connect multiple - devices. A typical device will require 3-4 pipes and, hence, - 4-5 channels. This effectively prevents using a hub with the - STM32 devices. This also applies to the EFM32 which uses the - same IP. - - It should be possible to redesign the STM32 F4 OTGHS/OTGFS and - EFM32 host driver so that channels are dynamically assigned to - pipes as needed for individual transfers. Then you could have - more "apparent" pipes and make better use of channels. - Although there are only 8 or 12 channels, transfers are not - active all of the time on all channels so it ought to be - possible to have an unlimited number of "pipes" but with no - more than 8 or 12 active transfers. - Status: Open - Priority: Medium-Low - - Title: USB CDC/ACM HOST CLASS DRIVER - Description: A CDC/ACM host class driver has been added. This has been - testing by running the USB CDC/ACM host on an Olimex - LPC1766STK and using the - boards/arm/stm32/stm3210e-eval/configs/usbserial - configuration (using the CDC/ACM device side driver). There - are several unresolved issues that prevent the host driver - from being usable: - - - The driver works fine when configured for reduced or bulk- - only protocol on the Olimex LPC1766STK. - - - Testing has not been performed with the interrupt IN channel - enabled (ie., I have not enabled FLOW control nor do I have - a test case that used the interrupt IN channel). I can see - that the polling for interrupt IN data is occurring - initially. - - - I test for incoming data by doing 'nsh> cat /dev/ttyACM0' on - the Olimex LPC1766STK host. The bulk data reception still - works okay whether or not the interrupt IN channel is enabled. - If the interrupt IN channel is enabled, then polling of that - channel appears to stop when the bulk in channel becomes - active. - - - The RX reception logic uses the low priority work queue. - However, that logic never returns and so blocks other use of - the work queue thread. This is probably okay but means that - the RX reception logic probably should be moved to its own - dedicated thread. - - - I get crashes when I run with the STM32 OTGHS host driver. - Apparently the host driver is trashing memory on receipt - of data. - - UPDATE: This behavior needs to be retested with: - commit ce2845c5c3c257d081f624857949a6afd4a4668a - Author: Janne Rosberg - Date: Tue Mar 7 06:58:32 2017 -0600 - - usbhost_cdcacm: fix tx outbuffer overflow and remove now - invalid assert - - commit 3331e9c49aaaa6dcc3aefa6a9e2c80422ffedcd3 - Author: Janne Rosberg - Date: Tue Mar 7 06:57:06 2017 -0600 - - STM32 OTGHS host: stm32_in_transfer() fails and returns NAK - if a short transfer is received. This causes problems from - class drivers like CDC/ACM where short packets are expected. - In those protocols, any transfer may be terminated by sending - short or NUL packet. - - commit 0631c1aafa76dbaa41b4c37e18db98be47b60481 - Author: Gregory Nutt - Date: Tue Mar 7 07:17:24 2017 -0600 - - STM32 OTGFS, STM32 L4 and F7: Adapt Janne Rosberg's patch to - STM32 OTGHS host to OTGFS host, and to similar implements for - L4 and F7. - - - The SAMA5D EHCI and the LPC31 EHCI drivers both take semaphores - in the cancel method. The current CDC/ACM class driver calls - the cancel() method from an interrupt handler. This will - cause a crash. Those EHCI drivers should be redesigned to - permit cancellation from the interrupt level. - - Most of these problems are unique to the Olimex LPC1766STK - DCD; some are probably design problems in the CDC/ACM host - driver. The bottom line is that the host CDC/ACM driver is - still immature and you could experience issues in some - configurations if you use it. - - That all being said, I know of no issues with the current - CDC/ACM driver on the Olimex LPC1766STK platform if the interrupt - IN endpoint is not used, i.e., in "reduced" mode. The only loss - of functionality is output flow control. - - UPDATE: The CDC/ACM class driver may also now be functional on - the STM32. That needs to be verified. - - Status: Open - Priority: Medium-Low unless you really need host CDC/ACM support. - -o Libraries (libs/libc/, libs/libm/) - ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - - Title: SIGNED time_t - Description: The NuttX time_t is type uint32_t. I think this is consistent - with all standards and with normal usage of time_t. However, - according to Wikipedia, time_t is usually implemented as a - signed 32-bit value. - Status: Open - Priority: Very low unless there is some compelling issue that I do not - know about. - - Title: ENVIRON - Description: The definition of environ in stdlib.h is bogus and will not - work as it should. This is because the underlying - representation of the environment is not an array of pointers. - Status: Open - Priority: Medium - - Title: TERMIOS - Description: Need some minimal termios support... at a minimum, enough to - switch between raw and "normal" modes to support behavior like - that needed for readline(). - UPDATE: There is growing functionality in libs/libc/termios/ - and in the ioctl methods of several MCU serial drivers (stm32, - lpc43, lpc17, pic32, and others). However, as phrased, this - bug cannot yet be closed since this "growing functionality" - does not address all termios.h functionality and not all - serial drivers support termios. - Status: Open - Priority: Low - - Title: CONCURRENT STREAM READ/WRITE - Description: NuttX only supports a single file pointer so reads and writes - must be from the same position. This prohibits implementation - of behavior like that required for fopen() with the "a+" mode. - According to the fopen man page: - - "a+ Open for reading and appending (writing at end of file). - The file is created if it does not exist. The initial file - position for reading is at the beginning of the file, but - output is always appended to the end of the file." - - At present, the single NuttX file pointer is positioned to the - end of the file for both reading and writing. - Status: Open - Priority: Medium. This kind of operation is probably not very common in - deeply embedded systems but is required by standards. - - Title: DIVIDE BY ZERO - Description: This is bug 3468949 on the SourceForge website (submitted by - Philipp Klaus Krause): - "lib_strtod.c does contain divisions by zero in lines 70 and 96. - AFAIK, unlike for Java, division by zero is not a reliable way to - get infinity in C. AFAIK compilers are allowed e.g. give a compile- - time error, and some, such as sdcc, do. AFAIK, C implementations - are not even required to support infinity. In C99 the macro isinf() - could replace the first use of division by zero. Unfortunately, the - macro INFINITY from math.h probably can't replace the second division - by zero, since it will result in a compile-time diagnostic, if the - implementation does not support infinity." - Status: Open - Priority: - - Title: OLD dtoa NEEDS TO BE UPDATED - Description: This implementation of dtoa in libs/libc/stdio is old and will not - work with some newer compilers. See - http://patrakov.blogspot.com/2009/03/dont-use-old-dtoac.html - Update: A new dtoa version is not available and enabled with - CONFIG_NANO_PRINF. However, the old version of dtoa is still in - in place and lib_libvsprintf() has been dupliated. I think this - issue should remain open until the implementations have been - unified. - Status: Open - Priority: ?? - - Title: FLOATING POINT FORMATS - Description: Only the %f floating point format is supported. Others are - accepted but treated like %f. - Update: %g is supported with CONFIG_NANO_PRINTF. - Status: Open - Priority: Medium (this might important to someone). - - Title: LIBM INACCURACIES - Description: "..if you are writing something like robot control or - inertial navigation system for aircraft, I have found - that using the toolchain libmath is only safe option. - I ported some code for converting quaternions to Euler - angles to NuttX for my project and only got it working - after switching to newlib math library. - - "NuttX does not fully implement IEC 60559 floating point - from C99 (sections marked [MX] in OpenGroup specs) so if - your code assumes that some function, say pow(), actually - behaves right for all the twenty or so odd corner cases - that the standards committees have recently specified, - you might get surprises. I'd expect pow(0.0, 1.0) to - return 0.0 (as zero raised to any positive power is - well-defined in mathematics) but I get +Inf. - - "NuttX atan2(-0.0, -1.0) returns +M_PI instead of correct - -M_PI. If we expect [MX] functionality, then atan2(Inf, Inf) - should return M_PI/4, instead NuttX gives NaN. - - "asin(2.0) does not set domain error or return NaN. In fact - it does not return at all as the loop in it does not - converge, hanging your app. - - "There are likely many other issues like these as the Rhombus - OS code has not been tested or used that much. Sorry for not - providing patches, but we found it easier just to switch the - math library." - - UPDATE: 2015-09-01: A fix for the noted problems with asin() - has been applied. - 2016-07-30: Numerous fixes and performance improvements from - David Alessio. - - Status: Open - Priority: Low for casual users but clearly high if you need care about - these incorrect corner case behaviors in the math libraries. - - Title: REPARTITION LIBC FUNCTIONALITY - Description: There are many things implemented within the kernel (for example - under sched/pthread) that probably should be migrated in the - C library where it belongs. - - I would really like to see a little flavor of a micro-kernel - at the OS interface: I would like to see more primitive OS - system calls with more higher level logic in the C library. - - One awkward thing is the incompatibility of KERNEL vs FLAT - builds: In the kernel build, it would be nice to move many - of the thread-specific data items out of the TCB and into - the process address environment where they belong. It is - difficult to make this compatible with the FLAT build, - however. - Status: Open - Priority: Low - -o File system / Generic drivers (fs/, drivers/) - ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - - NOTE: The NXFFS file system has its own TODO list at nuttx/fs/nxffs/README.txt - - Title: MISSING FILE SYSTEM FEATURES - Description: Implement missing file system features: - - chmod() is probably not relevant since file modes are not - currently supported. - - File privileges would also be good to support. But this is - really a small part of a much larger feature. NuttX has no - user IDs, there are no groups, there are no privileges - associated with either. User's don't need credentials. - This is really a system wide issues of which chmod is only - a small part. - - User privileges never seemed important to me since NuttX is - intended for deeply embedded environments where there are - not multiple users with varying levels of trust. - - link, unlink, softlink, readlink - For symbolic links. Only - the ROMFS file system currently supports hard and soft links, - so this is not too important. The top-level, pseudo-file - system supports soft links. - - File locking - - Special files - NuttX support special files only in the top- - level pseudo file system. Unix systems support many - different special files via mknod(). This would be - important only if it is an objective of NuttX to become a - true Unix OS. Again only supported by ROMFS. - - True inodes - Standard Unix inodes. Currently only supported - by ROMFs. - - File times, for example as set by utimes(). - - The primary obstacle to all these is that each would require - changes to all existing file systems. That number is pretty - large. The number of file system implementations that would - need to be reviewed and modified As of this writing this - would include binfs, fat, hostfs, nfs, nxffs, procfs, romfs, - tmpfs, unionfs, plus pseduo-file system support. - - Status: Open - Priority: Low - - Title: ROMFS CHECKSUMS - Description: The ROMFS file system does not verify checksums on either - volume header on on the individual files. - Status: Open - Priority: Low. I have mixed feelings about if NuttX should pay a - performance penalty for better data integrity. - - Title: SPI-BASED SD MULTIPLE BLOCK TRANSFERS - Description: The simple SPI based MMCS/SD driver in fs/mmcsd does not - yet handle multiple block transfers. - Status: Open - Priority: Medium-Low - - Title: SDIO-BASED SD READ-AHEAD/WRITE BUFFERING INCOMPLETE - Description: The drivers/mmcsd/mmcsd_sdio.c driver has hooks in place to - support read-ahead buffering and write buffering, but the logic - is incomplete and untested. - Status: Open - Priority: Low - - Title: POLLHUP SUPPORT - Description: All drivers that support the poll method should also report - POLLHUP event when the driver is closed. - Status: Open - Priority: Medium-Low - - Title: DUPLICATE FAT FILE NAMES - Description: "The NSH and POSIX API interpretations about sensitivity or - insensitivity to upper/lowercase file names seem to be not - consistent in our usage - which can result in creating two - directories with the same name..." - - Example using NSH: - - nsh> echo "Test1" >/tmp/AtEsT.tXt - nsh> echo "Test2" >/tmp/aTeSt.TxT - nsh> ls /tmp - /tmp: - AtEsT.tXt - aTeSt.TxT - nsh> cat /tmp/aTeSt.TxT - Test2 - nsh> cat /tmp/AtEsT.tXt - Test1 - - Status: Open - Priority: Low - - Title: MISSING FILES IN NSH 'LS' OF A DIRECTORY - Description: I have seen cases where (1) long file names are enabled, - but (2) a short file name is created like: - - nsh> echo "This is another test" >/mnt/sdcard/another.txt - - But then on subsequent 'ls' operations, the file does not appear: - - nsh> ls -l /mnt/sdcard - - I have determined that the problem is because, for some as- - of-yet-unknown reason the short file name is treated as a long - file name. The name then fails the long filename checksum - test and is skipped. - - readdir() (and fat_readdir()) is the logic underlying the - failure and the problem appears to be something unique to the - fat_readdir() implementation. Why? Because the file is - visible when you put the SD card on a PC and because this - works fine: - - nsh> ls -l /mnt/sdcard/another.txt - - The failure does not happen on all short file names. I do - not understand the pattern. But I have not had the opportunity - to dig into this deeply. - Status: Open - Priority: Perhaps not a problem??? I have analyzed this problem and - I am not sure what to do about it. I am suspected that a - fat filesystem was used with a version of NuttX that does - not support long file name entries. Here is the failure - scenario: - - 1) A file with a long file name is created under Windows. - 2) Then the file is deleted. I am not sure if Windows or - NuttX deleted the file, but the resulting directory - content is not compatible with NuttX with long file - name support. - - The file deletion left the full sequence of long - file name entries intact but apparently delete only - the following short file name entry. I am thinking - that this might have happened because a version of NuttX - with only short file name support was used to delete - the file. - - 3) When a new file with a short file name was created, it - re-used the short file name entry that was previously - deleted. This makes the new short file name entry - look like a part of the long file name. - - 4) When comparing the checksum in the long file name - entry with the checksum of the short file name, the - checksum fails and the entire directory sequence is - ignored by readdir() logic. This is why the file does - not appear in the 'ls'. - - Title: SILENT SPIFFS FILE TRUNCATION - Description: Under certain corner case conditions, SPIFFS will truncate - files. All of the writes to the file will claim that the - data has been written but after the file is closed, it may - be a little shorter than expected. - - This is due to how the caching is implemented in SPIFFS: - - 1. On each write, the data is not written to the FLASH but - rather to an internal cache in memory. - 2. When the a write causes the cache to become full, the - content of cache is flushed to memory. If that flush - fails because the FLASH has become full, write will - return the file system full error (ENOSPC). - 3. The cache is also flushed when the file is closed (or - when fsync() is called). These will also fail if the - file system becomes full. - - The problem is when the file is closed, the final file - size could be smaller than the number of successful writes - to the file. - - This error is probably not so significant in a real world - file system usage: It requires that you write continuously - to SPIFFS, never deleting files or freeing FLASH resources - in any way. And it requires the unlikely circumstance that - the final file written has its last few hundred bytes in - cache when the file is closed but there are even fewer bytes - available on the FLASH. That would be rare with a cache - size of a few hundred bytes and very large serial FLASH. - - This issue does cause the test at apps/testing/fstest to - fail. That test fails with a "Partial Read" because the - file being read is smaller than number bytes written to the - file. That test does write small files continuously until - file system is full and even the the error is rare. The - boards/sim/sim/sim/configs/spiffs test can used to - demonstrate the error. - Status: Open - Priority: Medium. It is certain a file system failure, but I think that - the exposure in real world uses cases is very small. - - Title: FAT: CAN'T SEEK TO END OF FILE IF READ-ONLY - Description: If the size of the underlying file is an exact multiple of the - FAT cluster size, then you cannot seek to the end of the file - if the file was opened read-only. In that case, the FAT lseek - logic will return ENOSPC. - - This is because seeking to the end of the file involves seeking - to an offset that is the size of the file (number of bytes - allocated for file + 1). In order to seek to a position, the - current FAT implementation insists that there be allocated file - space at the seek position. Seeking beyond the end of the file - has the side effect of extending the file. - - [NOTE: This automatic extension of the file cluster allocation - is probably unnecessary and another issue of its own.] - - For example, suppose you have a cluster size that is 4096 bytes - and a file that is 8192 bytes long. Then the file will consist - of 2 allocated clusters at offsets 0 through 8191. - - If the file is opened O_RDWR or O_WRONLY, then the statement: - - offset = lseek(fd, 0, SET_SEEK); - - will seek to offset 8192 which beyond the end of the file so a - new (empty) cluster will be added. Now the file consists of - three clusters and the file position refers to the first byte of - the third cluster. - - If the file is open O_RDONLY, however, then that same lseek - statement will fail. It is not possible to seek to position - 8192. That is beyond the end of the allocated cluster chain - and since the file is read-only, it is not permitted to extend - the cluster chain. Hence, the error ENOSPC is returned. - - This code snippet will duplicate the problem. It assumes a - cluster size of 512 and that /tmp is a mounted FAT file system: - - #define BUFSIZE 1024 //8192, depends on cluster size - static char buffer[BUFSIZE]; - - #if defined(BUILD_MODULE) - int main(int argc, FAR char *argv[]) - #else - int hello_main(int argc, char *argv[]) - #endif - { - ssize_t nwritten; - off_t pos; - int fd; - int ch; - int i; - - for (i = 0, ch = ' '; i < BUFSIZE; i++) - { - buffer[i] = ch; - - if (++ch == 0x7f) - { - ch = ' '; - } - } - - fd = open("/tmp/testfile", O_WRONLY | O_CREAT | O_TRUNC, 0644); - if (fd < 0) - { - printf("open failed: %d\n", errno); - return 1; - } - - nwritten = write(fd, buffer, BUFSIZE); - if (nwritten < 0) - { - printf("write failed: %d\n", errno); - return 1; - } - - close(fd); - - fd = open("/tmp/testfile", O_RDONLY); - if (fd < 0) - { - printf("open failed: %d\n", errno); - return 1; - } - - pos = lseek(fd, 0, SEEK_END); - if (pos < 0) - { - printf("lseek failed: %d\n", errno); - return 1; - } - else if (pos != BUFSIZE) - { - printf("lseek failed: %d\n", pos); - return 1; - } - - close(fd); - return 0; - } - - Status: Open - Priority: Medium. Although this is a significant design error, the problem - has existed for 11 years without being previously reported. I - conclude, then that the exposure from this problem is not great. - - Why would you seek to the end of a file using a read=only file - descriptor anyway? Only one reason I can think of: To get the - size of the file. The alternative (and much more efficient) way - to do that is via stat(). - -o Graphics Subsystem (graphics/) - ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - - See also the NxWidgets TODO list file for related issues. - - Title: UNTESTED GRAPHICS APIS - Description: Testing of all APIs is not complete. See - http://nuttx.sourceforge.net/NXGraphicsSubsystem.html#testcoverage - Status: Open - Priority: Medium - - Title: ITALIC FONTS / NEGATIVE FONT OFFSETS - Description: Font metric structure (in include/nuttx/nx/nxfont.h) should allow - negative X offsets. Negative x-offsets are necessary for certain - glyphs (and is very common in italic fonts). - For example Eth, icircumflex, idieresis, and oslash should have - offset=1 in the 40x49b font (these missing negative offsets are - NOTE'ed in the font header files). - Status: Open. The problem is that the x-offset is an unsigned bitfield - in the current structure. - Priority: Low. - - Title: RAW WINDOW AUTORAISE - Description: Auto-raise only applies to NXTK windows. Shouldn't it also apply - to raw windows as well? - Status: Open - Priority: Low - - Title: AUTO-RAISE DISABLED - Description: Auto-raise is currently disabled. The reason is complex: - - Most touchscreen controls send touch data a high rates - - In multi-server mode, touch events get queued in a message - queue. - - The logic that receives the messages performs the auto-raise. - But it can do stupid things after the first auto-raise as - it operates on the stale data in the message queue. - I am thinking that auto-raise ought to be removed from NuttX - and moved out into a graphics layer (like NxWM) that knows - more about the appropriate context to do the autoraise. - Status: Open - Priority: Medium low - - Title: NxTERM VT100 SUPPORT - Description: If the NxTerm will be used with the Emacs-like command line - editor (CLE), then it will need to support VT100 cursor control - commands. - Status: Open - Priority: Low, the need has not yet arisen. - - Title: VERTICAL ANTI-ALIASING - Description: Anti-aliasing is implemented along the horizontal raster line - with fractional pixels at the ends of each line. There is no - accounting for fractional pixels in the vertical direction. - As a result lines closer to vertical receive better anti- - aliasing than lines closer to horizontal. - Status: Open - Priority: Low, not a serious issue but worth noting. There is no plan - to change this behavior. - - Title: WIDE-FONT SUPPORT - Description: Wide fonts are not currently supported by the NuttX graphics sub- - system. - Status: Open - Priority: Low for many, but I imagine higher in countries that use wide fonts - - Title: LOW-RES FRAMEBUFFER RENDERING - Description: There are obvious issues in the low-res, < 8 BPP, implementation of - the framebuffer rendering logic of graphics/nxglib/fb. I see two - obvious problems in reviewing nxglib_copyrectangle(): - - 1. The masking logic might work 1 BPP, but is insufficient for other - resolutions like 2-BPP and 4-BPP. - 2. The use of lnlen will not handle multiple bits per pixel. It - would need to be converted to a byte count. - - The function PDC_copy_glyph() in the file apps/graphics/pdcurs34/nuttx/pdcdisp.c - derives from nxglib_copyrectangle() and all of those issues have been - resolved in that file. - - Other framebuffer rendering functions probably have similar issues. - Status: Open - Priority: Low. It is not surprising that there would be bugs in this logic: - I have never encountered a hardware framebuffer with sub-byte pixel - depth. If such a beast ever shows up, then this priority would be - higher. - - Title: INCOMPLATE PLANAR COLOR SUPPORT - Description: The original NX design included support for planar colors, - i.e,. for devices that provide separate framebuffers for each - color component. Planar graphics hard was common some years - back but is rarely encountered today. In fact, I am not aware - of any MCU that implements planar framebuffers. - - Support for planar colors is, however, unverified and - incomplete. In fact, many recent changes explicitly assume a - single color plane: Planar colors are specified by a array - of components; some recent logic uses only component [0], - ignoring the possible existence of other color component frames. - - Completely removing planar color support is one reasonable - options; it is not likely that NuttX will encounter planar - color hardware and this would greatly simplify the logic and - eliminate inconsistencies in the immplementation. - Status: Open - Priority: Low. There is no problem other than one of aesthetics. - -o Build system - ^^^^^^^^^^^^ - - Title: MAKE EXPORT LIMITATIONS - Description: The top-level Makefile 'export' target that will bundle up all of the - NuttX libraries, header files, and the startup object into an export-able - tarball. This target uses the tools/mkexport.sh script. Issues: - - 1. This script assumes the host archiver ar may not be appropriate for - non-GCC toolchains - 2. For the kernel build, the user libraries should be built into some - libuser.a. The list of user libraries would have to accepted with - some new argument, perhaps -u. - Status: Open - Priority: Low. - -o Other drivers (drivers/) - ^^^^^^^^^^^^^^^^^^^^^^^^ - - Title: SYSLOG OUTPUT LOST ON A CRASH - Description: Flush syslog output on crash. I don't know how to do in the - character driver case with interrupts disabled. It would be - easy to flush the interrupt interrupt buffer, but not the - data buffered within a character driver (such as the serial - driver). - - Perhaps there could be a crash dump IOCTL command to flush - that buffered data with interrupts disabled? - Status: Open - Priority: Low. It would be a convenience and would simplify crash - debug if you could see all of the SYSLOG output up to the - time of the crash. But not essential. - - Title: SERIAL DRIVER WITH DMA DOES NOT DISCARD OOB CHARACTERS - Description: If Ctrl-Z or Ctrl-C actions are enabled, the the OOB - character that generates the signal action must not be placed - in the serial driver Rx buffer. This behavior is correct for - the non-DMA case (serial_io.c), but not for the DMA case - (serial_dma.c). In the DMA case, the OOB character is left - in the Rx buffer and will be received as normal Rx data by - the application. It should not work that way. - - Perhaps in the DMA case, the OOB characters could be filtered - out later, just before returning the Rx data to the application? - Status: Open - Priority: Low, provided that the application can handle these characters - in the data stream. - -o Linux/Cygwin simulation (arch/sim) - ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - - Title: SIMULATOR HAS NO INTERRUPTS (NON-PREMPTIBLE) - Description: The current simulator implementation is has no interrupts and, hence, - is non-preemptible. Also, without simulated interrupt, there can - be no high-fidelity simulated device drivers. - - Currently, all timing and serial input is simulated in the IDLE loop: - When nothing is going on in the simulation, the IDLE loop runs and - fakes timer and UART events. - Status: Open - Priority: Low, unless there is a need for developing a higher fidelity simulation - I have been thinking about how to implement simulated interrupts in - the simulation. I think a solution would work like this: - https://cwiki.apache.org/confluence/display/NUTTX/NuttX+Simulation - - Title: ROUND-ROBIN SCHEDULING IN THE SIMULATOR - Description: Since the simulation is not pre-emptible, you can't use round-robin - scheduling (no time slicing). Currently, the timer interrupts are - "faked" during IDLE loop processing and, as a result, there is no - task pre-emption because there are no asynchronous events. This could - probably be fixed if the "timer interrupt" were driver by Linux - signals. NOTE: You would also have to implement up_irq_save() and - up_irq_restore() to block and (conditionally) unblock the signal. - Status: Open - Priority: Low - -o ARM (arch/arm/) - ^^^^^^^^^^^^^^^ - - Title: IMPROVED ARM INTERRUPT HANDLING - Description: ARM interrupt handling performance could be improved in some - ways. One easy way is to use a pointer to the context save - area in g_current_regs instead of using up_copystate so much. - - This approach is already implemented for the ARM Cortex-M0, - Cortex-M3, Cortex-M4, and Cortex-A5 families. But still needs - to be back-ported to the ARM7 and ARM9 (which are nearly - identical to the Cortex-A5 in this regard). The change is - *very* simple for this architecture, but not implemented. - Status: Open. But complete on all ARM platforms except ARM7 and ARM9. - Priority: Low. - - Title: IMPROVED ARM INTERRUPT HANDLING - Description: The ARM and Cortex-M3 interrupt handlers restores all registers - upon return. This could be improved as well: If there is no - context switch, then the static registers need not be restored - because they will not be modified by the called C code. - (see arch/renesas/src/sh1/sh1_vector.S for example) - Status: Open - Priority: Low - - Title: CORTEX-M3 STACK OVERFLOW - Description: There is bit bit logic in up_fullcontextrestore() that executes on - return from interrupts (and other context switches) that looks like: - - ldr r1, [r0, #(4*REG_CPSR)] /* Fetch the stored CPSR value */ - msr cpsr, r1 /* Set the CPSR */ - - /* Now recover r0 and r1 */ - - ldr r0, [sp] - ldr r1, [sp, #4] - add sp, sp, #(2*4) - - /* Then return to the address at the stop of the stack, - * destroying the stack frame - */ - - ldr pc, [sp], #4 - - Under conditions of excessively high interrupt conditions, many - nested interrupts can occur just after the 'msr cpsr' instruction. - At that time, there are 4 bytes on the stack and, with each - interrupt, the stack pointer may increment and possibly overflow. - - This can happen only under conditions of continuous interrupts. - One suggested change is: - - ldr r1, [r0, #(4*REG_CPSR)] /* Fetch the stored CPSR value */ - msr spsr_cxsf, r1 /* Set the CPSR */ - ldmia r0, {r0-r15}^ - - But this has not been proven to be a solution. - - UPDATE: Other ARM architectures have a similar issue. - - Status: Open - Priority: Low. The conditions of continuous interrupts is really the problem. - If your design needs continuous interrupts like this, please try - the above change and, please, submit a patch with the working fix. - - Title: IMPROVED TASK START-UP AND SYSCALL RETURN - Description: Couldn't up_start_task and up_start_pthread syscalls be - eliminated. Wouldn't this work to get us from kernel- - to user-mode with a system trap: - - lda r13, #address - str rn, [r13] - msr spsr_SVC, rm - ld r13,{r15}^ - - Would also need to set r13_USER and r14_USER. For new - SYS_context_switch... couldn't we do he same thing? - - Also... System calls use traps to get from user- to kernel- - mode to perform OS services. That is necessary to get from - user- to kernel-mode. But then another trap is used to get - from kernel- back to user-mode. It seems like this second - trap should be unnecessary. We should be able to do the - same kind of logic to do this. - Status: Open - Priority: Low-ish, but a good opportunity for performance improvement. - - Title: USE COMMON VECTOR LOGIC IN ALL ARM ARCHITECTURES. - Description: Originally, each ARMv7-M MCU architecture had its own - private implementation for interrupt vectors and interrupt - handling logic. This was superseded by common interrupt - vector logic but these private implementations were never - removed from older MCU architectures. This is turning into - a maintenance issue because any improvements to the common - vector handling must also be re-implemented for each of the - older MCU architectures. - Status: Open - Priority: Low. A pain in the ass and an annoying implementation, but - not really an issue otherwise. - -o Network Utilities (apps/netutils/) - ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - - Title: UNVERIFIED THTTPD FEATURES - Description: Not all THTTPD features/options have been verified. In - particular, there is no test case of a CGI program receiving - POST input. Only the configuration of apps/examples/thttpd - has been tested. - Status: Open - Priority: Medium - -o NuttShell (NSH) (apps/nshlib) - ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - - See some NHS issues under "Kernel/Protected Build" as well. - - Title: IFCONFIG AND MULTIPLE NETWORK INTERFACES - Description: The ifconfig command will not behave correctly if an interface - is provided and there are multiple interfaces. It should only - show status for the single interface on the command line; it will - still show status for all interfaces. - Status: Open - Priority: Low - -o System libraries apps/system (apps/system) - ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - - Title: READLINE IMPLEMENTATION - Description: readline implementation does not use C-buffered I/O, but rather - talks to serial driver directly via read(). It includes VT-100 - specific editing commands. A more generic readline() should be - implemented using termios' tcsetattr() to put the serial driver - into a "raw" mode. - Status: Open - Priority: Low (unless you are using mixed C-buffered I/O with readline and - fgetc, for example). - - Title: apps/system PARTITIONING - Description: Several of the USB device helper applications in apps/system - violate OS/application partitioning and will fail on a kernel - or protected build. Many of these have been fixed by adding - the BOARDIOC_USBDEV_CONTROL boardctl() command. But there are - still issues. - - These functions still call directly into operating system - functions: - - - usbmsc_configure - Called from apps/system/usbmsc and - apps/system/composite - - usbmsc_bindlun - Called from apps/system/usbmsc - - usbmsc_exportluns - Called from apps/system/usbmsc. - - Status: Open - Priority: Medium/High -- the kernel build configuration is not fully fielded - yet. - -o Modbus (apps/modbus) - ^^^^^^^^^^^^^^^^^^^^ - - Title: MODBUS NOT USABLE WITH USB SERIAL - Description: Modbus can be used with USB serial, however, if the USB - serial connection is lost, Modbus will hang in an infinite - loop. - - This is a problem in the handling of select() and read() - and could probably resolved by studying the Modbus error - handling. - - A more USB-friendly solution would be to: (1) Re-connect and - (2) re-open the serial drivers. That is what is done is NSH. - When the serial USB device is removed, this terminates the - session and NSH will then try to re-open the USB device. See - the function nsh_waitusbready() in the file - apps/nshlib/nsh_usbconsole.c. When the USB serial is - reconnected the open() in the function will succeed and a new - session will be started. - Status: Open - Priority: Low. This is really an enhancement request: Modbus was never - designed to work with removable serial devices. - -o Other Applications & Tests (apps/examples/) - ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - - Title: EXAMPLES/PIPE ON CYGWIN - Description: The redirection test (part of examples/pipe) terminates - incorrectly on the Cygwin-based simulation platform (but works - fine on the Linux-based simulation platform). - Status: Open - Priority: Low - - Title: EXAMPLES/SENDMAIL UNTESTED - Description: examples/sendmail is untested on the target (it has been tested - on the host, but not on the target). - Status: Open - Priority: Med - - Title: EXAMPLES/NX FONT CACHING - Description: The font caching logic in examples/nx is incomplete. Fonts are - added to the cache, but never removed. When the cache is full - it stops rendering. This is not a problem for the examples/nx - code because it uses so few fonts, but if the logic were - leveraged for more general purposes, it would be a problem. - - Update: see examples/nxtext for some improved font cache handling. - Update: The NXTERM font cache has been generalized and is now - offered as the standard, common font cache for all applications. - both the nx and nxtext examples should be modified to use this - common font cache. See interfaces defined in nxfonts.h. - Status: Open - Priority: Low. This is not really a problem because examples/nx works - fine with its bogus font caching. - - Title: EXAMPLES/NXTEXT ARTIFACTS - Description: examples/nxtext. Artifacts when the pop-up window is opened. - There are some artifacts that appear in the upper left hand - corner. These seems to be related to window creation. At - tiny artifact would not be surprising (the initial window - should like at (0,0) and be of size (1,1)), but sometimes - the artifact is larger. - Status: Open - Priority: Medium. - - Title: ILLEGAL CALLS TO romdisk_register() - Description: Several examples (and other things under apps/) make illegal - calls to romdisk_register(). This both violates the portable - POSIX OS interface and makes these applications un-usable in - PROTECTED and KERNEL build modes. - - Non-compliant examples include: - - examples/bastest, examples/elf, examples/module, - examples/nxflat, examples/posix_spawn, examples/romfs, - examples/sotest, examples/thttpd, examples/unionfs - - These examples are simple demos and, hence, you could argue that - it is not so bad that they violate the interface for the purpose - of demonstration (although they do set a bad example because of - this). - - These examples should, of course, use boardctl(BOARDIOC_ROMDISK) - to create the ROM disk instead of calling romdisk_register() - directly. - Status: Open - Priority: Medium.