Decouple the semcount and the work queue length.
Previous Problem:
If a work is queued and cancelled in high priority threads (or queued
by timer and cancelled by another high priority thread) before
work_thread runs, the queue operation will mark work_thread as ready to
run, but the cancel operation minus the semcount back to -1 and makes
wqueue->q empty. Then the work_thread still runs, found empty queue,
and wait sem again, then semcount becomes -2 (being minused by 1)
This can be done multiple times, then semcount can become very small
value. Test case to produce incorrect semcount:
high_priority_task()
{
for (int i = 0; i < 10000; i++)
{
work_queue(LPWORK, &work, worker, NULL, 0);
work_cancel(LPWORK, &work);
usleep(1);
}
/* Now the g_lpwork.sem.semcount is a value near -10000 */
}
With incorrect semcount, any queue operation when the work_thread is
busy, will only increase semcount and push work into queue, but cannot
trigger work_thread (semcount is negative but work_thread is not
waiting), then there will be more and more works left in queue while
the work_thread is waiting sem and cannot call them.
Signed-off-by: Zhe Weng <wengzhe@xiaomi.com>
The number of work entries will be inconsistent with semaphore count
if the work is canceled, in extreme case, semaphore count will overflow
and fallback to 0 the workqueue will stop scheduling the enqueue work.
Signed-off-by: chao an <anchao@xiaomi.com>
Error: module/mod_insmod.c:203:3: error: 'strncpy' specified bound 16 equals destination size [-Werror=stringop-truncation]
203 | strncpy(modp->modname, modname, MODLIB_NAMEMAX);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
wqueue/kwork_thread.c: In function 'work_start_lowpri':
Error: wqueue/kwork_thread.c:212:22: error: '%lx' directive output may be truncated writing between 1 and 16 bytes into a region of size 14 [-Werror=format-truncation=]
212 | snprintf(args, 16, "0x%" PRIxPTR, (uintptr_t)wqueue);
local/local_sockif.c: In function 'local_getsockname':
Error: local/local_sockif.c:392:11: error: 'strncpy' specified bound depends on the length of the source argument [-Werror=stringop-overflow=]
392 | strncpy(unaddr->sun_path, conn->lc_path, namelen);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
chip/esp32_wifi_utils.c: In function 'esp_wifi_scan_event_parse':
Error: chip/esp32_wifi_utils.c:373:37: error: argument to 'sizeof' in 'memset' call is the same expression as the destination; did you mean to dereference it? [-Werror=sizeof-pointer-memaccess]
memset(ap_list_buffer, 0x0, sizeof(ap_list_buffer));
^
stdio/lib_fputs.c: In function 'fputs':
Error: stdio/lib_fputs.c:99:9: error: nonnull argument 's' compared to NULL [-Werror=nonnull-compare]
if (s == NULL || stream == NULL)
^
Error: stdio/lib_fputs.c:99:27: error: nonnull argument 'stream' compared to NULL [-Werror=nonnull-compare]
if (s == NULL || stream == NULL)
^
stdio/lib_vfprintf.c: In function 'vfprintf':
Error: stdio/lib_vfprintf.c:40:6: error: nonnull argument 'stream' compared to NULL [-Werror=nonnull-compare]
if (stream)
^
string/lib_strdup.c: In function 'strdup':
Error: string/lib_strdup.c:39:6: error: nonnull argument 's' compared to NULL [-Werror=nonnull-compare]
if (s)
^
string/lib_strndup.c: In function 'strndup':
Error: string/lib_strndup.c:56:6: error: nonnull argument 's' compared to NULL [-Werror=nonnull-compare]
if (s)
^
string/lib_strpbrk.c: In function 'strpbrk':
Error: string/lib_strpbrk.c:39:7: error: nonnull argument 'str' compared to NULL [-Werror=nonnull-compare]
if (!str || !charset)
^~~~
Error: string/lib_strpbrk.c:39:15: error: nonnull argument 'charset' compared to NULL [-Werror=nonnull-compare]
if (!str || !charset)
^~~~~~~~
string/lib_strrchr.c: In function 'strrchr':
Error: string/lib_strrchr.c:40:6: error: nonnull argument 's' compared to NULL [-Werror=nonnull-compare]
if (s)
^
Error: time/lib_asctimer.c:73:50: error: '%d' directive output may be truncated writing between 1 and 11 bytes into a region of size between 0 and 12 [-Werror=format-truncation=]
snprintf(buf, 26, "%.3s %.3s%3d %.2d:%.2d:%.2d %d\n",
^~
time/lib_asctimer.c:73:21: note: directive argument in the range [-2147481748, 2147483647]
snprintf(buf, 26, "%.3s %.3s%3d %.2d:%.2d:%.2d %d\n",
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
time/lib_asctimer.c:73:3: note: 'snprintf' output between 17 and 68 bytes into a destination of size 26
snprintf(buf, 26, "%.3s %.3s%3d %.2d:%.2d:%.2d %d\n",
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
g_wday_name[tp->tm_wday], g_mon_name[tp->tm_mon],
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
tp->tm_mday, tp->tm_hour, tp->tm_min, tp->tm_sec,
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1900 + tp->tm_year);
~~~~~~~~~~~~~~~~~~~
Signed-off-by: Xiang Xiao <xiaoxiang@xiaomi.com>
include/nuttx/wqueue.h:
libs/libc/wqueue/work_cancel.c:
libs/libc/wqueue/work_queue.c:
sched/wqueue/kwork_cancel.c:
sched/wqueue/kwork_queue.c:
* Fix spelling, grammar, and typos.
* Improve wording in a few areas.
* These changes affect comments only. No functional changes.
Resolution of Issue 619 will require multiple steps, this part of the first step in that resolution: Every call to nxsem_wait_uninterruptible() must handle the return value from nxsem_wait_uninterruptible properly. This commit is only for those files under graphics/, mm/, net/, sched/, wireless/bluetooth.
Still to do: Files under fs/, drivers/, and arch. The last is 116 files and will take some effort.
kworker dqueue corruption if the garbage in work->worker handler
Change-Id: Ice5e66e8ed080a7539d5fe02f946a66794cbda7d
Signed-off-by: chao.an <anchao@xiaomi.com>
the width of all block comments. Includes a check to assure that all block
comments use the same line width.
Verified against all .c files under /sched. There were a few cosmetic changes to the coding style under /sched to account to new, correctly detected problems in the /sched files.
* Simplify EINTR/ECANCEL error handling
1. Add semaphore uninterruptible wait function
2 .Replace semaphore wait loop with a single uninterruptible wait
3. Replace all sem_xxx to nxsem_xxx
* Unify the void cast usage
1. Remove void cast for function because many place ignore the returned value witout cast
2. Replace void cast for variable with UNUSED macro
Author: Gregory Nutt <gnutt@nuttx.org>
Run all .h and .c files modified in last PR through nxstyle.
Author: Xiang Xiao <xiaoxiang@xiaomi.com>
Net cleanup (#17)
* Fix the semaphore usage issue found in tcp/udp
1. The count semaphore need disable priority inheritance
2. Loop again if net_lockedwait return -EINTR
3. Call nxsem_trywait to avoid the race condition
4. Call nxsem_post instead of sem_post
* Put the work notifier into free list to avoid the heap fragment in the long run. Since the allocation strategy is encapsulated internally, we can even refine the implementation later.
* Network stack shouldn't allocate memory in the poll implementation to avoid the heap fragment in the long run, other modification include:
1. Select MM_IOB automatically since ICMP[v6] socket can't work without the read ahead buffer
2. Remove the net lock since xxx_callback_free already do the same thing
3. TCP/UDP poll should work even the read ahead buffer isn't enabled at all
* Add NET_ prefix for UDP_NOTIFIER and TCP_NOTIFIER option to align with other UDP/TCP option convention
* Remove the unused _SF_[IDLE|ACCEPT|SEND|RECV|MASK] flags since there are code to set/clear these flags, but nobody check them.
sched/wqueue/kwork_notifier.c: Redesign some data structures. struct works_s must appear at the beginning of the notifier entry structure. That is because it contains the work queue indices. This solves a harfault issue.
net/tcp/tcp_netpoll.c: tcp_iob_work() needs to free the allocated argument when it is finished.
net/tcp/tcp_send_buffered.c: Extend psock_tcp_cansend() so that it also requires that at least on IOB is also avaialble.
mm/iob: iob_navail() was returning the number of free IOB chain queue entries, not the number of free IOBs. Completely misnamed.
net/tcp/tcp_netpoll.c: Add logic to receive notifications when IOBs are freed (Needs CONFIG_NET_TCP_WRITE_BUFFERS and CONFIG_IOB_NOTIFIER). At present, does nothing because the logic in in psock_tcp_cansend() does not check for the availability of IOBs. That will change.