| CVE |
Vendors |
Products |
Updated |
CVSS v3.1 |
| In the Linux kernel, the following vulnerability has been resolved:
media: v4l2-core: Fix a potential resource leak in v4l2_fwnode_parse_link()
If fwnode_graph_get_remote_endpoint() fails, 'fwnode' is known to be NULL,
so fwnode_handle_put() is a no-op.
Release the reference taken from a previous fwnode_graph_get_port_parent()
call instead.
Also handle fwnode_graph_get_port_parent() failures.
In order to fix these issues, add an error handling path to the function
and the needed gotos. |
| In the Linux kernel, the following vulnerability has been resolved:
usb: typec: altmodes/displayport: fix pin_assignment_show
This patch fixes negative indexing of buf array in pin_assignment_show
when get_current_pin_assignments returns 0 i.e. no compatible pin
assignments are found.
BUG: KASAN: use-after-free in pin_assignment_show+0x26c/0x33c
...
Call trace:
dump_backtrace+0x110/0x204
dump_stack_lvl+0x84/0xbc
print_report+0x358/0x974
kasan_report+0x9c/0xfc
__do_kernel_fault+0xd4/0x2d4
do_bad_area+0x48/0x168
do_tag_check_fault+0x24/0x38
do_mem_abort+0x6c/0x14c
el1_abort+0x44/0x68
el1h_64_sync_handler+0x64/0xa4
el1h_64_sync+0x78/0x7c
pin_assignment_show+0x26c/0x33c
dev_attr_show+0x50/0xc0 |
| In the Linux kernel, the following vulnerability has been resolved:
wifi: mt76: mt7996: fix memory leak in mt7996_mcu_exit
Always purge mcu skb queues in mt7996_mcu_exit routine even if
mt7996_firmware_state fails. |
| In the Linux kernel, the following vulnerability has been resolved:
f2fs: fix null pointer panic in tracepoint in __replace_atomic_write_block
We got a kernel panic if old_addr is NULL.
https://bugzilla.kernel.org/show_bug.cgi?id=217266
BUG: kernel NULL pointer dereference, address: 0000000000000000
Call Trace:
<TASK>
f2fs_commit_atomic_write+0x619/0x990 [f2fs a1b985b80f5babd6f3ea778384908880812bfa43]
__f2fs_ioctl+0xd8e/0x4080 [f2fs a1b985b80f5babd6f3ea778384908880812bfa43]
? vfs_write+0x2ae/0x3f0
? vfs_write+0x2ae/0x3f0
__x64_sys_ioctl+0x91/0xd0
do_syscall_64+0x5c/0x90
entry_SYSCALL_64_after_hwframe+0x72/0xdc
RIP: 0033:0x7f69095fe53f |
| In the Linux kernel, the following vulnerability has been resolved:
exfat: use kvmalloc_array/kvfree instead of kmalloc_array/kfree
The call stack shown below is a scenario in the Linux 4.19 kernel.
Allocating memory failed where exfat fs use kmalloc_array due to
system memory fragmentation, while the u-disk was inserted without
recognition.
Devices such as u-disk using the exfat file system are pluggable and
may be insert into the system at any time.
However, long-term running systems cannot guarantee the continuity of
physical memory. Therefore, it's necessary to address this issue.
Binder:2632_6: page allocation failure: order:4,
mode:0x6040c0(GFP_KERNEL|__GFP_COMP), nodemask=(null)
Call trace:
[242178.097582] dump_backtrace+0x0/0x4
[242178.097589] dump_stack+0xf4/0x134
[242178.097598] warn_alloc+0xd8/0x144
[242178.097603] __alloc_pages_nodemask+0x1364/0x1384
[242178.097608] kmalloc_order+0x2c/0x510
[242178.097612] kmalloc_order_trace+0x40/0x16c
[242178.097618] __kmalloc+0x360/0x408
[242178.097624] load_alloc_bitmap+0x160/0x284
[242178.097628] exfat_fill_super+0xa3c/0xe7c
[242178.097635] mount_bdev+0x2e8/0x3a0
[242178.097638] exfat_fs_mount+0x40/0x50
[242178.097643] mount_fs+0x138/0x2e8
[242178.097649] vfs_kern_mount+0x90/0x270
[242178.097655] do_mount+0x798/0x173c
[242178.097659] ksys_mount+0x114/0x1ac
[242178.097665] __arm64_sys_mount+0x24/0x34
[242178.097671] el0_svc_common+0xb8/0x1b8
[242178.097676] el0_svc_handler+0x74/0x90
[242178.097681] el0_svc+0x8/0x340
By analyzing the exfat code,we found that continuous physical memory
is not required here,so kvmalloc_array is used can solve this problem. |
| In the Linux kernel, the following vulnerability has been resolved:
rxrpc: Fix timeout of a call that hasn't yet been granted a channel
afs_make_call() calls rxrpc_kernel_begin_call() to begin a call (which may
get stalled in the background waiting for a connection to become
available); it then calls rxrpc_kernel_set_max_life() to set the timeouts -
but that starts the call timer so the call timer might then expire before
we get a connection assigned - leading to the following oops if the call
stalled:
BUG: kernel NULL pointer dereference, address: 0000000000000000
...
CPU: 1 PID: 5111 Comm: krxrpcio/0 Not tainted 6.3.0-rc7-build3+ #701
RIP: 0010:rxrpc_alloc_txbuf+0xc0/0x157
...
Call Trace:
<TASK>
rxrpc_send_ACK+0x50/0x13b
rxrpc_input_call_event+0x16a/0x67d
rxrpc_io_thread+0x1b6/0x45f
? _raw_spin_unlock_irqrestore+0x1f/0x35
? rxrpc_input_packet+0x519/0x519
kthread+0xe7/0xef
? kthread_complete_and_exit+0x1b/0x1b
ret_from_fork+0x22/0x30
Fix this by noting the timeouts in struct rxrpc_call when the call is
created. The timer will be started when the first packet is transmitted.
It shouldn't be possible to trigger this directly from userspace through
AF_RXRPC as sendmsg() will return EBUSY if the call is in the
waiting-for-conn state if it dropped out of the wait due to a signal. |
| In the Linux kernel, the following vulnerability has been resolved:
Revert "Bluetooth: btsdio: fix use after free bug in btsdio_remove due to unfinished work"
This reverts commit 1e9ac114c4428fdb7ff4635b45d4f46017e8916f.
This patch introduces a possible null-ptr-def problem. Revert it. And the
fixed bug by this patch have resolved by commit 73f7b171b7c0 ("Bluetooth:
btsdio: fix use after free bug in btsdio_remove due to race condition"). |
| In the Linux kernel, the following vulnerability has been resolved:
tty: fix out-of-bounds access in tty_driver_lookup_tty()
When specifying an invalid console= device like console=tty3270,
tty_driver_lookup_tty() returns the tty struct without checking
whether index is a valid number.
To reproduce:
qemu-system-x86_64 -enable-kvm -nographic -serial mon:stdio \
-kernel ../linux-build-x86/arch/x86/boot/bzImage \
-append "console=ttyS0 console=tty3270"
This crashes with:
[ 0.770599] BUG: kernel NULL pointer dereference, address: 00000000000000ef
[ 0.771265] #PF: supervisor read access in kernel mode
[ 0.771773] #PF: error_code(0x0000) - not-present page
[ 0.772609] Oops: 0000 [#1] PREEMPT SMP PTI
[ 0.774878] RIP: 0010:tty_open+0x268/0x6f0
[ 0.784013] chrdev_open+0xbd/0x230
[ 0.784444] ? cdev_device_add+0x80/0x80
[ 0.784920] do_dentry_open+0x1e0/0x410
[ 0.785389] path_openat+0xca9/0x1050
[ 0.785813] do_filp_open+0xaa/0x150
[ 0.786240] file_open_name+0x133/0x1b0
[ 0.786746] filp_open+0x27/0x50
[ 0.787244] console_on_rootfs+0x14/0x4d
[ 0.787800] kernel_init_freeable+0x1e4/0x20d
[ 0.788383] ? rest_init+0xc0/0xc0
[ 0.788881] kernel_init+0x11/0x120
[ 0.789356] ret_from_fork+0x22/0x30 |
| In the Linux kernel, the following vulnerability has been resolved:
drm/msm/adreno: Fix null ptr access in adreno_gpu_cleanup()
Fix the below kernel panic due to null pointer access:
[ 18.504431] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000048
[ 18.513464] Mem abort info:
[ 18.516346] ESR = 0x0000000096000005
[ 18.520204] EC = 0x25: DABT (current EL), IL = 32 bits
[ 18.525706] SET = 0, FnV = 0
[ 18.528878] EA = 0, S1PTW = 0
[ 18.532117] FSC = 0x05: level 1 translation fault
[ 18.537138] Data abort info:
[ 18.540110] ISV = 0, ISS = 0x00000005
[ 18.544060] CM = 0, WnR = 0
[ 18.547109] user pgtable: 4k pages, 39-bit VAs, pgdp=0000000112826000
[ 18.553738] [0000000000000048] pgd=0000000000000000, p4d=0000000000000000, pud=0000000000000000
[ 18.562690] Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP
**Snip**
[ 18.696758] Call trace:
[ 18.699278] adreno_gpu_cleanup+0x30/0x88
[ 18.703396] a6xx_destroy+0xc0/0x130
[ 18.707066] a6xx_gpu_init+0x308/0x424
[ 18.710921] adreno_bind+0x178/0x288
[ 18.714590] component_bind_all+0xe0/0x214
[ 18.718797] msm_drm_bind+0x1d4/0x614
[ 18.722566] try_to_bring_up_aggregate_device+0x16c/0x1b8
[ 18.728105] __component_add+0xa0/0x158
[ 18.732048] component_add+0x20/0x2c
[ 18.735719] adreno_probe+0x40/0xc0
[ 18.739300] platform_probe+0xb4/0xd4
[ 18.743068] really_probe+0xfc/0x284
[ 18.746738] __driver_probe_device+0xc0/0xec
[ 18.751129] driver_probe_device+0x48/0x110
[ 18.755421] __device_attach_driver+0xa8/0xd0
[ 18.759900] bus_for_each_drv+0x90/0xdc
[ 18.763843] __device_attach+0xfc/0x174
[ 18.767786] device_initial_probe+0x20/0x2c
[ 18.772090] bus_probe_device+0x40/0xa0
[ 18.776032] deferred_probe_work_func+0x94/0xd0
[ 18.780686] process_one_work+0x190/0x3d0
[ 18.784805] worker_thread+0x280/0x3d4
[ 18.788659] kthread+0x104/0x1c0
[ 18.791981] ret_from_fork+0x10/0x20
[ 18.795654] Code: f9400408 aa0003f3 aa1f03f4 91142015 (f9402516)
[ 18.801913] ---[ end trace 0000000000000000 ]---
[ 18.809039] Kernel panic - not syncing: Oops: Fatal exception
Patchwork: https://patchwork.freedesktop.org/patch/515605/ |
| In the Linux kernel, the following vulnerability has been resolved:
ksmbd: avoid out of bounds access in decode_preauth_ctxt()
Confirm that the accessed pneg_ctxt->HashAlgorithms address sits within
the SMB request boundary; deassemble_neg_contexts() only checks that the
eight byte smb2_neg_context header + (client controlled) DataLength are
within the packet boundary, which is insufficient.
Checking for sizeof(struct smb2_preauth_neg_context) is overkill given
that the type currently assumes SMB311_SALT_SIZE bytes of trailing Salt. |
| In the Linux kernel, the following vulnerability has been resolved:
netfilter: nf_tables: always release netdev hooks from notifier
This reverts "netfilter: nf_tables: skip netdev events generated on netns removal".
The problem is that when a veth device is released, the veth release
callback will also queue the peer netns device for removal.
Its possible that the peer netns is also slated for removal. In this
case, the device memory is already released before the pre_exit hook of
the peer netns runs:
BUG: KASAN: slab-use-after-free in nf_hook_entry_head+0x1b8/0x1d0
Read of size 8 at addr ffff88812c0124f0 by task kworker/u8:1/45
Workqueue: netns cleanup_net
Call Trace:
nf_hook_entry_head+0x1b8/0x1d0
__nf_unregister_net_hook+0x76/0x510
nft_netdev_unregister_hooks+0xa0/0x220
__nft_release_hook+0x184/0x490
nf_tables_pre_exit_net+0x12f/0x1b0
..
Order is:
1. First netns is released, veth_dellink() queues peer netns device
for removal
2. peer netns is queued for removal
3. peer netns device is released, unreg event is triggered
4. unreg event is ignored because netns is going down
5. pre_exit hook calls nft_netdev_unregister_hooks but device memory
might be free'd already. |
| In the Linux kernel, the following vulnerability has been resolved:
drm/i915: fix race condition UAF in i915_perf_add_config_ioctl
Userspace can guess the id value and try to race oa_config object creation
with config remove, resulting in a use-after-free if we dereference the
object after unlocking the metrics_lock. For that reason, unlocking the
metrics_lock must be done after we are done dereferencing the object.
[tursulin: Manually added stable tag.]
(cherry picked from commit 49f6f6483b652108bcb73accd0204a464b922395) |
| In the Linux kernel, the following vulnerability has been resolved:
ksmbd: fix slab-out-of-bounds in init_smb2_rsp_hdr
When smb1 mount fails, KASAN detect slab-out-of-bounds in
init_smb2_rsp_hdr like the following one.
For smb1 negotiate(56bytes) , init_smb2_rsp_hdr() for smb2 is called.
The issue occurs while handling smb1 negotiate as smb2 server operations.
Add smb server operations for smb1 (get_cmd_val, init_rsp_hdr,
allocate_rsp_buf, check_user_session) to handle smb1 negotiate so that
smb2 server operation does not handle it.
[ 411.400423] CIFS: VFS: Use of the less secure dialect vers=1.0 is
not recommended unless required for access to very old servers
[ 411.400452] CIFS: Attempting to mount \\192.168.45.139\homes
[ 411.479312] ksmbd: init_smb2_rsp_hdr : 492
[ 411.479323] ==================================================================
[ 411.479327] BUG: KASAN: slab-out-of-bounds in
init_smb2_rsp_hdr+0x1e2/0x1f4 [ksmbd]
[ 411.479369] Read of size 16 at addr ffff888488ed0734 by task kworker/14:1/199
[ 411.479379] CPU: 14 PID: 199 Comm: kworker/14:1 Tainted: G
OE 6.1.21 #3
[ 411.479386] Hardware name: ASUSTeK COMPUTER INC. Z10PA-D8
Series/Z10PA-D8 Series, BIOS 3801 08/23/2019
[ 411.479390] Workqueue: ksmbd-io handle_ksmbd_work [ksmbd]
[ 411.479425] Call Trace:
[ 411.479428] <TASK>
[ 411.479432] dump_stack_lvl+0x49/0x63
[ 411.479444] print_report+0x171/0x4a8
[ 411.479452] ? kasan_complete_mode_report_info+0x3c/0x200
[ 411.479463] ? init_smb2_rsp_hdr+0x1e2/0x1f4 [ksmbd]
[ 411.479497] kasan_report+0xb4/0x130
[ 411.479503] ? init_smb2_rsp_hdr+0x1e2/0x1f4 [ksmbd]
[ 411.479537] kasan_check_range+0x149/0x1e0
[ 411.479543] memcpy+0x24/0x70
[ 411.479550] init_smb2_rsp_hdr+0x1e2/0x1f4 [ksmbd]
[ 411.479585] handle_ksmbd_work+0x109/0x760 [ksmbd]
[ 411.479616] ? _raw_spin_unlock_irqrestore+0x50/0x50
[ 411.479624] ? smb3_encrypt_resp+0x340/0x340 [ksmbd]
[ 411.479656] process_one_work+0x49c/0x790
[ 411.479667] worker_thread+0x2b1/0x6e0
[ 411.479674] ? process_one_work+0x790/0x790
[ 411.479680] kthread+0x177/0x1b0
[ 411.479686] ? kthread_complete_and_exit+0x30/0x30
[ 411.479692] ret_from_fork+0x22/0x30
[ 411.479702] </TASK> |
| In the Linux kernel, the following vulnerability has been resolved:
media: ov5675: Fix memleak in ov5675_init_controls()
There is a kmemleak when testing the media/i2c/ov5675.c with bpf mock
device:
AssertionError: unreferenced object 0xffff888107362160 (size 16):
comm "python3", pid 277, jiffies 4294832798 (age 20.722s)
hex dump (first 16 bytes):
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace:
[<00000000abe7d67c>] __kmalloc_node+0x44/0x1b0
[<000000008a725aac>] kvmalloc_node+0x34/0x180
[<000000009a53cd11>] v4l2_ctrl_handler_init_class+0x11d/0x180
[videodev]
[<0000000055b46db0>] ov5675_probe+0x38b/0x897 [ov5675]
[<00000000153d886c>] i2c_device_probe+0x28d/0x680
[<000000004afb7e8f>] really_probe+0x17c/0x3f0
[<00000000ff2f18e4>] __driver_probe_device+0xe3/0x170
[<000000000a001029>] driver_probe_device+0x49/0x120
[<00000000e39743c7>] __device_attach_driver+0xf7/0x150
[<00000000d32fd070>] bus_for_each_drv+0x114/0x180
[<000000009083ac41>] __device_attach+0x1e5/0x2d0
[<0000000015b4a830>] bus_probe_device+0x126/0x140
[<000000007813deaf>] device_add+0x810/0x1130
[<000000007becb867>] i2c_new_client_device+0x386/0x540
[<000000007f9cf4b4>] of_i2c_register_device+0xf1/0x110
[<00000000ebfdd032>] of_i2c_notify+0xfc/0x1f0
ov5675_init_controls() won't clean all the allocated resources in fail
path, which may causes the memleaks. Add v4l2_ctrl_handler_free() to
prevent memleak. |
| In the Linux kernel, the following vulnerability has been resolved:
block: fix blktrace debugfs entries leakage
Commit 99d055b4fd4b ("block: remove per-disk debugfs files in
blk_unregister_queue") moves blk_trace_shutdown() from
blk_release_queue() to blk_unregister_queue(), this is safe if blktrace
is created through sysfs, however, there is a regression in corner
case.
blktrace can still be enabled after del_gendisk() through ioctl if
the disk is opened before del_gendisk(), and if blktrace is not shutdown
through ioctl before closing the disk, debugfs entries will be leaked.
Fix this problem by shutdown blktrace in disk_release(), this is safe
because blk_trace_remove() is reentrant. |
| In the Linux kernel, the following vulnerability has been resolved:
tracing: Fix warning in trace_buffered_event_disable()
Warning happened in trace_buffered_event_disable() at
WARN_ON_ONCE(!trace_buffered_event_ref)
Call Trace:
? __warn+0xa5/0x1b0
? trace_buffered_event_disable+0x189/0x1b0
__ftrace_event_enable_disable+0x19e/0x3e0
free_probe_data+0x3b/0xa0
unregister_ftrace_function_probe_func+0x6b8/0x800
event_enable_func+0x2f0/0x3d0
ftrace_process_regex.isra.0+0x12d/0x1b0
ftrace_filter_write+0xe6/0x140
vfs_write+0x1c9/0x6f0
[...]
The cause of the warning is in __ftrace_event_enable_disable(),
trace_buffered_event_enable() was called once while
trace_buffered_event_disable() was called twice.
Reproduction script show as below, for analysis, see the comments:
```
#!/bin/bash
cd /sys/kernel/tracing/
# 1. Register a 'disable_event' command, then:
# 1) SOFT_DISABLED_BIT was set;
# 2) trace_buffered_event_enable() was called first time;
echo 'cmdline_proc_show:disable_event:initcall:initcall_finish' > \
set_ftrace_filter
# 2. Enable the event registered, then:
# 1) SOFT_DISABLED_BIT was cleared;
# 2) trace_buffered_event_disable() was called first time;
echo 1 > events/initcall/initcall_finish/enable
# 3. Try to call into cmdline_proc_show(), then SOFT_DISABLED_BIT was
# set again!!!
cat /proc/cmdline
# 4. Unregister the 'disable_event' command, then:
# 1) SOFT_DISABLED_BIT was cleared again;
# 2) trace_buffered_event_disable() was called second time!!!
echo '!cmdline_proc_show:disable_event:initcall:initcall_finish' > \
set_ftrace_filter
```
To fix it, IIUC, we can change to call trace_buffered_event_enable() at
fist time soft-mode enabled, and call trace_buffered_event_disable() at
last time soft-mode disabled. |
| In the Linux kernel, the following vulnerability has been resolved:
clocksource/drivers/cadence-ttc: Fix memory leak in ttc_timer_probe
Smatch reports:
drivers/clocksource/timer-cadence-ttc.c:529 ttc_timer_probe()
warn: 'timer_baseaddr' from of_iomap() not released on lines: 498,508,516.
timer_baseaddr may have the problem of not being released after use,
I replaced it with the devm_of_iomap() function and added the clk_put()
function to cleanup the "clk_ce" and "clk_cs". |
| In the Linux kernel, the following vulnerability has been resolved:
posix-timers: Ensure timer ID search-loop limit is valid
posix_timer_add() tries to allocate a posix timer ID by starting from the
cached ID which was stored by the last successful allocation.
This is done in a loop searching the ID space for a free slot one by
one. The loop has to terminate when the search wrapped around to the
starting point.
But that's racy vs. establishing the starting point. That is read out
lockless, which leads to the following problem:
CPU0 CPU1
posix_timer_add()
start = sig->posix_timer_id;
lock(hash_lock);
... posix_timer_add()
if (++sig->posix_timer_id < 0)
start = sig->posix_timer_id;
sig->posix_timer_id = 0;
So CPU1 can observe a negative start value, i.e. -1, and the loop break
never happens because the condition can never be true:
if (sig->posix_timer_id == start)
break;
While this is unlikely to ever turn into an endless loop as the ID space is
huge (INT_MAX), the racy read of the start value caught the attention of
KCSAN and Dmitry unearthed that incorrectness.
Rewrite it so that all id operations are under the hash lock. |
| In the Linux kernel, the following vulnerability has been resolved:
virtio_pmem: add the missing REQ_OP_WRITE for flush bio
When doing mkfs.xfs on a pmem device, the following warning was
------------[ cut here ]------------
WARNING: CPU: 2 PID: 384 at block/blk-core.c:751 submit_bio_noacct
Modules linked in:
CPU: 2 PID: 384 Comm: mkfs.xfs Not tainted 6.4.0-rc7+ #154
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
RIP: 0010:submit_bio_noacct+0x340/0x520
......
Call Trace:
<TASK>
? submit_bio_noacct+0xd5/0x520
submit_bio+0x37/0x60
async_pmem_flush+0x79/0xa0
nvdimm_flush+0x17/0x40
pmem_submit_bio+0x370/0x390
__submit_bio+0xbc/0x190
submit_bio_noacct_nocheck+0x14d/0x370
submit_bio_noacct+0x1ef/0x520
submit_bio+0x55/0x60
submit_bio_wait+0x5a/0xc0
blkdev_issue_flush+0x44/0x60
The root cause is that submit_bio_noacct() needs bio_op() is either
WRITE or ZONE_APPEND for flush bio and async_pmem_flush() doesn't assign
REQ_OP_WRITE when allocating flush bio, so submit_bio_noacct just fail
the flush bio.
Simply fix it by adding the missing REQ_OP_WRITE for flush bio. And we
could fix the flush order issue and do flush optimization later. |
| In the Linux kernel, the following vulnerability has been resolved:
ixgbe: Fix panic during XDP_TX with > 64 CPUs
Commit 4fe815850bdc ("ixgbe: let the xdpdrv work with more than 64 cpus")
adds support to allow XDP programs to run on systems with more than
64 CPUs by locking the XDP TX rings and indexing them using cpu % 64
(IXGBE_MAX_XDP_QS).
Upon trying this out patch on a system with more than 64 cores,
the kernel paniced with an array-index-out-of-bounds at the return in
ixgbe_determine_xdp_ring in ixgbe.h, which means ixgbe_determine_xdp_q_idx
was just returning the cpu instead of cpu % IXGBE_MAX_XDP_QS. An example
splat:
==========================================================================
UBSAN: array-index-out-of-bounds in
/var/lib/dkms/ixgbe/5.18.6+focal-1/build/src/ixgbe.h:1147:26
index 65 is out of range for type 'ixgbe_ring *[64]'
==========================================================================
BUG: kernel NULL pointer dereference, address: 0000000000000058
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: 0000 [#1] SMP NOPTI
CPU: 65 PID: 408 Comm: ksoftirqd/65
Tainted: G IOE 5.15.0-48-generic #54~20.04.1-Ubuntu
Hardware name: Dell Inc. PowerEdge R640/0W23H8, BIOS 2.5.4 01/13/2020
RIP: 0010:ixgbe_xmit_xdp_ring+0x1b/0x1c0 [ixgbe]
Code: 3b 52 d4 cf e9 42 f2 ff ff 66 0f 1f 44 00 00 0f 1f 44 00 00 55 b9
00 00 00 00 48 89 e5 41 57 41 56 41 55 41 54 53 48 83 ec 08 <44> 0f b7
47 58 0f b7 47 5a 0f b7 57 54 44 0f b7 76 08 66 41 39 c0
RSP: 0018:ffffbc3fcd88fcb0 EFLAGS: 00010282
RAX: ffff92a253260980 RBX: ffffbc3fe68b00a0 RCX: 0000000000000000
RDX: ffff928b5f659000 RSI: ffff928b5f659000 RDI: 0000000000000000
RBP: ffffbc3fcd88fce0 R08: ffff92b9dfc20580 R09: 0000000000000001
R10: 3d3d3d3d3d3d3d3d R11: 3d3d3d3d3d3d3d3d R12: 0000000000000000
R13: ffff928b2f0fa8c0 R14: ffff928b9be20050 R15: 000000000000003c
FS: 0000000000000000(0000) GS:ffff92b9dfc00000(0000)
knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000058 CR3: 000000011dd6a002 CR4: 00000000007706e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
<TASK>
ixgbe_poll+0x103e/0x1280 [ixgbe]
? sched_clock_cpu+0x12/0xe0
__napi_poll+0x30/0x160
net_rx_action+0x11c/0x270
__do_softirq+0xda/0x2ee
run_ksoftirqd+0x2f/0x50
smpboot_thread_fn+0xb7/0x150
? sort_range+0x30/0x30
kthread+0x127/0x150
? set_kthread_struct+0x50/0x50
ret_from_fork+0x1f/0x30
</TASK>
I think this is how it happens:
Upon loading the first XDP program on a system with more than 64 CPUs,
ixgbe_xdp_locking_key is incremented in ixgbe_xdp_setup. However,
immediately after this, the rings are reconfigured by ixgbe_setup_tc.
ixgbe_setup_tc calls ixgbe_clear_interrupt_scheme which calls
ixgbe_free_q_vectors which calls ixgbe_free_q_vector in a loop.
ixgbe_free_q_vector decrements ixgbe_xdp_locking_key once per call if
it is non-zero. Commenting out the decrement in ixgbe_free_q_vector
stopped my system from panicing.
I suspect to make the original patch work, I would need to load an XDP
program and then replace it in order to get ixgbe_xdp_locking_key back
above 0 since ixgbe_setup_tc is only called when transitioning between
XDP and non-XDP ring configurations, while ixgbe_xdp_locking_key is
incremented every time ixgbe_xdp_setup is called.
Also, ixgbe_setup_tc can be called via ethtool --set-channels, so this
becomes another path to decrement ixgbe_xdp_locking_key to 0 on systems
with more than 64 CPUs.
Since ixgbe_xdp_locking_key only protects the XDP_TX path and is tied
to the number of CPUs present, there is no reason to disable it upon
unloading an XDP program. To avoid confusion, I have moved enabling
ixgbe_xdp_locking_key into ixgbe_sw_init, which is part of the probe path. |