| CVE |
Vendors |
Products |
Updated |
CVSS v3.1 |
| In the Linux kernel, the following vulnerability has been resolved:
mm/filemap: make MAX_PAGECACHE_ORDER acceptable to xarray
Patch series "mm/filemap: Limit page cache size to that supported by
xarray", v2.
Currently, xarray can't support arbitrary page cache size. More details
can be found from the WARN_ON() statement in xas_split_alloc(). In our
test whose code is attached below, we hit the WARN_ON() on ARM64 system
where the base page size is 64KB and huge page size is 512MB. The issue
was reported long time ago and some discussions on it can be found here
[1].
[1] https://www.spinics.net/lists/linux-xfs/msg75404.html
In order to fix the issue, we need to adjust MAX_PAGECACHE_ORDER to one
supported by xarray and avoid PMD-sized page cache if needed. The code
changes are suggested by David Hildenbrand.
PATCH[1] adjusts MAX_PAGECACHE_ORDER to that supported by xarray
PATCH[2-3] avoids PMD-sized page cache in the synchronous readahead path
PATCH[4] avoids PMD-sized page cache for shmem files if needed
Test program
============
# cat test.c
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <fcntl.h>
#include <errno.h>
#include <sys/syscall.h>
#include <sys/mman.h>
#define TEST_XFS_FILENAME "/tmp/data"
#define TEST_SHMEM_FILENAME "/dev/shm/data"
#define TEST_MEM_SIZE 0x20000000
int main(int argc, char **argv)
{
const char *filename;
int fd = 0;
void *buf = (void *)-1, *p;
int pgsize = getpagesize();
int ret;
if (pgsize != 0x10000) {
fprintf(stderr, "64KB base page size is required\n");
return -EPERM;
}
system("echo force > /sys/kernel/mm/transparent_hugepage/shmem_enabled");
system("rm -fr /tmp/data");
system("rm -fr /dev/shm/data");
system("echo 1 > /proc/sys/vm/drop_caches");
/* Open xfs or shmem file */
filename = TEST_XFS_FILENAME;
if (argc > 1 && !strcmp(argv[1], "shmem"))
filename = TEST_SHMEM_FILENAME;
fd = open(filename, O_CREAT | O_RDWR | O_TRUNC);
if (fd < 0) {
fprintf(stderr, "Unable to open <%s>\n", filename);
return -EIO;
}
/* Extend file size */
ret = ftruncate(fd, TEST_MEM_SIZE);
if (ret) {
fprintf(stderr, "Error %d to ftruncate()\n", ret);
goto cleanup;
}
/* Create VMA */
buf = mmap(NULL, TEST_MEM_SIZE,
PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
if (buf == (void *)-1) {
fprintf(stderr, "Unable to mmap <%s>\n", filename);
goto cleanup;
}
fprintf(stdout, "mapped buffer at 0x%p\n", buf);
ret = madvise(buf, TEST_MEM_SIZE, MADV_HUGEPAGE);
if (ret) {
fprintf(stderr, "Unable to madvise(MADV_HUGEPAGE)\n");
goto cleanup;
}
/* Populate VMA */
ret = madvise(buf, TEST_MEM_SIZE, MADV_POPULATE_WRITE);
if (ret) {
fprintf(stderr, "Error %d to madvise(MADV_POPULATE_WRITE)\n", ret);
goto cleanup;
}
/* Punch the file to enforce xarray split */
ret = fallocate(fd, FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE,
TEST_MEM_SIZE - pgsize, pgsize);
if (ret)
fprintf(stderr, "Error %d to fallocate()\n", ret);
cleanup:
if (buf != (void *)-1)
munmap(buf, TEST_MEM_SIZE);
if (fd > 0)
close(fd);
return 0;
}
# gcc test.c -o test
# cat /proc/1/smaps | grep KernelPageSize | head -n 1
KernelPageSize: 64 kB
# ./test shmem
:
------------[ cut here ]------------
WARNING: CPU: 17 PID: 5253 at lib/xarray.c:1025 xas_split_alloc+0xf8/0x128
Modules linked in: nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib \
nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct \
nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 \
ip_set nf_tables rfkill nfnetlink vfat fat virtio_balloon \
drm fuse xfs libcrc32c crct10dif_ce ghash_ce sha2_ce sha256_arm64 \
virtio_net sha1_ce net_failover failover virtio_console virtio_blk \
dimlib virtio_mmio
CPU: 17 PID: 5253 Comm: test Kdump: loaded Tainted: G W 6.10.0-rc5-gavin+ #12
Hardware name: QEMU KVM Virtual Machine, BIOS edk2-20240524-1.el9 05/24/2024
pstate: 83400005 (Nzcv daif +PAN -UAO +TC
---truncated--- |
| In the Linux kernel, the following vulnerability has been resolved:
mm/shmem: disable PMD-sized page cache if needed
For shmem files, it's possible that PMD-sized page cache can't be
supported by xarray. For example, 512MB page cache on ARM64 when the base
page size is 64KB can't be supported by xarray. It leads to errors as the
following messages indicate when this sort of xarray entry is split.
WARNING: CPU: 34 PID: 7578 at lib/xarray.c:1025 xas_split_alloc+0xf8/0x128
Modules linked in: binfmt_misc nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 \
nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject \
nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 \
ip_set rfkill nf_tables nfnetlink vfat fat virtio_balloon drm fuse xfs \
libcrc32c crct10dif_ce ghash_ce sha2_ce sha256_arm64 sha1_ce virtio_net \
net_failover virtio_console virtio_blk failover dimlib virtio_mmio
CPU: 34 PID: 7578 Comm: test Kdump: loaded Tainted: G W 6.10.0-rc5-gavin+ #9
Hardware name: QEMU KVM Virtual Machine, BIOS edk2-20240524-1.el9 05/24/2024
pstate: 83400005 (Nzcv daif +PAN -UAO +TCO +DIT -SSBS BTYPE=--)
pc : xas_split_alloc+0xf8/0x128
lr : split_huge_page_to_list_to_order+0x1c4/0x720
sp : ffff8000882af5f0
x29: ffff8000882af5f0 x28: ffff8000882af650 x27: ffff8000882af768
x26: 0000000000000cc0 x25: 000000000000000d x24: ffff00010625b858
x23: ffff8000882af650 x22: ffffffdfc0900000 x21: 0000000000000000
x20: 0000000000000000 x19: ffffffdfc0900000 x18: 0000000000000000
x17: 0000000000000000 x16: 0000018000000000 x15: 52f8004000000000
x14: 0000e00000000000 x13: 0000000000002000 x12: 0000000000000020
x11: 52f8000000000000 x10: 52f8e1c0ffff6000 x9 : ffffbeb9619a681c
x8 : 0000000000000003 x7 : 0000000000000000 x6 : ffff00010b02ddb0
x5 : ffffbeb96395e378 x4 : 0000000000000000 x3 : 0000000000000cc0
x2 : 000000000000000d x1 : 000000000000000c x0 : 0000000000000000
Call trace:
xas_split_alloc+0xf8/0x128
split_huge_page_to_list_to_order+0x1c4/0x720
truncate_inode_partial_folio+0xdc/0x160
shmem_undo_range+0x2bc/0x6a8
shmem_fallocate+0x134/0x430
vfs_fallocate+0x124/0x2e8
ksys_fallocate+0x4c/0xa0
__arm64_sys_fallocate+0x24/0x38
invoke_syscall.constprop.0+0x7c/0xd8
do_el0_svc+0xb4/0xd0
el0_svc+0x44/0x1d8
el0t_64_sync_handler+0x134/0x150
el0t_64_sync+0x17c/0x180
Fix it by disabling PMD-sized page cache when HPAGE_PMD_ORDER is larger
than MAX_PAGECACHE_ORDER. As Matthew Wilcox pointed, the page cache in a
shmem file isn't represented by a multi-index entry and doesn't have this
limitation when the xarry entry is split until commit 6b24ca4a1a8d ("mm:
Use multi-index entries in the page cache"). |
| In the Linux kernel, the following vulnerability has been resolved:
ice: Fix improper extts handling
Extts events are disabled and enabled by the application ts2phc.
However, in case where the driver is removed when the application is
running, a specific extts event remains enabled and can cause a kernel
crash.
As a side effect, when the driver is reloaded and application is started
again, remaining extts event for the channel from a previous run will
keep firing and the message "extts on unexpected channel" might be
printed to the user.
To avoid that, extts events shall be disabled when PTP is released. |
| In the Linux kernel, the following vulnerability has been resolved:
ice: Don't process extts if PTP is disabled
The ice_ptp_extts_event() function can race with ice_ptp_release() and
result in a NULL pointer dereference which leads to a kernel panic.
Panic occurs because the ice_ptp_extts_event() function calls
ptp_clock_event() with a NULL pointer. The ice driver has already
released the PTP clock by the time the interrupt for the next external
timestamp event occurs.
To fix this, modify the ice_ptp_extts_event() function to check the
PTP state and bail early if PTP is not ready. |
| In the Linux kernel, the following vulnerability has been resolved:
cpufreq: amd-pstate: fix memory leak on CPU EPP exit
The cpudata memory from kzalloc() in amd_pstate_epp_cpu_init() is
not freed in the analogous exit function, so fix that.
[ rjw: Subject and changelog edits ] |
| In the Linux kernel, the following vulnerability has been resolved:
net/mlx5e: Fix netif state handling
mlx5e_suspend cleans resources only if netif_device_present() returns
true. However, mlx5e_resume changes the state of netif, via
mlx5e_nic_enable, only if reg_state == NETREG_REGISTERED.
In the below case, the above leads to NULL-ptr Oops[1] and memory
leaks:
mlx5e_probe
_mlx5e_resume
mlx5e_attach_netdev
mlx5e_nic_enable <-- netdev not reg, not calling netif_device_attach()
register_netdev <-- failed for some reason.
ERROR_FLOW:
_mlx5e_suspend <-- netif_device_present return false, resources aren't freed :(
Hence, clean resources in this case as well.
[1]
BUG: kernel NULL pointer dereference, address: 0000000000000000
PGD 0 P4D 0
Oops: 0010 [#1] SMP
CPU: 2 PID: 9345 Comm: test-ovs-ct-gen Not tainted 6.5.0_for_upstream_min_debug_2023_09_05_16_01 #1
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
RIP: 0010:0x0
Code: Unable to access opcode bytes at0xffffffffffffffd6.
RSP: 0018:ffff888178aaf758 EFLAGS: 00010246
Call Trace:
<TASK>
? __die+0x20/0x60
? page_fault_oops+0x14c/0x3c0
? exc_page_fault+0x75/0x140
? asm_exc_page_fault+0x22/0x30
notifier_call_chain+0x35/0xb0
blocking_notifier_call_chain+0x3d/0x60
mlx5_blocking_notifier_call_chain+0x22/0x30 [mlx5_core]
mlx5_core_uplink_netdev_event_replay+0x3e/0x60 [mlx5_core]
mlx5_mdev_netdev_track+0x53/0x60 [mlx5_ib]
mlx5_ib_roce_init+0xc3/0x340 [mlx5_ib]
__mlx5_ib_add+0x34/0xd0 [mlx5_ib]
mlx5r_probe+0xe1/0x210 [mlx5_ib]
? auxiliary_match_id+0x6a/0x90
auxiliary_bus_probe+0x38/0x80
? driver_sysfs_add+0x51/0x80
really_probe+0xc9/0x3e0
? driver_probe_device+0x90/0x90
__driver_probe_device+0x80/0x160
driver_probe_device+0x1e/0x90
__device_attach_driver+0x7d/0x100
bus_for_each_drv+0x80/0xd0
__device_attach+0xbc/0x1f0
bus_probe_device+0x86/0xa0
device_add+0x637/0x840
__auxiliary_device_add+0x3b/0xa0
add_adev+0xc9/0x140 [mlx5_core]
mlx5_rescan_drivers_locked+0x22a/0x310 [mlx5_core]
mlx5_register_device+0x53/0xa0 [mlx5_core]
mlx5_init_one_devl_locked+0x5c4/0x9c0 [mlx5_core]
mlx5_init_one+0x3b/0x60 [mlx5_core]
probe_one+0x44c/0x730 [mlx5_core]
local_pci_probe+0x3e/0x90
pci_device_probe+0xbf/0x210
? kernfs_create_link+0x5d/0xa0
? sysfs_do_create_link_sd+0x60/0xc0
really_probe+0xc9/0x3e0
? driver_probe_device+0x90/0x90
__driver_probe_device+0x80/0x160
driver_probe_device+0x1e/0x90
__device_attach_driver+0x7d/0x100
bus_for_each_drv+0x80/0xd0
__device_attach+0xbc/0x1f0
pci_bus_add_device+0x54/0x80
pci_iov_add_virtfn+0x2e6/0x320
sriov_enable+0x208/0x420
mlx5_core_sriov_configure+0x9e/0x200 [mlx5_core]
sriov_numvfs_store+0xae/0x1a0
kernfs_fop_write_iter+0x10c/0x1a0
vfs_write+0x291/0x3c0
ksys_write+0x5f/0xe0
do_syscall_64+0x3d/0x90
entry_SYSCALL_64_after_hwframe+0x46/0xb0
CR2: 0000000000000000
---[ end trace 0000000000000000 ]--- |
| In the Linux kernel, the following vulnerability has been resolved:
cppc_cpufreq: Fix possible null pointer dereference
cppc_cpufreq_get_rate() and hisi_cppc_cpufreq_get_rate() can be called from
different places with various parameters. So cpufreq_cpu_get() can return
null as 'policy' in some circumstances.
Fix this bug by adding null return check.
Found by Linux Verification Center (linuxtesting.org) with SVACE. |
| In the Linux kernel, the following vulnerability has been resolved:
gfs2: Fix potential glock use-after-free on unmount
When a DLM lockspace is released and there ares still locks in that
lockspace, DLM will unlock those locks automatically. Commit
fb6791d100d1b started exploiting this behavior to speed up filesystem
unmount: gfs2 would simply free glocks it didn't want to unlock and then
release the lockspace. This didn't take the bast callbacks for
asynchronous lock contention notifications into account, which remain
active until until a lock is unlocked or its lockspace is released.
To prevent those callbacks from accessing deallocated objects, put the
glocks that should not be unlocked on the sd_dead_glocks list, release
the lockspace, and only then free those glocks.
As an additional measure, ignore unexpected ast and bast callbacks if
the receiving glock is dead. |
| In the Linux kernel, the following vulnerability has been resolved:
blk-cgroup: fix list corruption from reorder of WRITE ->lqueued
__blkcg_rstat_flush() can be run anytime, especially when blk_cgroup_bio_start
is being executed.
If WRITE of `->lqueued` is re-ordered with READ of 'bisc->lnode.next' in
the loop of __blkcg_rstat_flush(), `next_bisc` can be assigned with one
stat instance being added in blk_cgroup_bio_start(), then the local
list in __blkcg_rstat_flush() could be corrupted.
Fix the issue by adding one barrier. |
| In the Linux kernel, the following vulnerability has been resolved:
net: bridge: mst: fix vlan use-after-free
syzbot reported a suspicious rcu usage[1] in bridge's mst code. While
fixing it I noticed that nothing prevents a vlan to be freed while
walking the list from the same path (br forward delay timer). Fix the rcu
usage and also make sure we are not accessing freed memory by making
br_mst_vlan_set_state use rcu read lock.
[1]
WARNING: suspicious RCU usage
6.9.0-rc6-syzkaller #0 Not tainted
-----------------------------
net/bridge/br_private.h:1599 suspicious rcu_dereference_protected() usage!
...
stack backtrace:
CPU: 1 PID: 8017 Comm: syz-executor.1 Not tainted 6.9.0-rc6-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024
Call Trace:
<IRQ>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114
lockdep_rcu_suspicious+0x221/0x340 kernel/locking/lockdep.c:6712
nbp_vlan_group net/bridge/br_private.h:1599 [inline]
br_mst_set_state+0x1ea/0x650 net/bridge/br_mst.c:105
br_set_state+0x28a/0x7b0 net/bridge/br_stp.c:47
br_forward_delay_timer_expired+0x176/0x440 net/bridge/br_stp_timer.c:88
call_timer_fn+0x18e/0x650 kernel/time/timer.c:1793
expire_timers kernel/time/timer.c:1844 [inline]
__run_timers kernel/time/timer.c:2418 [inline]
__run_timer_base+0x66a/0x8e0 kernel/time/timer.c:2429
run_timer_base kernel/time/timer.c:2438 [inline]
run_timer_softirq+0xb7/0x170 kernel/time/timer.c:2448
__do_softirq+0x2c6/0x980 kernel/softirq.c:554
invoke_softirq kernel/softirq.c:428 [inline]
__irq_exit_rcu+0xf2/0x1c0 kernel/softirq.c:633
irq_exit_rcu+0x9/0x30 kernel/softirq.c:645
instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1043 [inline]
sysvec_apic_timer_interrupt+0xa6/0xc0 arch/x86/kernel/apic/apic.c:1043
</IRQ>
<TASK>
asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:702
RIP: 0010:lock_acquire+0x264/0x550 kernel/locking/lockdep.c:5758
Code: 2b 00 74 08 4c 89 f7 e8 ba d1 84 00 f6 44 24 61 02 0f 85 85 01 00 00 41 f7 c7 00 02 00 00 74 01 fb 48 c7 44 24 40 0e 36 e0 45 <4b> c7 44 25 00 00 00 00 00 43 c7 44 25 09 00 00 00 00 43 c7 44 25
RSP: 0018:ffffc90013657100 EFLAGS: 00000206
RAX: 0000000000000001 RBX: 1ffff920026cae2c RCX: 0000000000000001
RDX: dffffc0000000000 RSI: ffffffff8bcaca00 RDI: ffffffff8c1eaa60
RBP: ffffc90013657260 R08: ffffffff92efe507 R09: 1ffffffff25dfca0
R10: dffffc0000000000 R11: fffffbfff25dfca1 R12: 1ffff920026cae28
R13: dffffc0000000000 R14: ffffc90013657160 R15: 0000000000000246 |
| In the Linux kernel, the following vulnerability has been resolved:
drm/vmwgfx: Fix invalid reads in fence signaled events
Correctly set the length of the drm_event to the size of the structure
that's actually used.
The length of the drm_event was set to the parent structure instead of
to the drm_vmw_event_fence which is supposed to be read. drm_read
uses the length parameter to copy the event to the user space thus
resuling in oob reads. |
| In the Linux kernel, the following vulnerability has been resolved:
pinctrl: core: delete incorrect free in pinctrl_enable()
The "pctldev" struct is allocated in devm_pinctrl_register_and_init().
It's a devm_ managed pointer that is freed by devm_pinctrl_dev_release(),
so freeing it in pinctrl_enable() will lead to a double free.
The devm_pinctrl_dev_release() function frees the pindescs and destroys
the mutex as well. |
| In the Linux kernel, the following vulnerability has been resolved:
ipv6: fib6_rules: avoid possible NULL dereference in fib6_rule_action()
syzbot is able to trigger the following crash [1],
caused by unsafe ip6_dst_idev() use.
Indeed ip6_dst_idev() can return NULL, and must always be checked.
[1]
Oops: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN PTI
KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
CPU: 0 PID: 31648 Comm: syz-executor.0 Not tainted 6.9.0-rc4-next-20240417-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024
RIP: 0010:__fib6_rule_action net/ipv6/fib6_rules.c:237 [inline]
RIP: 0010:fib6_rule_action+0x241/0x7b0 net/ipv6/fib6_rules.c:267
Code: 02 00 00 49 8d 9f d8 00 00 00 48 89 d8 48 c1 e8 03 42 80 3c 20 00 74 08 48 89 df e8 f9 32 bf f7 48 8b 1b 48 89 d8 48 c1 e8 03 <42> 80 3c 20 00 74 08 48 89 df e8 e0 32 bf f7 4c 8b 03 48 89 ef 4c
RSP: 0018:ffffc9000fc1f2f0 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 1a772f98c8186700
RDX: 0000000000000003 RSI: ffffffff8bcac4e0 RDI: ffffffff8c1f9760
RBP: ffff8880673fb980 R08: ffffffff8fac15ef R09: 1ffffffff1f582bd
R10: dffffc0000000000 R11: fffffbfff1f582be R12: dffffc0000000000
R13: 0000000000000080 R14: ffff888076509000 R15: ffff88807a029a00
FS: 00007f55e82ca6c0(0000) GS:ffff8880b9400000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000001b31d23000 CR3: 0000000022b66000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
fib_rules_lookup+0x62c/0xdb0 net/core/fib_rules.c:317
fib6_rule_lookup+0x1fd/0x790 net/ipv6/fib6_rules.c:108
ip6_route_output_flags_noref net/ipv6/route.c:2637 [inline]
ip6_route_output_flags+0x38e/0x610 net/ipv6/route.c:2649
ip6_route_output include/net/ip6_route.h:93 [inline]
ip6_dst_lookup_tail+0x189/0x11a0 net/ipv6/ip6_output.c:1120
ip6_dst_lookup_flow+0xb9/0x180 net/ipv6/ip6_output.c:1250
sctp_v6_get_dst+0x792/0x1e20 net/sctp/ipv6.c:326
sctp_transport_route+0x12c/0x2e0 net/sctp/transport.c:455
sctp_assoc_add_peer+0x614/0x15c0 net/sctp/associola.c:662
sctp_connect_new_asoc+0x31d/0x6c0 net/sctp/socket.c:1099
__sctp_connect+0x66d/0xe30 net/sctp/socket.c:1197
sctp_connect net/sctp/socket.c:4819 [inline]
sctp_inet_connect+0x149/0x1f0 net/sctp/socket.c:4834
__sys_connect_file net/socket.c:2048 [inline]
__sys_connect+0x2df/0x310 net/socket.c:2065
__do_sys_connect net/socket.c:2075 [inline]
__se_sys_connect net/socket.c:2072 [inline]
__x64_sys_connect+0x7a/0x90 net/socket.c:2072
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f |
| In the Linux kernel, the following vulnerability has been resolved:
ipv6: prevent NULL dereference in ip6_output()
According to syzbot, there is a chance that ip6_dst_idev()
returns NULL in ip6_output(). Most places in IPv6 stack
deal with a NULL idev just fine, but not here.
syzbot reported:
general protection fault, probably for non-canonical address 0xdffffc00000000bc: 0000 [#1] PREEMPT SMP KASAN PTI
KASAN: null-ptr-deref in range [0x00000000000005e0-0x00000000000005e7]
CPU: 0 PID: 9775 Comm: syz-executor.4 Not tainted 6.9.0-rc5-syzkaller-00157-g6a30653b604a #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024
RIP: 0010:ip6_output+0x231/0x3f0 net/ipv6/ip6_output.c:237
Code: 3c 1e 00 49 89 df 74 08 4c 89 ef e8 19 58 db f7 48 8b 44 24 20 49 89 45 00 49 89 c5 48 8d 9d e0 05 00 00 48 89 d8 48 c1 e8 03 <42> 0f b6 04 38 84 c0 4c 8b 74 24 28 0f 85 61 01 00 00 8b 1b 31 ff
RSP: 0018:ffffc9000927f0d8 EFLAGS: 00010202
RAX: 00000000000000bc RBX: 00000000000005e0 RCX: 0000000000040000
RDX: ffffc900131f9000 RSI: 0000000000004f47 RDI: 0000000000004f48
RBP: 0000000000000000 R08: ffffffff8a1f0b9a R09: 1ffffffff1f51fad
R10: dffffc0000000000 R11: fffffbfff1f51fae R12: ffff8880293ec8c0
R13: ffff88805d7fc000 R14: 1ffff1100527d91a R15: dffffc0000000000
FS: 00007f135c6856c0(0000) GS:ffff8880b9400000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020000080 CR3: 0000000064096000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
NF_HOOK include/linux/netfilter.h:314 [inline]
ip6_xmit+0xefe/0x17f0 net/ipv6/ip6_output.c:358
sctp_v6_xmit+0x9f2/0x13f0 net/sctp/ipv6.c:248
sctp_packet_transmit+0x26ad/0x2ca0 net/sctp/output.c:653
sctp_packet_singleton+0x22c/0x320 net/sctp/outqueue.c:783
sctp_outq_flush_ctrl net/sctp/outqueue.c:914 [inline]
sctp_outq_flush+0x6d5/0x3e20 net/sctp/outqueue.c:1212
sctp_side_effects net/sctp/sm_sideeffect.c:1198 [inline]
sctp_do_sm+0x59cc/0x60c0 net/sctp/sm_sideeffect.c:1169
sctp_primitive_ASSOCIATE+0x95/0xc0 net/sctp/primitive.c:73
__sctp_connect+0x9cd/0xe30 net/sctp/socket.c:1234
sctp_connect net/sctp/socket.c:4819 [inline]
sctp_inet_connect+0x149/0x1f0 net/sctp/socket.c:4834
__sys_connect_file net/socket.c:2048 [inline]
__sys_connect+0x2df/0x310 net/socket.c:2065
__do_sys_connect net/socket.c:2075 [inline]
__se_sys_connect net/socket.c:2072 [inline]
__x64_sys_connect+0x7a/0x90 net/socket.c:2072
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f |
| In the Linux kernel, the following vulnerability has been resolved:
tls: fix missing memory barrier in tls_init
In tls_init(), a write memory barrier is missing, and store-store
reordering may cause NULL dereference in tls_{setsockopt,getsockopt}.
CPU0 CPU1
----- -----
// In tls_init()
// In tls_ctx_create()
ctx = kzalloc()
ctx->sk_proto = READ_ONCE(sk->sk_prot) -(1)
// In update_sk_prot()
WRITE_ONCE(sk->sk_prot, tls_prots) -(2)
// In sock_common_setsockopt()
READ_ONCE(sk->sk_prot)->setsockopt()
// In tls_{setsockopt,getsockopt}()
ctx->sk_proto->setsockopt() -(3)
In the above scenario, when (1) and (2) are reordered, (3) can observe
the NULL value of ctx->sk_proto, causing NULL dereference.
To fix it, we rely on rcu_assign_pointer() which implies the release
barrier semantic. By moving rcu_assign_pointer() after ctx->sk_proto is
initialized, we can ensure that ctx->sk_proto are visible when
changing sk->sk_prot. |
| In the Linux kernel, the following vulnerability has been resolved:
netfilter: tproxy: bail out if IP has been disabled on the device
syzbot reports:
general protection fault, probably for non-canonical address 0xdffffc0000000003: 0000 [#1] PREEMPT SMP KASAN PTI
KASAN: null-ptr-deref in range [0x0000000000000018-0x000000000000001f]
[..]
RIP: 0010:nf_tproxy_laddr4+0xb7/0x340 net/ipv4/netfilter/nf_tproxy_ipv4.c:62
Call Trace:
nft_tproxy_eval_v4 net/netfilter/nft_tproxy.c:56 [inline]
nft_tproxy_eval+0xa9a/0x1a00 net/netfilter/nft_tproxy.c:168
__in_dev_get_rcu() can return NULL, so check for this. |
| In the Linux kernel, the following vulnerability has been resolved:
ipv6: fix race condition between ipv6_get_ifaddr and ipv6_del_addr
Although ipv6_get_ifaddr walks inet6_addr_lst under the RCU lock, it
still means hlist_for_each_entry_rcu can return an item that got removed
from the list. The memory itself of such item is not freed thanks to RCU
but nothing guarantees the actual content of the memory is sane.
In particular, the reference count can be zero. This can happen if
ipv6_del_addr is called in parallel. ipv6_del_addr removes the entry
from inet6_addr_lst (hlist_del_init_rcu(&ifp->addr_lst)) and drops all
references (__in6_ifa_put(ifp) + in6_ifa_put(ifp)). With bad enough
timing, this can happen:
1. In ipv6_get_ifaddr, hlist_for_each_entry_rcu returns an entry.
2. Then, the whole ipv6_del_addr is executed for the given entry. The
reference count drops to zero and kfree_rcu is scheduled.
3. ipv6_get_ifaddr continues and tries to increments the reference count
(in6_ifa_hold).
4. The rcu is unlocked and the entry is freed.
5. The freed entry is returned.
Prevent increasing of the reference count in such case. The name
in6_ifa_hold_safe is chosen to mimic the existing fib6_info_hold_safe.
[ 41.506330] refcount_t: addition on 0; use-after-free.
[ 41.506760] WARNING: CPU: 0 PID: 595 at lib/refcount.c:25 refcount_warn_saturate+0xa5/0x130
[ 41.507413] Modules linked in: veth bridge stp llc
[ 41.507821] CPU: 0 PID: 595 Comm: python3 Not tainted 6.9.0-rc2.main-00208-g49563be82afa #14
[ 41.508479] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
[ 41.509163] RIP: 0010:refcount_warn_saturate+0xa5/0x130
[ 41.509586] Code: ad ff 90 0f 0b 90 90 c3 cc cc cc cc 80 3d c0 30 ad 01 00 75 a0 c6 05 b7 30 ad 01 01 90 48 c7 c7 38 cc 7a 8c e8 cc 18 ad ff 90 <0f> 0b 90 90 c3 cc cc cc cc 80 3d 98 30 ad 01 00 0f 85 75 ff ff ff
[ 41.510956] RSP: 0018:ffffbda3c026baf0 EFLAGS: 00010282
[ 41.511368] RAX: 0000000000000000 RBX: ffff9e9c46914800 RCX: 0000000000000000
[ 41.511910] RDX: ffff9e9c7ec29c00 RSI: ffff9e9c7ec1c900 RDI: ffff9e9c7ec1c900
[ 41.512445] RBP: ffff9e9c43660c9c R08: 0000000000009ffb R09: 00000000ffffdfff
[ 41.512998] R10: 00000000ffffdfff R11: ffffffff8ca58a40 R12: ffff9e9c4339a000
[ 41.513534] R13: 0000000000000001 R14: ffff9e9c438a0000 R15: ffffbda3c026bb48
[ 41.514086] FS: 00007fbc4cda1740(0000) GS:ffff9e9c7ec00000(0000) knlGS:0000000000000000
[ 41.514726] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 41.515176] CR2: 000056233b337d88 CR3: 000000000376e006 CR4: 0000000000370ef0
[ 41.515713] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 41.516252] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 41.516799] Call Trace:
[ 41.517037] <TASK>
[ 41.517249] ? __warn+0x7b/0x120
[ 41.517535] ? refcount_warn_saturate+0xa5/0x130
[ 41.517923] ? report_bug+0x164/0x190
[ 41.518240] ? handle_bug+0x3d/0x70
[ 41.518541] ? exc_invalid_op+0x17/0x70
[ 41.520972] ? asm_exc_invalid_op+0x1a/0x20
[ 41.521325] ? refcount_warn_saturate+0xa5/0x130
[ 41.521708] ipv6_get_ifaddr+0xda/0xe0
[ 41.522035] inet6_rtm_getaddr+0x342/0x3f0
[ 41.522376] ? __pfx_inet6_rtm_getaddr+0x10/0x10
[ 41.522758] rtnetlink_rcv_msg+0x334/0x3d0
[ 41.523102] ? netlink_unicast+0x30f/0x390
[ 41.523445] ? __pfx_rtnetlink_rcv_msg+0x10/0x10
[ 41.523832] netlink_rcv_skb+0x53/0x100
[ 41.524157] netlink_unicast+0x23b/0x390
[ 41.524484] netlink_sendmsg+0x1f2/0x440
[ 41.524826] __sys_sendto+0x1d8/0x1f0
[ 41.525145] __x64_sys_sendto+0x1f/0x30
[ 41.525467] do_syscall_64+0xa5/0x1b0
[ 41.525794] entry_SYSCALL_64_after_hwframe+0x72/0x7a
[ 41.526213] RIP: 0033:0x7fbc4cfcea9a
[ 41.526528] Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 f3 0f 1e fa 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 15 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 7e c3 0f 1f 44 00 00 41 54 48 83 ec 30 44 89
[ 41.527942] RSP: 002b:00007f
---truncated--- |
| In the Linux kernel, the following vulnerability has been resolved:
net/mlx5: Properly link new fs rules into the tree
Previously, add_rule_fg would only add newly created rules from the
handle into the tree when they had a refcount of 1. On the other hand,
create_flow_handle tries hard to find and reference already existing
identical rules instead of creating new ones.
These two behaviors can result in a situation where create_flow_handle
1) creates a new rule and references it, then
2) in a subsequent step during the same handle creation references it
again,
resulting in a rule with a refcount of 2 that is not linked into the
tree, will have a NULL parent and root and will result in a crash when
the flow group is deleted because del_sw_hw_rule, invoked on rule
deletion, assumes node->parent is != NULL.
This happened in the wild, due to another bug related to incorrect
handling of duplicate pkt_reformat ids, which lead to the code in
create_flow_handle incorrectly referencing a just-added rule in the same
flow handle, resulting in the problem described above. Full details are
at [1].
This patch changes add_rule_fg to add new rules without parents into
the tree, properly initializing them and avoiding the crash. This makes
it more consistent with how rules are added to an FTE in
create_flow_handle. |
| In the Linux kernel, the following vulnerability has been resolved:
wifi: rtw89: fix null pointer access when abort scan
During cancel scan we might use vif that weren't scanning.
Fix this by using the actual scanning vif. |
| In the Linux kernel, the following vulnerability has been resolved:
mlxbf_gige: call request_irq() after NAPI initialized
The mlxbf_gige driver encounters a NULL pointer exception in
mlxbf_gige_open() when kdump is enabled. The sequence to reproduce
the exception is as follows:
a) enable kdump
b) trigger kdump via "echo c > /proc/sysrq-trigger"
c) kdump kernel executes
d) kdump kernel loads mlxbf_gige module
e) the mlxbf_gige module runs its open() as the
the "oob_net0" interface is brought up
f) mlxbf_gige module will experience an exception
during its open(), something like:
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
Mem abort info:
ESR = 0x0000000086000004
EC = 0x21: IABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
FSC = 0x04: level 0 translation fault
user pgtable: 4k pages, 48-bit VAs, pgdp=00000000e29a4000
[0000000000000000] pgd=0000000000000000, p4d=0000000000000000
Internal error: Oops: 0000000086000004 [#1] SMP
CPU: 0 PID: 812 Comm: NetworkManager Tainted: G OE 5.15.0-1035-bluefield #37-Ubuntu
Hardware name: https://www.mellanox.com BlueField-3 SmartNIC Main Card/BlueField-3 SmartNIC Main Card, BIOS 4.6.0.13024 Jan 19 2024
pstate: 80400009 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : 0x0
lr : __napi_poll+0x40/0x230
sp : ffff800008003e00
x29: ffff800008003e00 x28: 0000000000000000 x27: 00000000ffffffff
x26: ffff000066027238 x25: ffff00007cedec00 x24: ffff800008003ec8
x23: 000000000000012c x22: ffff800008003eb7 x21: 0000000000000000
x20: 0000000000000001 x19: ffff000066027238 x18: 0000000000000000
x17: ffff578fcb450000 x16: ffffa870b083c7c0 x15: 0000aaab010441d0
x14: 0000000000000001 x13: 00726f7272655f65 x12: 6769675f6662786c
x11: 0000000000000000 x10: 0000000000000000 x9 : ffffa870b0842398
x8 : 0000000000000004 x7 : fe5a48b9069706ea x6 : 17fdb11fc84ae0d2
x5 : d94a82549d594f35 x4 : 0000000000000000 x3 : 0000000000400100
x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff000066027238
Call trace:
0x0
net_rx_action+0x178/0x360
__do_softirq+0x15c/0x428
__irq_exit_rcu+0xac/0xec
irq_exit+0x18/0x2c
handle_domain_irq+0x6c/0xa0
gic_handle_irq+0xec/0x1b0
call_on_irq_stack+0x20/0x2c
do_interrupt_handler+0x5c/0x70
el1_interrupt+0x30/0x50
el1h_64_irq_handler+0x18/0x2c
el1h_64_irq+0x7c/0x80
__setup_irq+0x4c0/0x950
request_threaded_irq+0xf4/0x1bc
mlxbf_gige_request_irqs+0x68/0x110 [mlxbf_gige]
mlxbf_gige_open+0x5c/0x170 [mlxbf_gige]
__dev_open+0x100/0x220
__dev_change_flags+0x16c/0x1f0
dev_change_flags+0x2c/0x70
do_setlink+0x220/0xa40
__rtnl_newlink+0x56c/0x8a0
rtnl_newlink+0x58/0x84
rtnetlink_rcv_msg+0x138/0x3c4
netlink_rcv_skb+0x64/0x130
rtnetlink_rcv+0x20/0x30
netlink_unicast+0x2ec/0x360
netlink_sendmsg+0x278/0x490
__sock_sendmsg+0x5c/0x6c
____sys_sendmsg+0x290/0x2d4
___sys_sendmsg+0x84/0xd0
__sys_sendmsg+0x70/0xd0
__arm64_sys_sendmsg+0x2c/0x40
invoke_syscall+0x78/0x100
el0_svc_common.constprop.0+0x54/0x184
do_el0_svc+0x30/0xac
el0_svc+0x48/0x160
el0t_64_sync_handler+0xa4/0x12c
el0t_64_sync+0x1a4/0x1a8
Code: bad PC value
---[ end trace 7d1c3f3bf9d81885 ]---
Kernel panic - not syncing: Oops: Fatal exception in interrupt
Kernel Offset: 0x2870a7a00000 from 0xffff800008000000
PHYS_OFFSET: 0x80000000
CPU features: 0x0,000005c1,a3332a5a
Memory Limit: none
---[ end Kernel panic - not syncing: Oops: Fatal exception in interrupt ]---
The exception happens because there is a pending RX interrupt before the
call to request_irq(RX IRQ) executes. Then, the RX IRQ handler fires
immediately after this request_irq() completes. The
---truncated--- |