Skip to content

Conversation

WereCatf
Copy link

@WereCatf WereCatf commented Nov 1, 2016

At line 539
reg = <0x01c2b000 0x400>;
should be
reg = <0x01c2b400 0x400>;

At line 539
reg = <0x01c2b000 0x400>;
should be
reg = <0x01c2b400 0x400>;
@@ -519,7 +519,7 @@

i2c2: i2c@01c2b400 {
compatible = "allwinner,sun6i-a31-i2c";
reg = <0x01c2b000 0x400>;
reg = <0x01c2b400 0x400>;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls add tests

@WereCatf WereCatf closed this Apr 14, 2017
tobetter added a commit to tobetter/linux that referenced this pull request Jul 2, 2018
Upgrade to v2.10.00 Realtek RTL8152/RTL8153 Based USB Ethernet Adapte…
Mic92 pushed a commit to Mic92/linux that referenced this pull request Feb 4, 2019
borkmann added a commit to cilium/linux that referenced this pull request May 23, 2024
  # ./vmtest.sh -- ./test_progs -t netkit
  [...]
  ./test_progs -t netkit
  [    1.140780] bpf_testmod: loading out-of-tree module taints kernel.
  [    1.141127] bpf_testmod: module verification failed: signature and/or required key missing - tainting kernel
  [    1.284601] tsc: Refined TSC clocksource calibration: 3408.006 MHz
  [    1.286672] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x311fd9b189d, max_idle_ns: 440795225691 ns
  [    1.290384] clocksource: Switched to clocksource tsc
  torvalds#345     tc_netkit_basic:OK
  torvalds#346     tc_netkit_device:OK
  torvalds#347     tc_netkit_multi_links:OK
  torvalds#348     tc_netkit_multi_opts:OK
  torvalds#349     tc_netkit_neigh_links:OK
  torvalds#350     tc_netkit_pkt_type:OK
  Summary: 6/0 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
borkmann added a commit to cilium/linux that referenced this pull request May 24, 2024
Add a test case to assert that the skb->pkt_type which was set from the BPF
program is retained from the netkit xmit side to the peer's device at tcx
ingress location.

  # ./vmtest.sh -- ./test_progs -t netkit
  [...]
  ./test_progs -t netkit
  [    1.140780] bpf_testmod: loading out-of-tree module taints kernel.
  [    1.141127] bpf_testmod: module verification failed: signature and/or required key missing - tainting kernel
  [    1.284601] tsc: Refined TSC clocksource calibration: 3408.006 MHz
  [    1.286672] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x311fd9b189d, max_idle_ns: 440795225691 ns
  [    1.290384] clocksource: Switched to clocksource tsc
  torvalds#345     tc_netkit_basic:OK
  torvalds#346     tc_netkit_device:OK
  torvalds#347     tc_netkit_multi_links:OK
  torvalds#348     tc_netkit_multi_opts:OK
  torvalds#349     tc_netkit_neigh_links:OK
  torvalds#350     tc_netkit_pkt_type:OK
  Summary: 6/0 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
borkmann added a commit to cilium/linux that referenced this pull request May 24, 2024
Add a test case to assert that the skb->pkt_type which was set from the BPF
program is retained from the netkit xmit side to the peer's device at tcx
ingress location.

  # ./vmtest.sh -- ./test_progs -t netkit
  [...]
  ./test_progs -t netkit
  [    1.140780] bpf_testmod: loading out-of-tree module taints kernel.
  [    1.141127] bpf_testmod: module verification failed: signature and/or required key missing - tainting kernel
  [    1.284601] tsc: Refined TSC clocksource calibration: 3408.006 MHz
  [    1.286672] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x311fd9b189d, max_idle_ns: 440795225691 ns
  [    1.290384] clocksource: Switched to clocksource tsc
  torvalds#345     tc_netkit_basic:OK
  torvalds#346     tc_netkit_device:OK
  torvalds#347     tc_netkit_multi_links:OK
  torvalds#348     tc_netkit_multi_opts:OK
  torvalds#349     tc_netkit_neigh_links:OK
  torvalds#350     tc_netkit_pkt_type:OK
  Summary: 6/0 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request May 24, 2024
Add a test case to assert that the skb->pkt_type which was set from the BPF
program is retained from the netkit xmit side to the peer's device at tcx
ingress location.

  # ./vmtest.sh -- ./test_progs -t netkit
  [...]
  ./test_progs -t netkit
  [    1.140780] bpf_testmod: loading out-of-tree module taints kernel.
  [    1.141127] bpf_testmod: module verification failed: signature and/or required key missing - tainting kernel
  [    1.284601] tsc: Refined TSC clocksource calibration: 3408.006 MHz
  [    1.286672] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x311fd9b189d, max_idle_ns: 440795225691 ns
  [    1.290384] clocksource: Switched to clocksource tsc
  torvalds#345     tc_netkit_basic:OK
  torvalds#346     tc_netkit_device:OK
  torvalds#347     tc_netkit_multi_links:OK
  torvalds#348     tc_netkit_multi_opts:OK
  torvalds#349     tc_netkit_neigh_links:OK
  torvalds#350     tc_netkit_pkt_type:OK
  Summary: 6/0 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request May 26, 2024
Add a test case to assert that the skb->pkt_type which was set from the BPF
program is retained from the netkit xmit side to the peer's device at tcx
ingress location.

  # ./vmtest.sh -- ./test_progs -t netkit
  [...]
  ./test_progs -t netkit
  [    1.140780] bpf_testmod: loading out-of-tree module taints kernel.
  [    1.141127] bpf_testmod: module verification failed: signature and/or required key missing - tainting kernel
  [    1.284601] tsc: Refined TSC clocksource calibration: 3408.006 MHz
  [    1.286672] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x311fd9b189d, max_idle_ns: 440795225691 ns
  [    1.290384] clocksource: Switched to clocksource tsc
  torvalds#345     tc_netkit_basic:OK
  torvalds#346     tc_netkit_device:OK
  torvalds#347     tc_netkit_multi_links:OK
  torvalds#348     tc_netkit_multi_opts:OK
  torvalds#349     tc_netkit_neigh_links:OK
  torvalds#350     tc_netkit_pkt_type:OK
  Summary: 6/0 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/r/20240524163619.26001-4-daniel@iogearbox.net
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
adeepn pushed a commit to jethome-iot/linux-kernel that referenced this pull request Jun 10, 2024
Add a test case to assert that the skb->pkt_type which was set from the BPF
program is retained from the netkit xmit side to the peer's device at tcx
ingress location.

  # ./vmtest.sh -- ./test_progs -t netkit
  [...]
  ./test_progs -t netkit
  [    1.140780] bpf_testmod: loading out-of-tree module taints kernel.
  [    1.141127] bpf_testmod: module verification failed: signature and/or required key missing - tainting kernel
  [    1.284601] tsc: Refined TSC clocksource calibration: 3408.006 MHz
  [    1.286672] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x311fd9b189d, max_idle_ns: 440795225691 ns
  [    1.290384] clocksource: Switched to clocksource tsc
  torvalds#345     tc_netkit_basic:OK
  torvalds#346     tc_netkit_device:OK
  torvalds#347     tc_netkit_multi_links:OK
  torvalds#348     tc_netkit_multi_opts:OK
  torvalds#349     tc_netkit_neigh_links:OK
  torvalds#350     tc_netkit_pkt_type:OK
  Summary: 6/0 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/r/20240524163619.26001-4-daniel@iogearbox.net
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Aug 21, 2025
While working on the lazy MMU mode enablement for s390 I hit pretty
curious issues in the kasan code.

The first is related to a custom kasan-based sanitizer aimed at catching
invalid accesses to PTEs and is inspired by [1] conversation.  The kasan
complains on valid PTE accesses, while the shadow memory is reported as
unpoisoned:

[  102.783993] ==================================================================
[  102.784008] BUG: KASAN: out-of-bounds in set_pte_range+0x36c/0x390
[  102.784016] Read of size 8 at addr 0000780084cf9608 by task vmalloc_test/0/5542
[  102.784019] 
[  102.784040] CPU: 1 UID: 0 PID: 5542 Comm: vmalloc_test/0 Kdump: loaded Tainted: G           OE       6.16.0-gcc-ipte-kasan-11657-gb2d930c4950e torvalds#340 PREEMPT 
[  102.784047] Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
[  102.784049] Hardware name: IBM 8561 T01 703 (LPAR)
[  102.784052] Call Trace:
[  102.784054]  [<00007fffe0147ac0>] dump_stack_lvl+0xe8/0x140 
[  102.784059]  [<00007fffe0112484>] print_address_description.constprop.0+0x34/0x2d0 
[  102.784066]  [<00007fffe011282c>] print_report+0x10c/0x1f8 
[  102.784071]  [<00007fffe090785a>] kasan_report+0xfa/0x220 
[  102.784078]  [<00007fffe01d3dec>] set_pte_range+0x36c/0x390 
[  102.784083]  [<00007fffe01d41c2>] leave_ipte_batch+0x3b2/0xb10 
[  102.784088]  [<00007fffe07d3650>] apply_to_pte_range+0x2f0/0x4e0 
[  102.784094]  [<00007fffe07e62e4>] apply_to_pmd_range+0x194/0x3e0 
[  102.784099]  [<00007fffe07e820e>] __apply_to_page_range+0x2fe/0x7a0 
[  102.784104]  [<00007fffe07e86d8>] apply_to_page_range+0x28/0x40 
[  102.784109]  [<00007fffe090a3ec>] __kasan_populate_vmalloc+0xec/0x310 
[  102.784114]  [<00007fffe090aa36>] kasan_populate_vmalloc+0x96/0x130 
[  102.784118]  [<00007fffe0833a04>] alloc_vmap_area+0x3d4/0xf30 
[  102.784123]  [<00007fffe083a8ba>] __get_vm_area_node+0x1aa/0x4c0 
[  102.784127]  [<00007fffe083c4f6>] __vmalloc_node_range_noprof+0x126/0x4e0 
[  102.784131]  [<00007fffe083c980>] __vmalloc_node_noprof+0xd0/0x110 
[  102.784135]  [<00007fffe083ca32>] vmalloc_noprof+0x32/0x40 
[  102.784139]  [<00007fff608aa336>] fix_size_alloc_test+0x66/0x150 [test_vmalloc] 
[  102.784147]  [<00007fff608aa710>] test_func+0x2f0/0x430 [test_vmalloc] 
[  102.784153]  [<00007fffe02841f8>] kthread+0x3f8/0x7a0 
[  102.784159]  [<00007fffe014d8b4>] __ret_from_fork+0xd4/0x7d0 
[  102.784164]  [<00007fffe299c00a>] ret_from_fork+0xa/0x30 
[  102.784173] no locks held by vmalloc_test/0/5542.
[  102.784176] 
[  102.784178] The buggy address belongs to the physical page:
[  102.784186] page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x84cf9
[  102.784198] flags: 0x3ffff00000000000(node=0|zone=1|lastcpupid=0x1ffff)
[  102.784212] page_type: f2(table)
[  102.784225] raw: 3ffff00000000000 0000000000000000 0000000000000122 0000000000000000
[  102.784234] raw: 0000000000000000 0000000000000000 f200000000000001 0000000000000000
[  102.784248] page dumped because: kasan: bad access detected
[  102.784250] 
[  102.784252] Memory state around the buggy address:
[  102.784260]  0000780084cf9500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  102.784274]  0000780084cf9580: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  102.784277] >0000780084cf9600: fd 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  102.784290]                          ^
[  102.784293]  0000780084cf9680: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  102.784303]  0000780084cf9700: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  102.784306] ==================================================================

The second issue hits when the custom sanitizer above is not implemented,
but the kasan itself is still active:

[ 1554.438028] Unable to handle kernel pointer dereference in virtual kernel address space
[ 1554.438065] Failing address: 001c0ff0066f0000 TEID: 001c0ff0066f0403
[ 1554.438076] Fault in home space mode while using kernel ASCE.
[ 1554.438103] AS:00000000059d400b R2:0000000ffec5c00b R3:00000000c6c9c007 S:0000000314470001 P:00000000d0ab413d 
[ 1554.438158] Oops: 0011 ilc:2 [#1]SMP 
[ 1554.438175] Modules linked in: test_vmalloc(E+) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) nft_chain_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) nf_tables(E) sunrpc(E) pkey_pckmo(E) uvdevice(E) s390_trng(E) rng_core(E) eadm_sch(E) vfio_ccw(E) mdev(E) vfio_iommu_type1(E) vfio(E) sch_fq_codel(E) drm(E) loop(E) i2c_core(E) drm_panel_orientation_quirks(E) nfnetlink(E) ctcm(E) fsm(E) zfcp(E) scsi_transport_fc(E) diag288_wdt(E) watchdog(E) ghash_s390(E) prng(E) aes_s390(E) des_s390(E) libdes(E) sha3_512_s390(E) sha3_256_s390(E) sha512_s390(E) sha1_s390(E) sha_common(E) pkey(E) autofs4(E)
[ 1554.438319] Unloaded tainted modules: pkey_uv(E):1 hmac_s390(E):2
[ 1554.438354] CPU: 1 UID: 0 PID: 1715 Comm: vmalloc_test/0 Kdump: loaded Tainted: G            E       6.16.0-gcc-ipte-kasan-11657-gb2d930c4950e torvalds#350 PREEMPT 
[ 1554.438368] Tainted: [E]=UNSIGNED_MODULE
[ 1554.438374] Hardware name: IBM 8561 T01 703 (LPAR)
[ 1554.438381] Krnl PSW : 0704e00180000000 00007fffe1d3d6ae (memset+0x5e/0x98)
[ 1554.438396]            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3
[ 1554.438409] Krnl GPRS: 0000000000000001 001c0ff0066f0000 001c0ff0066f0000 00000000000000f8
[ 1554.438418]            00000000000009fe 0000000000000009 0000000000000000 0000000000000002
[ 1554.438426]            0000000000005000 000078031ae655c8 00000feffdcf9f59 0000780258672a20
[ 1554.438433]            0000780243153500 00007f8033780000 00007fffe083a510 00007f7fee7cfa00
[ 1554.438452] Krnl Code: 00007fffe1d3d6a0: eb540008000c	srlg	%r5,%r4,8
           00007fffe1d3d6a6: b9020055		ltgr	%r5,%r5
          #00007fffe1d3d6aa: a784000b		brc	8,00007fffe1d3d6c0
          >00007fffe1d3d6ae: 42301000		stc	%r3,0(%r1)
           00007fffe1d3d6b2: d2fe10011000	mvc	1(255,%r1),0(%r1)
           00007fffe1d3d6b8: 41101100		la	%r1,256(%r1)
           00007fffe1d3d6bc: a757fff9		brctg	%r5,00007fffe1d3d6ae
           00007fffe1d3d6c0: 42301000		stc	%r3,0(%r1)
[ 1554.438539] Call Trace:
[ 1554.438545]  [<00007fffe1d3d6ae>] memset+0x5e/0x98 
[ 1554.438552] ([<00007fffe083a510>] remove_vm_area+0x220/0x400)
[ 1554.438562]  [<00007fffe083a9d6>] vfree.part.0+0x26/0x810 
[ 1554.438569]  [<00007fff6073bd50>] fix_align_alloc_test+0x50/0x90 [test_vmalloc] 
[ 1554.438583]  [<00007fff6073c73a>] test_func+0x46a/0x6c0 [test_vmalloc] 
[ 1554.438593]  [<00007fffe0283ac8>] kthread+0x3f8/0x7a0 
[ 1554.438603]  [<00007fffe014d8b4>] __ret_from_fork+0xd4/0x7d0 
[ 1554.438613]  [<00007fffe299ac0a>] ret_from_fork+0xa/0x30 
[ 1554.438622] INFO: lockdep is turned off.
[ 1554.438627] Last Breaking-Event-Address:
[ 1554.438632]  [<00007fffe1d3d65c>] memset+0xc/0x98
[ 1554.438644] Kernel panic - not syncing: Fatal exception: panic_on_oops

This series fixes the above issues and is a pre-requisite for the s390
lazy MMU mode implementation.

test_vmalloc was used to stress-test the fixes.


This patch (of 2):

When vmalloc shadow memory is established the modification of the
corresponding page tables is not protected by any locks.  Instead, the
locking is done per-PTE.  This scheme however has defects.

kasan_populate_vmalloc_pte() - while ptep_get() read is atomic the
sequence pte_none(ptep_get()) is not.  Doing that outside of the lock
might lead to a concurrent PTE update and what could be seen as a shadow
memory corruption as result.

kasan_depopulate_vmalloc_pte() - by the time a page whose address was
extracted from ptep_get() read and cached in a local variable outside of
the lock is attempted to get free, could actually be freed already.

To avoid these put ptep_get() itself and the code that manipulates the
result of the read under lock.  In addition, move freeing of the page out
of the atomic context.

Link: https://lkml.kernel.org/r/cover.1755528662.git.agordeev@linux.ibm.com
Link: https://lkml.kernel.org/r/adb258634194593db294c0d1fb35646e894d6ead.1755528662.git.agordeev@linux.ibm.com
Link: https://lore.kernel.org/linux-mm/5b0609c9-95ee-4e48-bb6d-98f57c5d2c31@arm.com/ [1]
Fixes: 3c5c3cf ("kasan: support backing vmalloc space with real shadow memory")
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Daniel Axtens <dja@axtens.net>
Cc: Marc Rutland <mark.rutland@arm.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
bjackman pushed a commit to bjackman/linux that referenced this pull request Aug 22, 2025
While working on the lazy MMU mode enablement for s390 I hit pretty
curious issues in the kasan code.

The first is related to a custom kasan-based sanitizer aimed at catching
invalid accesses to PTEs and is inspired by [1] conversation.  The kasan
complains on valid PTE accesses, while the shadow memory is reported as
unpoisoned:

[  102.783993] ==================================================================
[  102.784008] BUG: KASAN: out-of-bounds in set_pte_range+0x36c/0x390
[  102.784016] Read of size 8 at addr 0000780084cf9608 by task vmalloc_test/0/5542
[  102.784019] 
[  102.784040] CPU: 1 UID: 0 PID: 5542 Comm: vmalloc_test/0 Kdump: loaded Tainted: G           OE       6.16.0-gcc-ipte-kasan-11657-gb2d930c4950e torvalds#340 PREEMPT 
[  102.784047] Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
[  102.784049] Hardware name: IBM 8561 T01 703 (LPAR)
[  102.784052] Call Trace:
[  102.784054]  [<00007fffe0147ac0>] dump_stack_lvl+0xe8/0x140 
[  102.784059]  [<00007fffe0112484>] print_address_description.constprop.0+0x34/0x2d0 
[  102.784066]  [<00007fffe011282c>] print_report+0x10c/0x1f8 
[  102.784071]  [<00007fffe090785a>] kasan_report+0xfa/0x220 
[  102.784078]  [<00007fffe01d3dec>] set_pte_range+0x36c/0x390 
[  102.784083]  [<00007fffe01d41c2>] leave_ipte_batch+0x3b2/0xb10 
[  102.784088]  [<00007fffe07d3650>] apply_to_pte_range+0x2f0/0x4e0 
[  102.784094]  [<00007fffe07e62e4>] apply_to_pmd_range+0x194/0x3e0 
[  102.784099]  [<00007fffe07e820e>] __apply_to_page_range+0x2fe/0x7a0 
[  102.784104]  [<00007fffe07e86d8>] apply_to_page_range+0x28/0x40 
[  102.784109]  [<00007fffe090a3ec>] __kasan_populate_vmalloc+0xec/0x310 
[  102.784114]  [<00007fffe090aa36>] kasan_populate_vmalloc+0x96/0x130 
[  102.784118]  [<00007fffe0833a04>] alloc_vmap_area+0x3d4/0xf30 
[  102.784123]  [<00007fffe083a8ba>] __get_vm_area_node+0x1aa/0x4c0 
[  102.784127]  [<00007fffe083c4f6>] __vmalloc_node_range_noprof+0x126/0x4e0 
[  102.784131]  [<00007fffe083c980>] __vmalloc_node_noprof+0xd0/0x110 
[  102.784135]  [<00007fffe083ca32>] vmalloc_noprof+0x32/0x40 
[  102.784139]  [<00007fff608aa336>] fix_size_alloc_test+0x66/0x150 [test_vmalloc] 
[  102.784147]  [<00007fff608aa710>] test_func+0x2f0/0x430 [test_vmalloc] 
[  102.784153]  [<00007fffe02841f8>] kthread+0x3f8/0x7a0 
[  102.784159]  [<00007fffe014d8b4>] __ret_from_fork+0xd4/0x7d0 
[  102.784164]  [<00007fffe299c00a>] ret_from_fork+0xa/0x30 
[  102.784173] no locks held by vmalloc_test/0/5542.
[  102.784176] 
[  102.784178] The buggy address belongs to the physical page:
[  102.784186] page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x84cf9
[  102.784198] flags: 0x3ffff00000000000(node=0|zone=1|lastcpupid=0x1ffff)
[  102.784212] page_type: f2(table)
[  102.784225] raw: 3ffff00000000000 0000000000000000 0000000000000122 0000000000000000
[  102.784234] raw: 0000000000000000 0000000000000000 f200000000000001 0000000000000000
[  102.784248] page dumped because: kasan: bad access detected
[  102.784250] 
[  102.784252] Memory state around the buggy address:
[  102.784260]  0000780084cf9500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  102.784274]  0000780084cf9580: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  102.784277] >0000780084cf9600: fd 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  102.784290]                          ^
[  102.784293]  0000780084cf9680: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  102.784303]  0000780084cf9700: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  102.784306] ==================================================================

The second issue hits when the custom sanitizer above is not implemented,
but the kasan itself is still active:

[ 1554.438028] Unable to handle kernel pointer dereference in virtual kernel address space
[ 1554.438065] Failing address: 001c0ff0066f0000 TEID: 001c0ff0066f0403
[ 1554.438076] Fault in home space mode while using kernel ASCE.
[ 1554.438103] AS:00000000059d400b R2:0000000ffec5c00b R3:00000000c6c9c007 S:0000000314470001 P:00000000d0ab413d 
[ 1554.438158] Oops: 0011 ilc:2 [#1]SMP 
[ 1554.438175] Modules linked in: test_vmalloc(E+) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) nft_chain_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) nf_tables(E) sunrpc(E) pkey_pckmo(E) uvdevice(E) s390_trng(E) rng_core(E) eadm_sch(E) vfio_ccw(E) mdev(E) vfio_iommu_type1(E) vfio(E) sch_fq_codel(E) drm(E) loop(E) i2c_core(E) drm_panel_orientation_quirks(E) nfnetlink(E) ctcm(E) fsm(E) zfcp(E) scsi_transport_fc(E) diag288_wdt(E) watchdog(E) ghash_s390(E) prng(E) aes_s390(E) des_s390(E) libdes(E) sha3_512_s390(E) sha3_256_s390(E) sha512_s390(E) sha1_s390(E) sha_common(E) pkey(E) autofs4(E)
[ 1554.438319] Unloaded tainted modules: pkey_uv(E):1 hmac_s390(E):2
[ 1554.438354] CPU: 1 UID: 0 PID: 1715 Comm: vmalloc_test/0 Kdump: loaded Tainted: G            E       6.16.0-gcc-ipte-kasan-11657-gb2d930c4950e torvalds#350 PREEMPT 
[ 1554.438368] Tainted: [E]=UNSIGNED_MODULE
[ 1554.438374] Hardware name: IBM 8561 T01 703 (LPAR)
[ 1554.438381] Krnl PSW : 0704e00180000000 00007fffe1d3d6ae (memset+0x5e/0x98)
[ 1554.438396]            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3
[ 1554.438409] Krnl GPRS: 0000000000000001 001c0ff0066f0000 001c0ff0066f0000 00000000000000f8
[ 1554.438418]            00000000000009fe 0000000000000009 0000000000000000 0000000000000002
[ 1554.438426]            0000000000005000 000078031ae655c8 00000feffdcf9f59 0000780258672a20
[ 1554.438433]            0000780243153500 00007f8033780000 00007fffe083a510 00007f7fee7cfa00
[ 1554.438452] Krnl Code: 00007fffe1d3d6a0: eb540008000c	srlg	%r5,%r4,8
           00007fffe1d3d6a6: b9020055		ltgr	%r5,%r5
          #00007fffe1d3d6aa: a784000b		brc	8,00007fffe1d3d6c0
          >00007fffe1d3d6ae: 42301000		stc	%r3,0(%r1)
           00007fffe1d3d6b2: d2fe10011000	mvc	1(255,%r1),0(%r1)
           00007fffe1d3d6b8: 41101100		la	%r1,256(%r1)
           00007fffe1d3d6bc: a757fff9		brctg	%r5,00007fffe1d3d6ae
           00007fffe1d3d6c0: 42301000		stc	%r3,0(%r1)
[ 1554.438539] Call Trace:
[ 1554.438545]  [<00007fffe1d3d6ae>] memset+0x5e/0x98 
[ 1554.438552] ([<00007fffe083a510>] remove_vm_area+0x220/0x400)
[ 1554.438562]  [<00007fffe083a9d6>] vfree.part.0+0x26/0x810 
[ 1554.438569]  [<00007fff6073bd50>] fix_align_alloc_test+0x50/0x90 [test_vmalloc] 
[ 1554.438583]  [<00007fff6073c73a>] test_func+0x46a/0x6c0 [test_vmalloc] 
[ 1554.438593]  [<00007fffe0283ac8>] kthread+0x3f8/0x7a0 
[ 1554.438603]  [<00007fffe014d8b4>] __ret_from_fork+0xd4/0x7d0 
[ 1554.438613]  [<00007fffe299ac0a>] ret_from_fork+0xa/0x30 
[ 1554.438622] INFO: lockdep is turned off.
[ 1554.438627] Last Breaking-Event-Address:
[ 1554.438632]  [<00007fffe1d3d65c>] memset+0xc/0x98
[ 1554.438644] Kernel panic - not syncing: Fatal exception: panic_on_oops

This series fixes the above issues and is a pre-requisite for the s390
lazy MMU mode implementation.

test_vmalloc was used to stress-test the fixes.


This patch (of 2):

When vmalloc shadow memory is established the modification of the
corresponding page tables is not protected by any locks.  Instead, the
locking is done per-PTE.  This scheme however has defects.

kasan_populate_vmalloc_pte() - while ptep_get() read is atomic the
sequence pte_none(ptep_get()) is not.  Doing that outside of the lock
might lead to a concurrent PTE update and what could be seen as a shadow
memory corruption as result.

kasan_depopulate_vmalloc_pte() - by the time a page whose address was
extracted from ptep_get() read and cached in a local variable outside of
the lock is attempted to get free, could actually be freed already.

To avoid these put ptep_get() itself and the code that manipulates the
result of the read under lock.  In addition, move freeing of the page out
of the atomic context.

Link: https://lkml.kernel.org/r/cover.1755528662.git.agordeev@linux.ibm.com
Link: https://lkml.kernel.org/r/adb258634194593db294c0d1fb35646e894d6ead.1755528662.git.agordeev@linux.ibm.com
Link: https://lore.kernel.org/linux-mm/5b0609c9-95ee-4e48-bb6d-98f57c5d2c31@arm.com/ [1]
Fixes: 3c5c3cf ("kasan: support backing vmalloc space with real shadow memory")
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Daniel Axtens <dja@axtens.net>
Cc: Marc Rutland <mark.rutland@arm.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Aug 23, 2025
While working on the lazy MMU mode enablement for s390 I hit pretty
curious issues in the kasan code.

The first is related to a custom kasan-based sanitizer aimed at catching
invalid accesses to PTEs and is inspired by [1] conversation.  The kasan
complains on valid PTE accesses, while the shadow memory is reported as
unpoisoned:

[  102.783993] ==================================================================
[  102.784008] BUG: KASAN: out-of-bounds in set_pte_range+0x36c/0x390
[  102.784016] Read of size 8 at addr 0000780084cf9608 by task vmalloc_test/0/5542
[  102.784019] 
[  102.784040] CPU: 1 UID: 0 PID: 5542 Comm: vmalloc_test/0 Kdump: loaded Tainted: G           OE       6.16.0-gcc-ipte-kasan-11657-gb2d930c4950e torvalds#340 PREEMPT 
[  102.784047] Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
[  102.784049] Hardware name: IBM 8561 T01 703 (LPAR)
[  102.784052] Call Trace:
[  102.784054]  [<00007fffe0147ac0>] dump_stack_lvl+0xe8/0x140 
[  102.784059]  [<00007fffe0112484>] print_address_description.constprop.0+0x34/0x2d0 
[  102.784066]  [<00007fffe011282c>] print_report+0x10c/0x1f8 
[  102.784071]  [<00007fffe090785a>] kasan_report+0xfa/0x220 
[  102.784078]  [<00007fffe01d3dec>] set_pte_range+0x36c/0x390 
[  102.784083]  [<00007fffe01d41c2>] leave_ipte_batch+0x3b2/0xb10 
[  102.784088]  [<00007fffe07d3650>] apply_to_pte_range+0x2f0/0x4e0 
[  102.784094]  [<00007fffe07e62e4>] apply_to_pmd_range+0x194/0x3e0 
[  102.784099]  [<00007fffe07e820e>] __apply_to_page_range+0x2fe/0x7a0 
[  102.784104]  [<00007fffe07e86d8>] apply_to_page_range+0x28/0x40 
[  102.784109]  [<00007fffe090a3ec>] __kasan_populate_vmalloc+0xec/0x310 
[  102.784114]  [<00007fffe090aa36>] kasan_populate_vmalloc+0x96/0x130 
[  102.784118]  [<00007fffe0833a04>] alloc_vmap_area+0x3d4/0xf30 
[  102.784123]  [<00007fffe083a8ba>] __get_vm_area_node+0x1aa/0x4c0 
[  102.784127]  [<00007fffe083c4f6>] __vmalloc_node_range_noprof+0x126/0x4e0 
[  102.784131]  [<00007fffe083c980>] __vmalloc_node_noprof+0xd0/0x110 
[  102.784135]  [<00007fffe083ca32>] vmalloc_noprof+0x32/0x40 
[  102.784139]  [<00007fff608aa336>] fix_size_alloc_test+0x66/0x150 [test_vmalloc] 
[  102.784147]  [<00007fff608aa710>] test_func+0x2f0/0x430 [test_vmalloc] 
[  102.784153]  [<00007fffe02841f8>] kthread+0x3f8/0x7a0 
[  102.784159]  [<00007fffe014d8b4>] __ret_from_fork+0xd4/0x7d0 
[  102.784164]  [<00007fffe299c00a>] ret_from_fork+0xa/0x30 
[  102.784173] no locks held by vmalloc_test/0/5542.
[  102.784176] 
[  102.784178] The buggy address belongs to the physical page:
[  102.784186] page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x84cf9
[  102.784198] flags: 0x3ffff00000000000(node=0|zone=1|lastcpupid=0x1ffff)
[  102.784212] page_type: f2(table)
[  102.784225] raw: 3ffff00000000000 0000000000000000 0000000000000122 0000000000000000
[  102.784234] raw: 0000000000000000 0000000000000000 f200000000000001 0000000000000000
[  102.784248] page dumped because: kasan: bad access detected
[  102.784250] 
[  102.784252] Memory state around the buggy address:
[  102.784260]  0000780084cf9500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  102.784274]  0000780084cf9580: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  102.784277] >0000780084cf9600: fd 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  102.784290]                          ^
[  102.784293]  0000780084cf9680: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  102.784303]  0000780084cf9700: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  102.784306] ==================================================================

The second issue hits when the custom sanitizer above is not implemented,
but the kasan itself is still active:

[ 1554.438028] Unable to handle kernel pointer dereference in virtual kernel address space
[ 1554.438065] Failing address: 001c0ff0066f0000 TEID: 001c0ff0066f0403
[ 1554.438076] Fault in home space mode while using kernel ASCE.
[ 1554.438103] AS:00000000059d400b R2:0000000ffec5c00b R3:00000000c6c9c007 S:0000000314470001 P:00000000d0ab413d 
[ 1554.438158] Oops: 0011 ilc:2 [#1]SMP 
[ 1554.438175] Modules linked in: test_vmalloc(E+) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) nft_chain_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) nf_tables(E) sunrpc(E) pkey_pckmo(E) uvdevice(E) s390_trng(E) rng_core(E) eadm_sch(E) vfio_ccw(E) mdev(E) vfio_iommu_type1(E) vfio(E) sch_fq_codel(E) drm(E) loop(E) i2c_core(E) drm_panel_orientation_quirks(E) nfnetlink(E) ctcm(E) fsm(E) zfcp(E) scsi_transport_fc(E) diag288_wdt(E) watchdog(E) ghash_s390(E) prng(E) aes_s390(E) des_s390(E) libdes(E) sha3_512_s390(E) sha3_256_s390(E) sha512_s390(E) sha1_s390(E) sha_common(E) pkey(E) autofs4(E)
[ 1554.438319] Unloaded tainted modules: pkey_uv(E):1 hmac_s390(E):2
[ 1554.438354] CPU: 1 UID: 0 PID: 1715 Comm: vmalloc_test/0 Kdump: loaded Tainted: G            E       6.16.0-gcc-ipte-kasan-11657-gb2d930c4950e torvalds#350 PREEMPT 
[ 1554.438368] Tainted: [E]=UNSIGNED_MODULE
[ 1554.438374] Hardware name: IBM 8561 T01 703 (LPAR)
[ 1554.438381] Krnl PSW : 0704e00180000000 00007fffe1d3d6ae (memset+0x5e/0x98)
[ 1554.438396]            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3
[ 1554.438409] Krnl GPRS: 0000000000000001 001c0ff0066f0000 001c0ff0066f0000 00000000000000f8
[ 1554.438418]            00000000000009fe 0000000000000009 0000000000000000 0000000000000002
[ 1554.438426]            0000000000005000 000078031ae655c8 00000feffdcf9f59 0000780258672a20
[ 1554.438433]            0000780243153500 00007f8033780000 00007fffe083a510 00007f7fee7cfa00
[ 1554.438452] Krnl Code: 00007fffe1d3d6a0: eb540008000c	srlg	%r5,%r4,8
           00007fffe1d3d6a6: b9020055		ltgr	%r5,%r5
          #00007fffe1d3d6aa: a784000b		brc	8,00007fffe1d3d6c0
          >00007fffe1d3d6ae: 42301000		stc	%r3,0(%r1)
           00007fffe1d3d6b2: d2fe10011000	mvc	1(255,%r1),0(%r1)
           00007fffe1d3d6b8: 41101100		la	%r1,256(%r1)
           00007fffe1d3d6bc: a757fff9		brctg	%r5,00007fffe1d3d6ae
           00007fffe1d3d6c0: 42301000		stc	%r3,0(%r1)
[ 1554.438539] Call Trace:
[ 1554.438545]  [<00007fffe1d3d6ae>] memset+0x5e/0x98 
[ 1554.438552] ([<00007fffe083a510>] remove_vm_area+0x220/0x400)
[ 1554.438562]  [<00007fffe083a9d6>] vfree.part.0+0x26/0x810 
[ 1554.438569]  [<00007fff6073bd50>] fix_align_alloc_test+0x50/0x90 [test_vmalloc] 
[ 1554.438583]  [<00007fff6073c73a>] test_func+0x46a/0x6c0 [test_vmalloc] 
[ 1554.438593]  [<00007fffe0283ac8>] kthread+0x3f8/0x7a0 
[ 1554.438603]  [<00007fffe014d8b4>] __ret_from_fork+0xd4/0x7d0 
[ 1554.438613]  [<00007fffe299ac0a>] ret_from_fork+0xa/0x30 
[ 1554.438622] INFO: lockdep is turned off.
[ 1554.438627] Last Breaking-Event-Address:
[ 1554.438632]  [<00007fffe1d3d65c>] memset+0xc/0x98
[ 1554.438644] Kernel panic - not syncing: Fatal exception: panic_on_oops

This series fixes the above issues and is a pre-requisite for the s390
lazy MMU mode implementation.

test_vmalloc was used to stress-test the fixes.


This patch (of 2):

When vmalloc shadow memory is established the modification of the
corresponding page tables is not protected by any locks.  Instead, the
locking is done per-PTE.  This scheme however has defects.

kasan_populate_vmalloc_pte() - while ptep_get() read is atomic the
sequence pte_none(ptep_get()) is not.  Doing that outside of the lock
might lead to a concurrent PTE update and what could be seen as a shadow
memory corruption as result.

kasan_depopulate_vmalloc_pte() - by the time a page whose address was
extracted from ptep_get() read and cached in a local variable outside of
the lock is attempted to get free, could actually be freed already.

To avoid these put ptep_get() itself and the code that manipulates the
result of the read under lock.  In addition, move freeing of the page out
of the atomic context.

Link: https://lkml.kernel.org/r/cover.1755528662.git.agordeev@linux.ibm.com
Link: https://lkml.kernel.org/r/adb258634194593db294c0d1fb35646e894d6ead.1755528662.git.agordeev@linux.ibm.com
Link: https://lore.kernel.org/linux-mm/5b0609c9-95ee-4e48-bb6d-98f57c5d2c31@arm.com/ [1]
Fixes: 3c5c3cf ("kasan: support backing vmalloc space with real shadow memory")
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Daniel Axtens <dja@axtens.net>
Cc: Marc Rutland <mark.rutland@arm.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
klarasm pushed a commit to klarasm/linux that referenced this pull request Aug 25, 2025
While working on the lazy MMU mode enablement for s390 I hit pretty
curious issues in the kasan code.

The first is related to a custom kasan-based sanitizer aimed at catching
invalid accesses to PTEs and is inspired by [1] conversation.  The kasan
complains on valid PTE accesses, while the shadow memory is reported as
unpoisoned:

[  102.783993] ==================================================================
[  102.784008] BUG: KASAN: out-of-bounds in set_pte_range+0x36c/0x390
[  102.784016] Read of size 8 at addr 0000780084cf9608 by task vmalloc_test/0/5542
[  102.784019] 
[  102.784040] CPU: 1 UID: 0 PID: 5542 Comm: vmalloc_test/0 Kdump: loaded Tainted: G           OE       6.16.0-gcc-ipte-kasan-11657-gb2d930c4950e torvalds#340 PREEMPT 
[  102.784047] Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
[  102.784049] Hardware name: IBM 8561 T01 703 (LPAR)
[  102.784052] Call Trace:
[  102.784054]  [<00007fffe0147ac0>] dump_stack_lvl+0xe8/0x140 
[  102.784059]  [<00007fffe0112484>] print_address_description.constprop.0+0x34/0x2d0 
[  102.784066]  [<00007fffe011282c>] print_report+0x10c/0x1f8 
[  102.784071]  [<00007fffe090785a>] kasan_report+0xfa/0x220 
[  102.784078]  [<00007fffe01d3dec>] set_pte_range+0x36c/0x390 
[  102.784083]  [<00007fffe01d41c2>] leave_ipte_batch+0x3b2/0xb10 
[  102.784088]  [<00007fffe07d3650>] apply_to_pte_range+0x2f0/0x4e0 
[  102.784094]  [<00007fffe07e62e4>] apply_to_pmd_range+0x194/0x3e0 
[  102.784099]  [<00007fffe07e820e>] __apply_to_page_range+0x2fe/0x7a0 
[  102.784104]  [<00007fffe07e86d8>] apply_to_page_range+0x28/0x40 
[  102.784109]  [<00007fffe090a3ec>] __kasan_populate_vmalloc+0xec/0x310 
[  102.784114]  [<00007fffe090aa36>] kasan_populate_vmalloc+0x96/0x130 
[  102.784118]  [<00007fffe0833a04>] alloc_vmap_area+0x3d4/0xf30 
[  102.784123]  [<00007fffe083a8ba>] __get_vm_area_node+0x1aa/0x4c0 
[  102.784127]  [<00007fffe083c4f6>] __vmalloc_node_range_noprof+0x126/0x4e0 
[  102.784131]  [<00007fffe083c980>] __vmalloc_node_noprof+0xd0/0x110 
[  102.784135]  [<00007fffe083ca32>] vmalloc_noprof+0x32/0x40 
[  102.784139]  [<00007fff608aa336>] fix_size_alloc_test+0x66/0x150 [test_vmalloc] 
[  102.784147]  [<00007fff608aa710>] test_func+0x2f0/0x430 [test_vmalloc] 
[  102.784153]  [<00007fffe02841f8>] kthread+0x3f8/0x7a0 
[  102.784159]  [<00007fffe014d8b4>] __ret_from_fork+0xd4/0x7d0 
[  102.784164]  [<00007fffe299c00a>] ret_from_fork+0xa/0x30 
[  102.784173] no locks held by vmalloc_test/0/5542.
[  102.784176] 
[  102.784178] The buggy address belongs to the physical page:
[  102.784186] page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x84cf9
[  102.784198] flags: 0x3ffff00000000000(node=0|zone=1|lastcpupid=0x1ffff)
[  102.784212] page_type: f2(table)
[  102.784225] raw: 3ffff00000000000 0000000000000000 0000000000000122 0000000000000000
[  102.784234] raw: 0000000000000000 0000000000000000 f200000000000001 0000000000000000
[  102.784248] page dumped because: kasan: bad access detected
[  102.784250] 
[  102.784252] Memory state around the buggy address:
[  102.784260]  0000780084cf9500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  102.784274]  0000780084cf9580: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  102.784277] >0000780084cf9600: fd 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  102.784290]                          ^
[  102.784293]  0000780084cf9680: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  102.784303]  0000780084cf9700: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  102.784306] ==================================================================

The second issue hits when the custom sanitizer above is not implemented,
but the kasan itself is still active:

[ 1554.438028] Unable to handle kernel pointer dereference in virtual kernel address space
[ 1554.438065] Failing address: 001c0ff0066f0000 TEID: 001c0ff0066f0403
[ 1554.438076] Fault in home space mode while using kernel ASCE.
[ 1554.438103] AS:00000000059d400b R2:0000000ffec5c00b R3:00000000c6c9c007 S:0000000314470001 P:00000000d0ab413d 
[ 1554.438158] Oops: 0011 ilc:2 [#1]SMP 
[ 1554.438175] Modules linked in: test_vmalloc(E+) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) nft_chain_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) nf_tables(E) sunrpc(E) pkey_pckmo(E) uvdevice(E) s390_trng(E) rng_core(E) eadm_sch(E) vfio_ccw(E) mdev(E) vfio_iommu_type1(E) vfio(E) sch_fq_codel(E) drm(E) loop(E) i2c_core(E) drm_panel_orientation_quirks(E) nfnetlink(E) ctcm(E) fsm(E) zfcp(E) scsi_transport_fc(E) diag288_wdt(E) watchdog(E) ghash_s390(E) prng(E) aes_s390(E) des_s390(E) libdes(E) sha3_512_s390(E) sha3_256_s390(E) sha512_s390(E) sha1_s390(E) sha_common(E) pkey(E) autofs4(E)
[ 1554.438319] Unloaded tainted modules: pkey_uv(E):1 hmac_s390(E):2
[ 1554.438354] CPU: 1 UID: 0 PID: 1715 Comm: vmalloc_test/0 Kdump: loaded Tainted: G            E       6.16.0-gcc-ipte-kasan-11657-gb2d930c4950e torvalds#350 PREEMPT 
[ 1554.438368] Tainted: [E]=UNSIGNED_MODULE
[ 1554.438374] Hardware name: IBM 8561 T01 703 (LPAR)
[ 1554.438381] Krnl PSW : 0704e00180000000 00007fffe1d3d6ae (memset+0x5e/0x98)
[ 1554.438396]            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3
[ 1554.438409] Krnl GPRS: 0000000000000001 001c0ff0066f0000 001c0ff0066f0000 00000000000000f8
[ 1554.438418]            00000000000009fe 0000000000000009 0000000000000000 0000000000000002
[ 1554.438426]            0000000000005000 000078031ae655c8 00000feffdcf9f59 0000780258672a20
[ 1554.438433]            0000780243153500 00007f8033780000 00007fffe083a510 00007f7fee7cfa00
[ 1554.438452] Krnl Code: 00007fffe1d3d6a0: eb540008000c	srlg	%r5,%r4,8
           00007fffe1d3d6a6: b9020055		ltgr	%r5,%r5
          #00007fffe1d3d6aa: a784000b		brc	8,00007fffe1d3d6c0
          >00007fffe1d3d6ae: 42301000		stc	%r3,0(%r1)
           00007fffe1d3d6b2: d2fe10011000	mvc	1(255,%r1),0(%r1)
           00007fffe1d3d6b8: 41101100		la	%r1,256(%r1)
           00007fffe1d3d6bc: a757fff9		brctg	%r5,00007fffe1d3d6ae
           00007fffe1d3d6c0: 42301000		stc	%r3,0(%r1)
[ 1554.438539] Call Trace:
[ 1554.438545]  [<00007fffe1d3d6ae>] memset+0x5e/0x98 
[ 1554.438552] ([<00007fffe083a510>] remove_vm_area+0x220/0x400)
[ 1554.438562]  [<00007fffe083a9d6>] vfree.part.0+0x26/0x810 
[ 1554.438569]  [<00007fff6073bd50>] fix_align_alloc_test+0x50/0x90 [test_vmalloc] 
[ 1554.438583]  [<00007fff6073c73a>] test_func+0x46a/0x6c0 [test_vmalloc] 
[ 1554.438593]  [<00007fffe0283ac8>] kthread+0x3f8/0x7a0 
[ 1554.438603]  [<00007fffe014d8b4>] __ret_from_fork+0xd4/0x7d0 
[ 1554.438613]  [<00007fffe299ac0a>] ret_from_fork+0xa/0x30 
[ 1554.438622] INFO: lockdep is turned off.
[ 1554.438627] Last Breaking-Event-Address:
[ 1554.438632]  [<00007fffe1d3d65c>] memset+0xc/0x98
[ 1554.438644] Kernel panic - not syncing: Fatal exception: panic_on_oops

This series fixes the above issues and is a pre-requisite for the s390
lazy MMU mode implementation.

test_vmalloc was used to stress-test the fixes.


This patch (of 2):

When vmalloc shadow memory is established the modification of the
corresponding page tables is not protected by any locks.  Instead, the
locking is done per-PTE.  This scheme however has defects.

kasan_populate_vmalloc_pte() - while ptep_get() read is atomic the
sequence pte_none(ptep_get()) is not.  Doing that outside of the lock
might lead to a concurrent PTE update and what could be seen as a shadow
memory corruption as result.

kasan_depopulate_vmalloc_pte() - by the time a page whose address was
extracted from ptep_get() read and cached in a local variable outside of
the lock is attempted to get free, could actually be freed already.

To avoid these put ptep_get() itself and the code that manipulates the
result of the read under lock.  In addition, move freeing of the page out
of the atomic context.

Link: https://lkml.kernel.org/r/cover.1755528662.git.agordeev@linux.ibm.com
Link: https://lkml.kernel.org/r/adb258634194593db294c0d1fb35646e894d6ead.1755528662.git.agordeev@linux.ibm.com
Link: https://lore.kernel.org/linux-mm/5b0609c9-95ee-4e48-bb6d-98f57c5d2c31@arm.com/ [1]
Fixes: 3c5c3cf ("kasan: support backing vmalloc space with real shadow memory")
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Daniel Axtens <dja@axtens.net>
Cc: Marc Rutland <mark.rutland@arm.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
bjackman pushed a commit to bjackman/linux that referenced this pull request Aug 26, 2025
While working on the lazy MMU mode enablement for s390 I hit pretty
curious issues in the kasan code.

The first is related to a custom kasan-based sanitizer aimed at catching
invalid accesses to PTEs and is inspired by [1] conversation.  The kasan
complains on valid PTE accesses, while the shadow memory is reported as
unpoisoned:

[  102.783993] ==================================================================
[  102.784008] BUG: KASAN: out-of-bounds in set_pte_range+0x36c/0x390
[  102.784016] Read of size 8 at addr 0000780084cf9608 by task vmalloc_test/0/5542
[  102.784019] 
[  102.784040] CPU: 1 UID: 0 PID: 5542 Comm: vmalloc_test/0 Kdump: loaded Tainted: G           OE       6.16.0-gcc-ipte-kasan-11657-gb2d930c4950e torvalds#340 PREEMPT 
[  102.784047] Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
[  102.784049] Hardware name: IBM 8561 T01 703 (LPAR)
[  102.784052] Call Trace:
[  102.784054]  [<00007fffe0147ac0>] dump_stack_lvl+0xe8/0x140 
[  102.784059]  [<00007fffe0112484>] print_address_description.constprop.0+0x34/0x2d0 
[  102.784066]  [<00007fffe011282c>] print_report+0x10c/0x1f8 
[  102.784071]  [<00007fffe090785a>] kasan_report+0xfa/0x220 
[  102.784078]  [<00007fffe01d3dec>] set_pte_range+0x36c/0x390 
[  102.784083]  [<00007fffe01d41c2>] leave_ipte_batch+0x3b2/0xb10 
[  102.784088]  [<00007fffe07d3650>] apply_to_pte_range+0x2f0/0x4e0 
[  102.784094]  [<00007fffe07e62e4>] apply_to_pmd_range+0x194/0x3e0 
[  102.784099]  [<00007fffe07e820e>] __apply_to_page_range+0x2fe/0x7a0 
[  102.784104]  [<00007fffe07e86d8>] apply_to_page_range+0x28/0x40 
[  102.784109]  [<00007fffe090a3ec>] __kasan_populate_vmalloc+0xec/0x310 
[  102.784114]  [<00007fffe090aa36>] kasan_populate_vmalloc+0x96/0x130 
[  102.784118]  [<00007fffe0833a04>] alloc_vmap_area+0x3d4/0xf30 
[  102.784123]  [<00007fffe083a8ba>] __get_vm_area_node+0x1aa/0x4c0 
[  102.784127]  [<00007fffe083c4f6>] __vmalloc_node_range_noprof+0x126/0x4e0 
[  102.784131]  [<00007fffe083c980>] __vmalloc_node_noprof+0xd0/0x110 
[  102.784135]  [<00007fffe083ca32>] vmalloc_noprof+0x32/0x40 
[  102.784139]  [<00007fff608aa336>] fix_size_alloc_test+0x66/0x150 [test_vmalloc] 
[  102.784147]  [<00007fff608aa710>] test_func+0x2f0/0x430 [test_vmalloc] 
[  102.784153]  [<00007fffe02841f8>] kthread+0x3f8/0x7a0 
[  102.784159]  [<00007fffe014d8b4>] __ret_from_fork+0xd4/0x7d0 
[  102.784164]  [<00007fffe299c00a>] ret_from_fork+0xa/0x30 
[  102.784173] no locks held by vmalloc_test/0/5542.
[  102.784176] 
[  102.784178] The buggy address belongs to the physical page:
[  102.784186] page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x84cf9
[  102.784198] flags: 0x3ffff00000000000(node=0|zone=1|lastcpupid=0x1ffff)
[  102.784212] page_type: f2(table)
[  102.784225] raw: 3ffff00000000000 0000000000000000 0000000000000122 0000000000000000
[  102.784234] raw: 0000000000000000 0000000000000000 f200000000000001 0000000000000000
[  102.784248] page dumped because: kasan: bad access detected
[  102.784250] 
[  102.784252] Memory state around the buggy address:
[  102.784260]  0000780084cf9500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  102.784274]  0000780084cf9580: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  102.784277] >0000780084cf9600: fd 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  102.784290]                          ^
[  102.784293]  0000780084cf9680: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  102.784303]  0000780084cf9700: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  102.784306] ==================================================================

The second issue hits when the custom sanitizer above is not implemented,
but the kasan itself is still active:

[ 1554.438028] Unable to handle kernel pointer dereference in virtual kernel address space
[ 1554.438065] Failing address: 001c0ff0066f0000 TEID: 001c0ff0066f0403
[ 1554.438076] Fault in home space mode while using kernel ASCE.
[ 1554.438103] AS:00000000059d400b R2:0000000ffec5c00b R3:00000000c6c9c007 S:0000000314470001 P:00000000d0ab413d 
[ 1554.438158] Oops: 0011 ilc:2 [#1]SMP 
[ 1554.438175] Modules linked in: test_vmalloc(E+) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) nft_chain_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) nf_tables(E) sunrpc(E) pkey_pckmo(E) uvdevice(E) s390_trng(E) rng_core(E) eadm_sch(E) vfio_ccw(E) mdev(E) vfio_iommu_type1(E) vfio(E) sch_fq_codel(E) drm(E) loop(E) i2c_core(E) drm_panel_orientation_quirks(E) nfnetlink(E) ctcm(E) fsm(E) zfcp(E) scsi_transport_fc(E) diag288_wdt(E) watchdog(E) ghash_s390(E) prng(E) aes_s390(E) des_s390(E) libdes(E) sha3_512_s390(E) sha3_256_s390(E) sha512_s390(E) sha1_s390(E) sha_common(E) pkey(E) autofs4(E)
[ 1554.438319] Unloaded tainted modules: pkey_uv(E):1 hmac_s390(E):2
[ 1554.438354] CPU: 1 UID: 0 PID: 1715 Comm: vmalloc_test/0 Kdump: loaded Tainted: G            E       6.16.0-gcc-ipte-kasan-11657-gb2d930c4950e torvalds#350 PREEMPT 
[ 1554.438368] Tainted: [E]=UNSIGNED_MODULE
[ 1554.438374] Hardware name: IBM 8561 T01 703 (LPAR)
[ 1554.438381] Krnl PSW : 0704e00180000000 00007fffe1d3d6ae (memset+0x5e/0x98)
[ 1554.438396]            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3
[ 1554.438409] Krnl GPRS: 0000000000000001 001c0ff0066f0000 001c0ff0066f0000 00000000000000f8
[ 1554.438418]            00000000000009fe 0000000000000009 0000000000000000 0000000000000002
[ 1554.438426]            0000000000005000 000078031ae655c8 00000feffdcf9f59 0000780258672a20
[ 1554.438433]            0000780243153500 00007f8033780000 00007fffe083a510 00007f7fee7cfa00
[ 1554.438452] Krnl Code: 00007fffe1d3d6a0: eb540008000c	srlg	%r5,%r4,8
           00007fffe1d3d6a6: b9020055		ltgr	%r5,%r5
          #00007fffe1d3d6aa: a784000b		brc	8,00007fffe1d3d6c0
          >00007fffe1d3d6ae: 42301000		stc	%r3,0(%r1)
           00007fffe1d3d6b2: d2fe10011000	mvc	1(255,%r1),0(%r1)
           00007fffe1d3d6b8: 41101100		la	%r1,256(%r1)
           00007fffe1d3d6bc: a757fff9		brctg	%r5,00007fffe1d3d6ae
           00007fffe1d3d6c0: 42301000		stc	%r3,0(%r1)
[ 1554.438539] Call Trace:
[ 1554.438545]  [<00007fffe1d3d6ae>] memset+0x5e/0x98 
[ 1554.438552] ([<00007fffe083a510>] remove_vm_area+0x220/0x400)
[ 1554.438562]  [<00007fffe083a9d6>] vfree.part.0+0x26/0x810 
[ 1554.438569]  [<00007fff6073bd50>] fix_align_alloc_test+0x50/0x90 [test_vmalloc] 
[ 1554.438583]  [<00007fff6073c73a>] test_func+0x46a/0x6c0 [test_vmalloc] 
[ 1554.438593]  [<00007fffe0283ac8>] kthread+0x3f8/0x7a0 
[ 1554.438603]  [<00007fffe014d8b4>] __ret_from_fork+0xd4/0x7d0 
[ 1554.438613]  [<00007fffe299ac0a>] ret_from_fork+0xa/0x30 
[ 1554.438622] INFO: lockdep is turned off.
[ 1554.438627] Last Breaking-Event-Address:
[ 1554.438632]  [<00007fffe1d3d65c>] memset+0xc/0x98
[ 1554.438644] Kernel panic - not syncing: Fatal exception: panic_on_oops

This series fixes the above issues and is a pre-requisite for the s390
lazy MMU mode implementation.

test_vmalloc was used to stress-test the fixes.


This patch (of 2):

When vmalloc shadow memory is established the modification of the
corresponding page tables is not protected by any locks.  Instead, the
locking is done per-PTE.  This scheme however has defects.

kasan_populate_vmalloc_pte() - while ptep_get() read is atomic the
sequence pte_none(ptep_get()) is not.  Doing that outside of the lock
might lead to a concurrent PTE update and what could be seen as a shadow
memory corruption as result.

kasan_depopulate_vmalloc_pte() - by the time a page whose address was
extracted from ptep_get() read and cached in a local variable outside of
the lock is attempted to get free, could actually be freed already.

To avoid these put ptep_get() itself and the code that manipulates the
result of the read under lock.  In addition, move freeing of the page out
of the atomic context.

Link: https://lkml.kernel.org/r/cover.1755528662.git.agordeev@linux.ibm.com
Link: https://lkml.kernel.org/r/adb258634194593db294c0d1fb35646e894d6ead.1755528662.git.agordeev@linux.ibm.com
Link: https://lore.kernel.org/linux-mm/5b0609c9-95ee-4e48-bb6d-98f57c5d2c31@arm.com/ [1]
Fixes: 3c5c3cf ("kasan: support backing vmalloc space with real shadow memory")
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Daniel Axtens <dja@axtens.net>
Cc: Marc Rutland <mark.rutland@arm.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Aug 26, 2025
While working on the lazy MMU mode enablement for s390 I hit pretty
curious issues in the kasan code.

The first is related to a custom kasan-based sanitizer aimed at catching
invalid accesses to PTEs and is inspired by [1] conversation.  The kasan
complains on valid PTE accesses, while the shadow memory is reported as
unpoisoned:

[  102.783993] ==================================================================
[  102.784008] BUG: KASAN: out-of-bounds in set_pte_range+0x36c/0x390
[  102.784016] Read of size 8 at addr 0000780084cf9608 by task vmalloc_test/0/5542
[  102.784019] 
[  102.784040] CPU: 1 UID: 0 PID: 5542 Comm: vmalloc_test/0 Kdump: loaded Tainted: G           OE       6.16.0-gcc-ipte-kasan-11657-gb2d930c4950e torvalds#340 PREEMPT 
[  102.784047] Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
[  102.784049] Hardware name: IBM 8561 T01 703 (LPAR)
[  102.784052] Call Trace:
[  102.784054]  [<00007fffe0147ac0>] dump_stack_lvl+0xe8/0x140 
[  102.784059]  [<00007fffe0112484>] print_address_description.constprop.0+0x34/0x2d0 
[  102.784066]  [<00007fffe011282c>] print_report+0x10c/0x1f8 
[  102.784071]  [<00007fffe090785a>] kasan_report+0xfa/0x220 
[  102.784078]  [<00007fffe01d3dec>] set_pte_range+0x36c/0x390 
[  102.784083]  [<00007fffe01d41c2>] leave_ipte_batch+0x3b2/0xb10 
[  102.784088]  [<00007fffe07d3650>] apply_to_pte_range+0x2f0/0x4e0 
[  102.784094]  [<00007fffe07e62e4>] apply_to_pmd_range+0x194/0x3e0 
[  102.784099]  [<00007fffe07e820e>] __apply_to_page_range+0x2fe/0x7a0 
[  102.784104]  [<00007fffe07e86d8>] apply_to_page_range+0x28/0x40 
[  102.784109]  [<00007fffe090a3ec>] __kasan_populate_vmalloc+0xec/0x310 
[  102.784114]  [<00007fffe090aa36>] kasan_populate_vmalloc+0x96/0x130 
[  102.784118]  [<00007fffe0833a04>] alloc_vmap_area+0x3d4/0xf30 
[  102.784123]  [<00007fffe083a8ba>] __get_vm_area_node+0x1aa/0x4c0 
[  102.784127]  [<00007fffe083c4f6>] __vmalloc_node_range_noprof+0x126/0x4e0 
[  102.784131]  [<00007fffe083c980>] __vmalloc_node_noprof+0xd0/0x110 
[  102.784135]  [<00007fffe083ca32>] vmalloc_noprof+0x32/0x40 
[  102.784139]  [<00007fff608aa336>] fix_size_alloc_test+0x66/0x150 [test_vmalloc] 
[  102.784147]  [<00007fff608aa710>] test_func+0x2f0/0x430 [test_vmalloc] 
[  102.784153]  [<00007fffe02841f8>] kthread+0x3f8/0x7a0 
[  102.784159]  [<00007fffe014d8b4>] __ret_from_fork+0xd4/0x7d0 
[  102.784164]  [<00007fffe299c00a>] ret_from_fork+0xa/0x30 
[  102.784173] no locks held by vmalloc_test/0/5542.
[  102.784176] 
[  102.784178] The buggy address belongs to the physical page:
[  102.784186] page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x84cf9
[  102.784198] flags: 0x3ffff00000000000(node=0|zone=1|lastcpupid=0x1ffff)
[  102.784212] page_type: f2(table)
[  102.784225] raw: 3ffff00000000000 0000000000000000 0000000000000122 0000000000000000
[  102.784234] raw: 0000000000000000 0000000000000000 f200000000000001 0000000000000000
[  102.784248] page dumped because: kasan: bad access detected
[  102.784250] 
[  102.784252] Memory state around the buggy address:
[  102.784260]  0000780084cf9500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  102.784274]  0000780084cf9580: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  102.784277] >0000780084cf9600: fd 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  102.784290]                          ^
[  102.784293]  0000780084cf9680: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  102.784303]  0000780084cf9700: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  102.784306] ==================================================================

The second issue hits when the custom sanitizer above is not implemented,
but the kasan itself is still active:

[ 1554.438028] Unable to handle kernel pointer dereference in virtual kernel address space
[ 1554.438065] Failing address: 001c0ff0066f0000 TEID: 001c0ff0066f0403
[ 1554.438076] Fault in home space mode while using kernel ASCE.
[ 1554.438103] AS:00000000059d400b R2:0000000ffec5c00b R3:00000000c6c9c007 S:0000000314470001 P:00000000d0ab413d 
[ 1554.438158] Oops: 0011 ilc:2 [#1]SMP 
[ 1554.438175] Modules linked in: test_vmalloc(E+) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) nft_chain_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) nf_tables(E) sunrpc(E) pkey_pckmo(E) uvdevice(E) s390_trng(E) rng_core(E) eadm_sch(E) vfio_ccw(E) mdev(E) vfio_iommu_type1(E) vfio(E) sch_fq_codel(E) drm(E) loop(E) i2c_core(E) drm_panel_orientation_quirks(E) nfnetlink(E) ctcm(E) fsm(E) zfcp(E) scsi_transport_fc(E) diag288_wdt(E) watchdog(E) ghash_s390(E) prng(E) aes_s390(E) des_s390(E) libdes(E) sha3_512_s390(E) sha3_256_s390(E) sha512_s390(E) sha1_s390(E) sha_common(E) pkey(E) autofs4(E)
[ 1554.438319] Unloaded tainted modules: pkey_uv(E):1 hmac_s390(E):2
[ 1554.438354] CPU: 1 UID: 0 PID: 1715 Comm: vmalloc_test/0 Kdump: loaded Tainted: G            E       6.16.0-gcc-ipte-kasan-11657-gb2d930c4950e torvalds#350 PREEMPT 
[ 1554.438368] Tainted: [E]=UNSIGNED_MODULE
[ 1554.438374] Hardware name: IBM 8561 T01 703 (LPAR)
[ 1554.438381] Krnl PSW : 0704e00180000000 00007fffe1d3d6ae (memset+0x5e/0x98)
[ 1554.438396]            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3
[ 1554.438409] Krnl GPRS: 0000000000000001 001c0ff0066f0000 001c0ff0066f0000 00000000000000f8
[ 1554.438418]            00000000000009fe 0000000000000009 0000000000000000 0000000000000002
[ 1554.438426]            0000000000005000 000078031ae655c8 00000feffdcf9f59 0000780258672a20
[ 1554.438433]            0000780243153500 00007f8033780000 00007fffe083a510 00007f7fee7cfa00
[ 1554.438452] Krnl Code: 00007fffe1d3d6a0: eb540008000c	srlg	%r5,%r4,8
           00007fffe1d3d6a6: b9020055		ltgr	%r5,%r5
          #00007fffe1d3d6aa: a784000b		brc	8,00007fffe1d3d6c0
          >00007fffe1d3d6ae: 42301000		stc	%r3,0(%r1)
           00007fffe1d3d6b2: d2fe10011000	mvc	1(255,%r1),0(%r1)
           00007fffe1d3d6b8: 41101100		la	%r1,256(%r1)
           00007fffe1d3d6bc: a757fff9		brctg	%r5,00007fffe1d3d6ae
           00007fffe1d3d6c0: 42301000		stc	%r3,0(%r1)
[ 1554.438539] Call Trace:
[ 1554.438545]  [<00007fffe1d3d6ae>] memset+0x5e/0x98 
[ 1554.438552] ([<00007fffe083a510>] remove_vm_area+0x220/0x400)
[ 1554.438562]  [<00007fffe083a9d6>] vfree.part.0+0x26/0x810 
[ 1554.438569]  [<00007fff6073bd50>] fix_align_alloc_test+0x50/0x90 [test_vmalloc] 
[ 1554.438583]  [<00007fff6073c73a>] test_func+0x46a/0x6c0 [test_vmalloc] 
[ 1554.438593]  [<00007fffe0283ac8>] kthread+0x3f8/0x7a0 
[ 1554.438603]  [<00007fffe014d8b4>] __ret_from_fork+0xd4/0x7d0 
[ 1554.438613]  [<00007fffe299ac0a>] ret_from_fork+0xa/0x30 
[ 1554.438622] INFO: lockdep is turned off.
[ 1554.438627] Last Breaking-Event-Address:
[ 1554.438632]  [<00007fffe1d3d65c>] memset+0xc/0x98
[ 1554.438644] Kernel panic - not syncing: Fatal exception: panic_on_oops

This series fixes the above issues and is a pre-requisite for the s390
lazy MMU mode implementation.

test_vmalloc was used to stress-test the fixes.


This patch (of 2):

When vmalloc shadow memory is established the modification of the
corresponding page tables is not protected by any locks.  Instead, the
locking is done per-PTE.  This scheme however has defects.

kasan_populate_vmalloc_pte() - while ptep_get() read is atomic the
sequence pte_none(ptep_get()) is not.  Doing that outside of the lock
might lead to a concurrent PTE update and what could be seen as a shadow
memory corruption as result.

kasan_depopulate_vmalloc_pte() - by the time a page whose address was
extracted from ptep_get() read and cached in a local variable outside of
the lock is attempted to get free, could actually be freed already.

To avoid these put ptep_get() itself and the code that manipulates the
result of the read under lock.  In addition, move freeing of the page out
of the atomic context.

Link: https://lkml.kernel.org/r/cover.1755528662.git.agordeev@linux.ibm.com
Link: https://lkml.kernel.org/r/adb258634194593db294c0d1fb35646e894d6ead.1755528662.git.agordeev@linux.ibm.com
Link: https://lore.kernel.org/linux-mm/5b0609c9-95ee-4e48-bb6d-98f57c5d2c31@arm.com/ [1]
Fixes: 3c5c3cf ("kasan: support backing vmalloc space with real shadow memory")
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Daniel Axtens <dja@axtens.net>
Cc: Marc Rutland <mark.rutland@arm.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Aug 27, 2025
While working on the lazy MMU mode enablement for s390 I hit pretty
curious issues in the kasan code.

The first is related to a custom kasan-based sanitizer aimed at catching
invalid accesses to PTEs and is inspired by [1] conversation.  The kasan
complains on valid PTE accesses, while the shadow memory is reported as
unpoisoned:

[  102.783993] ==================================================================
[  102.784008] BUG: KASAN: out-of-bounds in set_pte_range+0x36c/0x390
[  102.784016] Read of size 8 at addr 0000780084cf9608 by task vmalloc_test/0/5542
[  102.784019] 
[  102.784040] CPU: 1 UID: 0 PID: 5542 Comm: vmalloc_test/0 Kdump: loaded Tainted: G           OE       6.16.0-gcc-ipte-kasan-11657-gb2d930c4950e torvalds#340 PREEMPT 
[  102.784047] Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
[  102.784049] Hardware name: IBM 8561 T01 703 (LPAR)
[  102.784052] Call Trace:
[  102.784054]  [<00007fffe0147ac0>] dump_stack_lvl+0xe8/0x140 
[  102.784059]  [<00007fffe0112484>] print_address_description.constprop.0+0x34/0x2d0 
[  102.784066]  [<00007fffe011282c>] print_report+0x10c/0x1f8 
[  102.784071]  [<00007fffe090785a>] kasan_report+0xfa/0x220 
[  102.784078]  [<00007fffe01d3dec>] set_pte_range+0x36c/0x390 
[  102.784083]  [<00007fffe01d41c2>] leave_ipte_batch+0x3b2/0xb10 
[  102.784088]  [<00007fffe07d3650>] apply_to_pte_range+0x2f0/0x4e0 
[  102.784094]  [<00007fffe07e62e4>] apply_to_pmd_range+0x194/0x3e0 
[  102.784099]  [<00007fffe07e820e>] __apply_to_page_range+0x2fe/0x7a0 
[  102.784104]  [<00007fffe07e86d8>] apply_to_page_range+0x28/0x40 
[  102.784109]  [<00007fffe090a3ec>] __kasan_populate_vmalloc+0xec/0x310 
[  102.784114]  [<00007fffe090aa36>] kasan_populate_vmalloc+0x96/0x130 
[  102.784118]  [<00007fffe0833a04>] alloc_vmap_area+0x3d4/0xf30 
[  102.784123]  [<00007fffe083a8ba>] __get_vm_area_node+0x1aa/0x4c0 
[  102.784127]  [<00007fffe083c4f6>] __vmalloc_node_range_noprof+0x126/0x4e0 
[  102.784131]  [<00007fffe083c980>] __vmalloc_node_noprof+0xd0/0x110 
[  102.784135]  [<00007fffe083ca32>] vmalloc_noprof+0x32/0x40 
[  102.784139]  [<00007fff608aa336>] fix_size_alloc_test+0x66/0x150 [test_vmalloc] 
[  102.784147]  [<00007fff608aa710>] test_func+0x2f0/0x430 [test_vmalloc] 
[  102.784153]  [<00007fffe02841f8>] kthread+0x3f8/0x7a0 
[  102.784159]  [<00007fffe014d8b4>] __ret_from_fork+0xd4/0x7d0 
[  102.784164]  [<00007fffe299c00a>] ret_from_fork+0xa/0x30 
[  102.784173] no locks held by vmalloc_test/0/5542.
[  102.784176] 
[  102.784178] The buggy address belongs to the physical page:
[  102.784186] page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x84cf9
[  102.784198] flags: 0x3ffff00000000000(node=0|zone=1|lastcpupid=0x1ffff)
[  102.784212] page_type: f2(table)
[  102.784225] raw: 3ffff00000000000 0000000000000000 0000000000000122 0000000000000000
[  102.784234] raw: 0000000000000000 0000000000000000 f200000000000001 0000000000000000
[  102.784248] page dumped because: kasan: bad access detected
[  102.784250] 
[  102.784252] Memory state around the buggy address:
[  102.784260]  0000780084cf9500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  102.784274]  0000780084cf9580: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  102.784277] >0000780084cf9600: fd 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  102.784290]                          ^
[  102.784293]  0000780084cf9680: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  102.784303]  0000780084cf9700: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  102.784306] ==================================================================

The second issue hits when the custom sanitizer above is not implemented,
but the kasan itself is still active:

[ 1554.438028] Unable to handle kernel pointer dereference in virtual kernel address space
[ 1554.438065] Failing address: 001c0ff0066f0000 TEID: 001c0ff0066f0403
[ 1554.438076] Fault in home space mode while using kernel ASCE.
[ 1554.438103] AS:00000000059d400b R2:0000000ffec5c00b R3:00000000c6c9c007 S:0000000314470001 P:00000000d0ab413d 
[ 1554.438158] Oops: 0011 ilc:2 [#1]SMP 
[ 1554.438175] Modules linked in: test_vmalloc(E+) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) nft_chain_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) nf_tables(E) sunrpc(E) pkey_pckmo(E) uvdevice(E) s390_trng(E) rng_core(E) eadm_sch(E) vfio_ccw(E) mdev(E) vfio_iommu_type1(E) vfio(E) sch_fq_codel(E) drm(E) loop(E) i2c_core(E) drm_panel_orientation_quirks(E) nfnetlink(E) ctcm(E) fsm(E) zfcp(E) scsi_transport_fc(E) diag288_wdt(E) watchdog(E) ghash_s390(E) prng(E) aes_s390(E) des_s390(E) libdes(E) sha3_512_s390(E) sha3_256_s390(E) sha512_s390(E) sha1_s390(E) sha_common(E) pkey(E) autofs4(E)
[ 1554.438319] Unloaded tainted modules: pkey_uv(E):1 hmac_s390(E):2
[ 1554.438354] CPU: 1 UID: 0 PID: 1715 Comm: vmalloc_test/0 Kdump: loaded Tainted: G            E       6.16.0-gcc-ipte-kasan-11657-gb2d930c4950e torvalds#350 PREEMPT 
[ 1554.438368] Tainted: [E]=UNSIGNED_MODULE
[ 1554.438374] Hardware name: IBM 8561 T01 703 (LPAR)
[ 1554.438381] Krnl PSW : 0704e00180000000 00007fffe1d3d6ae (memset+0x5e/0x98)
[ 1554.438396]            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3
[ 1554.438409] Krnl GPRS: 0000000000000001 001c0ff0066f0000 001c0ff0066f0000 00000000000000f8
[ 1554.438418]            00000000000009fe 0000000000000009 0000000000000000 0000000000000002
[ 1554.438426]            0000000000005000 000078031ae655c8 00000feffdcf9f59 0000780258672a20
[ 1554.438433]            0000780243153500 00007f8033780000 00007fffe083a510 00007f7fee7cfa00
[ 1554.438452] Krnl Code: 00007fffe1d3d6a0: eb540008000c	srlg	%r5,%r4,8
           00007fffe1d3d6a6: b9020055		ltgr	%r5,%r5
          #00007fffe1d3d6aa: a784000b		brc	8,00007fffe1d3d6c0
          >00007fffe1d3d6ae: 42301000		stc	%r3,0(%r1)
           00007fffe1d3d6b2: d2fe10011000	mvc	1(255,%r1),0(%r1)
           00007fffe1d3d6b8: 41101100		la	%r1,256(%r1)
           00007fffe1d3d6bc: a757fff9		brctg	%r5,00007fffe1d3d6ae
           00007fffe1d3d6c0: 42301000		stc	%r3,0(%r1)
[ 1554.438539] Call Trace:
[ 1554.438545]  [<00007fffe1d3d6ae>] memset+0x5e/0x98 
[ 1554.438552] ([<00007fffe083a510>] remove_vm_area+0x220/0x400)
[ 1554.438562]  [<00007fffe083a9d6>] vfree.part.0+0x26/0x810 
[ 1554.438569]  [<00007fff6073bd50>] fix_align_alloc_test+0x50/0x90 [test_vmalloc] 
[ 1554.438583]  [<00007fff6073c73a>] test_func+0x46a/0x6c0 [test_vmalloc] 
[ 1554.438593]  [<00007fffe0283ac8>] kthread+0x3f8/0x7a0 
[ 1554.438603]  [<00007fffe014d8b4>] __ret_from_fork+0xd4/0x7d0 
[ 1554.438613]  [<00007fffe299ac0a>] ret_from_fork+0xa/0x30 
[ 1554.438622] INFO: lockdep is turned off.
[ 1554.438627] Last Breaking-Event-Address:
[ 1554.438632]  [<00007fffe1d3d65c>] memset+0xc/0x98
[ 1554.438644] Kernel panic - not syncing: Fatal exception: panic_on_oops

This series fixes the above issues and is a pre-requisite for the s390
lazy MMU mode implementation.

test_vmalloc was used to stress-test the fixes.


This patch (of 2):

When vmalloc shadow memory is established the modification of the
corresponding page tables is not protected by any locks.  Instead, the
locking is done per-PTE.  This scheme however has defects.

kasan_populate_vmalloc_pte() - while ptep_get() read is atomic the
sequence pte_none(ptep_get()) is not.  Doing that outside of the lock
might lead to a concurrent PTE update and what could be seen as a shadow
memory corruption as result.

kasan_depopulate_vmalloc_pte() - by the time a page whose address was
extracted from ptep_get() read and cached in a local variable outside of
the lock is attempted to get free, could actually be freed already.

To avoid these put ptep_get() itself and the code that manipulates the
result of the read under lock.  In addition, move freeing of the page out
of the atomic context.

Link: https://lkml.kernel.org/r/cover.1755528662.git.agordeev@linux.ibm.com
Link: https://lkml.kernel.org/r/adb258634194593db294c0d1fb35646e894d6ead.1755528662.git.agordeev@linux.ibm.com
Link: https://lore.kernel.org/linux-mm/5b0609c9-95ee-4e48-bb6d-98f57c5d2c31@arm.com/ [1]
Fixes: 3c5c3cf ("kasan: support backing vmalloc space with real shadow memory")
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Daniel Axtens <dja@axtens.net>
Cc: Marc Rutland <mark.rutland@arm.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Aug 28, 2025
While working on the lazy MMU mode enablement for s390 I hit pretty
curious issues in the kasan code.

The first is related to a custom kasan-based sanitizer aimed at catching
invalid accesses to PTEs and is inspired by [1] conversation.  The kasan
complains on valid PTE accesses, while the shadow memory is reported as
unpoisoned:

[  102.783993] ==================================================================
[  102.784008] BUG: KASAN: out-of-bounds in set_pte_range+0x36c/0x390
[  102.784016] Read of size 8 at addr 0000780084cf9608 by task vmalloc_test/0/5542
[  102.784019] 
[  102.784040] CPU: 1 UID: 0 PID: 5542 Comm: vmalloc_test/0 Kdump: loaded Tainted: G           OE       6.16.0-gcc-ipte-kasan-11657-gb2d930c4950e torvalds#340 PREEMPT 
[  102.784047] Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
[  102.784049] Hardware name: IBM 8561 T01 703 (LPAR)
[  102.784052] Call Trace:
[  102.784054]  [<00007fffe0147ac0>] dump_stack_lvl+0xe8/0x140 
[  102.784059]  [<00007fffe0112484>] print_address_description.constprop.0+0x34/0x2d0 
[  102.784066]  [<00007fffe011282c>] print_report+0x10c/0x1f8 
[  102.784071]  [<00007fffe090785a>] kasan_report+0xfa/0x220 
[  102.784078]  [<00007fffe01d3dec>] set_pte_range+0x36c/0x390 
[  102.784083]  [<00007fffe01d41c2>] leave_ipte_batch+0x3b2/0xb10 
[  102.784088]  [<00007fffe07d3650>] apply_to_pte_range+0x2f0/0x4e0 
[  102.784094]  [<00007fffe07e62e4>] apply_to_pmd_range+0x194/0x3e0 
[  102.784099]  [<00007fffe07e820e>] __apply_to_page_range+0x2fe/0x7a0 
[  102.784104]  [<00007fffe07e86d8>] apply_to_page_range+0x28/0x40 
[  102.784109]  [<00007fffe090a3ec>] __kasan_populate_vmalloc+0xec/0x310 
[  102.784114]  [<00007fffe090aa36>] kasan_populate_vmalloc+0x96/0x130 
[  102.784118]  [<00007fffe0833a04>] alloc_vmap_area+0x3d4/0xf30 
[  102.784123]  [<00007fffe083a8ba>] __get_vm_area_node+0x1aa/0x4c0 
[  102.784127]  [<00007fffe083c4f6>] __vmalloc_node_range_noprof+0x126/0x4e0 
[  102.784131]  [<00007fffe083c980>] __vmalloc_node_noprof+0xd0/0x110 
[  102.784135]  [<00007fffe083ca32>] vmalloc_noprof+0x32/0x40 
[  102.784139]  [<00007fff608aa336>] fix_size_alloc_test+0x66/0x150 [test_vmalloc] 
[  102.784147]  [<00007fff608aa710>] test_func+0x2f0/0x430 [test_vmalloc] 
[  102.784153]  [<00007fffe02841f8>] kthread+0x3f8/0x7a0 
[  102.784159]  [<00007fffe014d8b4>] __ret_from_fork+0xd4/0x7d0 
[  102.784164]  [<00007fffe299c00a>] ret_from_fork+0xa/0x30 
[  102.784173] no locks held by vmalloc_test/0/5542.
[  102.784176] 
[  102.784178] The buggy address belongs to the physical page:
[  102.784186] page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x84cf9
[  102.784198] flags: 0x3ffff00000000000(node=0|zone=1|lastcpupid=0x1ffff)
[  102.784212] page_type: f2(table)
[  102.784225] raw: 3ffff00000000000 0000000000000000 0000000000000122 0000000000000000
[  102.784234] raw: 0000000000000000 0000000000000000 f200000000000001 0000000000000000
[  102.784248] page dumped because: kasan: bad access detected
[  102.784250] 
[  102.784252] Memory state around the buggy address:
[  102.784260]  0000780084cf9500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  102.784274]  0000780084cf9580: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  102.784277] >0000780084cf9600: fd 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  102.784290]                          ^
[  102.784293]  0000780084cf9680: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  102.784303]  0000780084cf9700: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  102.784306] ==================================================================

The second issue hits when the custom sanitizer above is not implemented,
but the kasan itself is still active:

[ 1554.438028] Unable to handle kernel pointer dereference in virtual kernel address space
[ 1554.438065] Failing address: 001c0ff0066f0000 TEID: 001c0ff0066f0403
[ 1554.438076] Fault in home space mode while using kernel ASCE.
[ 1554.438103] AS:00000000059d400b R2:0000000ffec5c00b R3:00000000c6c9c007 S:0000000314470001 P:00000000d0ab413d 
[ 1554.438158] Oops: 0011 ilc:2 [#1]SMP 
[ 1554.438175] Modules linked in: test_vmalloc(E+) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) nft_chain_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) nf_tables(E) sunrpc(E) pkey_pckmo(E) uvdevice(E) s390_trng(E) rng_core(E) eadm_sch(E) vfio_ccw(E) mdev(E) vfio_iommu_type1(E) vfio(E) sch_fq_codel(E) drm(E) loop(E) i2c_core(E) drm_panel_orientation_quirks(E) nfnetlink(E) ctcm(E) fsm(E) zfcp(E) scsi_transport_fc(E) diag288_wdt(E) watchdog(E) ghash_s390(E) prng(E) aes_s390(E) des_s390(E) libdes(E) sha3_512_s390(E) sha3_256_s390(E) sha512_s390(E) sha1_s390(E) sha_common(E) pkey(E) autofs4(E)
[ 1554.438319] Unloaded tainted modules: pkey_uv(E):1 hmac_s390(E):2
[ 1554.438354] CPU: 1 UID: 0 PID: 1715 Comm: vmalloc_test/0 Kdump: loaded Tainted: G            E       6.16.0-gcc-ipte-kasan-11657-gb2d930c4950e torvalds#350 PREEMPT 
[ 1554.438368] Tainted: [E]=UNSIGNED_MODULE
[ 1554.438374] Hardware name: IBM 8561 T01 703 (LPAR)
[ 1554.438381] Krnl PSW : 0704e00180000000 00007fffe1d3d6ae (memset+0x5e/0x98)
[ 1554.438396]            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3
[ 1554.438409] Krnl GPRS: 0000000000000001 001c0ff0066f0000 001c0ff0066f0000 00000000000000f8
[ 1554.438418]            00000000000009fe 0000000000000009 0000000000000000 0000000000000002
[ 1554.438426]            0000000000005000 000078031ae655c8 00000feffdcf9f59 0000780258672a20
[ 1554.438433]            0000780243153500 00007f8033780000 00007fffe083a510 00007f7fee7cfa00
[ 1554.438452] Krnl Code: 00007fffe1d3d6a0: eb540008000c	srlg	%r5,%r4,8
           00007fffe1d3d6a6: b9020055		ltgr	%r5,%r5
          #00007fffe1d3d6aa: a784000b		brc	8,00007fffe1d3d6c0
          >00007fffe1d3d6ae: 42301000		stc	%r3,0(%r1)
           00007fffe1d3d6b2: d2fe10011000	mvc	1(255,%r1),0(%r1)
           00007fffe1d3d6b8: 41101100		la	%r1,256(%r1)
           00007fffe1d3d6bc: a757fff9		brctg	%r5,00007fffe1d3d6ae
           00007fffe1d3d6c0: 42301000		stc	%r3,0(%r1)
[ 1554.438539] Call Trace:
[ 1554.438545]  [<00007fffe1d3d6ae>] memset+0x5e/0x98 
[ 1554.438552] ([<00007fffe083a510>] remove_vm_area+0x220/0x400)
[ 1554.438562]  [<00007fffe083a9d6>] vfree.part.0+0x26/0x810 
[ 1554.438569]  [<00007fff6073bd50>] fix_align_alloc_test+0x50/0x90 [test_vmalloc] 
[ 1554.438583]  [<00007fff6073c73a>] test_func+0x46a/0x6c0 [test_vmalloc] 
[ 1554.438593]  [<00007fffe0283ac8>] kthread+0x3f8/0x7a0 
[ 1554.438603]  [<00007fffe014d8b4>] __ret_from_fork+0xd4/0x7d0 
[ 1554.438613]  [<00007fffe299ac0a>] ret_from_fork+0xa/0x30 
[ 1554.438622] INFO: lockdep is turned off.
[ 1554.438627] Last Breaking-Event-Address:
[ 1554.438632]  [<00007fffe1d3d65c>] memset+0xc/0x98
[ 1554.438644] Kernel panic - not syncing: Fatal exception: panic_on_oops

This series fixes the above issues and is a pre-requisite for the s390
lazy MMU mode implementation.

test_vmalloc was used to stress-test the fixes.


This patch (of 2):

When vmalloc shadow memory is established the modification of the
corresponding page tables is not protected by any locks.  Instead, the
locking is done per-PTE.  This scheme however has defects.

kasan_populate_vmalloc_pte() - while ptep_get() read is atomic the
sequence pte_none(ptep_get()) is not.  Doing that outside of the lock
might lead to a concurrent PTE update and what could be seen as a shadow
memory corruption as result.

kasan_depopulate_vmalloc_pte() - by the time a page whose address was
extracted from ptep_get() read and cached in a local variable outside of
the lock is attempted to get free, could actually be freed already.

To avoid these put ptep_get() itself and the code that manipulates the
result of the read under lock.  In addition, move freeing of the page out
of the atomic context.

Link: https://lkml.kernel.org/r/cover.1755528662.git.agordeev@linux.ibm.com
Link: https://lkml.kernel.org/r/adb258634194593db294c0d1fb35646e894d6ead.1755528662.git.agordeev@linux.ibm.com
Link: https://lore.kernel.org/linux-mm/5b0609c9-95ee-4e48-bb6d-98f57c5d2c31@arm.com/ [1]
Fixes: 3c5c3cf ("kasan: support backing vmalloc space with real shadow memory")
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Daniel Axtens <dja@axtens.net>
Cc: Marc Rutland <mark.rutland@arm.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Aug 28, 2025
While working on the lazy MMU mode enablement for s390 I hit pretty
curious issues in the kasan code.

The first is related to a custom kasan-based sanitizer aimed at catching
invalid accesses to PTEs and is inspired by [1] conversation.  The kasan
complains on valid PTE accesses, while the shadow memory is reported as
unpoisoned:

[  102.783993] ==================================================================
[  102.784008] BUG: KASAN: out-of-bounds in set_pte_range+0x36c/0x390
[  102.784016] Read of size 8 at addr 0000780084cf9608 by task vmalloc_test/0/5542
[  102.784019] 
[  102.784040] CPU: 1 UID: 0 PID: 5542 Comm: vmalloc_test/0 Kdump: loaded Tainted: G           OE       6.16.0-gcc-ipte-kasan-11657-gb2d930c4950e torvalds#340 PREEMPT 
[  102.784047] Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
[  102.784049] Hardware name: IBM 8561 T01 703 (LPAR)
[  102.784052] Call Trace:
[  102.784054]  [<00007fffe0147ac0>] dump_stack_lvl+0xe8/0x140 
[  102.784059]  [<00007fffe0112484>] print_address_description.constprop.0+0x34/0x2d0 
[  102.784066]  [<00007fffe011282c>] print_report+0x10c/0x1f8 
[  102.784071]  [<00007fffe090785a>] kasan_report+0xfa/0x220 
[  102.784078]  [<00007fffe01d3dec>] set_pte_range+0x36c/0x390 
[  102.784083]  [<00007fffe01d41c2>] leave_ipte_batch+0x3b2/0xb10 
[  102.784088]  [<00007fffe07d3650>] apply_to_pte_range+0x2f0/0x4e0 
[  102.784094]  [<00007fffe07e62e4>] apply_to_pmd_range+0x194/0x3e0 
[  102.784099]  [<00007fffe07e820e>] __apply_to_page_range+0x2fe/0x7a0 
[  102.784104]  [<00007fffe07e86d8>] apply_to_page_range+0x28/0x40 
[  102.784109]  [<00007fffe090a3ec>] __kasan_populate_vmalloc+0xec/0x310 
[  102.784114]  [<00007fffe090aa36>] kasan_populate_vmalloc+0x96/0x130 
[  102.784118]  [<00007fffe0833a04>] alloc_vmap_area+0x3d4/0xf30 
[  102.784123]  [<00007fffe083a8ba>] __get_vm_area_node+0x1aa/0x4c0 
[  102.784127]  [<00007fffe083c4f6>] __vmalloc_node_range_noprof+0x126/0x4e0 
[  102.784131]  [<00007fffe083c980>] __vmalloc_node_noprof+0xd0/0x110 
[  102.784135]  [<00007fffe083ca32>] vmalloc_noprof+0x32/0x40 
[  102.784139]  [<00007fff608aa336>] fix_size_alloc_test+0x66/0x150 [test_vmalloc] 
[  102.784147]  [<00007fff608aa710>] test_func+0x2f0/0x430 [test_vmalloc] 
[  102.784153]  [<00007fffe02841f8>] kthread+0x3f8/0x7a0 
[  102.784159]  [<00007fffe014d8b4>] __ret_from_fork+0xd4/0x7d0 
[  102.784164]  [<00007fffe299c00a>] ret_from_fork+0xa/0x30 
[  102.784173] no locks held by vmalloc_test/0/5542.
[  102.784176] 
[  102.784178] The buggy address belongs to the physical page:
[  102.784186] page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x84cf9
[  102.784198] flags: 0x3ffff00000000000(node=0|zone=1|lastcpupid=0x1ffff)
[  102.784212] page_type: f2(table)
[  102.784225] raw: 3ffff00000000000 0000000000000000 0000000000000122 0000000000000000
[  102.784234] raw: 0000000000000000 0000000000000000 f200000000000001 0000000000000000
[  102.784248] page dumped because: kasan: bad access detected
[  102.784250] 
[  102.784252] Memory state around the buggy address:
[  102.784260]  0000780084cf9500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  102.784274]  0000780084cf9580: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  102.784277] >0000780084cf9600: fd 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  102.784290]                          ^
[  102.784293]  0000780084cf9680: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  102.784303]  0000780084cf9700: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  102.784306] ==================================================================

The second issue hits when the custom sanitizer above is not implemented,
but the kasan itself is still active:

[ 1554.438028] Unable to handle kernel pointer dereference in virtual kernel address space
[ 1554.438065] Failing address: 001c0ff0066f0000 TEID: 001c0ff0066f0403
[ 1554.438076] Fault in home space mode while using kernel ASCE.
[ 1554.438103] AS:00000000059d400b R2:0000000ffec5c00b R3:00000000c6c9c007 S:0000000314470001 P:00000000d0ab413d 
[ 1554.438158] Oops: 0011 ilc:2 [#1]SMP 
[ 1554.438175] Modules linked in: test_vmalloc(E+) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) nft_chain_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) nf_tables(E) sunrpc(E) pkey_pckmo(E) uvdevice(E) s390_trng(E) rng_core(E) eadm_sch(E) vfio_ccw(E) mdev(E) vfio_iommu_type1(E) vfio(E) sch_fq_codel(E) drm(E) loop(E) i2c_core(E) drm_panel_orientation_quirks(E) nfnetlink(E) ctcm(E) fsm(E) zfcp(E) scsi_transport_fc(E) diag288_wdt(E) watchdog(E) ghash_s390(E) prng(E) aes_s390(E) des_s390(E) libdes(E) sha3_512_s390(E) sha3_256_s390(E) sha512_s390(E) sha1_s390(E) sha_common(E) pkey(E) autofs4(E)
[ 1554.438319] Unloaded tainted modules: pkey_uv(E):1 hmac_s390(E):2
[ 1554.438354] CPU: 1 UID: 0 PID: 1715 Comm: vmalloc_test/0 Kdump: loaded Tainted: G            E       6.16.0-gcc-ipte-kasan-11657-gb2d930c4950e torvalds#350 PREEMPT 
[ 1554.438368] Tainted: [E]=UNSIGNED_MODULE
[ 1554.438374] Hardware name: IBM 8561 T01 703 (LPAR)
[ 1554.438381] Krnl PSW : 0704e00180000000 00007fffe1d3d6ae (memset+0x5e/0x98)
[ 1554.438396]            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3
[ 1554.438409] Krnl GPRS: 0000000000000001 001c0ff0066f0000 001c0ff0066f0000 00000000000000f8
[ 1554.438418]            00000000000009fe 0000000000000009 0000000000000000 0000000000000002
[ 1554.438426]            0000000000005000 000078031ae655c8 00000feffdcf9f59 0000780258672a20
[ 1554.438433]            0000780243153500 00007f8033780000 00007fffe083a510 00007f7fee7cfa00
[ 1554.438452] Krnl Code: 00007fffe1d3d6a0: eb540008000c	srlg	%r5,%r4,8
           00007fffe1d3d6a6: b9020055		ltgr	%r5,%r5
          #00007fffe1d3d6aa: a784000b		brc	8,00007fffe1d3d6c0
          >00007fffe1d3d6ae: 42301000		stc	%r3,0(%r1)
           00007fffe1d3d6b2: d2fe10011000	mvc	1(255,%r1),0(%r1)
           00007fffe1d3d6b8: 41101100		la	%r1,256(%r1)
           00007fffe1d3d6bc: a757fff9		brctg	%r5,00007fffe1d3d6ae
           00007fffe1d3d6c0: 42301000		stc	%r3,0(%r1)
[ 1554.438539] Call Trace:
[ 1554.438545]  [<00007fffe1d3d6ae>] memset+0x5e/0x98 
[ 1554.438552] ([<00007fffe083a510>] remove_vm_area+0x220/0x400)
[ 1554.438562]  [<00007fffe083a9d6>] vfree.part.0+0x26/0x810 
[ 1554.438569]  [<00007fff6073bd50>] fix_align_alloc_test+0x50/0x90 [test_vmalloc] 
[ 1554.438583]  [<00007fff6073c73a>] test_func+0x46a/0x6c0 [test_vmalloc] 
[ 1554.438593]  [<00007fffe0283ac8>] kthread+0x3f8/0x7a0 
[ 1554.438603]  [<00007fffe014d8b4>] __ret_from_fork+0xd4/0x7d0 
[ 1554.438613]  [<00007fffe299ac0a>] ret_from_fork+0xa/0x30 
[ 1554.438622] INFO: lockdep is turned off.
[ 1554.438627] Last Breaking-Event-Address:
[ 1554.438632]  [<00007fffe1d3d65c>] memset+0xc/0x98
[ 1554.438644] Kernel panic - not syncing: Fatal exception: panic_on_oops

This series fixes the above issues and is a pre-requisite for the s390
lazy MMU mode implementation.

test_vmalloc was used to stress-test the fixes.


This patch (of 2):

When vmalloc shadow memory is established the modification of the
corresponding page tables is not protected by any locks.  Instead, the
locking is done per-PTE.  This scheme however has defects.

kasan_populate_vmalloc_pte() - while ptep_get() read is atomic the
sequence pte_none(ptep_get()) is not.  Doing that outside of the lock
might lead to a concurrent PTE update and what could be seen as a shadow
memory corruption as result.

kasan_depopulate_vmalloc_pte() - by the time a page whose address was
extracted from ptep_get() read and cached in a local variable outside of
the lock is attempted to get free, could actually be freed already.

To avoid these put ptep_get() itself and the code that manipulates the
result of the read under lock.  In addition, move freeing of the page out
of the atomic context.

Link: https://lkml.kernel.org/r/cover.1755528662.git.agordeev@linux.ibm.com
Link: https://lkml.kernel.org/r/adb258634194593db294c0d1fb35646e894d6ead.1755528662.git.agordeev@linux.ibm.com
Link: https://lore.kernel.org/linux-mm/5b0609c9-95ee-4e48-bb6d-98f57c5d2c31@arm.com/ [1]
Fixes: 3c5c3cf ("kasan: support backing vmalloc space with real shadow memory")
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Daniel Axtens <dja@axtens.net>
Cc: Marc Rutland <mark.rutland@arm.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants