HID: lenovo: Do not access driver data in PM reset_resume #1

rcj4747 · 2015-11-03T18:12:32Z

Accessing driver data (hid_get_drvdata) during reset_resume result
in the user experiencing a system hang. This patch removes the
feature set operations during reset_resume which depend on driver
data.

Signed-off-by: Robert C Jennings rcj4747@gmail.com

Accessing driver data (hid_get_drvdata) during reset_resume result in the user experiencing a system hang. This patch removes the feature set operations during reset_resume which depend on driver data. Signed-off-by: Robert C Jennings <rcj4747@gmail.com>

We noticed this panic while enabling SR-IOV in sparc. mlx4_core: Mellanox ConnectX core driver v2.2-1 (Jan 1 2015) mlx4_core: Initializing 0007:01:00.0 mlx4_core 0007:01:00.0: Enabling SR-IOV with 5 VFs mlx4_core: Initializing 0007:01:00.1 Unable to handle kernel NULL pointer dereference insmod(10010): Oops [#1] CPU: 391 PID: 10010 Comm: insmod Not tainted 4.1.12-32.el6uek.kdump2.sparc64 #1 TPC: <dma_supported+0x20/0x80> I7: <__mlx4_init_one+0x324/0x500 [mlx4_core]> Call Trace: [00000000104c5ea4] __mlx4_init_one+0x324/0x500 [mlx4_core] [00000000104c613c] mlx4_init_one+0xbc/0x120 [mlx4_core] [0000000000725f14] local_pci_probe+0x34/0xa0 [0000000000726028] pci_call_probe+0xa8/0xe0 [0000000000726310] pci_device_probe+0x50/0x80 [000000000079f700] really_probe+0x140/0x420 [000000000079fa24] driver_probe_device+0x44/0xa0 [000000000079fb5c] __device_attach+0x3c/0x60 [000000000079d85c] bus_for_each_drv+0x5c/0xa0 [000000000079f588] device_attach+0x88/0xc0 [000000000071acd0] pci_bus_add_device+0x30/0x80 [0000000000736090] virtfn_add.clone.1+0x210/0x360 [00000000007364a4] sriov_enable+0x2c4/0x520 [000000000073672c] pci_enable_sriov+0x2c/0x40 [00000000104c2d58] mlx4_enable_sriov+0xf8/0x180 [mlx4_core] [00000000104c49ac] mlx4_load_one+0x42c/0xd40 [mlx4_core] Disabling lock debugging due to kernel taint Caller[00000000104c5ea4]: __mlx4_init_one+0x324/0x500 [mlx4_core] Caller[00000000104c613c]: mlx4_init_one+0xbc/0x120 [mlx4_core] Caller[0000000000725f14]: local_pci_probe+0x34/0xa0 Caller[0000000000726028]: pci_call_probe+0xa8/0xe0 Caller[0000000000726310]: pci_device_probe+0x50/0x80 Caller[000000000079f700]: really_probe+0x140/0x420 Caller[000000000079fa24]: driver_probe_device+0x44/0xa0 Caller[000000000079fb5c]: __device_attach+0x3c/0x60 Caller[000000000079d85c]: bus_for_each_drv+0x5c/0xa0 Caller[000000000079f588]: device_attach+0x88/0xc0 Caller[000000000071acd0]: pci_bus_add_device+0x30/0x80 Caller[0000000000736090]: virtfn_add.clone.1+0x210/0x360 Caller[00000000007364a4]: sriov_enable+0x2c4/0x520 Caller[000000000073672c]: pci_enable_sriov+0x2c/0x40 Caller[00000000104c2d58]: mlx4_enable_sriov+0xf8/0x180 [mlx4_core] Caller[00000000104c49ac]: mlx4_load_one+0x42c/0xd40 [mlx4_core] Caller[00000000104c5f90]: __mlx4_init_one+0x410/0x500 [mlx4_core] Caller[00000000104c613c]: mlx4_init_one+0xbc/0x120 [mlx4_core] Caller[0000000000725f14]: local_pci_probe+0x34/0xa0 Caller[0000000000726028]: pci_call_probe+0xa8/0xe0 Caller[0000000000726310]: pci_device_probe+0x50/0x80 Caller[000000000079f700]: really_probe+0x140/0x420 Caller[000000000079fa24]: driver_probe_device+0x44/0xa0 Caller[000000000079fb08]: __driver_attach+0x88/0xa0 Caller[000000000079d90c]: bus_for_each_dev+0x6c/0xa0 Caller[000000000079f29c]: driver_attach+0x1c/0x40 Caller[000000000079e35c]: bus_add_driver+0x17c/0x220 Caller[00000000007a02d4]: driver_register+0x74/0x120 Caller[00000000007263fc]: __pci_register_driver+0x3c/0x60 Caller[00000000104f62bc]: mlx4_init+0x60/0xcc [mlx4_core] Kernel panic - not syncing: Fatal exception Press Stop-A (L1-A) to return to the boot prom ---[ end Kernel panic - not syncing: Fatal exception Details: Here is the call sequence virtfn_add->__mlx4_init_one->dma_set_mask->dma_supported The panic happened at line 760(file arch/sparc/kernel/iommu.c) 758 int dma_supported(struct device *dev, u64 device_mask) 759 { 760 struct iommu *iommu = dev->archdata.iommu; 761 u64 dma_addr_mask = iommu->dma_addr_mask; 762 763 if (device_mask >= (1UL << 32UL)) 764 return 0; 765 766 if ((device_mask & dma_addr_mask) == dma_addr_mask) 767 return 1; 768 769 #ifdef CONFIG_PCI 770 if (dev_is_pci(dev)) 771 return pci64_dma_supported(to_pci_dev(dev), device_mask); 772 #endif 773 774 return 0; 775 } 776 EXPORT_SYMBOL(dma_supported); Same panic happened with Intel ixgbe driver also. SR-IOV code looks for arch specific data while enabling VFs. When VF device is added, driver probe function makes set of calls to initialize the pci device. Because the VF device is added different way than the normal PF device(which happens via of_create_pci_dev for sparc), some of the arch specific initialization does not happen for VF device. That causes panic when archdata is accessed. To fix this, I have used already defined weak function pcibios_setup_device to copy archdata from PF to VF. Also verified the fix. Signed-off-by: Babu Moger <babu.moger@oracle.com> Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Reviewed-by: Ethan Zhao <ethan.zhao@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>

In certain probe conditions the interrupt came right after registering the handler causing a NULL pointer exception because of uninitialized waitqueue: $ udevadm trigger i2c-gpio i2c-gpio-1: using pins 143 (SDA) and 144 (SCL) i2c-gpio i2c-gpio-3: using pins 53 (SDA) and 52 (SCL) Unable to handle kernel NULL pointer dereference at virtual address 00000000 pgd = e8b3800 [00000000] *pgd=00000000 Internal error: Oops: 5 [#1] SMP ARM Modules linked in: snd_soc_i2s(+) i2c_gpio(+) snd_soc_idma snd_soc_s3c_dma snd_soc_core snd_pcm_dmaengine snd_pcm snd_timer snd soundcore ac97_bus spi_s3c64xx pwm_samsung dwc2 exynos_adc phy_exynos_usb2 exynosdrm exynos_rng rng_core rtc_s3c CPU: 0 PID: 717 Comm: data-provider-m Not tainted 4.6.0-rc1-next-20160401-00011-g1b8d87473b9e-dirty torvalds#101 Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) (...) (__wake_up_common) from [<c0379624>] (__wake_up+0x38/0x4c) (__wake_up) from [<c0a41d30>] (ak8975_irq_handler+0x28/0x30) (ak8975_irq_handler) from [<c0386720>] (handle_irq_event_percpu+0x88/0x140) (handle_irq_event_percpu) from [<c038681c>] (handle_irq_event+0x44/0x68) (handle_irq_event) from [<c0389c40>] (handle_edge_irq+0xf0/0x19c) (handle_edge_irq) from [<c0385e04>] (generic_handle_irq+0x24/0x34) (generic_handle_irq) from [<c05ee360>] (exynos_eint_gpio_irq+0x50/0x68) (exynos_eint_gpio_irq) from [<c0386720>] (handle_irq_event_percpu+0x88/0x140) (handle_irq_event_percpu) from [<c038681c>] (handle_irq_event+0x44/0x68) (handle_irq_event) from [<c0389a70>] (handle_fasteoi_irq+0xb4/0x194) (handle_fasteoi_irq) from [<c0385e04>] (generic_handle_irq+0x24/0x34) (generic_handle_irq) from [<c03860b4>] (__handle_domain_irq+0x5c/0xb4) (__handle_domain_irq) from [<c0301774>] (gic_handle_irq+0x54/0x94) (gic_handle_irq) from [<c030c910>] (__irq_usr+0x50/0x80) The bug was reproduced on exynos4412-trats2 (with a max77693 device also using i2c-gpio) after building max77693 as a module. Cc: <stable@vger.kernel.org> Fixes: 94a6d5c ("iio:ak8975 Implement data ready interrupt handling") Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> Tested-by: Gregor Boirie <gregor.boirie@parrot.com> Signed-off-by: Jonathan Cameron <jic23@kernel.org>

The driver never calls platform_set_drvdata() , so platform_get_drvdata() in .remove returns NULL and thus $indio_dev variable in .remove is NULL. Then it's only a matter of dereferencing the indio_dev variable to make the kernel blow as seen below. This patch adds the platform_set_drvdata() call to fix the problem. root@armhf:~# rmmod at91-sama5d2_adc Unable to handle kernel NULL pointer dereference at virtual address 000001d4 pgd = dd57c000 [000001d4] *pgd=00000000 Internal error: Oops: 5 [#1] ARM Modules linked in: at91_sama5d2_adc(-) CPU: 0 PID: 1334 Comm: rmmod Not tainted 4.6.0-rc3-next-20160418+ #3 Hardware name: Atmel SAMA5 task: dd4fcc40 ti: de910000 task.ti: de910000 PC is at mutex_lock+0x4/0x24 LR is at iio_device_unregister+0x14/0x6c pc : [<c05f4624>] lr : [<c0471f74>] psr: a00d0013 sp : de911f00 ip : 00000000 fp : be898bd8 r10: 00000000 r9 : de910000 r8 : c0107724 r7 : 00000081 r6 : bf001048 r5 : 000001d4 r4 : 00000000 r3 : bf000000 r2 : 00000000 r1 : 00000004 r0 : 000001d4 Flags: NzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none Control: 10c53c7d Table: 3d57c059 DAC: 00000051 Process rmmod (pid: 1334, stack limit = 0xde910208) Stack: (0xde911f00 to 0xde912000) 1f00: bf000000 00000000 df5c7e10 bf000010 bf000000 df5c7e10 df5c7e10 c0351ca8 1f20: c0351c84 df5c7e10 bf001048 c0350734 bf001048 df5c7e10 df5c7e44 c035087c 1f40: bf001048 7f62dd4c 00000800 c034fb30 bf0010c0 c0158ee8 de910000 31397461 1f60: 6d61735f 32643561 6364615f 00000000 de911f90 de910000 de910000 00000000 1f80: de911fb0 10c53c7d de911f9c c05f33d8 de911fa0 00910000 be898ecb 7f62dd10 1fa0: 00000000 c0107560 be898ecb 7f62dd10 7f62dd4c 00000800 6f844800 6f844800 1fc0: be898ecb 7f62dd10 00000000 00000081 00000000 7f62dd10 be898bd8 be898bd8 1fe0: b6eedab1 be898b6c 7f61056b b6eedab6 000d0030 7f62dd4c 00000000 00000000 [<c05f4624>] (mutex_lock) from [<c0471f74>] (iio_device_unregister+0x14/0x6c) [<c0471f74>] (iio_device_unregister) from [<bf000010>] (at91_adc_remove+0x10/0x3c [at91_sama5d2_adc]) [<bf000010>] (at91_adc_remove [at91_sama5d2_adc]) from [<c0351ca8>] (platform_drv_remove+0x24/0x3c) [<c0351ca8>] (platform_drv_remove) from [<c0350734>] (__device_release_driver+0x84/0x110) [<c0350734>] (__device_release_driver) from [<c035087c>] (driver_detach+0x8c/0x90) [<c035087c>] (driver_detach) from [<c034fb30>] (bus_remove_driver+0x4c/0xa0) [<c034fb30>] (bus_remove_driver) from [<c0158ee8>] (SyS_delete_module+0x110/0x1d0) [<c0158ee8>] (SyS_delete_module) from [<c0107560>] (ret_fast_syscall+0x0/0x3c) Code: e3520001 1affffd5 eafffff4 f5d0f000 (e1902f9f) ---[ end trace 86914d7ad3696fca ]--- Signed-off-by: Marek Vasut <marex@denx.de> Cc: Ludovic Desroches <ludovic.desroches@atmel.com> Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <jic23@kernel.org>

Conversion of talitos driver to the new AEAD interface hasn't been properly tested. AEAD algorithms crash in talitos_cra_init as follows: [...] [ 1.141095] talitos ffe30000.crypto: hwrng [ 1.145381] Unable to handle kernel paging request for data at address 0x00000058 [ 1.152913] Faulting instruction address: 0xc02accc0 [ 1.157910] Oops: Kernel access of bad area, sig: 11 [#1] [ 1.163315] SMP NR_CPUS=2 P1020 RDB [ 1.166810] Modules linked in: [ 1.169875] CPU: 0 PID: 1007 Comm: cryptomgr_test Not tainted 4.4.6 #1 [ 1.176415] task: db5ec200 ti: db4d6000 task.ti: db4d6000 [ 1.181821] NIP: c02accc0 LR: c02acd18 CTR: c02acd04 [ 1.186793] REGS: db4d7d30 TRAP: 0300 Not tainted (4.4.6) [ 1.192457] MSR: 00029000 <CE,EE,ME> CR: 95009359 XER: e0000000 [ 1.198585] DEAR: 00000058 ESR: 00000000 GPR00: c017bdc0 db4d7de0 db5ec200 df424b48 00000000 00000000 df424bfc db75a600 GPR08: df424b48 00000000 db75a628 db4d6000 00000149 00000000 c0044cac db5acda0 GPR16: 00000000 00000000 00000000 00000000 00000000 00000000 00000400 df424940 GPR24: df424900 00003083 00000400 c0180000 db75a640 c03e9f84 df424b40 df424b48 [ 1.230978] NIP [c02accc0] talitos_cra_init+0x28/0x6c [ 1.236039] LR [c02acd18] talitos_cra_init_aead+0x14/0x28 [ 1.241443] Call Trace: [ 1.243894] [db4d7de0] [c03e9f84] 0xc03e9f84 (unreliable) [ 1.249322] [db4d7df0] [c017bdc0] crypto_create_tfm+0x5c/0xf0 [ 1.255083] [db4d7e10] [c017beec] crypto_alloc_tfm+0x98/0xf8 [ 1.260769] [db4d7e40] [c0186a20] alg_test_aead+0x28/0xc8 [ 1.266181] [db4d7e60] [c0186718] alg_test+0x260/0x2e0 [ 1.271333] [db4d7ee0] [c0183860] cryptomgr_test+0x30/0x54 [ 1.276843] [db4d7ef0] [c0044d80] kthread+0xd4/0xd8 [ 1.281741] [db4d7f40] [c000e4a4] ret_from_kernel_thread+0x5c/0x64 [ 1.287930] Instruction dump: [ 1.290902] 38600000 4e800020 81230028 7c681b78 81490010 38e9ffc0 3929ffe8 554a073e [ 1.298691] 2b8a000a 7d474f9e 812a0008 91230030 <80e90058> 39270060 7c0004ac 7cc04828 Cc: <stable@vger.kernel.org> # 4.3+ Fixes: aeb4c13 ("crypto: talitos - Convert to new AEAD interface") Signed-off-by: Jonas Eymann <J.Eymann@gmx.net> Fix typo - replaced parameter of __crypto_ahash_alg(): s/tfm/alg Remove checkpatch warnings. Add commit message. Signed-off-by: Horia Geant? <horia.geanta@nxp.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

The following commit: 1fb3a8b ("xen/spinlock: Fix locking path engaging too soon under PVHVM.") ... moved the initalization of the kicker interrupt until after native_cpu_up() is called. However, when using qspinlocks, a CPU may try to kick another CPU that is spinning (because it has not yet initialized its kicker interrupt), resulting in the following crash during boot: kernel BUG at /build/linux-Ay7j_C/linux-4.4.0/drivers/xen/events/events_base.c:1210! invalid opcode: 0000 [#1] SMP ... RIP: 0010:[<ffffffff814c97c9>] [<ffffffff814c97c9>] xen_send_IPI_one+0x59/0x60 ... Call Trace: [<ffffffff8102be9e>] xen_qlock_kick+0xe/0x10 [<ffffffff810cabc2>] __pv_queued_spin_unlock+0xb2/0xf0 [<ffffffff810ca6d1>] ? __raw_callee_save___pv_queued_spin_unlock+0x11/0x20 [<ffffffff81052936>] ? check_tsc_warp+0x76/0x150 [<ffffffff81052aa6>] check_tsc_sync_source+0x96/0x160 [<ffffffff81051e28>] native_cpu_up+0x3d8/0x9f0 [<ffffffff8102b315>] xen_hvm_cpu_up+0x35/0x80 [<ffffffff8108198c>] _cpu_up+0x13c/0x180 [<ffffffff81081a4a>] cpu_up+0x7a/0xa0 [<ffffffff81f80dfc>] smp_init+0x7f/0x81 [<ffffffff81f5a121>] kernel_init_freeable+0xef/0x212 [<ffffffff81817f30>] ? rest_init+0x80/0x80 [<ffffffff81817f3e>] kernel_init+0xe/0xe0 [<ffffffff8182488f>] ret_from_fork+0x3f/0x70 [<ffffffff81817f30>] ? rest_init+0x80/0x80 To fix this, only send the kick if the target CPU's interrupt has been initialized. This check isn't racy, because the target is waiting for the spinlock, so it won't have initialized the interrupt in the meantime. Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: David Vrabel <david.vrabel@citrix.com> Cc: Juergen Gross <jgross@suse.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Cc: xen-devel@lists.xenproject.org Signed-off-by: Ingo Molnar <mingo@kernel.org>

Starting the kernel client with cephx disabled and then enabling cephx and restarting userspace daemons can result in a crash: [262671.478162] BUG: unable to handle kernel paging request at ffffebe000000000 [262671.531460] IP: [<ffffffff811cd04a>] kfree+0x5a/0x130 [262671.584334] PGD 0 [262671.635847] Oops: 0000 [#1] SMP [262672.055841] CPU: 22 PID: 2961272 Comm: kworker/22:2 Not tainted 4.2.0-34-generic torvalds#39~14.04.1-Ubuntu [262672.162338] Hardware name: Dell Inc. PowerEdge R720/068CDY, BIOS 2.4.3 07/09/2014 [262672.268937] Workqueue: ceph-msgr con_work [libceph] [262672.322290] task: ffff88081c2d0dc0 ti: ffff880149ae8000 task.ti: ffff880149ae8000 [262672.428330] RIP: 0010:[<ffffffff811cd04a>] [<ffffffff811cd04a>] kfree+0x5a/0x130 [262672.535880] RSP: 0018:ffff880149aeba58 EFLAGS: 00010286 [262672.589486] RAX: 000001e000000000 RBX: 0000000000000012 RCX: ffff8807e7461018 [262672.695980] RDX: 000077ff80000000 RSI: ffff88081af2be04 RDI: 0000000000000012 [262672.803668] RBP: ffff880149aeba78 R08: 0000000000000000 R09: 0000000000000000 [262672.912299] R10: ffffebe000000000 R11: ffff880819a60e78 R12: ffff8800aec8df40 [262673.021769] R13: ffffffffc035f70f R14: ffff8807e5b138e0 R15: ffff880da9785840 [262673.131722] FS: 0000000000000000(0000) GS:ffff88081fac0000(0000) knlGS:0000000000000000 [262673.245377] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [262673.303281] CR2: ffffebe000000000 CR3: 0000000001c0d000 CR4: 00000000001406e0 [262673.417556] Stack: [262673.472943] ffff880149aeba88 ffff88081af2be04 ffff8800aec8df40 ffff88081af2be04 [262673.583767] ffff880149aeba98 ffffffffc035f70f ffff880149aebac8 ffff8800aec8df00 [262673.694546] ffff880149aebac8 ffffffffc035c89e ffff8807e5b138e0 ffff8805b047f800 [262673.805230] Call Trace: [262673.859116] [<ffffffffc035f70f>] ceph_x_destroy_authorizer+0x1f/0x50 [libceph] [262673.968705] [<ffffffffc035c89e>] ceph_auth_destroy_authorizer+0x3e/0x60 [libceph] [262674.078852] [<ffffffffc0352805>] put_osd+0x45/0x80 [libceph] [262674.134249] [<ffffffffc035290e>] remove_osd+0xae/0x140 [libceph] [262674.189124] [<ffffffffc0352aa3>] __reset_osd+0x103/0x150 [libceph] [262674.243749] [<ffffffffc0354703>] kick_requests+0x223/0x460 [libceph] [262674.297485] [<ffffffffc03559e2>] ceph_osdc_handle_map+0x282/0x5e0 [libceph] [262674.350813] [<ffffffffc035022e>] dispatch+0x4e/0x720 [libceph] [262674.403312] [<ffffffffc034bd91>] try_read+0x3d1/0x1090 [libceph] [262674.454712] [<ffffffff810ab7c2>] ? dequeue_entity+0x152/0x690 [262674.505096] [<ffffffffc034cb1b>] con_work+0xcb/0x1300 [libceph] [262674.555104] [<ffffffff8108fb3e>] process_one_work+0x14e/0x3d0 [262674.604072] [<ffffffff810901ea>] worker_thread+0x11a/0x470 [262674.652187] [<ffffffff810900d0>] ? rescuer_thread+0x310/0x310 [262674.699022] [<ffffffff810957a2>] kthread+0xd2/0xf0 [262674.744494] [<ffffffff810956d0>] ? kthread_create_on_node+0x1c0/0x1c0 [262674.789543] [<ffffffff817bd81f>] ret_from_fork+0x3f/0x70 [262674.834094] [<ffffffff810956d0>] ? kthread_create_on_node+0x1c0/0x1c0 What happens is the following: (1) new MON session is established (2) old "none" ac is destroyed (3) new "cephx" ac is constructed ... (4) old OSD session (w/ "none" authorizer) is put ceph_auth_destroy_authorizer(ac, osd->o_auth.authorizer) osd->o_auth.authorizer in the "none" case is just a bare pointer into ac, which contains a single static copy for all services. By the time we get to (4), "none" ac, freed in (2), is long gone. On top of that, a new vtable installed in (3) points us at ceph_x_destroy_authorizer(), so we end up trying to destroy a "none" authorizer with a "cephx" destructor operating on invalid memory! To fix this, decouple authorizer destruction from ac and do away with a single static "none" authorizer by making a copy for each OSD or MDS session. Authorizers themselves are independent of ac and so there is no reason for destroy_authorizer() to be an ac op. Make it an op on the authorizer itself by turning ceph_authorizer into a real struct. Fixes: http://tracker.ceph.com/issues/15447 Reported-by: Alan Zhang <alan.zhang@linux.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Sage Weil <sage@redhat.com>

For T4, kernel mode qps don't use the user doorbell. User mode qps during flow control db ringing are forced into kernel, where user doorbell is treated as kernel doorbell and proper bar2 offset in bar2 virtual space is calculated, which incase of T4 is a bogus address, causing a kernel panic due to illegal write during doorbell ringing. In case of T4, kernel mode qp bar2 virtual address should be 0. Added T4 check during bar2 virtual address calculation to return 0. Fixed Bar2 range checks based on bar2 physical address. The below oops will be fixed <1>BUG: unable to handle kernel paging request at 000000000002aa08 <1>IP: [<ffffffffa011d800>] c4iw_uld_control+0x4e0/0x880 [iw_cxgb4] <4>PGD 1416a8067 PUD 15bf35067 PMD 0 <4>Oops: 0002 [#1] SMP <4>last sysfs file: /sys/devices/pci0000:00/0000:00:03.0/0000:02:00.4/infiniband/cxgb4_0/node_guid <4>CPU 5 <4>Modules linked in: rdma_ucm rdma_cm ib_cm ib_sa ib_mad ib_uverbs ip6table_filter ip6_tables ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT xt_CHECKSUM iptable_mangle iptable_filter ip_tables bridge autofs4 target_core_iblock target_core_file target_core_pscsi target_core_mod configfs bnx2fc cnic uio fcoe libfcoe libfc scsi_transport_fc scsi_tgt 8021q garp stp llc cpufreq_ondemand acpi_cpufreq freq_table mperf vhost_net macvtap macvlan tun kvm uinput microcode iTCO_wdt iTCO_vendor_support sg joydev serio_raw i2c_i801 i2c_core lpc_ich mfd_core e1000e ptp pps_core ioatdma dca i7core_edac edac_core shpchp ext3 jbd mbcache sd_mod crc_t10dif pata_acpi ata_generic ata_piix iw_cxgb4 iw_cm ib_core ib_addr cxgb4 ipv6 dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan] <4> Supermicro X8ST3/X8ST3 <4>RIP: 0010:[<ffffffffa011d800>] [<ffffffffa011d800>] c4iw_uld_control+0x4e0/0x880 [iw_cxgb4] <4>RSP: 0000:ffff880155a03db0 EFLAGS: 00010006 <4>RAX: 000000000000001d RBX: ffff88013ae5fc00 RCX: ffff880155adb180 <4>RDX: 000000000002aa00 RSI: 0000000000000001 RDI: ffff88013ae5fdf8 <4>RBP: ffff880155a03e10 R08: 0000000000000000 R09: 0000000000000001 <4>R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 <4>R13: 000000000000001d R14: ffff880156414ab0 R15: ffffe8ffffc05b88 <4>FS: 0000000000000000(0000) GS:ffff8800282a0000(0000) knlGS:0000000000000000 <4>CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b <4>CR2: 000000000002aa08 CR3: 000000015bd0e000 CR4: 00000000000007e0 <4>DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 <4>DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 <4>Process cxgb4 (pid: 394, threadinfo ffff880155a00000, task ffff880156414ab0) <4>Stack: <4> ffff880156415068 ffff880155adb180 ffff880155a03df0 ffffffffa00a344b <4><d> 00000000000003e8 ffff880155920000 0000000000000004 ffff880155920000 <4><d> ffff88015592d438 ffffffffa00a3860 ffff880155a03fd8 ffffe8ffffc05b88 <4>Call Trace: <4> [<ffffffffa00a344b>] ? enable_txq_db+0x2b/0x80 [cxgb4] <4> [<ffffffffa00a3860>] ? process_db_full+0x0/0xa0 [cxgb4] <4> [<ffffffffa00a38a6>] process_db_full+0x46/0xa0 [cxgb4] <4> [<ffffffff8109fda0>] worker_thread+0x170/0x2a0 <4> [<ffffffff810a6aa0>] ? autoremove_wake_function+0x0/0x40 <4> [<ffffffff8109fc30>] ? worker_thread+0x0/0x2a0 <4> [<ffffffff810a660e>] kthread+0x9e/0xc0 <4> [<ffffffff8100c28a>] child_rip+0xa/0x20 <4> [<ffffffff810a6570>] ? kthread+0x0/0xc0 <4> [<ffffffff8100c280>] ? child_rip+0x0/0x20 <4>Code: e9 ba 00 00 00 66 0f 1f 44 00 00 44 8b 05 29 07 02 00 45 85 c0 0f 85 71 02 00 00 8b 83 70 01 00 00 45 0f b7 ed c1 e0 0f 44 09 e8 <89> 42 08 0f ae f8 66 c7 83 82 01 00 00 00 00 44 0f b7 ab dc 01 <1>RIP [<ffffffffa011d800>] c4iw_uld_control+0x4e0/0x880 [iw_cxgb4] <4> RSP <ffff880155a03db0> <4>CR2: 000000000002aa08` Based on original work by Bharat Potnuri <bharat@chelsio.com> Fixes: 74217d4 ("iw_cxgb4: support for bar2 qid densities exceeding the page size") Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Reviewed-by: Leon Romanovsky <leon@leon.nu> Signed-off-by: Doug Ledford <dledford@redhat.com>

The locking around the interval RB tree is designed to prevent access to the tree while it's being modified. The locking in its current form is too overzealous, which is causing a deadlock in certain cases with the following backtrace: Kernel panic - not syncing: Watchdog detected hard LOCKUP on cpu 0 CPU: 0 PID: 5836 Comm: IMB-MPI1 Tainted: G O 3.12.18-wfr+ #1 0000000000000000 ffff88087f206c50 ffffffff814f1caa ffffffff817b53f0 ffff88087f206cc8 ffffffff814ecd56 0000000000000010 ffff88087f206cd8 ffff88087f206c78 0000000000000000 0000000000000000 0000000000001662 Call Trace: <NMI> [<ffffffff814f1caa>] dump_stack+0x45/0x56 [<ffffffff814ecd56>] panic+0xc2/0x1cb [<ffffffff810d4370>] ? restart_watchdog_hrtimer+0x50/0x50 [<ffffffff810d4432>] watchdog_overflow_callback+0xc2/0xd0 [<ffffffff81109b4e>] __perf_event_overflow+0x8e/0x2b0 [<ffffffff8110a714>] perf_event_overflow+0x14/0x20 [<ffffffff8101c906>] intel_pmu_handle_irq+0x1b6/0x390 [<ffffffff814f927b>] perf_event_nmi_handler+0x2b/0x50 [<ffffffff814f8ad8>] nmi_handle.isra.3+0x88/0x180 [<ffffffff814f8d39>] do_nmi+0x169/0x310 [<ffffffff814f8177>] end_repeat_nmi+0x1e/0x2e [<ffffffff81272600>] ? unmap_single+0x30/0x30 [<ffffffff814f780d>] ? _raw_spin_lock_irqsave+0x2d/0x40 [<ffffffff814f780d>] ? _raw_spin_lock_irqsave+0x2d/0x40 [<ffffffff814f780d>] ? _raw_spin_lock_irqsave+0x2d/0x40 <<EOE>> <IRQ> [<ffffffffa056c4a8>] hfi1_mmu_rb_search+0x38/0x70 [hfi1] [<ffffffffa05919cb>] user_sdma_free_request+0xcb/0x120 [hfi1] [<ffffffffa0593393>] user_sdma_txreq_cb+0x263/0x350 [hfi1] [<ffffffffa057fad7>] ? sdma_txclean+0x27/0x1c0 [hfi1] [<ffffffffa0593130>] ? user_sdma_send_pkts+0x1710/0x1710 [hfi1] [<ffffffffa057fdd6>] sdma_make_progress+0x166/0x480 [hfi1] [<ffffffff810762c9>] ? ttwu_do_wakeup+0x19/0xd0 [<ffffffffa0581c7e>] sdma_engine_interrupt+0x8e/0x100 [hfi1] [<ffffffffa0546bdd>] sdma_interrupt+0x5d/0xa0 [hfi1] [<ffffffff81097e57>] handle_irq_event_percpu+0x47/0x1d0 [<ffffffff81098017>] handle_irq_event+0x37/0x60 [<ffffffff8109aa5f>] handle_edge_irq+0x6f/0x120 [<ffffffff810044af>] handle_irq+0xbf/0x150 [<ffffffff8104c9b7>] ? irq_enter+0x17/0x80 [<ffffffff8150168d>] do_IRQ+0x4d/0xc0 [<ffffffff814f7c6a>] common_interrupt+0x6a/0x6a <EOI> [<ffffffff81073524>] ? finish_task_switch+0x54/0xe0 [<ffffffff814f56c6>] __schedule+0x3b6/0x7e0 [<ffffffff810763a6>] __cond_resched+0x26/0x30 [<ffffffff814f5eda>] _cond_resched+0x3a/0x50 [<ffffffff814f4f82>] down_write+0x12/0x30 [<ffffffffa0591619>] hfi1_release_user_pages+0x69/0x90 [hfi1] [<ffffffffa059173a>] sdma_rb_remove+0x9a/0xc0 [hfi1] [<ffffffffa056c00d>] __mmu_rb_remove.isra.5+0x5d/0x70 [hfi1] [<ffffffffa056c536>] hfi1_mmu_rb_remove+0x56/0x70 [hfi1] [<ffffffffa059427b>] hfi1_user_sdma_process_request+0x74b/0x1160 [hfi1] [<ffffffffa055c763>] hfi1_aio_write+0xc3/0x100 [hfi1] [<ffffffff8116a14c>] do_sync_readv_writev+0x4c/0x80 [<ffffffff8116b58b>] do_readv_writev+0xbb/0x230 [<ffffffff811a9da1>] ? fsnotify+0x241/0x320 [<ffffffff81073524>] ? finish_task_switch+0x54/0xe0 [<ffffffff8116b795>] vfs_writev+0x35/0x60 [<ffffffff8116b8c9>] SyS_writev+0x49/0xc0 [<ffffffff810cd876>] ? __audit_syscall_exit+0x1f6/0x2a0 [<ffffffff814ff992>] system_call_fastpath+0x16/0x1b As evident from the backtrace above, the process was being put to sleep while holding the lock. Limiting the scope of the lock only to the RB tree operation fixes the above error allowing for proper locking and the process being put to sleep when needed. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>

Attempting to free resources which have not been allocated and initialized properly led to the following kernel backtrace: BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<ffffffffa09658fe>] unlock_exp_tids.isra.8+0x2e/0x120 [hfi1] PGD 852a43067 PUD 85d4a6067 PMD 0 Oops: 0000 [#1] SMP CPU: 0 PID: 2831 Comm: osu_bw Tainted: G IO 3.12.18-wfr+ #1 task: ffff88085b15b540 ti: ffff8808588fe000 task.ti: ffff8808588fe000 RIP: 0010:[<ffffffffa09658fe>] [<ffffffffa09658fe>] unlock_exp_tids.isra.8+0x2e/0x120 [hfi1] RSP: 0018:ffff8808588ffde0 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffff880858a31800 RCX: 0000000000000000 RDX: ffff88085d971bc0 RSI: ffff880858a318f8 RDI: ffff880858a318c0 RBP: ffff8808588ffe20 R08: 0000000000000000 R09: 0000000000000000 R10: ffff88087ffd6f40 R11: 0000000001100348 R12: ffff880852900000 R13: ffff880858a318c0 R14: 0000000000000000 R15: ffff88085d971be8 FS: 00007f4674e83740(0000) GS:ffff88087f400000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 000000085c377000 CR4: 00000000001407f0 Stack: ffffffffa0941a71 ffff880858a318f8 ffff88085d971bc0 ffff880858a31800 ffff880852900000 ffff880858a31800 00000000003ffff7 ffff88085d971bc0 ffff8808588ffe60 ffffffffa09663fc ffff8808588ffe60 ffff880858a31800 Call Trace: [<ffffffffa0941a71>] ? find_mmu_handler+0x51/0x70 [hfi1] [<ffffffffa09663fc>] hfi1_user_exp_rcv_free+0x6c/0x120 [hfi1] [<ffffffffa0932809>] hfi1_file_close+0x1a9/0x340 [hfi1] [<ffffffff8116c189>] __fput+0xe9/0x270 [<ffffffff8116c35e>] ____fput+0xe/0x10 [<ffffffff81065707>] task_work_run+0xa7/0xe0 [<ffffffff81002969>] do_notify_resume+0x59/0x80 [<ffffffff814ffc1a>] int_signal+0x12/0x17 This commit re-arranges the context initialization code in a way that would allow for context event flags to be used to determine whether the context has been successfully initialized. In turn, this can be used to skip the resource de-allocation if they were never allocated in the first place. Fixes: 3abb33a ("staging/hfi1: Add TID cache receive init and free funcs") Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com. Signed-off-by: Doug Ledford <dledford@redhat.com>

Kyeongdon reported below error which is BUG_ON(!PageSwapCache(page)) in page_swap_info. The reason is that page_endio in rw_page unlocks the page if read I/O is completed so we need to hold a PG_lock again to check PageSwapCache. Otherwise, the page can be removed from swapcache. Kernel BUG at c00f9040 [verbose debug info unavailable] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM Modules linked in: CPU: 4 PID: 13446 Comm: RenderThread Tainted: G W 3.10.84-g9f14aec-dirty torvalds#73 task: c3b73200 ti: dd192000 task.ti: dd192000 PC is at page_swap_info+0x10/0x2c LR is at swap_slot_free_notify+0x18/0x6c pc : [<c00f9040>] lr : [<c00f5560>] psr: 400f0113 sp : dd193d78 ip : c2deb1e4 fp : da015180 r10: 00000000 r9 : 000200da r8 : c120fe08 r7 : 00000000 r6 : 00000000 r5 : c249a6c0 r4 : = c249a6c0 r3 : 00000000 r2 : 40080009 r1 : 200f0113 r0 : = c249a6c0 ..<snip> .. Call Trace: page_swap_info+0x10/0x2c swap_slot_free_notify+0x18/0x6c swap_readpage+0x90/0x11c read_swap_cache_async+0x134/0x1ac swapin_readahead+0x70/0xb0 handle_pte_fault+0x320/0x6fc handle_mm_fault+0xc0/0xf0 do_page_fault+0x11c/0x36c do_DataAbort+0x34/0x118 Fixes: 3f2b1a0 ("zram: revive swap_slot_free_notify") Signed-off-by: Minchan Kim <minchan@kernel.org> Tested-by: Kyeongdon Kim <kyeongdon.kim@lge.com> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

…p list Calling gpiod_get() from a module and then unloading the module leads to an oops due to acpi_can_fallback_to_crs() storing the pointer to the passed 'con_id' string onto acpi_crs_lookup_list. The next guy to come along will then try to access the string but the memory may now be gone with the module. Make a copy of the passed string instead, and store the copy on the list. BUG: unable to handle kernel paging request at ffffffffa03e7855 IP: [<ffffffff81338322>] strcmp+0x12/0x30 PGD 2a07067 PUD 2a08063 PMD 74720067 PTE 0 Oops: 0000 [#1] PREEMPT SMP Modules linked in: i915(+) drm_kms_helper drm intel_gtt snd_hda_codec snd_hda_core i2c_algo_bit syscopya rea sysfillrect sysimgblt fb_sys_fops agpgart snd_soc_sst_bytcr_rt5640 coretemp hwmon intel_rapl intel_soc_dts_thermal punit_atom_debug snd_soc_rt5640 snd_soc_rl6231 serio snd_intel_sst_acpi snd_intel_sst_core video snd_soc_sst_mfld_platf orm snd_soc_sst_match backlight int3402_thermal processor_thermal_device int3403_thermal int3400_thermal acpi_thermal_r el snd_soc_core intel_soc_dts_iosf int340x_thermal_zone snd_compress i2c_hid hid snd_pcm snd_timer snd soundcore evdev sch_fq_codel efivarfs ipv6 autofs4 [last unloaded: drm] CPU: 2 PID: 3064 Comm: modprobe Tainted: G U W 4.6.0-rc3-ffrd-ipvr+ torvalds#302 Hardware name: Intel Corp. VALLEYVIEW C0 PLATFORM/BYT-T FFD8, BIOS BLAKFF81.X64.0088.R10.1403240443 FFD8 _X64_R_2014_13_1_00 03/24/2014 task: ffff8800701cd200 ti: ffff880070034000 task.ti: ffff880070034000 RIP: 0010:[<ffffffff81338322>] [<ffffffff81338322>] strcmp+0x12/0x30 RSP: 0000:ffff880070037748 EFLAGS: 00010286 RAX: 0000000080000000 RBX: ffff88007a342800 RCX: 0000000000000006 RDX: 0000000000000006 RSI: ffffffffa054f856 RDI: ffffffffa03e7856 RBP: ffff880070037748 R08: 0000000000000000 R09: 0000000000000001 R10: 0000000000000000 R11: 0000000000000000 R12: ffffffffa054f855 R13: ffff88007281cae0 R14: 0000000000000010 R15: ffffffffffffffea FS: 00007faa51447700(0000) GS:ffff880079300000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffffa03e7855 CR3: 0000000041eba000 CR4: 00000000001006e0 Stack: ffff880070037770 ffffffff8136ad28 ffffffffa054f855 0000000000000000 ffff88007a0a2098 ffff8800700377e8 ffffffff8136852e ffff88007a342800 00000007700377a0 ffff8800700377a0 ffffffff81412442 70672d6c656e6170 Call Trace: [<ffffffff8136ad28>] acpi_can_fallback_to_crs+0x88/0x100 [<ffffffff8136852e>] gpiod_get_index+0x25e/0x310 [<ffffffff81412442>] ? mipi_dsi_attach+0x22/0x30 [<ffffffff813685f2>] gpiod_get+0x12/0x20 [<ffffffffa04fcf41>] intel_dsi_init+0x421/0x480 [i915] [<ffffffffa04d3783>] intel_modeset_init+0x853/0x16b0 [i915] [<ffffffffa0504864>] ? intel_setup_gmbus+0x214/0x260 [i915] [<ffffffffa0510158>] i915_driver_load+0xdc8/0x19b0 [i915] [<ffffffff8160fb53>] ? _raw_spin_unlock_irqrestore+0x43/0x70 [<ffffffffa026b13b>] drm_dev_register+0xab/0xc0 [drm] [<ffffffffa026d7b3>] drm_get_pci_dev+0x93/0x1f0 [drm] [<ffffffff8160fb53>] ? _raw_spin_unlock_irqrestore+0x43/0x70 [<ffffffffa043f1f4>] i915_pci_probe+0x34/0x50 [i915] [<ffffffff81379751>] pci_device_probe+0x91/0x100 [<ffffffff8141a75a>] driver_probe_device+0x20a/0x2d0 [<ffffffff8141a8be>] __driver_attach+0x9e/0xb0 [<ffffffff8141a820>] ? driver_probe_device+0x2d0/0x2d0 [<ffffffff81418439>] bus_for_each_dev+0x69/0xa0 [<ffffffff8141a04e>] driver_attach+0x1e/0x20 [<ffffffff81419c20>] bus_add_driver+0x1c0/0x240 [<ffffffff8141b6d0>] driver_register+0x60/0xe0 [<ffffffff81377d20>] __pci_register_driver+0x60/0x70 [<ffffffffa026d9f4>] drm_pci_init+0xe4/0x110 [drm] [<ffffffff810ce04e>] ? trace_hardirqs_on+0xe/0x10 [<ffffffffa02f1000>] ? 0xffffffffa02f1000 [<ffffffffa02f1094>] i915_init+0x94/0x9b [i915] [<ffffffff810003bb>] do_one_initcall+0x8b/0x1c0 [<ffffffff810eb616>] ? rcu_read_lock_sched_held+0x86/0x90 [<ffffffff811de6d6>] ? kmem_cache_alloc_trace+0x1f6/0x270 [<ffffffff81183826>] do_init_module+0x60/0x1dc [<ffffffff81115a8d>] load_module+0x1d0d/0x2390 [<ffffffff811120b0>] ? __symbol_put+0x70/0x70 [<ffffffff811f41b2>] ? kernel_read_file+0x92/0x120 [<ffffffff811162f4>] SYSC_finit_module+0xa4/0xb0 [<ffffffff8111631e>] SyS_finit_module+0xe/0x10 [<ffffffff81001ff3>] do_syscall_64+0x63/0x350 [<ffffffff816103da>] entry_SYSCALL64_slow_path+0x25/0x25 Code: f7 48 8d 76 01 48 8d 52 01 0f b6 4e ff 84 c9 88 4a ff 75 ed 5d c3 0f 1f 00 55 48 89 e5 eb 04 84 c0 74 18 48 8d 7f 01 48 8d 76 01 <0f> b6 47 ff 3a 46 ff 74 eb 19 c0 83 c8 01 5d c3 31 c0 5d c3 66 RIP [<ffffffff81338322>] strcmp+0x12/0x30 RSP <ffff880070037748> CR2: ffffffffa03e7855 v2: Make the copied con_id const Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com> Cc: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: Alexandre Courbot <gnurou@gmail.com> Cc: stable@vger.kernel.org Fixes: 10cf489 ("gpiolib: tighten up ACPI legacy gpio lookups") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com> Reviewed-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>

This was recently reported to me, and reproduced on the latest net kernel, when attempting to run netperf from a host that had a netem qdisc attached to the egress interface: [ 788.073771] ---------------------[ cut here ]--------------------------- [ 788.096716] WARNING: at net/core/dev.c:2253 skb_warn_bad_offload+0xcd/0xda() [ 788.129521] bnx2: caps=(0x00000001801949b3, 0x0000000000000000) len=2962 data_len=0 gso_size=1448 gso_type=1 ip_summed=3 [ 788.182150] Modules linked in: sch_netem kvm_amd kvm crc32_pclmul ipmi_ssif ghash_clmulni_intel sp5100_tco amd64_edac_mod aesni_intel lrw gf128mul glue_helper ablk_helper edac_mce_amd cryptd pcspkr sg edac_core hpilo ipmi_si i2c_piix4 k10temp fam15h_power hpwdt ipmi_msghandler shpchp acpi_power_meter pcc_cpufreq nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic mgag200 syscopyarea sysfillrect sysimgblt i2c_algo_bit drm_kms_helper ahci ata_generic pata_acpi ttm libahci crct10dif_pclmul pata_atiixp tg3 libata crct10dif_common drm crc32c_intel ptp serio_raw bnx2 r8169 hpsa pps_core i2c_core mii dm_mirror dm_region_hash dm_log dm_mod [ 788.465294] CPU: 16 PID: 0 Comm: swapper/16 Tainted: G W ------------ 3.10.0-327.el7.x86_64 #1 [ 788.511521] Hardware name: HP ProLiant DL385p Gen8, BIOS A28 12/17/2012 [ 788.542260] ffff880437c036b8 f7afc56532a53db9 ffff880437c03670 ffffffff816351f1 [ 788.576332] ffff880437c036a8 ffffffff8107b200 ffff880633e74200 ffff880231674000 [ 788.611943] 0000000000000001 0000000000000003 0000000000000000 ffff880437c03710 [ 788.647241] Call Trace: [ 788.658817] <IRQ> [<ffffffff816351f1>] dump_stack+0x19/0x1b [ 788.686193] [<ffffffff8107b200>] warn_slowpath_common+0x70/0xb0 [ 788.713803] [<ffffffff8107b29c>] warn_slowpath_fmt+0x5c/0x80 [ 788.741314] [<ffffffff812f92f3>] ? ___ratelimit+0x93/0x100 [ 788.767018] [<ffffffff81637f49>] skb_warn_bad_offload+0xcd/0xda [ 788.796117] [<ffffffff8152950c>] skb_checksum_help+0x17c/0x190 [ 788.823392] [<ffffffffa01463a1>] netem_enqueue+0x741/0x7c0 [sch_netem] [ 788.854487] [<ffffffff8152cb58>] dev_queue_xmit+0x2a8/0x570 [ 788.880870] [<ffffffff8156ae1d>] ip_finish_output+0x53d/0x7d0 ... The problem occurs because netem is not prepared to handle GSO packets (as it uses skb_checksum_help in its enqueue path, which cannot manipulate these frames). The solution I think is to simply segment the skb in a simmilar fashion to the way we do in __dev_queue_xmit (via validate_xmit_skb), with some minor changes. When we decide to corrupt an skb, if the frame is GSO, we segment it, corrupt the first segment, and enqueue the remaining ones. tested successfully by myself on the latest net kernel, to which this applies Signed-off-by: Neil Horman <nhorman@tuxdriver.com> CC: Jamal Hadi Salim <jhs@mojatatu.com> CC: "David S. Miller" <davem@davemloft.net> CC: netem@lists.linux-foundation.org CC: eric.dumazet@gmail.com CC: stephen@networkplumber.org Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>

When the first propgated copy was a slave the following oops would result: > BUG: unable to handle kernel NULL pointer dereference at 0000000000000010 > IP: [<ffffffff811fba4e>] propagate_one+0xbe/0x1c0 > PGD bacd4067 PUD bac66067 PMD 0 > Oops: 0000 [#1] SMP > Modules linked in: > CPU: 1 PID: 824 Comm: mount Not tainted 4.6.0-rc5userns+ #1523 > Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007 > task: ffff8800bb0a8000 ti: ffff8800bac3c000 task.ti: ffff8800bac3c000 > RIP: 0010:[<ffffffff811fba4e>] [<ffffffff811fba4e>] propagate_one+0xbe/0x1c0 > RSP: 0018:ffff8800bac3fd38 EFLAGS: 00010283 > RAX: 0000000000000000 RBX: ffff8800bb77ec00 RCX: 0000000000000010 > RDX: 0000000000000000 RSI: ffff8800bb58c000 RDI: ffff8800bb58c480 > RBP: ffff8800bac3fd48 R08: 0000000000000001 R09: 0000000000000000 > R10: 0000000000001ca1 R11: 0000000000001c9d R12: 0000000000000000 > R13: ffff8800ba713800 R14: ffff8800bac3fda0 R15: ffff8800bb77ec00 > FS: 00007f3c0cd9b7e0(0000) GS:ffff8800bfb00000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 0000000000000010 CR3: 00000000bb79d000 CR4: 00000000000006e0 > Stack: > ffff8800bb77ec00 0000000000000000 ffff8800bac3fd88 ffffffff811fbf85 > ffff8800bac3fd98 ffff8800bb77f080 ffff8800ba713800 ffff8800bb262b40 > 0000000000000000 0000000000000000 ffff8800bac3fdd8 ffffffff811f1da0 > Call Trace: > [<ffffffff811fbf85>] propagate_mnt+0x105/0x140 > [<ffffffff811f1da0>] attach_recursive_mnt+0x120/0x1e0 > [<ffffffff811f1ec3>] graft_tree+0x63/0x70 > [<ffffffff811f1f6b>] do_add_mount+0x9b/0x100 > [<ffffffff811f2c1a>] do_mount+0x2aa/0xdf0 > [<ffffffff8117efbe>] ? strndup_user+0x4e/0x70 > [<ffffffff811f3a45>] SyS_mount+0x75/0xc0 > [<ffffffff8100242b>] do_syscall_64+0x4b/0xa0 > [<ffffffff81988f3c>] entry_SYSCALL64_slow_path+0x25/0x25 > Code: 00 00 75 ec 48 89 0d 02 22 22 01 8b 89 10 01 00 00 48 89 05 fd 21 22 01 39 8e 10 01 00 00 0f 84 e0 00 00 00 48 8b 80 d8 00 00 00 <48> 8b 50 10 48 89 05 df 21 22 01 48 89 15 d0 21 22 01 8b 53 30 > RIP [<ffffffff811fba4e>] propagate_one+0xbe/0x1c0 > RSP <ffff8800bac3fd38> > CR2: 0000000000000010 > ---[ end trace 2725ecd95164f217 ]--- This oops happens with the namespace_sem held and can be triggered by non-root users. An all around not pleasant experience. To avoid this scenario when finding the appropriate source mount to copy stop the walk up the mnt_master chain when the first source mount is encountered. Further rewrite the walk up the last_source mnt_master chain so that it is clear what is going on. The reason why the first source mount is special is that it it's mnt_parent is not a mount in the dest_mnt propagation tree, and as such termination conditions based up on the dest_mnt mount propgation tree do not make sense. To avoid other kinds of confusion last_dest is not changed when computing last_source. last_dest is only used once in propagate_one and that is above the point of the code being modified, so changing the global variable is meaningless and confusing. Cc: stable@vger.kernel.org fixes: f2ebb3a ("smarter propagate_mnt()") Reported-by: Tycho Andersen <tycho.andersen@canonical.com> Reviewed-by: Seth Forshee <seth.forshee@canonical.com> Tested-by: Seth Forshee <seth.forshee@canonical.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

NULL pointer derefence happens when booting with DTB because the platform data for haptic device is not set in supplied data from parent MFD device. The MFD device creates only platform data (from Device Tree) for itself, not for haptic child. Unable to handle kernel NULL pointer dereference at virtual address 0000009c pgd = c0004000 [0000009c] *pgd=00000000 Internal error: Oops: 5 [#1] PREEMPT SMP ARM (max8997_haptic_probe) from [<c03f9cec>] (platform_drv_probe+0x4c/0xb0) (platform_drv_probe) from [<c03f8440>] (driver_probe_device+0x214/0x2c0) (driver_probe_device) from [<c03f8598>] (__driver_attach+0xac/0xb0) (__driver_attach) from [<c03f67ac>] (bus_for_each_dev+0x68/0x9c) (bus_for_each_dev) from [<c03f7a38>] (bus_add_driver+0x1a0/0x218) (bus_add_driver) from [<c03f8db0>] (driver_register+0x78/0xf8) (driver_register) from [<c0101774>] (do_one_initcall+0x90/0x1d8) (do_one_initcall) from [<c0a00dbc>] (kernel_init_freeable+0x15c/0x1fc) (kernel_init_freeable) from [<c06bb5b4>] (kernel_init+0x8/0x114) (kernel_init) from [<c0107938>] (ret_from_fork+0x14/0x3c) Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Cc: <stable@vger.kernel.org> Fixes: 104594b ("Input: add driver support for MAX8997-haptic") [k.kozlowski: Write commit message, add CC-stable] Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>

…it/nferre/linux-at91 into fixes Merge "at91: fixes for 4.6 #1" from Nicolas Ferre: Here is a late fix for AT91. Sorry to have figure it out so late in the development cycle but we had to confirm it was an error with the documentation of two products. So, as the compatibility string is in since 4.6-rc1 and that the previous one works okay, it's a good opportunity to switch back to the one that works without introducing a intermediary bug. The revert on driver code and the removal of the useless additional compatibility string will be queued for 4.7 through NAND/MTD. * tag 'at91-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nferre/linux-at91: ARM: dts: at91: sama5d2: use "atmel,sama5d3-nfc" compatible for nfc

…ing" This patch causes a Kernel panic when called on a DVB driver. This was also reported by David R <david@unsolicited.net>: May 7 14:47:35 server kernel: [ 501.247123] BUG: unable to handle kernel NULL pointer dereference at 0000000000000004 May 7 14:47:35 server kernel: [ 501.247239] IP: [<ffffffffa0222c71>] __verify_planes_array.isra.3+0x1/0x80 [videobuf2_v4l2] May 7 14:47:35 server kernel: [ 501.247354] PGD cae6f067 PUD ca99c067 PMD 0 May 7 14:47:35 server kernel: [ 501.247426] Oops: 0000 [#1] SMP May 7 14:47:35 server kernel: [ 501.247482] Modules linked in: xfs tun xt_connmark xt_TCPMSS xt_tcpmss xt_owner xt_REDIRECT nf_nat_redirect xt_nat ipt_MASQUERADE nf_nat_masquerade_ipv4 ts_kmp ts_bm xt_string ipt_REJECT nf_reject_ipv4 xt_recent xt_conntrack xt_multiport xt_pkttype xt_tcpudp xt_mark nf_log_ipv4 nf_log_common xt_LOG xt_limit iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_filter ip_tables ip6table_filter ip6_tables x_tables pppoe pppox dm_crypt ts2020 regmap_i2c ds3000 cx88_dvb dvb_pll cx88_vp3054_i2c mt352 videobuf2_dvb cx8800 cx8802 cx88xx pl2303 tveeprom videobuf2_dma_sg ppdev videobuf2_memops videobuf2_v4l2 videobuf2_core dvb_usb_digitv snd_hda_codec_via snd_hda_codec_hdmi snd_hda_codec_generic radeon dvb_usb snd_hda_intel amd64_edac_mod serio_raw snd_hda_codec edac_core fbcon k10temp bitblit softcursor snd_hda_core font snd_pcm_oss i2c_piix4 snd_mixer_oss tileblit drm_kms_helper syscopyarea snd_pcm snd_seq_dummy sysfillrect snd_seq_oss sysimgblt fb_sys_fops ttm snd_seq_midi r8169 snd_rawmidi drm snd_seq_midi_event e1000e snd_seq snd_seq_device snd_timer snd ptp pps_core i2c_algo_bit soundcore parport_pc ohci_pci shpchp tpm_tis tpm nfsd auth_rpcgss oid_registry hwmon_vid exportfs nfs_acl mii nfs bonding lockd grace lp sunrpc parport May 7 14:47:35 server kernel: [ 501.249564] CPU: 1 PID: 6889 Comm: vb2-cx88[0] Not tainted 4.5.3 #3 May 7 14:47:35 server kernel: [ 501.249644] Hardware name: System manufacturer System Product Name/M4A785TD-V EVO, BIOS 0211 07/08/2009 May 7 14:47:35 server kernel: [ 501.249767] task: ffff8800aebf3600 ti: ffff8801e07a0000 task.ti: ffff8801e07a0000 May 7 14:47:35 server kernel: [ 501.249861] RIP: 0010:[<ffffffffa0222c71>] [<ffffffffa0222c71>] __verify_planes_array.isra.3+0x1/0x80 [videobuf2_v4l2] May 7 14:47:35 server kernel: [ 501.250002] RSP: 0018:ffff8801e07a3de8 EFLAGS: 00010086 May 7 14:47:35 server kernel: [ 501.250071] RAX: 0000000000000283 RBX: ffff880210dc5000 RCX: 0000000000000283 May 7 14:47:35 server kernel: [ 501.250161] RDX: ffffffffa0222cf0 RSI: 0000000000000000 RDI: ffff880210dc5014 May 7 14:47:35 server kernel: [ 501.250251] RBP: ffff8801e07a3df8 R08: ffff8801e07a0000 R09: 0000000000000000 May 7 14:47:35 server kernel: [ 501.250348] R10: 0000000000000000 R11: 0000000000000001 R12: ffff8800cda2a9d8 May 7 14:47:35 server kernel: [ 501.250438] R13: ffff880210dc51b8 R14: 0000000000000000 R15: ffff8800cda2a828 May 7 14:47:35 server kernel: [ 501.250528] FS: 00007f5b77fff700(0000) GS:ffff88021fc40000(0000) knlGS:00000000adaffb40 May 7 14:47:35 server kernel: [ 501.250631] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b May 7 14:47:35 server kernel: [ 501.250704] CR2: 0000000000000004 CR3: 00000000ca19d000 CR4: 00000000000006e0 May 7 14:47:35 server kernel: [ 501.250794] Stack: May 7 14:47:35 server kernel: [ 501.250822] ffff8801e07a3df8 ffffffffa0222cfd ffff8801e07a3e70 ffffffffa0236beb May 7 14:47:35 server kernel: [ 501.250937] 0000000000000283 ffff8801e07a3e94 0000000000000000 0000000000000000 May 7 14:47:35 server kernel: [ 501.251051] ffff8800aebf3600 ffffffff8108d8e0 ffff8801e07a3e38 ffff8801e07a3e38 May 7 14:47:35 server kernel: [ 501.251165] Call Trace: May 7 14:47:35 server kernel: [ 501.251200] [<ffffffffa0222cfd>] ? __verify_planes_array_core+0xd/0x10 [videobuf2_v4l2] May 7 14:47:35 server kernel: [ 501.251306] [<ffffffffa0236beb>] vb2_core_dqbuf+0x2eb/0x4c0 [videobuf2_core] May 7 14:47:35 server kernel: [ 501.251398] [<ffffffff8108d8e0>] ? prepare_to_wait_event+0x100/0x100 May 7 14:47:35 server kernel: [ 501.251482] [<ffffffffa023855b>] vb2_thread+0x1cb/0x220 [videobuf2_core] May 7 14:47:35 server kernel: [ 501.251569] [<ffffffffa0238390>] ? vb2_core_qbuf+0x230/0x230 [videobuf2_core] May 7 14:47:35 server kernel: [ 501.251662] [<ffffffffa0238390>] ? vb2_core_qbuf+0x230/0x230 [videobuf2_core] May 7 14:47:35 server kernel: [ 501.255982] [<ffffffff8106f984>] kthread+0xc4/0xe0 May 7 14:47:35 server kernel: [ 501.260292] [<ffffffff8106f8c0>] ? kthread_park+0x50/0x50 May 7 14:47:35 server kernel: [ 501.264615] [<ffffffff81697a5f>] ret_from_fork+0x3f/0x70 May 7 14:47:35 server kernel: [ 501.268962] [<ffffffff8106f8c0>] ? kthread_park+0x50/0x50 May 7 14:47:35 server kernel: [ 501.273216] Code: 0d 01 74 16 48 8b 46 28 48 8b 56 30 48 89 87 d0 01 00 00 48 89 97 d8 01 00 00 5d c3 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 55 <8b> 46 04 48 89 e5 8d 50 f7 31 c0 83 fa 01 76 02 5d c3 48 83 7e May 7 14:47:35 server kernel: [ 501.282146] RIP [<ffffffffa0222c71>] __verify_planes_array.isra.3+0x1/0x80 [videobuf2_v4l2] May 7 14:47:35 server kernel: [ 501.286391] RSP <ffff8801e07a3de8> May 7 14:47:35 server kernel: [ 501.290619] CR2: 0000000000000004 May 7 14:47:35 server kernel: [ 501.294786] ---[ end trace b2b354153ccad110 ]--- This reverts commit 2c1f695. Cc: Sakari Ailus <sakari.ailus@linux.intel.com> Cc: Hans Verkuil <hans.verkuil@cisco.com> Cc: stable@vger.kernel.org Fixes: 2c1f695 ("[media] videobuf2-v4l2: Verify planes array in buffer dequeueing") Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>

A basic perf callgraph record operation causes an immediate panic on a 32-bit kernel compiled with CONFIG_CC_STACKPROTECTOR=y: $ perf record -g ls Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: c0404fbd CPU: 0 PID: 998 Comm: ls Not tainted 4.7.0-rc5+ #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.1-1.fc24 04/01/2014 c0dd5967 ff7afe1c 00000086 f41dbc2c c07445a0 464c457f f41dbca8 f41dbc44 c05646f4 f41dbca8 464c457f f41dbca8 464c457f f41dbc54 c04625be c0ce56fc c0404fbd f41dbc88 c0404fbd b74668f0 f41dc000 00000000 c0000000 00000000 Call Trace: [<c07445a0>] dump_stack+0x58/0x78 [<c05646f4>] panic+0x8e/0x1c6 [<c04625be>] __stack_chk_fail+0x1e/0x30 [<c0404fbd>] ? perf_callchain_user+0x22d/0x230 [<c0404fbd>] perf_callchain_user+0x22d/0x230 [<c055f89f>] get_perf_callchain+0x1ff/0x270 [<c055f988>] perf_callchain+0x78/0x90 [<c055c7eb>] perf_prepare_sample+0x24b/0x370 [<c055c934>] perf_event_output_forward+0x24/0x70 [<c05531c0>] __perf_event_overflow+0xa0/0x210 [<c0550a93>] ? cpu_clock_event_read+0x43/0x50 [<c0553431>] perf_swevent_hrtimer+0x101/0x180 [<c0456235>] ? kmap_atomic_prot+0x35/0x140 [<c056dc69>] ? get_page_from_freelist+0x279/0x950 [<c058fdd8>] ? vma_interval_tree_remove+0x158/0x230 [<c05939f4>] ? wp_page_copy.isra.82+0x2f4/0x630 [<c05a050d>] ? page_add_file_rmap+0x1d/0x50 [<c0565611>] ? unlock_page+0x61/0x80 [<c0566755>] ? filemap_map_pages+0x305/0x320 [<c059769f>] ? handle_mm_fault+0xb7f/0x1560 [<c074cbeb>] ? timerqueue_del+0x1b/0x70 [<c04cfefe>] ? __remove_hrtimer+0x2e/0x60 [<c04d017b>] __hrtimer_run_queues+0xcb/0x2a0 [<c0553330>] ? __perf_event_overflow+0x210/0x210 [<c04d0a2a>] hrtimer_interrupt+0x8a/0x180 [<c043ecc2>] local_apic_timer_interrupt+0x32/0x60 [<c043f643>] smp_apic_timer_interrupt+0x33/0x50 [<c0b0cd38>] apic_timer_interrupt+0x34/0x3c Kernel Offset: disabled ---[ end Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: c0404fbd The panic is caused by the fact that perf_callchain_user() mistakenly assumes it's 64-bit only and ends up corrupting the stack. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: stable@vger.kernel.org # v4.5+ Fixes: 75925e1 ("perf/x86: Optimize stack walk user accesses") Link: http://lkml.kernel.org/r/1a547f5077ec30f75f9b57074837c3c80df86e5e.1467432113.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>

Commit fbde2d7 ("MIPS: Add generic SMP IPI support") introduced code which calls irq_find_matching_host with a NULL node parameter in order to discover IPI IRQ domains which are not associated with the DT root node's interrupt parent. This suggests that implementations of IPI IRQ domains should effectively ignore the node parameter if it is NULL and search purely based upon the bus token. Commit 2af70a9 ("irqchip/mips-gic: Add a IPI hierarchy domain") did not do this when implementing the GIC IPI IRQ domain, and on MIPS Boston boards this leads to no IPI domain being discovered and a NULL pointer dereference when attempting to send an IPI: CPU 0 Unable to handle kernel paging request at virtual address 0000000000000040, epc == ffffffff8016e70c, ra == ffffffff8010ff5c Oops[#1]: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.7.0-rc6-00223-gad0d1b6 torvalds#945 task: a8000000ff066fc0 ti: a8000000ff068000 task.ti: a8000000ff068000 $ 0 : 0000000000000000 0000000000000001 ffffffff80730000 0000000000000003 $ 4 : 0000000000000000 ffffffff8057e5b0 a800000001e3ee00 0000000000000000 $ 8 : 0000000000000000 0000000000000023 0000000000000001 0000000000000001 $12 : 0000000000000000 ffffffff803323d0 0000000000000000 0000000000000000 $16 : 0000000000000000 0000000000000000 0000000000000001 ffffffff801108fc $20 : 0000000000000000 ffffffff8057e5b0 0000000000000001 0000000000000000 $24 : 0000000000000000 ffffffff8012de28 $28 : a8000000ff068000 a8000000ff06fbc0 0000000000000000 ffffffff8010ff5c Hi : ffffffff8014c174 Lo : a800000001e1e140 epc : ffffffff8016e70c __ipi_send_mask+0x24/0x11c ra : ffffffff8010ff5c mips_smp_send_ipi_mask+0x68/0x178 Status: 140084e2 KX SX UX KERNEL EXL Cause : 00800008 (ExcCode 02) BadVA : 0000000000000040 PrId : 0001a920 (MIPS I6400) Process swapper/0 (pid: 1, threadinfo=a8000000ff068000, task=a8000000ff066fc0, tls=0000000000000000) Stack : 0000000000000000 0000000000000000 0000000000000001 ffffffff801108fc 0000000000000000 ffffffff8057e5b0 0000000000000001 ffffffff8010ff5c 0000000000000001 0000000000000020 0000000000000000 0000000000000000 0000000000000000 ffffffff801108fc 0000000000000000 0000000000000001 0000000000000001 0000000000000000 0000000000000000 ffffffff801865e8 a8000000ff0c7500 a8000000ff06fc90 0000000000000001 0000000000000002 ffffffff801108fc ffffffff801868b8 0000000000000000 ffffffff801108fc 0000000000000000 0000000000000003 ffffffff8068c700 0000000000000001 ffffffff80730000 0000000000000001 a8000000ff00a290 ffffffff80110c50 0000000000000003 a800000001e48308 0000000000000003 0000000000000008 ... Call Trace: [<ffffffff8016e70c>] __ipi_send_mask+0x24/0x11c [<ffffffff8010ff5c>] mips_smp_send_ipi_mask+0x68/0x178 [<ffffffff801865e8>] generic_exec_single+0x150/0x170 [<ffffffff801868b8>] smp_call_function_single+0x108/0x160 [<ffffffff80110c50>] cps_boot_secondary+0x328/0x394 [<ffffffff80110534>] __cpu_up+0x38/0x90 [<ffffffff8012de4c>] bringup_cpu+0x24/0xac [<ffffffff8012df40>] cpuhp_up_callbacks+0x58/0xdc [<ffffffff8012e648>] cpu_up+0x118/0x18c [<ffffffff806dc158>] smp_init+0xbc/0xe8 [<ffffffff806d4c18>] kernel_init_freeable+0xa0/0x228 [<ffffffff8056c908>] kernel_init+0x10/0xf0 [<ffffffff80105098>] ret_from_kernel_thread+0x14/0x1c Fix this by allowing the GIC IPI IRQ domain to match purely based upon the bus token if the node provided is NULL. Fixes: 2af70a9 ("irqchip/mips-gic: Add a IPI hierarchy domain") Signed-off-by: Paul Burton <paul.burton@imgtec.com> Cc: linux-mips@linux-mips.org Cc: Jason Cooper <jason@lakedaemon.net> Cc: Qais Yousef <qsyousef@gmail.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20160705132600.27730-2-paul.burton@imgtec.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

In qla24xx_process_response_queue() rsp->msix->cpuid may trigger NULL pointer dereference when rsp->msix is NULL: [ 5.622457] NULL pointer dereference at 0000000000000050 [ 5.622457] IP: [<ffffffff8155e614>] qla24xx_process_response_queue+0x44/0x4b0 [ 5.622457] PGD 0 [ 5.622457] Oops: 0000 [#1] SMP [ 5.622457] Modules linked in: [ 5.622457] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.6.3-x86_64 #1 [ 5.622457] Hardware name: HP ProLiant DL360 G5, BIOS P58 05/02/2011 [ 5.622457] task: ffff8801a88f3740 ti: ffff8801a8954000 task.ti: ffff8801a8954000 [ 5.622457] RIP: 0010:[<ffffffff8155e614>] [<ffffffff8155e614>] qla24xx_process_response_queue+0x44/0x4b0 [ 5.622457] RSP: 0000:ffff8801afb03de8 EFLAGS: 00010002 [ 5.622457] RAX: 0000000000000000 RBX: 0000000000000032 RCX: 00000000ffffffff [ 5.622457] RDX: 0000000000000002 RSI: ffff8801a79bf8c8 RDI: ffff8800c8f7e7c0 [ 5.622457] RBP: ffff8801afb03e68 R08: 0000000000000000 R09: 0000000000000000 [ 5.622457] R10: 00000000ffff8c47 R11: 0000000000000002 R12: ffff8801a79bf8c8 [ 5.622457] R13: ffff8800c8f7e7c0 R14: ffff8800c8f60000 R15: 0000000000018013 [ 5.622457] FS: 0000000000000000(0000) GS:ffff8801afb00000(0000) knlGS:0000000000000000 [ 5.622457] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 5.622457] CR2: 0000000000000050 CR3: 0000000001e07000 CR4: 00000000000006e0 [ 5.622457] Stack: [ 5.622457] ffff8801afb03e30 ffffffff810c0f2d 0000000000000086 0000000000000002 [ 5.622457] ffff8801afb03e28 ffffffff816570e1 ffff8800c8994628 0000000000000002 [ 5.622457] ffff8801afb03e60 ffffffff816772d4 b47c472ad6955e68 0000000000000032 [ 5.622457] Call Trace: [ 5.622457] <IRQ> [ 5.622457] [<ffffffff810c0f2d>] ? __wake_up_common+0x4d/0x80 [ 5.622457] [<ffffffff816570e1>] ? usb_hcd_resume_root_hub+0x51/0x60 [ 5.622457] [<ffffffff816772d4>] ? uhci_hub_status_data+0x64/0x240 [ 5.622457] [<ffffffff81560d00>] qla24xx_intr_handler+0xf0/0x2e0 [ 5.622457] [<ffffffff810d569e>] ? get_next_timer_interrupt+0xce/0x200 [ 5.622457] [<ffffffff810c89b4>] handle_irq_event_percpu+0x64/0x100 [ 5.622457] [<ffffffff810c8a77>] handle_irq_event+0x27/0x50 [ 5.622457] [<ffffffff810cb965>] handle_edge_irq+0x65/0x140 [ 5.622457] [<ffffffff8101a498>] handle_irq+0x18/0x30 [ 5.622457] [<ffffffff8101a276>] do_IRQ+0x46/0xd0 [ 5.622457] [<ffffffff817f8fff>] common_interrupt+0x7f/0x7f [ 5.622457] <EOI> [ 5.622457] [<ffffffff81020d38>] ? mwait_idle+0x68/0x80 [ 5.622457] [<ffffffff8102114a>] arch_cpu_idle+0xa/0x10 [ 5.622457] [<ffffffff810c1b97>] default_idle_call+0x27/0x30 [ 5.622457] [<ffffffff810c1d3b>] cpu_startup_entry+0x19b/0x230 [ 5.622457] [<ffffffff810324c6>] start_secondary+0x136/0x140 [ 5.622457] Code: 00 00 65 48 8b 04 25 28 00 00 00 48 89 45 d0 31 c0 48 8b 47 58 a8 02 0f 84 c5 00 00 00 48 8b 46 50 49 89 f4 65 8b 15 34 bb aa 7e <39> 50 50 74 11 89 50 50 48 8b 46 50 8b 40 50 41 89 86 60 8b 00 [ 5.622457] RIP [<ffffffff8155e614>] qla24xx_process_response_queue+0x44/0x4b0 [ 5.622457] RSP <ffff8801afb03de8> [ 5.622457] CR2: 0000000000000050 [ 5.622457] ---[ end trace fa2b19c25106d42b ]--- [ 5.622457] Kernel panic - not syncing: Fatal exception in interrupt The affected code was introduced by commit cdb898c (qla2xxx: Add irq affinity notification). Only dereference rsp->msix when it has been set so the machine can boot fine. Possibly rsp->msix is unset because: [ 3.479679] qla2xxx [0000:00:00.0]-0005: : QLogic Fibre Channel HBA Driver: 8.07.00.33-k. [ 3.481839] qla2xxx [0000:13:00.0]-001d: : Found an ISP2432 irq 17 iobase 0xffffc90000038000. [ 3.484081] qla2xxx [0000:13:00.0]-0035:0: MSI-X; Unsupported ISP2432 (0x2, 0x3). [ 3.485804] qla2xxx [0000:13:00.0]-0037:0: Falling back-to MSI mode -258. [ 3.890145] scsi host0: qla2xxx [ 3.891956] qla2xxx [0000:13:00.0]-00fb:0: QLogic QLE2460 - PCI-Express Single Channel 4Gb Fibre Channel HBA. [ 3.894207] qla2xxx [0000:13:00.0]-00fc:0: ISP2432: PCIe (2.5GT/s x4) @ 0000:13:00.0 hdma+ host#=0 fw=7.03.00 (9496). [ 5.714774] qla2xxx [0000:13:00.0]-500a:0: LOOP UP detected (4 Gbps). Signed-off-by: Bruno Prémont <bonbons@linux-vserver.org> Acked-by: Quinn Tran <quinn.tran@qlogic.com> CC: <stable@vger.kernel.org> # 4.5+ Fixes: cdb898c Signed-off-by: James Bottomley <jejb@linux.vnet.ibm.com>

X722 hardware requires using the admin queue to configure RSS. This function was previously re-written in commit e69ff81 ("i40e: rework the functions to configure RSS with similar parameters"). However, the previous refactor did not work correctly for a few reasons (a) it does not check whether seed is NULL before using it, resulting in a NULL pointer dereference [ 402.954721] BUG: unable to handle kernel NULL pointer dereference at (null) [ 402.955568] IP: [<ffffffffa0090ccf>] i40e_config_rss_aq.constprop.65+0x2f/0x1c0 [i40e] [ 402.956402] PGD ad610067 PUD accc0067 PMD 0 [ 402.957235] Oops: 0000 [#1] SMP [ 402.958064] Modules linked in: ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_filter ebtable_ broute bridge stp llc ebtable_nat ebtables ip6table_mangle ip6table_raw ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv 6 ip6table_security ip6table_filter ip6_tables iptable_mangle iptable_raw iptable_nat nf_conntrack_ipv4_ nf_defrag_ipv4_ nf_nat_ip v4_ nf_nat nf_conntrack iptable_security intel_rapl i86_kg_temp_thermal coretemp kvm_intel kvm irqbypass crct10dif_clMl crc32_ pclMl ghash_clMlni_intel iTCO_wdt iTCO_vendor_support shpchp sb_edac dcdbas pcspkr joydev ipmi_devintf wmi edac_core ipmi_ssif acpi_ad acpi_ower_meter ipmi_si ipmi_msghandler mei_me nfsd lpc_ich mei ioatdma tpm_tis auth_rpcgss tpm nfs_acl lockd grace s unrpc ifs nngag200 i2c_algo_bit drm_kms_helper ttm drm iigbe bnx2x i40e dca mdio ptp pps_core libcrc32c fjes crc32c_intel [ 402.965563] CPU: 22 PID: 2461 Conm: ethtool Not tainted 4.6.0-rc7_1.2-ABNidQ+ torvalds#20 [ 402.966719] Hardware name: Dell Inc. PowerEdge R720/0C4Y3R, BIOS 2.5.2 01/28/2015 [ 402.967862] task: ffff880219b51dc0 ti: ffff8800b3408000 task.ti: ffff8800b3408000 [ 402.969046] RIP: 0010:[<ffffffffa0090ccf>] [<ffffffffa0090ccf>] i40e_config_rss_aq.constprop.65+0x2f/0x1c0 [i40e] [ 402.970339] RSP: 0018:ffff8800b340ba90 EFLAGS: 00010246 [ 402.971616] RAX: 0000000000000000 RBX: ffff88042ec14000 RCX: 0000000000000200 [ 402.972961] RDX: ffff880428eb9200 RSI: 0000000000000000 RDI: ffff88042ec14000 [ 402.974312] RBP: ffff8800b340baf8 R08: ffff880237ada8f0 R09: ffff880428eb9200 [ 402.975709] R10: ffff880428eb9200 R11: 0000000000000000 R12: ffff88042ec2e000 [ 402.977104] R13: ffff88042ec2e000 R14: ffff88042ec14000 R15: ffff88022ea00800 [ 402.978541] FS: 00007f84fd054700(0000) GS:ffff880237ac0000(0000) knlGS:0000000000000000 [ 402.980003] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 402.981508] CR2: 0000000000000000 CR3: 000000003289e000 CR4: 00000000000406e0 [ 402.983028] Stack: [ 402.984578] 0000000002000200 0000000000000000 ffff88023ffeda68 ffff88023ffef000 [ 402.986187] 0000000000000268 ffff8800b340bbf8 ffff88023ffedd80 0000000088ce4f1d [ 402.987844] ffff88042ec14000 ffff88022ea00800 ffff88042ec2e000 ffff88042ec14000 [ 402.989509] Call Trace: [ 402.991200] [<ffffffffa009636f>] i40e_config_rss+0x11f/0x1c0 [i40e] [ 402.992924] [<ffffffffa00a1ae0>] i40e_set_rifh+0ic0/0x130 [i40e] [ 402.994684] [<ffffffff816d54b7>] ethtool_set_rifh+0x1f7/0x300 [ 402.996446] [<ffffffff8136d02b>] ? cred_has_capability+0io6b/0x100 [ 402.998203] [<ffffffff8136d102>] ? selinux_capable+0x12/0x20 [ 402.999968] [<ffffffff8136277b>] ? security_capable+0x4b/0x70 [ 403.001707] [<ffffffff816d6da3>] dev_ethtool+0x1423/0x2290 [ 403.003461] [<ffffffff816eab41>] dev_ioctl+0x191/0io630 [ 403.005186] [<ffffffff811cf80a>] ? lru_cache_add+0x3a/0i80 [ 403.006942] [<ffffffff817f2a8e>] ? _raw_spin_unlock+0ie/0x20 [ 403.008691] [<ffffffff816adb95>] sock_do_ioctl+0x45/0i50 [ 403.010421] [<ffffffff816ae229>] sock_ioctl+0x209/0x2d0 [ 403.012173] [<ffffffff81262194>] do_vfs_ioctl+0u4/0io6c0 [ 403.013911] [<ffffffff81262829>] SyS_ioctl+0x79/0x90 [ 403.015710] [<ffffffff817f2e72>] entry_SYSCALL_64_fastpath+0x1a/0u4 [ 403.017500] Code: 90 55 48 89 e5 41 57 41 56 41 55 41 54 53 48 89 fb 48 83 ec 40 4c 8b a7 e0 05 00 00 65 48 8b 04 25 28 00 00 00 48 89 45 d0 31 c0 <48> 8b 06 41 0f b7 bc 24 f2 0f 00 00 48 89 45 9c 48 8b 46 08 48 [ 403.021454] RIP [<ffffffffa0090ccf>] i40e_config_rss_aq.constprop.65+0x2f/0x1c0 [i40e] [ 403.023395] RSP <ffff8800b340ba90> [ 403.025271] CR2: 0000000000000000 [ 403.027169] ---[ end trace 64561b528cf61cf0 ]--- (b) it does not even bother to use the passed in *lut parameter which defines the requested lookup table. Instead it uses its own round robin table. Fix these issues by re-writing it to be similar to i40e_config_rss_reg and i40e_get_rss_aq. Fixes: e69ff81 ("i40e: rework the functions to configure RSS with similar parameters", 2015-10-21) Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

rcj4747 closed this Apr 2, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

HID: lenovo: Do not access driver data in PM reset_resume #1

HID: lenovo: Do not access driver data in PM reset_resume #1

Uh oh!

rcj4747 commented Nov 3, 2015

Uh oh!

Uh oh!

HID: lenovo: Do not access driver data in PM reset_resume #1

HID: lenovo: Do not access driver data in PM reset_resume #1

Uh oh!

Conversation

rcj4747 commented Nov 3, 2015

Uh oh!

Uh oh!