forked from torvalds/linux
-
Notifications
You must be signed in to change notification settings - Fork 0
DECEMBER update #9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Acer AIO Veriton Z4860G/Z6860G with the same ALC286 codec has issues with the input from external microphone. The issue can be fixed by the fixup ALC286_FIXUP_ACER_AIO_MIC_NO_PRESENCE for Veriton Z4660G. Signed-off-by: Jian-Hong Pan <jian-hong@endlessm.com> Signed-off-by: Daniel Drake <drake@endlessm.com> Signed-off-by: Chris Chiu <chiu@endlessm.com> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de>
…el/git/lee/mfd Pull mfd bugfix from Lee Jones: "Replace release function in cros_ec_dev" * tag 'mfd-fixes-4.20' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd: Revert "mfd: cros_ec: Use devm_kzalloc for private data"
…git/rafael/linux-pm Pull power management fix from Rafael Wysocki: "Revert a problematic recent commit that attempted to fix a system-wide suspend issue related to the freezer" * tag 'pm-4.20-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: Revert "exec: make de_thread() freezable"
…rnel/git/kdave/linux Pull btrfs fix from David Sterba: "A patch in 4.19 introduced a sanity check that was too strict and a filesystem cannot be mounted. This happens for filesystems with more than 10 devices and has been reported by a few users so we need the fix to propagate to stable" * tag 'for-4.20-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: tree-checker: Don't check max block group size as current max chunk size limit is unreliable
The MPEG2 state controls for the cedrus stateless MPEG2 driver are not yet stable. Move them out of the public headers into media/mpeg2-ctrls.h. Eventually, once this has stabilized, they will be moved back to the public headers. Unfortunately I had to cast the control type to a u32 in two switch statements to prevent a compiler warning about a control type define not being part of the enum. Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Reviewed-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Add a note mentioning that these two controls are not part of the public API while they still stabilizing. Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Reviewed-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
The Request API is now merged to the kernel but the confidence on the stability of that API is not great, especially regarding the interaction with V4L2. Add a Kconfig option for the API, with a scary-looking warning. The patch itself disables request creation as well as does not advertise them as buffer flags. The driver requiring requests (cedrus) now depends on the Kconfig option as well. Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Acked-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Replace vcn_v1_0_stop with vcn_v1_0_set_powergating_state during suspend, to keep adev->vcn.cur_state update. It will fix VCN S3 hung issue. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
…/scm/linux/kernel/git/jberg/mac80211 Johannes Berg: ==================== As it's been a while, we have various fixes for * hwsim * AP mode (client powersave related) * CSA/FTM interaction * a busy loop in IE handling * and similar ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
When reading an extra descriptor, we need to properly check the minimum and maximum size allowed, to prevent from invalid data being sent by a device. Reported-by: Hui Peng <benquike@gmail.com> Reported-by: Mathias Payer <mathias.payer@nebelwelt.net> Co-developed-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Hui Peng <benquike@gmail.com> Signed-off-by: Mathias Payer <mathias.payer@nebelwelt.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: stable <stable@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Pull block fixes from Jens Axboe: "A bit earlier in the week as usual, but there's a fix here that should go in sooner rather than later. Under a combination of circumstance, the direct issue path in blk-mq could corrupt data. This wasn't easy to hit, but the ones that are affected by it, seem to hit it pretty easily. Full explanation in the patch. None of the regular filesystem and storage testing has triggered it, even though it's been around since 4.19-rc1. Outside of that, whitelist trim tweak for certain Samsung devices for libata" * tag 'for-linus-20181205' of git://git.kernel.dk/linux-block: blk-mq: fix corruption with direct issue libata: whitelist all SAMSUNG MZ7KM* solid-state disks
In preparation for libnvdimm growing new restrictions to detect section conflicts between persistent memory regions, enable nfit_test to allocate aligned resources. Use a gen_pool to allocate nfit_test's fake resources in a separate address space from the virtual translation of the same. Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Tested-by: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Commit cfe30b8 "libnvdimm, pmem: adjust for section collisions with 'System RAM'" enabled Linux to workaround occasions where platform firmware arranges for "System RAM" and "Persistent Memory" to collide within a single section boundary. Unfortunately, as reported in this issue [1], platform firmware can inflict the same collision between persistent memory regions. The approach of interrogating iomem_resource does not work in this case because platform firmware may merge multiple regions into a single iomem_resource range. Instead provide a method to interrogate regions that share the same parent bus. This is a stop-gap until the core-MM can grow support for hotplug on sub-section boundaries. [1]: pmem/ndctl#76 Fixes: cfe30b8 ("libnvdimm, pmem: adjust for section collisions with...") Cc: <stable@vger.kernel.org> Reported-by: Patrick Geary <patrickg@supermicro.com> Tested-by: Patrick Geary <patrickg@supermicro.com> Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
…hort" A "short" ARS (address range scrub) instructs the platform firmware to return known errors. In contrast, a "long" ARS instructs platform firmware to arrange every data address on the DIMM to be read / checked for poisoned data. The conversion of the flags in commit d3abaf4 "acpi, nfit: Fix Address Range Scrub completion tracking", changed the meaning of passing '0' to acpi_nfit_ars_rescan(). Previously '0' meant "not short", now '0' is ARS_REQ_SHORT. Pass ARS_REQ_LONG to restore the expected scrub-type behavior of user-initiated ARS sessions. Fixes: d3abaf4 ("acpi, nfit: Fix Address Range Scrub completion tracking") Reported-by: Jacek Zloch <jacek.zloch@intel.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
This is a full revert of ac5b2c1 ("mm: thp: relax __GFP_THISNODE for MADV_HUGEPAGE mappings") and a partial revert of 89c83fb ("mm, thp: consolidate THP gfp handling into alloc_hugepage_direct_gfpmask"). By not setting __GFP_THISNODE, applications can allocate remote hugepages when the local node is fragmented or low on memory when either the thp defrag setting is "always" or the vma has been madvised with MADV_HUGEPAGE. Remote access to hugepages often has much higher latency than local pages of the native page size. On Haswell, ac5b2c1 was shown to have a 13.9% access regression after this commit for binaries that remap their text segment to be backed by transparent hugepages. The intent of ac5b2c1 is to address an issue where a local node is low on memory or fragmented such that a hugepage cannot be allocated. In every scenario where this was described as a fix, there is abundant and unfragmented remote memory available to allocate from, even with a greater access latency. If remote memory is also low or fragmented, not setting __GFP_THISNODE was also measured on Haswell to have a 40% regression in allocation latency. Restore __GFP_THISNODE for thp allocations. Fixes: ac5b2c1 ("mm: thp: relax __GFP_THISNODE for MADV_HUGEPAGE mappings") Fixes: 89c83fb ("mm, thp: consolidate THP gfp handling into alloc_hugepage_direct_gfpmask") Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Michal Hocko <mhocko@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
…/git/vgupta/arc Pull ARC fixes/updates from Vineet Gupta - Missing reads{x}()/writes{x}() getting in the way of some drivers [Jose Abreu] - Builds defaulting to ARCv2 ISA based configsa [Kevin Hilman] - Misc fixes * tag 'arc-4.20-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: ARC: io.h: Implement reads{x}()/writes{x}() ARC: change defconfig defaults to ARCv2 arc: [devboards] Add support of NFSv3 ACL ARC: mm: fix uninitialised signal code in do_page_fault ARC: [plat-hsdk] Enable DW APB GPIO support ARCv2: boot log unaligned access in use ARC: IOC: panic if kernel was started with previously enabled IOC ARC: remove redundant 'default n' from Kconfig
list_del() leaves the skb->next pointer poisoned, which can then lead to a crash in e.g. OVS forwarding. For example, setting up an OVS VXLAN forwarding bridge on sfc as per: ======== $ ovs-vsctl show 5dfd9c47-f04b-4aaa-aa96-4fbb0a522a30 Bridge "br0" Port "br0" Interface "br0" type: internal Port "enp6s0f0" Interface "enp6s0f0" Port "vxlan0" Interface "vxlan0" type: vxlan options: {key="1", local_ip="10.0.0.5", remote_ip="10.0.0.4"} ovs_version: "2.5.0" ======== (where 10.0.0.5 is an address on enp6s0f1) and sending traffic across it will lead to the following panic: ======== general protection fault: 0000 [#1] SMP PTI CPU: 5 PID: 0 Comm: swapper/5 Not tainted 4.20.0-rc3-ehc+ #701 Hardware name: Dell Inc. PowerEdge R710/0M233H, BIOS 6.4.0 07/23/2013 RIP: 0010:dev_hard_start_xmit+0x38/0x200 Code: 53 48 89 fb 48 83 ec 20 48 85 ff 48 89 54 24 08 48 89 4c 24 18 0f 84 ab 01 00 00 48 8d 86 90 00 00 00 48 89 f5 48 89 44 24 10 <4c> 8b 33 48 c7 03 00 00 00 00 48 8b 05 c7 d1 b3 00 4d 85 f6 0f 95 RSP: 0018:ffff888627b437e0 EFLAGS: 00010202 RAX: 0000000000000000 RBX: dead000000000100 RCX: ffff88862279c000 RDX: ffff888614a342c0 RSI: 0000000000000000 RDI: 0000000000000000 RBP: ffff888618a88000 R08: 0000000000000001 R09: 00000000000003e8 R10: 0000000000000000 R11: ffff888614a34140 R12: 0000000000000000 R13: 0000000000000062 R14: dead000000000100 R15: ffff888616430000 FS: 0000000000000000(0000) GS:ffff888627b40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f6d2bc6d000 CR3: 000000000200a000 CR4: 00000000000006e0 Call Trace: <IRQ> __dev_queue_xmit+0x623/0x870 ? masked_flow_lookup+0xf7/0x220 [openvswitch] ? ep_poll_callback+0x101/0x310 do_execute_actions+0xaba/0xaf0 [openvswitch] ? __wake_up_common+0x8a/0x150 ? __wake_up_common_lock+0x87/0xc0 ? queue_userspace_packet+0x31c/0x5b0 [openvswitch] ovs_execute_actions+0x47/0x120 [openvswitch] ovs_dp_process_packet+0x7d/0x110 [openvswitch] ovs_vport_receive+0x6e/0xd0 [openvswitch] ? dst_alloc+0x64/0x90 ? rt_dst_alloc+0x50/0xd0 ? ip_route_input_slow+0x19a/0x9a0 ? __udp_enqueue_schedule_skb+0x198/0x1b0 ? __udp4_lib_rcv+0x856/0xa30 ? __udp4_lib_rcv+0x856/0xa30 ? cpumask_next_and+0x19/0x20 ? find_busiest_group+0x12d/0xcd0 netdev_frame_hook+0xce/0x150 [openvswitch] __netif_receive_skb_core+0x205/0xae0 __netif_receive_skb_list_core+0x11e/0x220 netif_receive_skb_list+0x203/0x460 ? __efx_rx_packet+0x335/0x5e0 [sfc] efx_poll+0x182/0x320 [sfc] net_rx_action+0x294/0x3c0 __do_softirq+0xca/0x297 irq_exit+0xa6/0xb0 do_IRQ+0x54/0xd0 common_interrupt+0xf/0xf </IRQ> ======== So, in all listified-receive handling, instead pull skbs off the lists with skb_list_del_init(). Fixes: 9af86f9 ("net: core: fix use-after-free in __netif_receive_skb_list_core") Fixes: 7da517a ("net: core: Another step of skb receive list processing") Fixes: a4ca8b7 ("net: ipv4: fix drop handling in ip_list_rcv() and ip_list_rcv_finish()") Fixes: d8269e2 ("net: ipv6: listify ipv6_rcv() and ip6_rcv_finish()") Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Alexei Starovoitov says: ==================== pull-request: bpf 2018-12-05 The following pull-request contains BPF updates for your *net* tree. The main changes are: 1) fix bpf uapi pointers for 32-bit architectures, from Daniel. 2) improve verifer ability to handle progs with a lot of branches, from Alexei. 3) strict btf checks, from Yonghong. 4) bpf_sk_lookup api cleanup, from Joe. 5) other misc fixes ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
If available rwnd is too small, tcp_tso_should_defer() can decide it is worth waiting before splitting a TSO packet. This really means we are rwnd limited. Fixes: 5615f88 ("tcp: instrument how long TCP is limited by receive window") Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Reviewed-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
TCP loss probe timer may fire when the retranmission queue is empty but has a non-zero tp->packets_out counter. tcp_send_loss_probe will call tcp_rearm_rto which triggers NULL pointer reference by fetching the retranmission queue head in its sub-routines. Add a more detailed warning to help catch the root cause of the inflight accounting inconsistency. Reported-by: Rafael Tinoco <rafael.tinoco@linaro.org> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
…it/jejb/scsi Pull SCSI fixes from James Bottomley: "Four obvious bug fixes. The vmw_pscsi is so old that it's amazing no-one noticed before now" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: storvsc: Fix a race in sub-channel creation that can cause panic scsi: vmw_pscsi: Rearrange code to avoid multiple calls to free_irq during unload scsi: libiscsi: Fix NULL pointer dereference in iscsi_eh_session_reset scsi: lpfc: fix block guard enablement on SLI3 adapters
The sw2iso count should cover ARM LDO ramp-up time, the MAX ARM LDO ramp-up time may be up to more than 100us on some boards, this patch sets sw2iso to 0xf (~384us) which is the reset value, and it is much more safe to cover different boards, since we have observed that some customer boards failed with current setting of 0x2. Fixes: 05136f0 ("ARM: imx: support arm power off in cpuidle for i.mx6sx") Signed-off-by: Anson Huang <Anson.Huang@nxp.com> Reviewed-by: Fabio Estevam <festevam@gmail.com> Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Function graph tracing recurses into itself when stackleak is enabled, causing the ftrace graph selftest to run for up to 90 seconds and trigger the softlockup watchdog. Breakpoint 2, ftrace_graph_caller () at ../arch/arm64/kernel/entry-ftrace.S:200 200 mcount_get_lr_addr x0 // pointer to function's saved lr (gdb) bt \#0 ftrace_graph_caller () at ../arch/arm64/kernel/entry-ftrace.S:200 \#1 0xffffff80081d5280 in ftrace_caller () at ../arch/arm64/kernel/entry-ftrace.S:153 \#2 0xffffff8008555484 in stackleak_track_stack () at ../kernel/stackleak.c:106 \#3 0xffffff8008421ff8 in ftrace_ops_test (ops=0xffffff8009eaa840 <graph_ops>, ip=18446743524091297036, regs=<optimized out>) at ../kernel/trace/ftrace.c:1507 \#4 0xffffff8008428770 in __ftrace_ops_list_func (regs=<optimized out>, ignored=<optimized out>, parent_ip=<optimized out>, ip=<optimized out>) at ../kernel/trace/ftrace.c:6286 \#5 ftrace_ops_no_ops (ip=18446743524091297036, parent_ip=18446743524091242824) at ../kernel/trace/ftrace.c:6321 \#6 0xffffff80081d5280 in ftrace_caller () at ../arch/arm64/kernel/entry-ftrace.S:153 \#7 0xffffff800832fd10 in irq_find_mapping (domain=0xffffffc03fc4bc80, hwirq=27) at ../kernel/irq/irqdomain.c:876 \#8 0xffffff800832294c in __handle_domain_irq (domain=0xffffffc03fc4bc80, hwirq=27, lookup=true, regs=0xffffff800814b840) at ../kernel/irq/irqdesc.c:650 \#9 0xffffff80081d52b4 in ftrace_graph_caller () at ../arch/arm64/kernel/entry-ftrace.S:205 Rework so we mark stackleak_track_stack as notrace Co-developed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Anders Roxell <anders.roxell@linaro.org> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Kees Cook <keescook@chromium.org>
There could be a race between task exit and probe unregister: exit_mm() mmput() __mmput() uprobe_unregister() uprobe_clear_state() put_uprobe() delayed_uprobe_remove() delayed_uprobe_remove() put_uprobe() is calling delayed_uprobe_remove() without taking delayed_uprobe_lock and thus the race sometimes results in a kernel crash. Fix this by taking delayed_uprobe_lock before calling delayed_uprobe_remove() from put_uprobe(). Detailed crash log can be found at: Link: http://lkml.kernel.org/r/000000000000140c370577db5ece@google.com Link: http://lkml.kernel.org/r/20181205033423.26242-1-ravi.bangoria@linux.ibm.com Acked-by: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Reported-by: syzbot+cb1fb754b771caca0a88@syzkaller.appspotmail.com Fixes: 1cc3316 ("uprobes: Support SDT markers having reference count (semaphore)") Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
…anpaul/dpu-staging into drm-fixes - Several related to incorrect error checking/handling (Various) - Prevent IRQ storm on MDP5 HDMI hotplug (Todor) - Don't capture crash state if unsupported (Sharat) - Properly grab vblank reference in atomic wait for commit done (Sean) Cc: Sharat Masetty <smasetty@codeaurora.org> Cc: Todor Tomov <todor.tomov@linaro.org> Cc: Sean Paul <seanpaul@chromium.org> Signed-off-by: Dave Airlie <airlied@redhat.com> From: Sean Paul <sean@poorly.run> Link: https://patchwork.freedesktop.org/patch/msgid/20181205194207.GY154160@art_vandelay
…linux into drm-fixes Fixes for 4.20: - Fix banding regression on 6 bpc panels - Vega20 fix for six 4k displays - Fix LRU handling in ttm_buffer_object_transfer - Use proper MC firmware for newer polaris variants - Vega20 powerplay fixes - VCN suspend/resume fix for PCO - Misc other fixes Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexdeucher@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181205192934.2857-1-alexander.deucher@amd.com
…g/drm/drm-misc into drm-fixes UAPI: - Distinguish lease events from hotplug (Daniel) Other: - omap: Restore panel-dpi bus flags (Tomi) - omap: Fix a couple of dsi issues (Sebastian) Cc: Sebastian Reichel <sebastian.reichel@collabora.com> Cc: Tomi Valkeinen <tomi.valkeinen@ti.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Dave Airlie <airlied@redhat.com> From: Sean Paul <sean@poorly.run> Link: https://patchwork.freedesktop.org/patch/msgid/20181205201428.GA35447@art_vandelay
When unloading the ast driver, a warning message is printed by drm_mode_config_cleanup() because a reference is still held to one of the drm_connector structs. Correct this by calling drm_crtc_force_disable_all() in ast_fbdev_destroy(). Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/1e613f3c630c7bbc72e04a44b178259b9164d2f6.1543798395.git.sbobroff@linux.ibm.com
If for some reason an association's fragmentation point is zero, sctp_datamsg_from_user will try to endlessly try to divide a message into zero-sized chunks. This eventually causes kernel panic due to running out of memory. Although this situation is quite unlikely, it has occurred before as reported. I propose to add this simple last-ditch sanity check due to the severity of the potential consequences. Signed-off-by: Jakub Audykowicz <jakub.audykowicz@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
The *_frag_reasm() functions are susceptible to miscalculating the byte count of packet fragments in case the truesize of a head buffer changes. The truesize member may be changed by the call to skb_unclone(), leaving the fragment memory limit counter unbalanced even if all fragments are processed. This miscalculation goes unnoticed as long as the network namespace which holds the counter is not destroyed. Should an attempt be made to destroy a network namespace that holds an unbalanced fragment memory limit counter the cleanup of the namespace never finishes. The thread handling the cleanup gets stuck in inet_frags_exit_net() waiting for the percpu counter to reach zero. The thread is usually in running state with a stacktrace similar to: PID: 1073 TASK: ffff880626711440 CPU: 1 COMMAND: "kworker/u48:4" #5 [ffff880621563d48] _raw_spin_lock at ffffffff815f5480 #6 [ffff880621563d48] inet_evict_bucket at ffffffff8158020b #7 [ffff880621563d80] inet_frags_exit_net at ffffffff8158051c #8 [ffff880621563db0] ops_exit_list at ffffffff814f5856 #9 [ffff880621563dd8] cleanup_net at ffffffff814f67c0 #10 [ffff880621563e38] process_one_work at ffffffff81096f14 It is not possible to create new network namespaces, and processes that call unshare() end up being stuck in uninterruptible sleep state waiting to acquire the net_mutex. The bug was observed in the IPv6 netfilter code by Per Sundstrom. I thank him for his analysis of the problem. The parts of this patch that apply to IPv4 and IPv6 fragment reassembly are preemptive measures. Signed-off-by: Jiri Wiesner <jwiesner@suse.com> Reported-by: Per Sundstrom <per.sundstrom@redqube.se> Acked-by: Peter Oskolkov <posk@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Found warning: WARNING: EXPORT symbol "gsi_write_channel_scratch" [vmlinux] version generation failed, symbol will not be versioned. WARNING: vmlinux.o(.text+0x1e0a0): Section mismatch in reference from the function valid_phys_addr_range() to the function .init.text:memblock_is_reserved() The function valid_phys_addr_range() references the function __init memblock_is_reserved(). This is often because valid_phys_addr_range lacks a __init annotation or the annotation of memblock_is_reserved is wrong. Use __init_memblock instead of __init. Link: http://lkml.kernel.org/r/BLUPR13MB02893411BF12EACB61888E80DFAE0@BLUPR13MB0289.namprd13.prod.outlook.com Signed-off-by: Yueyi Li <liyueyi@live.com> Acked-by: Michal Hocko <mhocko@suse.com> Reviewed-by: David Hildenbrand <david@redhat.com> Acked-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
A stack trace was triggered by VM_BUG_ON_PAGE(page_mapcount(page), page) in free_huge_page(). Unfortunately, the page->mapping field was set to NULL before this test. This made it more difficult to determine the root cause of the problem. Move the VM_BUG_ON_PAGE tests earlier in the function so that if they do trigger more information is present in the page struct. Link: http://lkml.kernel.org/r/1543491843-23438-1-git-send-email-nic_w@163.com Signed-off-by: Yongkai Wu <nic_w@163.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Mike Kravetz <mike.kravetz@oracle.com> Reviewed-by: William Kucharski <william.kucharski@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
migrate_page_move_mapping() expects pages with private data set to have a page_count elevated by 1. This is what used to happen for xfs through the buffer_heads code before the switch to iomap in commit 82cb141 ("xfs: add support for sub-pagesize writeback without buffer_heads"). Not having the count elevated causes move_pages() to fail on memory mapped files coming from xfs. Make iomap compatible with the migrate_page_move_mapping() assumption by elevating the page count as part of iomap_page_create() and lowering it in iomap_page_release(). It causes the move_pages() syscall to misbehave on memory mapped files from xfs. It does not not move any pages, which I suppose is "just" a perf issue, but it also ends up returning a positive number which is out of spec for the syscall. Talking to Michal Hocko, it sounds like returning positive numbers might be a necessary update to move_pages() anyway though (https://lkml.kernel.org/r/20181116114955.GJ14706@dhcp22.suse.cz). I only hit this in tests that verify that move_pages() actually moved the pages. The test also got confused by the positive return from move_pages() (it got treated as a success as positive numbers were not expected and not handled) making it a bit harder to track down what's going on. Link: http://lkml.kernel.org/r/20181115184140.1388751-1-pjaroszynski@nvidia.com Fixes: 82cb141 ("xfs: add support for sub-pagesize writeback without buffer_heads") Signed-off-by: Piotr Jaroszynski <pjaroszynski@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: William Kucharski <william.kucharski@oracle.com> Cc: Darrick J. Wong <darrick.wong@oracle.com> Cc: Brian Foster <bfoster@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
…gistered Calling UFFDIO_UNREGISTER on virtual ranges not yet registered in uffd could trigger an harmless false positive WARN_ON. Check the vma is already registered before checking VM_MAYWRITE to shut off the false positive warning. Link: http://lkml.kernel.org/r/20181206212028.18726-2-aarcange@redhat.com Cc: <stable@vger.kernel.org> Fixes: 29ec906 ("userfaultfd: shmem/hugetlbfs: only allow to register VM_MAYWRITE vmas") Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Reported-by: syzbot+06c7092e7d71218a2c16@syzkaller.appspotmail.com Acked-by: Mike Rapoport <rppt@linux.ibm.com> Acked-by: Hugh Dickins <hughd@google.com> Acked-by: Peter Xu <peterx@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There is actually a space after "sp," like this, ffff2000080813c8: a9bb7bfd stp x29, x30, [sp, #-80]! Right now, checkstack.pl isn't able to print anything on aarch64, because it won't be able to match the stating objdump line of a function due to this missing space. Hence, it displays every stack as zero-size. After this patch, checkpatch.pl is able to match the start of a function's objdump, and is then able to calculate each function's stack correctly. Link: http://lkml.kernel.org/r/20181207195843.38528-1-cai@lca.pw Signed-off-by: Qian Cai <cai@lca.pw> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The spdxcheck script currently falls over when confronted with a binary file (such as Documentation/logo.gif). To avoid that, always open files in binary mode and decode line-by-line, ignoring encoding errors. One tricky case is when piping data into the script and reading it from standard input. By default, standard input will be opened in text mode, so we need to reopen it in binary mode. The breakage only happens with python3 and results in a UnicodeDecodeError (according to Uwe). Link: http://lkml.kernel.org/r/20181212131210.28024-1-thierry.reding@gmail.com Fixes: 6f4d29d ("scripts/spdxcheck.py: make python3 compliant") Signed-off-by: Thierry Reding <treding@nvidia.com> Reviewed-by: Jeremy Cline <jcline@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Joe Perches <joe@perches.com> Cc: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Merge misc fixes from Andrew Morton: "11 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: scripts/spdxcheck.py: always open files in binary mode checkstack.pl: fix for aarch64 userfaultfd: check VM_MAYWRITE was set after verifying the uffd is registered fs/iomap.c: get/put the page in iomap_page_create/release() hugetlbfs: call VM_BUG_ON_PAGE earlier in free_huge_page() memblock: annotate memblock_is_reserved() with __init_memblock psi: fix reference to kernel commandline enable arch/sh/include/asm/io.h: provide prototypes for PCI I/O mapping in asm/io.h mm/sparse: add common helper to mark all memblocks present mm: introduce common STRUCT_PAGE_MAX_SHIFT define alpha: fix hang caused by the bootmem removal
According to the documentation in include/uapi/asm-generic/ioctl.h, _IOW means userspace is writing and kernel is reading, and _IOR means userspace is reading and kernel is writing. In case of these two ioctls, kernel is writing and userspace is reading, so they have to be _IOR instead of _IOW. Fixes: 72cd875 ("block: Introduce BLKGETZONESZ ioctl") Fixes: 65e4e3e ("block: Introduce BLKGETNRZONES ioctl") Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Dmitry V. Levin <ldv@altlinux.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
The dma_direct_supported() function intends to check the DMA mask against specific values. However, the phys_to_dma() function includes the SME encryption mask, which defeats the intended purpose of the check. This results in drivers that support less than 48-bit DMA (SME encryption mask is bit 47) from being able to set the DMA mask successfully when SME is active, which results in the driver failing to initialize. Change the function used to check the mask from phys_to_dma() to __phys_to_dma() so that the SME encryption mask is not part of the check. Fixes: c1d0af1 ("kernel/dma/direct: take DMA offset into account in dma_direct_supported") Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
The code uses a bitmap to check for duplicate tokens during parsing, and that doesn't work at all for the negative Opt_err token case. There is absolutely no reason to make Opt_err be negative, and in fact it only confuses things, since some of the affected functions actually return a positive Opt_xyz enum _or_ a regular negative error code (eg -EINVAL), and using -1 for Opt_err makes no sense. There are similar problems in ima_policy.c and key encryption, but they don't have the immediate bug wrt bitmap handing, and ima_policy.c in particular needs a different patch to make the enum values match the token array index. Mimi is sending that separately. Reported-by: syzbot+a22e0dc07567662c50bc@syzkaller.appspotmail.com Reported-by: Eric Biggers <ebiggers@kernel.org> Fixes: 5208cc8 ("keys, trusted: fix: *do not* allow duplicate key options") Fixes: 00d60fd ("KEYS: Provide keyctls to drive the new key type ops for asymmetric keys [ver #2]") Cc: James Morris James Morris <jmorris@namei.org> Cc: Mimi Zohar <zohar@linux.vnet.ibm.com> Cc: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Cc: Peter Huewe <peterhuewe@gmx.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Start the policy_tokens and the associated enumeration from zero, simplifying the pt macro. Signed-off-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When the socket is closed, we need to call xprt_disconnect_done() in order to clean up the XPRT_WRITE_SPACE flag, and wake up the sleeping tasks. However, we also want to ensure that we don't wake them up before the socket is closed, since that would cause thundering herd issues with everyone piling up to retransmit before the TCP shutdown dance has completed. Only the task that holds XPRT_LOCKED needs to wake up early in order to allow the close to complete. Reported-by: Dave Wysochanski <dwysocha@redhat.com> Reported-by: Scott Mayhew <smayhew@redhat.com> Cc: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Tested-by: Chuck Lever <chuck.lever@oracle.com>
Ensure that we clear XPRT_CONNECTING before releasing the XPRT_LOCK so that we don't have races between the (asynchronous) socket setup code and tasks in xprt_connect(). Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Tested-by: Chuck Lever <chuck.lever@oracle.com>
Over the years, xprt_connect_status() has been superseded by call_connect_status(), which now handles all the errors that xprt_connect_status() does and more. Since the latter converts all errors that it doesn't recognise to EIO, then it is time for it to be retired. Reported-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Tested-by: Chuck Lever <chuck.lever@oracle.com>
…it/jejb/scsi Pull SCSI fixes from James Bottomley: "Three fixes: The t10-pi one is a regression from the 4.19 release, the qla2xxx one is a 4.20 merge window regression and the bnx2fc is a very old bug" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: t10-pi: Return correct ref tag when queue has no integrity profile scsi: bnx2fc: Fix NULL dereference in error handling Revert "scsi: qla2xxx: Fix NVMe Target discovery"
If you register a kvm_coalesced_mmio_zone with '.pio = 0' but then unregister it with '.pio = 1', KVM_UNREGISTER_COALESCED_MMIO will try to unregister it from KVM_PIO_BUS rather than KVM_MMIO_BUS, which is a no-op. But it frees the kvm_coalesced_mmio_dev anyway, causing a use-after-free. Fix it by only unregistering and freeing the zone if the correct value of 'pio' is provided. Reported-by: syzbot+f87f60bb6f13f39b54e3@syzkaller.appspotmail.com Fixes: 0804c84 ("kvm/x86 : add coalesced pio support") Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
nested_get_vmcs12_pages() processes the posted_intr address in vmcs12. It caches the kmap()ed page object and pointer, however, it doesn't handle errors correctly: it's possible to cache a valid pointer, then release the page and later dereference the dangling pointer. I was able to reproduce with the following steps: 1. Call vmlaunch with valid posted_intr_desc_addr but an invalid MSR_EFER. This causes nested_get_vmcs12_pages() to cache the kmap()ed pi_desc_page and pi_desc. Later the invalid EFER value fails check_vmentry_postreqs() which fails the first vmlaunch. 2. Call vmlanuch with a valid EFER but an invalid posted_intr_desc_addr (I set it to 2G - 0x80). The second time we call nested_get_vmcs12_pages pi_desc_page is unmapped and released and pi_desc_page is set to NULL (the "shouldn't happen" clause). Due to the invalid posted_intr_desc_addr, kvm_vcpu_gpa_to_page() fails and nested_get_vmcs12_pages() returns. It doesn't return an error value so vmlaunch proceeds. Note that at this time we have a dangling pointer in vmx->nested.pi_desc and POSTED_INTR_DESC_ADDR in L0's vmcs. 3. Issue an IPI in L2 guest code. This triggers a call to vmx_complete_nested_posted_interrupt() and pi_test_and_clear_on() which dereferences the dangling pointer. Vulnerable code requires nested and enable_apicv variables to be set to true. The host CPU must also support posted interrupts. Fixes: 5e2f30b "KVM: nVMX: get rid of nested_get_page()" Cc: stable@vger.kernel.org Reviewed-by: Andy Honig <ahonig@google.com> Signed-off-by: Cfir Cohen <cfir@google.com> Reviewed-by: Liran Alon <liran.alon@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reported by syzkaller: CPU: 1 PID: 5962 Comm: syz-executor118 Not tainted 4.20.0-rc6+ #374 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:kvm_apic_hw_enabled arch/x86/kvm/lapic.h:169 [inline] RIP: 0010:vcpu_scan_ioapic arch/x86/kvm/x86.c:7449 [inline] RIP: 0010:vcpu_enter_guest arch/x86/kvm/x86.c:7602 [inline] RIP: 0010:vcpu_run arch/x86/kvm/x86.c:7874 [inline] RIP: 0010:kvm_arch_vcpu_ioctl_run+0x5296/0x7320 arch/x86/kvm/x86.c:8074 Call Trace: kvm_vcpu_ioctl+0x5c8/0x1150 arch/x86/kvm/../../../virt/kvm/kvm_main.c:2596 vfs_ioctl fs/ioctl.c:46 [inline] file_ioctl fs/ioctl.c:509 [inline] do_vfs_ioctl+0x1de/0x1790 fs/ioctl.c:696 ksys_ioctl+0xa9/0xd0 fs/ioctl.c:713 __do_sys_ioctl fs/ioctl.c:720 [inline] __se_sys_ioctl fs/ioctl.c:718 [inline] __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe The reason is that the testcase writes hyperv synic HV_X64_MSR_SINT14 msr and triggers scan ioapic logic to load synic vectors into EOI exit bitmap. However, irqchip is not initialized by this simple testcase, ioapic/apic objects should not be accessed. This patch fixes it by also considering whether or not apic is present. Reported-by: syzbot+39810e6c400efadfef71@syzkaller.appspotmail.com Cc: stable@vger.kernel.org Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Some guests OSes (including Windows 10) write to MSR 0xc001102c on some cases (possibly while trying to apply a CPU errata). Make KVM ignore reads and writes to that MSR, so the guest won't crash. The MSR is documented as "Execution Unit Configuration (EX_CFG)", at AMD's "BIOS and Kernel Developer's Guide (BKDG) for AMD Family 15h Models 00h-0Fh Processors". Cc: stable@vger.kernel.org Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
…ernel/git/helgaas/pci Pull PCI fix from Bjorn Helgaas: "Fix the ACPI APEI error path, which previously queued several uninitialized events (Yanjiang Jin)" * tag 'pci-v4.20-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: PCI/AER: Queue one GHES event, not several uninitialized ones
Pull block fix from Jens Axboe: "Correct an ioctl direction for the zoned ioctls" * tag 'for-linus-20181218' of git://git.kernel.dk/linux-block: uapi: linux/blkzoned.h: fix BLKGETZONESZ and BLKGETNRZONES definitions
Fixes: d384995 ("fs: decouple READ and WRITE from the block layer ops") Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
…ma-mapping Pull dma-mapping fix from Christoph Hellwig: "Fix a regression in dma-direct that didn't take account the magic AMD memory encryption mask in the DMA address" * tag 'dma-mapping-4.20-4' of git://git.infradead.org/users/hch/dma-mapping: dma-direct: do not include SME mask in the DMA supported check
Pull kvm fixes from Paolo Bonzini: - One nasty use-after-free bugfix, from this merge window however - A less nasty use-after-free that can only zero some words at the beginning of the page, and hence is not really exploitable - A NULL pointer dereference - A dummy implementation of an AMD chicken bit MSR that Windows uses for some unknown reason * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: kvm: x86: Add AMD's EX_CFG to the list of ignored MSRs KVM: X86: Fix NULL deref in vcpu_scan_ioapic KVM: Fix UAF in nested posted interrupt processing KVM: fix unregistering coalesced mmio zone from wrong bus
…y/linux-nfs Pull NFS client bugfixes from Trond Myklebust: - Fix TCP socket disconnection races by ensuring we always call xprt_disconnect_done() after releasing the socket. - Fix a race when clearing both XPRT_CONNECTING and XPRT_LOCKED - Remove xprt_connect_status() so it does not mask errors that should be handled by call_connect_status() * tag 'nfs-for-4.20-6' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: SUNRPC: Remove xprt_connect_status() SUNRPC: Fix a race with XPRT_CONNECTING SUNRPC: Fix disconnection races
…t/mst/vhost Pull virtio fix from Michael Tsirkin: "A last-minute fix for a test build" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: virtio: fix test build after uio.h change
rahrawat
pushed a commit
that referenced
this pull request
Dec 20, 2018
Seeing as we use this registermap in the context of our IRQ handlers, we need to be using spinlocks for reading/writing registers so that we can still read them from IRQ handlers without having to grab any mutexes and accidentally sleep. We don't currently do this, as pointed out by lockdep: [ 18.403770] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:908 [ 18.406744] in_atomic(): 1, irqs_disabled(): 128, pid: 68, name: kworker/u17:0 [ 18.413864] INFO: lockdep is turned off. [ 18.417675] irq event stamp: 12 [ 18.420778] hardirqs last enabled at (11): [<ffff000008a4f57c>] _raw_spin_unlock_irq+0x2c/0x60 [ 18.429510] hardirqs last disabled at (12): [<ffff000008a48914>] __schedule+0xc4/0xa60 [ 18.437345] softirqs last enabled at (0): [<ffff0000080b55e0>] copy_process.isra.4.part.5+0x4d8/0x1c50 [ 18.446684] softirqs last disabled at (0): [<0000000000000000>] (null) [ 18.453979] CPU: 0 PID: 68 Comm: kworker/u17:0 Tainted: G W O 4.20.0-rc3Lyude-Test+ #9 [ 18.469839] Hardware name: amlogic khadas-vim2/khadas-vim2, BIOS 2018.07-rc2-armbian 09/11/2018 [ 18.480037] Workqueue: hci0 hci_power_on [bluetooth] [ 18.487138] Call trace: [ 18.494192] dump_backtrace+0x0/0x1b8 [ 18.501280] show_stack+0x14/0x20 [ 18.508361] dump_stack+0xbc/0xf4 [ 18.515427] ___might_sleep+0x140/0x1d8 [ 18.522515] __might_sleep+0x50/0x88 [ 18.529582] __mutex_lock+0x60/0x870 [ 18.536621] mutex_lock_nested+0x1c/0x28 [ 18.543660] regmap_lock_mutex+0x10/0x18 [ 18.550696] regmap_read+0x38/0x70 [ 18.557727] dw_hdmi_hardirq+0x58/0x138 [dw_hdmi] [ 18.564804] __handle_irq_event_percpu+0xac/0x410 [ 18.571891] handle_irq_event_percpu+0x34/0x88 [ 18.578982] handle_irq_event+0x48/0x78 [ 18.586051] handle_fasteoi_irq+0xac/0x160 [ 18.593061] generic_handle_irq+0x24/0x38 [ 18.599989] __handle_domain_irq+0x60/0xb8 [ 18.606857] gic_handle_irq+0x50/0xa0 [ 18.613659] el1_irq+0xb4/0x130 [ 18.620394] debug_lockdep_rcu_enabled+0x2c/0x30 [ 18.627111] schedule+0x38/0xa0 [ 18.633781] schedule_timeout+0x3a8/0x510 [ 18.640389] wait_for_common+0x15c/0x180 [ 18.646905] wait_for_completion+0x14/0x20 [ 18.653319] mmc_wait_for_req_done+0x28/0x168 [ 18.659693] mmc_wait_for_req+0xa8/0xe8 [ 18.665978] mmc_wait_for_cmd+0x64/0x98 [ 18.672180] mmc_io_rw_direct_host+0x94/0x130 [ 18.678385] mmc_io_rw_direct+0x10/0x18 [ 18.684516] sdio_enable_func+0xe8/0x1d0 [ 18.690627] btsdio_open+0x24/0xc0 [btsdio] [ 18.696821] hci_dev_do_open+0x64/0x598 [bluetooth] [ 18.703025] hci_power_on+0x50/0x270 [bluetooth] [ 18.709163] process_one_work+0x2a0/0x6e0 [ 18.715252] worker_thread+0x40/0x448 [ 18.721310] kthread+0x12c/0x130 [ 18.727326] ret_from_fork+0x10/0x1c [ 18.735555] ------------[ cut here ]------------ [ 18.741430] do not call blocking ops when !TASK_RUNNING; state=2 set at [<000000006265ec59>] wait_for_common+0x140/0x180 [ 18.752417] WARNING: CPU: 0 PID: 68 at kernel/sched/core.c:6096 __might_sleep+0x7c/0x88 [ 18.760553] Modules linked in: dm_mirror dm_region_hash dm_log dm_mod btsdio bluetooth snd_soc_hdmi_codec dw_hdmi_i2s_audio ecdh_generic brcmfmac brcmutil cfg80211 rfkill ir_nec_decoder meson_dw_hdmi(O) dw_hdmi rc_geekbox meson_rng meson_ir ao_cec rng_core rc_core cec leds_pwm efivars nfsd ip_tables x_tables crc32_generic f2fs uas meson_gxbb_wdt pwm_meson efivarfs ipv6 [ 18.799469] CPU: 0 PID: 68 Comm: kworker/u17:0 Tainted: G W O 4.20.0-rc3Lyude-Test+ #9 [ 18.808858] Hardware name: amlogic khadas-vim2/khadas-vim2, BIOS 2018.07-rc2-armbian 09/11/2018 [ 18.818045] Workqueue: hci0 hci_power_on [bluetooth] [ 18.824088] pstate: 80000085 (Nzcv daIf -PAN -UAO) [ 18.829891] pc : __might_sleep+0x7c/0x88 [ 18.835722] lr : __might_sleep+0x7c/0x88 [ 18.841256] sp : ffff000008003cb0 [ 18.846751] x29: ffff000008003cb0 x28: 0000000000000000 [ 18.852269] x27: ffff00000938e000 x26: ffff800010283000 [ 18.857726] x25: ffff800010353280 x24: ffff00000868ef50 [ 18.863166] x23: 0000000000000000 x22: 0000000000000000 [ 18.868551] x21: 0000000000000000 x20: 000000000000038c [ 18.873850] x19: ffff000008cd08c0 x18: 0000000000000010 [ 18.879081] x17: ffff000008a68cb0 x16: 0000000000000000 [ 18.884197] x15: 0000000000aaaaaa x14: 0e200e200e200e20 [ 18.889239] x13: 0000000000000001 x12: 00000000ffffffff [ 18.894261] x11: ffff000008adfa48 x10: 0000000000000001 [ 18.899517] x9 : ffff0000092a0158 x8 : 0000000000000000 [ 18.904674] x7 : ffff00000812136c x6 : 0000000000000000 [ 18.909895] x5 : 0000000000000000 x4 : 0000000000000001 [ 18.915080] x3 : 0000000000000007 x2 : 0000000000000007 [ 18.920269] x1 : 99ab8e9ebb6c8500 x0 : 0000000000000000 [ 18.925443] Call trace: [ 18.929904] __might_sleep+0x7c/0x88 [ 18.934311] __mutex_lock+0x60/0x870 [ 18.938687] mutex_lock_nested+0x1c/0x28 [ 18.943076] regmap_lock_mutex+0x10/0x18 [ 18.947453] regmap_read+0x38/0x70 [ 18.951842] dw_hdmi_hardirq+0x58/0x138 [dw_hdmi] [ 18.956269] __handle_irq_event_percpu+0xac/0x410 [ 18.960712] handle_irq_event_percpu+0x34/0x88 [ 18.965176] handle_irq_event+0x48/0x78 [ 18.969612] handle_fasteoi_irq+0xac/0x160 [ 18.974058] generic_handle_irq+0x24/0x38 [ 18.978501] __handle_domain_irq+0x60/0xb8 [ 18.982938] gic_handle_irq+0x50/0xa0 [ 18.987351] el1_irq+0xb4/0x130 [ 18.991734] debug_lockdep_rcu_enabled+0x2c/0x30 [ 18.996180] schedule+0x38/0xa0 [ 19.000609] schedule_timeout+0x3a8/0x510 [ 19.005064] wait_for_common+0x15c/0x180 [ 19.009513] wait_for_completion+0x14/0x20 [ 19.013951] mmc_wait_for_req_done+0x28/0x168 [ 19.018402] mmc_wait_for_req+0xa8/0xe8 [ 19.022809] mmc_wait_for_cmd+0x64/0x98 [ 19.027177] mmc_io_rw_direct_host+0x94/0x130 [ 19.031563] mmc_io_rw_direct+0x10/0x18 [ 19.035922] sdio_enable_func+0xe8/0x1d0 [ 19.040294] btsdio_open+0x24/0xc0 [btsdio] [ 19.044742] hci_dev_do_open+0x64/0x598 [bluetooth] [ 19.049228] hci_power_on+0x50/0x270 [bluetooth] [ 19.053687] process_one_work+0x2a0/0x6e0 [ 19.058143] worker_thread+0x40/0x448 [ 19.062608] kthread+0x12c/0x130 [ 19.067064] ret_from_fork+0x10/0x1c [ 19.071513] irq event stamp: 12 [ 19.075937] hardirqs last enabled at (11): [<ffff000008a4f57c>] _raw_spin_unlock_irq+0x2c/0x60 [ 19.083560] hardirqs last disabled at (12): [<ffff000008a48914>] __schedule+0xc4/0xa60 [ 19.091401] softirqs last enabled at (0): [<ffff0000080b55e0>] copy_process.isra.4.part.5+0x4d8/0x1c50 [ 19.100801] softirqs last disabled at (0): [<0000000000000000>] (null) [ 19.108135] ---[ end trace 38c4920787b88c75 ]--- So, fix this by enabling the fast_io option in our regmap config so that regmap uses spinlocks for locking instead of mutexes. Signed-off-by: Lyude Paul <lyude@redhat.com> Fixes: 3f68be7 ("drm/meson: Add support for HDMI encoder and DW-HDMI bridge + PHY") Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Neil Armstrong <narmstrong@baylibre.com> Cc: Carlo Caione <carlo@caione.org> Cc: Kevin Hilman <khilman@baylibre.com> Cc: dri-devel@lists.freedesktop.org Cc: linux-amlogic@lists.infradead.org Cc: linux-arm-kernel@lists.infradead.org Cc: <stable@vger.kernel.org> # v4.12+ Acked-by: Neil Armstrong <narmstrong@baylibre.com> Signed-off-by: Neil Armstrong <narmstrong@baylibre.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181124191238.28276-1-lyude@redhat.com Signed-off-by: Sean Paul <seanpaul@chromium.org>
rahrawat
pushed a commit
that referenced
this pull request
Dec 20, 2018
It was observed that a process blocked indefintely in __fscache_read_or_alloc_page(), waiting for FSCACHE_COOKIE_LOOKING_UP to be cleared via fscache_wait_for_deferred_lookup(). At this time, ->backing_objects was empty, which would normaly prevent __fscache_read_or_alloc_page() from getting to the point of waiting. This implies that ->backing_objects was cleared *after* __fscache_read_or_alloc_page was was entered. When an object is "killed" and then "dropped", FSCACHE_COOKIE_LOOKING_UP is cleared in fscache_lookup_failure(), then KILL_OBJECT and DROP_OBJECT are "called" and only in DROP_OBJECT is ->backing_objects cleared. This leaves a window where something else can set FSCACHE_COOKIE_LOOKING_UP and __fscache_read_or_alloc_page() can start waiting, before ->backing_objects is cleared There is some uncertainty in this analysis, but it seems to be fit the observations. Adding the wake in this patch will be handled correctly by __fscache_read_or_alloc_page(), as it checks if ->backing_objects is empty again, after waiting. Customer which reported the hang, also report that the hang cannot be reproduced with this fix. The backtrace for the blocked process looked like: PID: 29360 TASK: ffff881ff2ac0f80 CPU: 3 COMMAND: "zsh" #0 [ffff881ff43efbf8] schedule at ffffffff815e56f1 #1 [ffff881ff43efc58] bit_wait at ffffffff815e64ed #2 [ffff881ff43efc68] __wait_on_bit at ffffffff815e61b8 #3 [ffff881ff43efca0] out_of_line_wait_on_bit at ffffffff815e625e #4 [ffff881ff43efd08] fscache_wait_for_deferred_lookup at ffffffffa04f2e8f [fscache] #5 [ffff881ff43efd18] __fscache_read_or_alloc_page at ffffffffa04f2ffe [fscache] #6 [ffff881ff43efd58] __nfs_readpage_from_fscache at ffffffffa0679668 [nfs] #7 [ffff881ff43efd78] nfs_readpage at ffffffffa067092b [nfs] #8 [ffff881ff43efda0] generic_file_read_iter at ffffffff81187a73 #9 [ffff881ff43efe50] nfs_file_read at ffffffffa066544b [nfs] torvalds#10 [ffff881ff43efe70] __vfs_read at ffffffff811fc756 torvalds#11 [ffff881ff43efee8] vfs_read at ffffffff811fccfa torvalds#12 [ffff881ff43eff18] sys_read at ffffffff811fda62 torvalds#13 [ffff881ff43eff50] entry_SYSCALL_64_fastpath at ffffffff815e986e Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: David Howells <dhowells@redhat.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.