linux-xiaomi-chiron

Author	SHA1	Message	Date
Joakim Zhang	a6597121d6	can: flexcan: initialize all flexcan memory for ECC function One issue was reported at a baremetal environment, which is used for FPGA verification. "The first transfer will fail for extended ID format(for both 2.0B and FD format), following frames can be transmitted and received successfully for extended format, and standard format don't have this issue. This issue occurred randomly with high possiblity, when it occurs, the transmitter will detect a BIT1 error, the receiver a CRC error. According to the spec, a non-correctable error may cause this transfer failure." With FLEXCAN_QUIRK_DISABLE_MECR quirk, it supports correctable errors, disable non-correctable errors interrupt and freeze mode. Platform has ECC hardware support, but select this quirk, this issue may not come to light. Initialize all FlexCAN memory before accessing them, at least it can avoid non-correctable errors detected due to memory uninitialized. The internal region can't be initialized when the hardware doesn't support ECC. According to IMX8MPRM, Rev.C, 04/2020. There is a NOTE at the section 11.8.3.13 Detection and correction of memory errors: "All FlexCAN memory must be initialized before starting its operation in order to have the parity bits in memory properly updated. CTRL2[WRMFRZ] grants write access to all memory positions that require initialization, ranging from 0x080 to 0xADF and from 0xF28 to 0xFFF when the CAN FD feature is enabled. The RXMGMASK, RX14MASK, RX15MASK, and RXFGMASK registers need to be initialized as well. MCR[RFEN] must not be set during memory initialization." Memory range from 0x080 to 0xADF, there are reserved memory (unimplemented by hardware, e.g. only configure 64 MBs), these memory can be initialized or not. In this patch, initialize all flexcan memory which includes reserved memory. In this patch, create FLEXCAN_QUIRK_SUPPORT_ECC for platforms which has ECC feature. If you have a ECC platform in your hand, please select this qurik to initialize all flexcan memory firstly, then you can select FLEXCAN_QUIRK_DISABLE_MECR to only enable correctable errors. Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com> Link: https://lore.kernel.org/r/20200929211557.14153-2-qiangqing.zhang@nxp.com [mkl: wrap long lines] Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2020-09-30 21:56:57 +02:00
Marc Kleine-Budde	eb79a267c9	can: mcp251xfd: rename all remaining occurrence to mcp251xfd In [1] Geert noted that the autodetect compatible for the mcp25xxfd driver, which is "microchip,mcp25xxfd" might be too generic and overlap with upcoming, but incompatible chips. In the previous patch the autodetect DT compatbile has been renamed to "microchip,mcp251xfd", this patch changes all non user facing occurrence of "mcp25xxfd" to "mcp251xfd" and "MCP25XXFD" to "MCP251XFD". [1] http://lore.kernel.org/r/CAMuHMdVkwGjr6dJuMyhQNqFoJqbh6Ec5V2b5LenCshwpM2SDsQ@mail.gmail.com Link: https://lore.kernel.org/r/20200930091424.792165-10-mkl@pengutronix.de Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2020-09-30 21:55:28 +02:00
Marc Kleine-Budde	f4f77366f2	can: mcp251xfd: rename all user facing strings to mcp251xfd In [1] Geert noted that the autodetect compatible for the mcp25xxfd driver, which is "microchip,mcp25xxfd" might be too generic and overlap with upcoming, but incompatible chips. In the previous patch the autodetect DT compatbile has been renamed to "microchip,mcp251xfd", this patch changes all user facing strings from "mcp25xxfd" to "mcp251xfd" and "MCP25XXFD" to "MCP251XFD", including: - kconfig symbols - name of kernel module - DT and SPI compatible [1] http://lore.kernel.org/r/CAMuHMdVkwGjr6dJuMyhQNqFoJqbh6Ec5V2b5LenCshwpM2SDsQ@mail.gmail.com Link: https://lore.kernel.org/r/20200930091424.792165-9-mkl@pengutronix.de Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2020-09-30 21:55:04 +02:00
Marc Kleine-Budde	1f0e21a0c0	can: mcp251xfd: rename driver files and subdir to mcp251xfd In [1] Geert noted that the autodetect compatible for the mcp25xxfd driver, which is "microchip,mcp25xxfd" might be too generic and overlap with upcoming, but incompatible chips. In the previous patch the autodetect DT compatbile has been renamed to "microchip,mcp251xfd", this patch changes the name of the driver subdir and the individual files accordinly. [1] http://lore.kernel.org/r/CAMuHMdVkwGjr6dJuMyhQNqFoJqbh6Ec5V2b5LenCshwpM2SDsQ@mail.gmail.com Link: https://lore.kernel.org/r/20200930091424.792165-8-mkl@pengutronix.de Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2020-09-30 21:54:30 +02:00
Pujin Shi	047248cab1	MIPS: process: include exec.h header in process.c arch/mips/kernel/process.c:696:15: error: no previous prototype for 'arch_align_stack' [-Werror=missing-prototypes] Signed-off-by: Pujin Shi <shipujin.t@gmail.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>	2020-09-30 21:53:00 +02:00
Pujin Shi	99419c310e	MIPS: process: Add prototype for function arch_dup_task_struct This commit adds a prototype to fix warning at W=1: arch/mips/kernel/process.c:95:5: error: no previous prototype for 'arch_dup_task_struct' [-Werror=missing-prototypes] Signed-off-by: Pujin Shi <shipujin.t@gmail.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>	2020-09-30 21:52:42 +02:00
Mike Snitzer	0c2915b8c6	dm: fix missing imposition of queue_limits from dm_wq_work() thread If a DM device was suspended when bios were issued to it, those bios would be deferred using queue_io(). Once the DM device was resumed dm_process_bio() could be called by dm_wq_work() for original bio that still needs splitting. dm_process_bio()'s check for current->bio_list (meaning call chain is within ->submit_bio) as a prerequisite for calling blk_queue_split() for "abnormal IO" would result in dm_process_bio() never imposing corresponding queue_limits (e.g. discard_granularity, discard_max_bytes, etc). Fix this by always having dm_wq_work() resubmit deferred bios using submit_bio_noacct(). Side-effect is blk_queue_split() is always called for "abnormal IO" from ->submit_bio, be it from application thread or dm_wq_work() workqueue, so proper bio splitting and depth-first bio submission is performed. For sake of clarity, remove current->bio_list check before call to blk_queue_split(). Also, remove dm_wq_work()'s use of dm_{get,put}_live_table() -- no longer needed since IO will be reissued in terms of ->submit_bio. And rename bio variable from 'c' to 'bio'. Fixes: `cf9c378655` ("dm: fix comment in dm_process_bio()") Reported-by: Jeffle Xu <jefflexu@linux.alibaba.com> Reviewed-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2020-09-30 15:34:38 -04:00
Andrii Nakryiko	f4d385e4d5	selftests/bpf: Test "incremental" btf_dump in C format Add test validating that btf_dump works fine with BTFs that are modified and incrementally generated. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200929232843.1249318-5-andriin@fb.com	2020-09-30 12:30:46 -07:00
Andrii Nakryiko	9c6c5c48d7	libbpf: Make btf_dump work with modifiable BTF Ensure that btf_dump can accommodate new BTF types being appended to BTF instance after struct btf_dump was created. This came up during attemp to use btf_dump for raw type dumping in selftests, but given changes are not excessive, it's good to not have any gotchas in API usage, so I decided to support such use case in general. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200929232843.1249318-2-andriin@fb.com	2020-09-30 12:30:22 -07:00
Ramesh Errabolu	f2fa07b39f	drm/amd/amdkfd: Surface files in Sysfs to allow users to get number of compute units that are in use. [Why] Allow user to know how many compute units (CU) are in use at any given moment. [How] Surface files in Sysfs that allow user to determine the number of compute units that are in use for a given process. One Sysfs file is used per device. Signed-off-by: Ramesh Errabolu <Ramesh.Errabolu@amd.com> Reviewed-By: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-30 15:26:27 -04:00
Ramesh Errabolu	43a4bc828c	drm/amd/amdgpu: Define and implement a function that collects number of waves that are in flight. [Why] Allow user to know how many compute units (CU) are in use at any given moment. [How] Read registers of SQ that give number of waves that are in flight of various queues. Use this information to determine number of CU's in use. Signed-off-by: Ramesh Errabolu <Ramesh.Errabolu@amd.com> Reviewed-By: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-30 15:26:27 -04:00
Alexei Starovoitov	ea7da1d563	Merge branch 'Various BPF helper improvements' Daniel Borkmann says: ==================== This series adds two BPF helpers, that is, one for retrieving the classid of an skb and another one to redirect via the neigh subsystem, and improves also the cookie helpers by removing the atomic counter. I've also added the bpf_tail_call_static() helper to the libbpf API that we've been using in Cilium for a while now, and last but not least the series adds a few selftests. For details, please check individual patches, thanks! v3 -> v4: - Removed out_rec error path (Martin) - Integrate BPF_F_NEIGH flag into rejecting invalid flags (Martin) - I think this way it's better to avoid bit overlaps given it's right in the place that would need to be extended on new flags v2 -> v3: - Removed double skb->dev = dev assignment (David) - Added headroom check for v6 path (David) - Set set flowi4_proto for ip_route_output_flow (David) - Rebased onto latest bpf-next v1 -> v2: - Rework cookie generator to support nested contexts (Eric) - Use ip_neigh_gw6() and container_of() (David) - Rename __throw_build_bug() and improve comments (Andrii) - Use bpf_tail_call_static() also in BPF samples (Maciej) ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2020-09-30 11:50:35 -07:00
Daniel Borkmann	eef4a011f3	bpf, selftests: Add redirect_neigh selftest Add a small test that exercises the new redirect_neigh() helper for the IPv4 and IPv6 case. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/0fc7d9c5f9a6cc1c65b0d3be83b44b1ec9889f43.1601477936.git.daniel@iogearbox.net	2020-09-30 11:50:35 -07:00
Daniel Borkmann	faef26fa44	bpf, selftests: Use bpf_tail_call_static where appropriate For those locations where we use an immediate tail call map index use the newly added bpf_tail_call_static() helper. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/3cfb2b799a62d22c6e7ae5897c23940bdcc24cbc.1601477936.git.daniel@iogearbox.net	2020-09-30 11:50:35 -07:00
Daniel Borkmann	0e9f6841f6	bpf, libbpf: Add bpf_tail_call_static helper for bpf programs Port of tail_call_static() helper function from Cilium's BPF code base [0] to libbpf, so others can easily consume it as well. We've been using this in production code for some time now. The main idea is that we guarantee that the kernel's BPF infrastructure and JIT (here: x86_64) can patch the JITed BPF insns with direct jumps instead of having to fall back to using expensive retpolines. By using inline asm, we guarantee that the compiler won't merge the call from different paths with potentially different content of r2/r3. We're also using Cilium's __throw_build_bug() macro (here as: __bpf_unreachable()) in different places as a neat trick to trigger compilation errors when compiler does not remove code at compilation time. This works for the BPF back end as it does not implement the __builtin_trap(). [0] `f5537c2602` Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/1656a082e077552eb46642d513b4a6bde9a7dd01.1601477936.git.daniel@iogearbox.net	2020-09-30 11:50:35 -07:00
Daniel Borkmann	b4ab314149	bpf: Add redirect_neigh helper as redirect drop-in Add a redirect_neigh() helper as redirect() drop-in replacement for the xmit side. Main idea for the helper is to be very similar in semantics to the latter just that the skb gets injected into the neighboring subsystem in order to let the stack do the work it knows best anyway to populate the L2 addresses of the packet and then hand over to dev_queue_xmit() as redirect() does. This solves two bigger items: i) skbs don't need to go up to the stack on the host facing veth ingress side for traffic egressing the container to achieve the same for populating L2 which also has the huge advantage that ii) the skb->sk won't get orphaned in ip_rcv_core() when entering the IP routing layer on the host stack. Given that skb->sk neither gets orphaned when crossing the netns as per `9c4c325252` ("skbuff: preserve sock reference when scrubbing the skb.") the helper can then push the skbs directly to the phys device where FQ scheduler can do its work and TCP stack gets proper backpressure given we hold on to skb->sk as long as skb is still residing in queues. With the helper used in BPF data path to then push the skb to the phys device, I observed a stable/consistent TCP_STREAM improvement on veth devices for traffic going container -> host -> host -> container from ~10Gbps to ~15Gbps for a single stream in my test environment. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: David Ahern <dsahern@gmail.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Cc: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/bpf/f207de81629e1724899b73b8112e0013be782d35.1601477936.git.daniel@iogearbox.net	2020-09-30 11:50:35 -07:00
Daniel Borkmann	92acdc58ab	bpf, net: Rework cookie generator as per-cpu one With its use in BPF, the cookie generator can be called very frequently in particular when used out of cgroup v2 hooks (e.g. connect / sendmsg) and attached to the root cgroup, for example, when used in v1/v2 mixed environments. In particular, when there's a high churn on sockets in the system there can be many parallel requests to the bpf_get_socket_cookie() and bpf_get_netns_cookie() helpers which then cause contention on the atomic counter. As similarly done in `f991bd2e14` ("fs: introduce a per-cpu last_ino allocator"), add a small helper library that both can use for the 64 bit counters. Given this can be called from different contexts, we also need to deal with potential nested calls even though in practice they are considered extremely rare. One idea as suggested by Eric Dumazet was to use a reverse counter for this situation since we don't expect 64 bit overflows anyways; that way, we can avoid bigger gaps in the 64 bit counter space compared to just batch-wise increase. Even on machines with small number of cores (e.g. 4) the cookie generation shrinks from min/max/med/avg (ns) of 22/50/40/38.9 down to 10/35/14/17.3 when run in parallel from multiple CPUs. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Link: https://lore.kernel.org/bpf/8a80b8d27d3c49f9a14e1d5213c19d8be87d1dc8.1601477936.git.daniel@iogearbox.net	2020-09-30 11:50:35 -07:00
Daniel Borkmann	b426ce83ba	bpf: Add classid helper only based on skb->sk Similarly to `5a52ae4e32` ("bpf: Allow to retrieve cgroup v1 classid from v2 hooks"), add a helper to retrieve cgroup v1 classid solely based on the skb->sk, so it can be used as key as part of BPF map lookups out of tc from host ns, in particular given the skb->sk is retained these days when crossing net ns thanks to `9c4c325252` ("skbuff: preserve sock reference when scrubbing the skb."). This is similar to bpf_skb_cgroup_id() which implements the same for v2. Kubernetes ecosystem is still operating on v1 however, hence net_cls needs to be used there until this can be dropped in with the v2 helper of bpf_skb_cgroup_id(). Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/ed633cf27a1c620e901c5aa99ebdefb028dce600.1601477936.git.daniel@iogearbox.net	2020-09-30 11:50:34 -07:00
Jason Gunthorpe	2ee9bf346f	RDMA/addr: Fix race with netevent_callback()/rdma_addr_cancel() This three thread race can result in the work being run once the callback becomes NULL: CPU1 CPU2 CPU3 netevent_callback() process_one_req() rdma_addr_cancel() [..] spin_lock_bh() set_timeout() spin_unlock_bh() spin_lock_bh() list_del_init(&req->list); spin_unlock_bh() req->callback = NULL spin_lock_bh() if (!list_empty(&req->list)) // Skipped! // cancel_delayed_work(&req->work); spin_unlock_bh() process_one_req() // again req->callback() // BOOM cancel_delayed_work_sync() The solution is to always cancel the work once it is completed so any in between set_timeout() does not result in it running again. Cc: stable@vger.kernel.org Fixes: `44e75052bc` ("RDMA/rdma_cm: Make rdma_addr_cancel into a fence") Link: https://lore.kernel.org/r/20200930072007.1009692-1-leon@kernel.org Reported-by: Dan Aloni <dan@kernelim.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-09-30 15:29:05 -03:00
Jason Gunthorpe	a6f0b08dba	RDMA/core: Remove ucontext->closing Nothing reads this any more, and the reason for its existence has passed due to the deferred fput() scheme. Fixes: `8ea1f989aa` ("drivers/IB,usnic: reduce scope of mmap_sem") Link: https://lore.kernel.org/r/0-v1-df64ff042436+42-uctx_closing_jgg@nvidia.com Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-09-30 15:27:19 -03:00
Chris Wilson	c60b93cd48	drm/i915: Avoid mixing integer types during batch copies Be consistent and use unsigned long throughout the chunk copies to avoid the inherent clumsiness of mixing integer types of different widths and signs. Failing to take acount of a wider unsigned type when using min_t can lead to treating it as a negative, only for it flip back to a large unsigned value after passing a boundary check. Fixes: `ed13033f02` ("drm/i915/cmdparser: Only cache the dst vmap") Testcase: igt/gen9_exec_parse/bb-large Reported-by: "Candelaria, Jared" <jared.candelaria@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: "Candelaria, Jared" <jared.candelaria@intel.com> Cc: "Bloomfield, Jon" <jon.bloomfield@intel.com> Cc: <stable@vger.kernel.org> # v4.9+ Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200928215942.31917-1-chris@chris-wilson.co.uk (cherry picked from commit `b7eeb2b413`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2020-09-30 14:24:54 -04:00
Chris Wilson	651dabe27f	drm/i915/gem: Always test execution status on closing the context Verify that if a context is active at the time it is closed, that it is either persistent and preemptible (with hangcheck running) or it shall be removed from execution. Fixes: `9a40bddd47` ("drm/i915/gt: Expose heartbeat interval via sysfs") Testcase: igt/gem_ctx_persistence/heartbeat-close Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: <stable@vger.kernel.org> # v5.7+ Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200928221510.26044-3-chris@chris-wilson.co.uk (cherry picked from commit `d3bb2f9b5e`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2020-09-30 14:24:51 -04:00
Chris Wilson	ca65fc0d8e	drm/i915/gt: Always send a pulse down the engine after disabling heartbeat Currently, we check we can send a pulse prior to disabling the heartbeat to verify that we can change the heartbeat, but since we may re-evaluate execution upon changing the heartbeat interval we need another pulse afterwards to refresh execution. v2: Tvrtko asked if we could reduce the double pulse to a single, which opened up a discussion of how we should handle the pulse-error after attempting to change the property, and the desire to serialise adjustment of the property with its validating pulse, and unwind upon failure. Fixes: `9a40bddd47` ("drm/i915/gt: Expose heartbeat interval via sysfs") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: <stable@vger.kernel.org> # v5.7+ Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200928221510.26044-2-chris@chris-wilson.co.uk (cherry picked from commit `3dd66a94de`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2020-09-30 14:24:48 -04:00
Chris Wilson	7d442ea7c5	drm/i915: Cancel outstanding work after disabling heartbeats on an engine We only allow persistent requests to remain on the GPU past the closure of their containing context (and process) so long as they are continuously checked for hangs or allow other requests to preempt them, as we need to ensure forward progress of the system. If we allow persistent contexts to remain on the system after the the hangcheck mechanism is disabled, the system may grind to a halt. On disabling the mechanism, we sent a pulse along the engine to remove all executing contexts from the engine which would check for hung contexts -- but we did not prevent those contexts from being resubmitted if they survived the final hangcheck. Fixes: `9a40bddd47` ("drm/i915/gt: Expose heartbeat interval via sysfs") Testcase: igt/gem_ctx_persistence/heartbeat-stop Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: <stable@vger.kernel.org> # v5.7+ Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200928221510.26044-1-chris@chris-wilson.co.uk (cherry picked from commit `7a991cd3e3`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2020-09-30 14:24:46 -04:00
Chris Wilson	3cfea8c97c	drm/i915/gem: Hold request reference for canceling an active context We have to be very careful while walking the timeline->requests list under the RCU guard, as the requests (and so rq->link) use SLAB_TYPESAFE_BY_RCU and so the requests may be reallocated within an rcu grace period. As the requests are reallocated, they are removed from one list and placed on another, and if we are iterating over that request at that moment, the list iteration jumps from one list to the next and promptly gets confused. Verify we hold the request reference to ensure that the request is not added to a new list behind our backs. <4> [582.745252] general protection fault, probably for non-canonical address 0xcccccccccccccd5c: 0000 [#1] PREEMPT SMP PTI <4> [582.745297] CPU: 0 PID: 1475 Comm: gem_ctx_persist Not tainted 5.9.0-rc1-CI-CI_DRM_8908+ #1 <4> [582.745304] Hardware name: Intel Corporation NUC7CJYH/NUC7JYB, BIOS JYGLKCPX.86A.0027.2018.0125.1347 01/25/2018 <4> [582.745317] RIP: 0010:__lock_acquire+0x2c3/0x1f40 <4> [582.745323] Code: 00 65 8b 05 c7 8a ef 7e 85 c0 0f 85 b4 07 00 00 44 8b 9d c4 08 00 00 45 85 db 0f 84 0f 01 00 00 ba 05 00 00 00 e9 c8 06 00 00 <48> 81 3f c0 89 c7 82 b8 00 00 00 00 41 0f 45 c0 83 fe 01 41 89 c3 <4> [582.745334] RSP: 0018:ffffc9000461bc40 EFLAGS: 00010002 <4> [582.745340] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000 <4> [582.745345] RDX: 0000000000000000 RSI: 0000000000000000 RDI: cccccccccccccd5c <4> [582.745350] RBP: ffff8881ec4a2880 R08: 0000000000000001 R09: 0000000000000001 <4> [582.745356] R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000000 <4> [582.745361] R13: 0000000000000000 R14: 0000000000000000 R15: cccccccccccccd5c <4> [582.745367] FS: 00007fb44da78e40(0000) GS:ffff888278000000(0000) knlGS:0000000000000000 <4> [582.745373] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 <4> [582.745378] CR2: 00007fb44daad040 CR3: 0000000268428000 CR4: 0000000000350ef0 <4> [582.745383] Call Trace: <4> [582.745390] ? __lock_acquire+0x913/0x1f40 <4> [582.745397] lock_acquire+0xb5/0x3c0 <4> [582.745526] ? kill_engines+0x19a/0x4b0 [i915] <4> [582.745533] ? find_held_lock+0x2d/0x90 <4> [582.745541] _raw_spin_lock_irq+0x30/0x40 <4> [582.745635] ? kill_engines+0x19a/0x4b0 [i915] <4> [582.745727] kill_engines+0x19a/0x4b0 [i915] <4> [582.745820] context_close+0x195/0x410 [i915] <4> [582.745912] i915_gem_context_close+0x5b/0x160 [i915] <4> [582.745994] i915_driver_postclose+0x14/0x40 [i915] <4> [582.746003] drm_file_free.part.13+0x240/0x290 <4> [582.746009] drm_release_noglobal+0x16/0x50 <4> [582.746016] __fput+0xa5/0x250 <4> [582.746021] task_work_run+0x6e/0xb0 <4> [582.746028] exit_to_user_mode_prepare+0x178/0x180 <4> [582.746034] syscall_exit_to_user_mode+0x36/0x220 <4> [582.746040] entry_SYSCALL_64_after_hwframe+0x44/0xa9 <4> [582.746045] RIP: 0033:0x7fb44d1dc421 <4> [582.746050] Code: f7 d8 64 89 02 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 8b 05 ea cf 20 00 85 c0 75 16 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 3f f3 c3 0f 1f 44 00 00 53 89 fb 48 83 ec 10 <4> [582.746062] RSP: 002b:00007ffed2e83818 EFLAGS: 00000246 ORIG_RAX: 0000000000000003 <4> [582.746069] RAX: 0000000000000000 RBX: 0000556410bfe840 RCX: 00007fb44d1dc421 <4> [582.746075] RDX: 000000000000000a RSI: 00000000c0406469 RDI: 0000000000000008 <4> [582.746080] RBP: 0000000000000008 R08: 00007fb44d1c51cc R09: 00007fb44d1c5240 <4> [582.746086] R10: 0000000000000001 R11: 0000000000000246 R12: 00000000fffffffb <4> [582.746091] R13: 0000000000000006 R14: 0000000000000000 R15: 000000000000000a <4> [582.746099] Modules linked in: vgem mei_hdcp snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio btusb btrtl btbcm btintel x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul bluetooth ghash_clmulni_intel ecdh_generic ecc i915 r8169 realtek mei_me mei snd_hda_intel i2c_hid snd_intel_dspcfg snd_hda_codec snd_hwdep snd_hda_core snd_pcm pinctrl_geminilake pinctrl_intel prime_numbers [last unloaded: test_drm_mm] Fixes: `736e785f9b` ("drm/i915/gem: Reduce context termination list iteration guard to RCU") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200925101107.27869-2-chris@chris-wilson.co.uk (cherry picked from commit `badef44def`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2020-09-30 14:24:42 -04:00
Chris Wilson	5701a66edb	drm/i915: Redo "Remove i915_request.lock requirement for execution callbacks" The reordering and rebasing of commit `2e4c6c1a9d` ("drm/i915: Remove i915_request.lock requirement for execution callbacks") caused it to revert an earlier correction. Let us restore commit 99f0a640d464 ("drm/i915: Remove requirement for holding i915_request.lock for breadcrumbs") Fixes: `2e4c6c1a9d` ("drm/i915: Remove i915_request.lock requirement for execution callbacks") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200925101107.27869-1-chris@chris-wilson.co.uk (cherry picked from commit `35faeb7de9`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2020-09-30 14:24:40 -04:00
Chris Wilson	4fe9af8e88	drm/i915/gem: Serialise debugfs i915_gem_objects with ctx->mutex Since the debugfs may peek into the GEM contexts as the corresponding client/fd is being closed, we may try and follow a dangling pointer. However, the context closure itself is serialised with the ctx->mutex, so if we hold that mutex as we inspect the state coupled in the context, we know the pointers within the context are stable and will remain valid as we inspect their tables. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: CQ Tang <cq.tang@intel.com> Cc: Daniel Vetter <daniel.vetter@intel.com> Cc: stable@vger.kernel.org Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200723172119.17649-3-chris@chris-wilson.co.uk (cherry picked from commit `102f5aa491`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2020-09-30 14:24:37 -04:00
Matthew Auld	cef8ce5528	drm/i915: check i915_vm_alloc_pt_stash for errors If we are really unlucky and encounter an error during i915_vm_alloc_pt_stash, we end up passing an empty pt/pd stash all the way down into the low-level ppgtt alloc code, leading to explosions, since it expects at least the required number of pt/pd for the va range. [ 211.981418] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 211.981421] #PF: supervisor read access in kernel mode [ 211.981422] #PF: error_code(0x0000) - not-present page [ 211.981424] PGD 80000008439cb067 P4D 80000008439cb067 PUD 84a37f067 PMD 0 [ 211.981427] Oops: 0000 [#1] SMP PTI [ 211.981428] CPU: 1 PID: 1301 Comm: i915_selftest Tainted: G U I 5.9.0-rc5+ #3 [ 211.981430] Hardware name: /NUC6i7KYB, BIOS KYSKLi70.86A.0050.2017.0831.1924 08/31/2017 [ 211.981521] RIP: 0010:__gen8_ppgtt_alloc+0x1ed/0x3c0 [i915] [ 211.981523] Code: c1 48 c7 c7 5d 5d fe c0 65 ff 0d ee 1d 03 3f e8 d9 91 1f e2 8b 55 c4 31 c0 48 8b 75 b8 85 d2 0f 95 c0 48 8b 1c c6 48 89 45 98 <48> 8b 03 48 8b 90 58 02 00 00 48 85 d2 0f 84 07 ea 15 00 48 81 fa [ 211.981526] RSP: 0018:ffffba2cc0eb3970 EFLAGS: 00010202 [ 211.981527] RAX: 0000000000000001 RBX: 0000000000000000 RCX: 0000000000000004 [ 211.981529] RDX: 0000000000000002 RSI: ffff9be998bdb8c0 RDI: ffff9be99c844300 [ 211.981530] RBP: ffffba2cc0eb39d8 R08: 0000000000000640 R09: ffff9be97cdfd000 [ 211.981531] R10: ffff9be97cdfd614 R11: 0000000000000000 R12: 0000000000000000 [ 211.981532] R13: ffff9be98607ba20 R14: ffff9be995a0b400 R15: ffffba2cc0eb39e8 [ 211.981534] FS: 00007f0f10b31000(0000) GS:ffff9be99fc40000(0000) knlGS:0000000000000000 [ 211.981536] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 211.981538] CR2: 0000000000000000 CR3: 000000084d74e006 CR4: 00000000003706e0 [ 211.981539] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 211.981541] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 211.981542] Call Trace: [ 211.981609] gen8_ppgtt_alloc+0x79/0x90 [i915] [ 211.981678] ppgtt_bind_vma+0x36/0x80 [i915] [ 211.981756] __vma_bind+0x39/0x40 [i915] [ 211.981818] fence_work+0x21/0x98 [i915] [ 211.981879] fence_notify+0x8d/0x128 [i915] [ 211.981939] __i915_sw_fence_complete+0x62/0x240 [i915] [ 211.982018] i915_vma_pin_ww+0x1ee/0x9c0 [i915] Fixes: `cd0452aa2a` ("drm/i915: Preallocate stashes for vma page-directories") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200921160844.73186-1-matthew.auld@intel.com (cherry picked from commit `1604cb2aa7`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2020-09-30 14:24:35 -04:00
Maarten Lankhorst	159ace7ffe	drm/i915: Fix uninitialised variable in intel_context_create_request. In case backoff fails with an error, we return an undefined rq, assign err to rq correctly. Fixes: `8a929c9eb1` ("drm/i915: Use ww pinning for intel_context_create_request()") Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200918111208.1392128-1-maarten.lankhorst@linux.intel.com Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit `4316b19dee`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2020-09-30 14:24:32 -04:00
Chris Wilson	7d55531476	drm/i915: Break up error capture compression loops with cond_resched() As the error capture will compress user buffers as directed to by the user, it can take an arbitrary amount of time and space. Break up the compression loops with a call to cond_resched(), that will allow other processes to schedule (avoiding the soft lockups) and also serve as a warning should we try to make this loop atomic in the future. Testcase: igt/gem_exec_capture/many-* Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: stable@vger.kernel.org Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200916090059.3189-2-chris@chris-wilson.co.uk (cherry picked from commit `293f43c80c`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2020-09-30 14:24:30 -04:00
Dan Carpenter	eb2a27086a	drm/i915: Fix an error code i915_gem_object_copy_blt() This code should use "vma[1]" instead of "vma". The "vma" variable is a valid pointer. Fixes: `6b05030496` ("drm/i915: Convert i915_gem_object/client_blt.c to use ww locking as well, v2.") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200911075243.GG12635@kadam (cherry picked from commit `68ba71e3ae`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2020-09-30 14:24:28 -04:00
Chris Wilson	922d369b29	drm/i915/gt: Clear the buffer pool age before use If we create a new node, it is possible for the slab allocator to return us a recently freed node. If that node was just retired, it will retain the current jiffy as its node->age. There is then a miniscule window, where as that node is retired, it will appear on the free list with an incorrect age and be eligible for reuse by one thread, and then by a second thread as the correct node->age is written. Fixes: `06b73c2d0b` ("drm/i915/gt: Delay taking the spinlock for grabbing from the buffer pool") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200915091417.4086-3-chris@chris-wilson.co.uk (cherry picked from commit `9bb34ff25c`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2020-09-30 14:24:23 -04:00
Chris Wilson	ba2ebf605d	drm/i915/gem: Prevent using pgprot_writecombine() if PAT is not supported Let's not try and use PAT attributes for I915_MAP_WC if the CPU doesn't support PAT. Fixes: `6056e50033` ("drm/i915/gem: Support discontiguous lmem object maps") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Cc: <stable@vger.kernel.org> # v5.6+ Link: https://patchwork.freedesktop.org/patch/msgid/20200915091417.4086-2-chris@chris-wilson.co.uk (cherry picked from commit `121ba69ffd`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2020-09-30 14:24:20 -04:00
Chris Wilson	4caf017ee9	drm/i915/gem: Avoid implicit vmap for highmem on x86-32 On 32b, highmem using a finite set of indirect PTE (i.e. vmap) to provide virtual mappings of the high pages. As these are finite, map_new_virtual() must wait for some other kmap() to finish when it runs out. If we map a large number of objects, there is no method for it to tell us to release the mappings, and we deadlock. However, if we make an explicit vmap of the page, that uses a larger vmalloc arena, and also has the ability to tell us to release unwanted mappings. Most importantly, it will fail and propagate an error instead of waiting forever. Fixes: `fb8621d3be` ("drm/i915: Avoid allocating a vmap arena for a single page") #x86-32 References: `e87666b52f` ("drm/i915/shrinker: Hook up vmap allocation failure notifier") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Cc: <stable@vger.kernel.org> # v4.7+ Link: https://patchwork.freedesktop.org/patch/msgid/20200915091417.4086-1-chris@chris-wilson.co.uk (cherry picked from commit `060bb115c2`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2020-09-30 14:24:17 -04:00
Gioh Kim	220aee3021	RDMA/rtrs: Remove unused field of rtrs_iu list field is not used anywhere Link: https://lore.kernel.org/r/20200930131407.6438-1-gi-oh.kim@clous.ionos.com Signed-off-by: Gioh Kim <gi-oh.kim@cloud.ionos.com> Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-09-30 15:21:16 -03:00
Anup Patel	aa9887608e	RISC-V: Check clint_time_val before use The NoMMU kernel is broken for QEMU virt machine from Linux-5.9-rc6 because clint_time_val is used even before CLINT driver is probed at following places: 1. rand_initialize() calls get_cycles() which in-turn uses clint_time_val 2. boot_init_stack_canary() calls get_cycles() which in-turn uses clint_time_val The issue#1 (above) is fixed by providing custom random_get_entropy() for RISC-V NoMMU kernel. For issue#2 (above), we remove dependency of boot_init_stack_canary() on get_cycles() and this is aligned with the boot_init_stack_canary() implementations of ARM, ARM64 and MIPS kernel. Fixes: `d5be89a8d1` ("RISC-V: Resurrect the MMIO timer implementation for M-mode systems") Signed-off-by: Anup Patel <anup.patel@wdc.com> Tested-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>	2020-09-30 11:05:14 -07:00
Jiansong Chen	39ad082459	drm/amdgpu: disable gfxoff temporarily for navy_flounder gfxoff is temporarily disabled for navy_flounder, since at present the feature caused some tdr when performing display operations. Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-30 13:53:21 -04:00
Guchun Chen	4a20300bc2	drm/amdgpu: drop duplicated ecc check for vega10 (v5) The same ECC check has been executed in amdgpu_ras_init for vega10, prior to gmc_v9_0_late_init. v2: drop all atombios helper callings v3: use bit operation v4: correct inline comment, remove parity check statement v5: squash in build fix Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-30 13:53:21 -04:00
Jouni Roivas	65026da59c	cgroup: Zero sized write should be no-op Do not report failure on zero sized writes, and handle them as no-op. There's issues for example in case of writev() when there's iovec containing zero buffer as a first one. It's expected writev() on below example to successfully perform the write to specified writable cgroup file expecting integer value, and to return 2. For now it's returning value -1, and skipping the write: int writetest(int fd) { const char buf1 = ""; const char buf2 = "1\n"; struct iovec iov[2] = { { .iov_base = (void)buf1, .iov_len = 0 }, { .iov_base = (void)buf2, .iov_len = 2 } }; return writev(fd, iov, 2); } This patch fixes the issue by checking if there's nothing to write, and handling the write as no-op by just returning 0. Signed-off-by: Jouni Roivas <jouni.roivas@tuxera.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2020-09-30 13:52:06 -04:00
Dmytro Laktyushkin	d3768874e5	drm/amd/display: add pipe reassignment prevention code to dcn3 Add code to gracefuly handle any pipe reassignment occuring on dcn3 hardware. This should only happen when new surfaces are used for an update rather than old ones updated. Fixes: `69fc1f4b97` ("amd/drm/display: avoid dcn3 on flip opp change for slave pipes") Signed-off-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com> Reviewed-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-30 13:50:22 -04:00
Oak Zeng	8ffff9b449	drm/amdgpu: use function pointer for gfxhub functions gfxhub functions are now called from function pointers, instead of from asic-specific functions. Signed-off-by: Oak Zeng <Oak.Zeng@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-30 13:50:13 -04:00
Ramesh Errabolu	825c91d090	drm/amd/amdgpu: Prepare implementation to support reporting of CU usage [Why] Allow user to know number of compute units (CU) that are in use at any given moment. [How] Read registers of SQ that give number of waves that are in flight of various queues. Use this information to determine number of CU's in use. Signed-off-by: Ramesh Errabolu <Ramesh.Errabolu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-30 13:50:06 -04:00
Ramesh Errabolu	b8810a142a	drm/amd/amdgpu: Clean up header file of symbols that are defined to be static [Why] Header file exports functions get_gpu_clock_counter(), get_cu_info() and select_se_sh() that are defined to be static Signed-off-by: Ramesh Errabolu <Ramesh.Errabolu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-30 13:49:44 -04:00
Filipe Manana	4c8f353272	btrfs: fix filesystem corruption after a device replace We use a device's allocation state tree to track ranges in a device used for allocated chunks, and we set ranges in this tree when allocating a new chunk. However after a device replace operation, we were not setting the allocated ranges in the new device's allocation state tree, so that tree is empty after a device replace. This means that a fitrim operation after a device replace will trim the device ranges that have allocated chunks and extents, as we trim every range for which there is not a range marked in the device's allocation state tree. It is also important during chunk allocation, since the device's allocation state is used to determine if a range is already allocated when allocating a new chunk. This is trivial to reproduce and the following script triggers the bug: $ cat reproducer.sh #!/bin/bash DEV1="/dev/sdg" DEV2="/dev/sdh" DEV3="/dev/sdi" wipefs -a $DEV1 $DEV2 $DEV3 &> /dev/null # Create a raid1 test fs on 2 devices. mkfs.btrfs -f -m raid1 -d raid1 $DEV1 $DEV2 > /dev/null mount $DEV1 /mnt/btrfs xfs_io -f -c "pwrite -S 0xab 0 10M" /mnt/btrfs/foo echo "Starting to replace $DEV1 with $DEV3" btrfs replace start -B $DEV1 $DEV3 /mnt/btrfs echo echo "Running fstrim" fstrim /mnt/btrfs echo echo "Unmounting filesystem" umount /mnt/btrfs echo "Mounting filesystem in degraded mode using $DEV3 only" wipefs -a $DEV1 $DEV2 &> /dev/null mount -o degraded $DEV3 /mnt/btrfs if [ $? -ne 0 ]; then dmesg \| tail echo echo "Failed to mount in degraded mode" exit 1 fi echo echo "File foo data (expected all bytes = 0xab):" od -A d -t x1 /mnt/btrfs/foo umount /mnt/btrfs When running the reproducer: $ ./replace-test.sh wrote 10485760/10485760 bytes at offset 0 10 MiB, 2560 ops; 0.0901 sec (110.877 MiB/sec and 28384.5216 ops/sec) Starting to replace /dev/sdg with /dev/sdi Running fstrim Unmounting filesystem Mounting filesystem in degraded mode using /dev/sdi only mount: /mnt/btrfs: wrong fs type, bad option, bad superblock on /dev/sdi, missing codepage or helper program, or other error. [19581.748641] BTRFS info (device sdg): dev_replace from /dev/sdg (devid 1) to /dev/sdi started [19581.803842] BTRFS info (device sdg): dev_replace from /dev/sdg (devid 1) to /dev/sdi finished [19582.208293] BTRFS info (device sdi): allowing degraded mounts [19582.208298] BTRFS info (device sdi): disk space caching is enabled [19582.208301] BTRFS info (device sdi): has skinny extents [19582.212853] BTRFS warning (device sdi): devid 2 uuid 1f731f47-e1bb-4f00-bfbb-9e5a0cb4ba9f is missing [19582.213904] btree_readpage_end_io_hook: 25839 callbacks suppressed [19582.213907] BTRFS error (device sdi): bad tree block start, want 30490624 have 0 [19582.214780] BTRFS warning (device sdi): failed to read root (objectid=7): -5 [19582.231576] BTRFS error (device sdi): open_ctree failed Failed to mount in degraded mode So fix by setting all allocated ranges in the replace target device when the replace operation is finishing, when we are holding the chunk mutex and we can not race with new chunk allocations. A test case for fstests follows soon. Fixes: `1c11b63eff` ("btrfs: replace pending/pinned chunks lists with io tree") CC: stable@vger.kernel.org # 5.2+ Reviewed-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2020-09-30 19:40:51 +02:00
Josef Bacik	a466c85edc	btrfs: move btrfs_rm_dev_replace_free_srcdev outside of all locks When closing and freeing the source device we could end up doing our final blkdev_put() on the bdev, which will grab the bd_mutex. As such we want to be holding as few locks as possible, so move this call outside of the dev_replace->lock_finishing_cancel_unmount lock. Since we're modifying the fs_devices we need to make sure we're holding the uuid_mutex here, so take that as well. There's a report from syzbot probably hitting one of the cases where the bd_mutex and device_list_mutex are taken in the wrong order, however it's not with device replace, like this patch fixes. As there's no reproducer available so far, we can't verify the fix. https://lore.kernel.org/lkml/000000000000fc04d105afcf86d7@google.com/ dashboard link: https://syzkaller.appspot.com/bug?extid=84a0634dc5d21d488419 WARNING: possible circular locking dependency detected 5.9.0-rc5-syzkaller #0 Not tainted ------------------------------------------------------ syz-executor.0/6878 is trying to acquire lock: ffff88804c17d780 (&bdev->bd_mutex){+.+.}-{3:3}, at: blkdev_put+0x30/0x520 fs/block_dev.c:1804 but task is already holding lock: ffff8880908cfce0 (&fs_devs->device_list_mutex){+.+.}-{3:3}, at: close_fs_devices.part.0+0x2e/0x800 fs/btrfs/volumes.c:1159 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #4 (&fs_devs->device_list_mutex){+.+.}-{3:3}: __mutex_lock_common kernel/locking/mutex.c:956 [inline] __mutex_lock+0x134/0x10e0 kernel/locking/mutex.c:1103 btrfs_finish_chunk_alloc+0x281/0xf90 fs/btrfs/volumes.c:5255 btrfs_create_pending_block_groups+0x2f3/0x700 fs/btrfs/block-group.c:2109 __btrfs_end_transaction+0xf5/0x690 fs/btrfs/transaction.c:916 find_free_extent_update_loop fs/btrfs/extent-tree.c:3807 [inline] find_free_extent+0x23b7/0x2e60 fs/btrfs/extent-tree.c:4127 btrfs_reserve_extent+0x166/0x460 fs/btrfs/extent-tree.c:4206 cow_file_range+0x3de/0x9b0 fs/btrfs/inode.c:1063 btrfs_run_delalloc_range+0x2cf/0x1410 fs/btrfs/inode.c:1838 writepage_delalloc+0x150/0x460 fs/btrfs/extent_io.c:3439 __extent_writepage+0x441/0xd00 fs/btrfs/extent_io.c:3653 extent_write_cache_pages.constprop.0+0x69d/0x1040 fs/btrfs/extent_io.c:4249 extent_writepages+0xcd/0x2b0 fs/btrfs/extent_io.c:4370 do_writepages+0xec/0x290 mm/page-writeback.c:2352 __writeback_single_inode+0x125/0x1400 fs/fs-writeback.c:1461 writeback_sb_inodes+0x53d/0xf40 fs/fs-writeback.c:1721 wb_writeback+0x2ad/0xd40 fs/fs-writeback.c:1894 wb_do_writeback fs/fs-writeback.c:2039 [inline] wb_workfn+0x2dc/0x13e0 fs/fs-writeback.c:2080 process_one_work+0x94c/0x1670 kernel/workqueue.c:2269 worker_thread+0x64c/0x1120 kernel/workqueue.c:2415 kthread+0x3b5/0x4a0 kernel/kthread.c:292 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294 -> #3 (sb_internal#2){.+.+}-{0:0}: percpu_down_read include/linux/percpu-rwsem.h:51 [inline] __sb_start_write+0x234/0x470 fs/super.c:1672 sb_start_intwrite include/linux/fs.h:1690 [inline] start_transaction+0xbe7/0x1170 fs/btrfs/transaction.c:624 find_free_extent_update_loop fs/btrfs/extent-tree.c:3789 [inline] find_free_extent+0x25e1/0x2e60 fs/btrfs/extent-tree.c:4127 btrfs_reserve_extent+0x166/0x460 fs/btrfs/extent-tree.c:4206 cow_file_range+0x3de/0x9b0 fs/btrfs/inode.c:1063 btrfs_run_delalloc_range+0x2cf/0x1410 fs/btrfs/inode.c:1838 writepage_delalloc+0x150/0x460 fs/btrfs/extent_io.c:3439 __extent_writepage+0x441/0xd00 fs/btrfs/extent_io.c:3653 extent_write_cache_pages.constprop.0+0x69d/0x1040 fs/btrfs/extent_io.c:4249 extent_writepages+0xcd/0x2b0 fs/btrfs/extent_io.c:4370 do_writepages+0xec/0x290 mm/page-writeback.c:2352 __writeback_single_inode+0x125/0x1400 fs/fs-writeback.c:1461 writeback_sb_inodes+0x53d/0xf40 fs/fs-writeback.c:1721 wb_writeback+0x2ad/0xd40 fs/fs-writeback.c:1894 wb_do_writeback fs/fs-writeback.c:2039 [inline] wb_workfn+0x2dc/0x13e0 fs/fs-writeback.c:2080 process_one_work+0x94c/0x1670 kernel/workqueue.c:2269 worker_thread+0x64c/0x1120 kernel/workqueue.c:2415 kthread+0x3b5/0x4a0 kernel/kthread.c:292 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294 -> #2 ((work_completion)(&(&wb->dwork)->work)){+.+.}-{0:0}: __flush_work+0x60e/0xac0 kernel/workqueue.c:3041 wb_shutdown+0x180/0x220 mm/backing-dev.c:355 bdi_unregister+0x174/0x590 mm/backing-dev.c:872 del_gendisk+0x820/0xa10 block/genhd.c:933 loop_remove drivers/block/loop.c:2192 [inline] loop_control_ioctl drivers/block/loop.c:2291 [inline] loop_control_ioctl+0x3b1/0x480 drivers/block/loop.c:2257 vfs_ioctl fs/ioctl.c:48 [inline] __do_sys_ioctl fs/ioctl.c:753 [inline] __se_sys_ioctl fs/ioctl.c:739 [inline] __x64_sys_ioctl+0x193/0x200 fs/ioctl.c:739 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 -> #1 (loop_ctl_mutex){+.+.}-{3:3}: __mutex_lock_common kernel/locking/mutex.c:956 [inline] __mutex_lock+0x134/0x10e0 kernel/locking/mutex.c:1103 lo_open+0x19/0xd0 drivers/block/loop.c:1893 __blkdev_get+0x759/0x1aa0 fs/block_dev.c:1507 blkdev_get fs/block_dev.c:1639 [inline] blkdev_open+0x227/0x300 fs/block_dev.c:1753 do_dentry_open+0x4b9/0x11b0 fs/open.c:817 do_open fs/namei.c:3251 [inline] path_openat+0x1b9a/0x2730 fs/namei.c:3368 do_filp_open+0x17e/0x3c0 fs/namei.c:3395 do_sys_openat2+0x16d/0x420 fs/open.c:1168 do_sys_open fs/open.c:1184 [inline] __do_sys_open fs/open.c:1192 [inline] __se_sys_open fs/open.c:1188 [inline] __x64_sys_open+0x119/0x1c0 fs/open.c:1188 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 -> #0 (&bdev->bd_mutex){+.+.}-{3:3}: check_prev_add kernel/locking/lockdep.c:2496 [inline] check_prevs_add kernel/locking/lockdep.c:2601 [inline] validate_chain kernel/locking/lockdep.c:3218 [inline] __lock_acquire+0x2a96/0x5780 kernel/locking/lockdep.c:4426 lock_acquire+0x1f3/0xae0 kernel/locking/lockdep.c:5006 __mutex_lock_common kernel/locking/mutex.c:956 [inline] __mutex_lock+0x134/0x10e0 kernel/locking/mutex.c:1103 blkdev_put+0x30/0x520 fs/block_dev.c:1804 btrfs_close_bdev fs/btrfs/volumes.c:1117 [inline] btrfs_close_bdev fs/btrfs/volumes.c:1107 [inline] btrfs_close_one_device fs/btrfs/volumes.c:1133 [inline] close_fs_devices.part.0+0x1a4/0x800 fs/btrfs/volumes.c:1161 close_fs_devices fs/btrfs/volumes.c:1193 [inline] btrfs_close_devices+0x95/0x1f0 fs/btrfs/volumes.c:1179 close_ctree+0x688/0x6cb fs/btrfs/disk-io.c:4149 generic_shutdown_super+0x144/0x370 fs/super.c:464 kill_anon_super+0x36/0x60 fs/super.c:1108 btrfs_kill_super+0x38/0x50 fs/btrfs/super.c:2265 deactivate_locked_super+0x94/0x160 fs/super.c:335 deactivate_super+0xad/0xd0 fs/super.c:366 cleanup_mnt+0x3a3/0x530 fs/namespace.c:1118 task_work_run+0xdd/0x190 kernel/task_work.c:141 tracehook_notify_resume include/linux/tracehook.h:188 [inline] exit_to_user_mode_loop kernel/entry/common.c:163 [inline] exit_to_user_mode_prepare+0x1e1/0x200 kernel/entry/common.c:190 syscall_exit_to_user_mode+0x7e/0x2e0 kernel/entry/common.c:265 entry_SYSCALL_64_after_hwframe+0x44/0xa9 other info that might help us debug this: Chain exists of: &bdev->bd_mutex --> sb_internal#2 --> &fs_devs->device_list_mutex Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&fs_devs->device_list_mutex); lock(sb_internal#2); lock(&fs_devs->device_list_mutex); lock(&bdev->bd_mutex); * DEADLOCK * 3 locks held by syz-executor.0/6878: #0: ffff88809070c0e0 (&type->s_umount_key#70){++++}-{3:3}, at: deactivate_super+0xa5/0xd0 fs/super.c:365 #1: ffffffff8a5b37a8 (uuid_mutex){+.+.}-{3:3}, at: btrfs_close_devices+0x23/0x1f0 fs/btrfs/volumes.c:1178 #2: ffff8880908cfce0 (&fs_devs->device_list_mutex){+.+.}-{3:3}, at: close_fs_devices.part.0+0x2e/0x800 fs/btrfs/volumes.c:1159 stack backtrace: CPU: 0 PID: 6878 Comm: syz-executor.0 Not tainted 5.9.0-rc5-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x198/0x1fd lib/dump_stack.c:118 check_noncircular+0x324/0x3e0 kernel/locking/lockdep.c:1827 check_prev_add kernel/locking/lockdep.c:2496 [inline] check_prevs_add kernel/locking/lockdep.c:2601 [inline] validate_chain kernel/locking/lockdep.c:3218 [inline] __lock_acquire+0x2a96/0x5780 kernel/locking/lockdep.c:4426 lock_acquire+0x1f3/0xae0 kernel/locking/lockdep.c:5006 __mutex_lock_common kernel/locking/mutex.c:956 [inline] __mutex_lock+0x134/0x10e0 kernel/locking/mutex.c:1103 blkdev_put+0x30/0x520 fs/block_dev.c:1804 btrfs_close_bdev fs/btrfs/volumes.c:1117 [inline] btrfs_close_bdev fs/btrfs/volumes.c:1107 [inline] btrfs_close_one_device fs/btrfs/volumes.c:1133 [inline] close_fs_devices.part.0+0x1a4/0x800 fs/btrfs/volumes.c:1161 close_fs_devices fs/btrfs/volumes.c:1193 [inline] btrfs_close_devices+0x95/0x1f0 fs/btrfs/volumes.c:1179 close_ctree+0x688/0x6cb fs/btrfs/disk-io.c:4149 generic_shutdown_super+0x144/0x370 fs/super.c:464 kill_anon_super+0x36/0x60 fs/super.c:1108 btrfs_kill_super+0x38/0x50 fs/btrfs/super.c:2265 deactivate_locked_super+0x94/0x160 fs/super.c:335 deactivate_super+0xad/0xd0 fs/super.c:366 cleanup_mnt+0x3a3/0x530 fs/namespace.c:1118 task_work_run+0xdd/0x190 kernel/task_work.c:141 tracehook_notify_resume include/linux/tracehook.h:188 [inline] exit_to_user_mode_loop kernel/entry/common.c:163 [inline] exit_to_user_mode_prepare+0x1e1/0x200 kernel/entry/common.c:190 syscall_exit_to_user_mode+0x7e/0x2e0 kernel/entry/common.c:265 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x460027 RSP: 002b:00007fff59216328 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6 RAX: 0000000000000000 RBX: 0000000000076035 RCX: 0000000000460027 RDX: 0000000000403188 RSI: 0000000000000002 RDI: 00007fff592163d0 RBP: 0000000000000333 R08: 0000000000000000 R09: 000000000000000b R10: 0000000000000005 R11: 0000000000000246 R12: 00007fff59217460 R13: 0000000002df2a60 R14: 0000000000000000 R15: 00007fff59217460 Signed-off-by: Josef Bacik <josef@toxicpanda.com> [ add syzbot reference ] Signed-off-by: David Sterba <dsterba@suse.com>	2020-09-30 19:34:24 +02:00
Marek Behún	8fd8f94235	leds: ns2: do not guard OF match pointer with of_match_ptr Do not match OF match pointer with of_match_ptr, so that even if CONFIG_OF is disabled, the driver can still be bound via another method. Move definition of of_ns2_leds_match just before ns2_led_driver definition, since it is not needed sooner. Signed-off-by: Marek Behún <kabel@kernel.org> Tested-by: Simon Guinot <simon.guinot@sequanux.org> Signed-off-by: Pavel Machek <pavel@ucw.cz>	2020-09-30 19:22:58 +02:00
Marek Behún	940cca1ab5	leds: ns2: convert to fwnode API Convert from OF api to fwnode API, so that it is possible to bind this driver without device-tree. The fwnode API does not expose a function to read a specific element of an array. We therefore change the types of the ns2_led_modval structure so that we can read the whole modval array with one fwnode call. Signed-off-by: Marek Behún <kabel@kernel.org> Tested-by: Simon Guinot <simon.guinot@sequanux.org> Signed-off-by: Pavel Machek <pavel@ucw.cz>	2020-09-30 19:22:58 +02:00
Tobias Jordan	108f4664e3	leds: tlc591xx: fix leak of device node iterator In one of the error paths of the for_each_child_of_node loop in tlc591xx_probe, add missing call to of_node_put. Fixes: `1ab4531ad1` ("leds: tlc591xx: simplify driver by using the managed led API") Signed-off-by: Tobias Jordan <kernel@cdqe.de> Reviewed-by: Marek Behún <kabel@kernel.org> Reviewed-by: Tomi Valkeinen <tomi.valkeinen@ti.com> Signed-off-by: Pavel Machek <pavel@ucw.cz>	2020-09-30 19:20:46 +02:00
Marek Behún	564ead1280	leds: pca963x: use struct led_init_data when registering By using struct led_init_data when registering we do not need to parse `label` DT property. Moreover `label` is deprecated and if it is not present but `color` and `function` are, LED core will compose a name from these properties instead. Previously if the `label` DT property was not present, the code composed name for the LED in the form "pca963x:%d:%.2x:%u" For backwards compatibility we therefore set init_data->default_label to this value so that the LED will not get a different name if `label` property is not present, nor are `color` and `function`. Signed-off-by: Marek Behún <marek.behun@nic.cz> Cc: Peter Meerwald <p.meerwald@bct-electronic.com> Cc: Ricardo Ribalda <ribalda@kernel.org> Cc: Zahari Petkov <zahari@balena.io> Signed-off-by: Pavel Machek <pavel@ucw.cz>	2020-09-30 19:15:42 +02:00
Marek Behún	85fc8efe85	leds: pca963x: register LEDs immediately after parsing, get rid of platdata Register LEDs immediately after parsing their properties. This allows us to get rid of platdata, and since no one in tree uses header linux/platform_data/leds-pca963x.h, remove it. Signed-off-by: Marek Behún <marek.behun@nic.cz> Cc: Peter Meerwald <p.meerwald@bct-electronic.com> Cc: Ricardo Ribalda <ribalda@kernel.org> Cc: Zahari Petkov <zahari@balena.io> Signed-off-by: Pavel Machek <pavel@ucw.cz>	2020-09-30 19:15:41 +02:00

... 75 76 77 78 79 ...

964751 commits