Commit graph

1105317 commits

Author SHA1 Message Date
Abhinav Kumar
cf961a5e67 drm/msm/dpu: add DRM_MODE_ROTATE_180 back to supported rotations
DRM_MODE_ROTATE_180 was previously marked as supported even
for devices not supporting inline rotation.

This is true because the SSPPs can always flip the image.

After inline rotation support changes, this bit was removed
and kms_rotation_crc IGT test starts skipping now whereas
it was previously passing.

Restore DRM_MODE_ROTATE_180 bit to the supported rotations
list.

Fixes: dabfdd89ea ("add inline rotation support for sc7280")
Signed-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
Tested-by: Jessica Zhang <quic_jesszhan@quicinc.com> # Trogdor (SC8170)
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Patchwork: https://patchwork.freedesktop.org/patch/485928/
Link: https://lore.kernel.org/r/20220511222710.22394-1-quic_abhinavk@quicinc.com
Signed-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
2022-05-18 15:59:28 -07:00
Dmitry Baryshkov
577e2a9dfc drm/msm: don't free the IRQ if it was not requested
As msm_drm_uninit() is called from the msm_drm_init() error path,
additional care should be necessary as not to call the free_irq() for
the IRQ that was not requested before (because an error occured earlier
than the request_irq() call).

This fixed the issue reported with the following backtrace:

[    8.571329] Trying to free already-free IRQ 187
[    8.571339] WARNING: CPU: 0 PID: 76 at kernel/irq/manage.c:1895 free_irq+0x1e0/0x35c
[    8.588746] Modules linked in: pmic_glink pdr_interface fastrpc qrtr_smd snd_soc_hdmi_codec msm fsa4480 gpu_sched drm_dp_aux_bus qrtr i2c_qcom_geni crct10dif_ce qcom_stats qcom_q6v5_pas drm_display_helper gpi qcom_pil_info drm_kms_helper qcom_q6v5 qcom_sysmon qcom_common qcom_glink_smem qcom_rng mdt_loader qmi_helpers phy_qcom_qmp ufs_qcom typec qnoc_sm8350 socinfo rmtfs_mem fuse drm ipv6
[    8.624154] CPU: 0 PID: 76 Comm: kworker/u16:2 Not tainted 5.18.0-rc5-next-20220506-00033-g6cee8cab6089-dirty #419
[    8.624161] Hardware name: Qualcomm Technologies, Inc. SM8350 HDK (DT)
[    8.641496] Workqueue: events_unbound deferred_probe_work_func
[    8.647510] pstate: 604000c5 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[    8.654681] pc : free_irq+0x1e0/0x35c
[    8.658454] lr : free_irq+0x1e0/0x35c
[    8.662228] sp : ffff800008ab3950
[    8.665642] x29: ffff800008ab3950 x28: 0000000000000000 x27: ffff16350f56a700
[    8.672994] x26: ffff1635025df080 x25: ffff16350251badc x24: ffff16350251bb90
[    8.680343] x23: 0000000000000000 x22: 00000000000000bb x21: ffff16350e8f9800
[    8.687690] x20: ffff16350251ba00 x19: ffff16350cbd5880 x18: ffffffffffffffff
[    8.695039] x17: 0000000000000000 x16: ffffa2dd12179434 x15: ffffa2dd1431d02d
[    8.702391] x14: 0000000000000000 x13: ffffa2dd1431d028 x12: 662d79646165726c
[    8.709740] x11: ffffa2dd13fd2438 x10: 000000000000000a x9 : 00000000000000bb
[    8.717111] x8 : ffffa2dd13fd23f0 x7 : ffff800008ab3750 x6 : 00000000fffff202
[    8.724487] x5 : ffff16377e870a18 x4 : 00000000fffff202 x3 : ffff735a6ae1b000
[    8.731851] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff1635015f8000
[    8.739217] Call trace:
[    8.741755]  free_irq+0x1e0/0x35c
[    8.745198]  msm_drm_uninit.isra.0+0x14c/0x294 [msm]
[    8.750548]  msm_drm_bind+0x28c/0x5d0 [msm]
[    8.755081]  try_to_bring_up_aggregate_device+0x164/0x1d0
[    8.760657]  __component_add+0xa0/0x170
[    8.764626]  component_add+0x14/0x20
[    8.768337]  dp_display_probe+0x2a4/0x464 [msm]
[    8.773242]  platform_probe+0x68/0xe0
[    8.777043]  really_probe.part.0+0x9c/0x28c
[    8.781368]  __driver_probe_device+0x98/0x144
[    8.785871]  driver_probe_device+0x40/0x140
[    8.790191]  __device_attach_driver+0xb4/0x120
[    8.794788]  bus_for_each_drv+0x78/0xd0
[    8.798751]  __device_attach+0xdc/0x184
[    8.802713]  device_initial_probe+0x14/0x20
[    8.807031]  bus_probe_device+0x9c/0xa4
[    8.810991]  deferred_probe_work_func+0x88/0xc0
[    8.815667]  process_one_work+0x1d0/0x320
[    8.819809]  worker_thread+0x14c/0x444
[    8.823688]  kthread+0x10c/0x110
[    8.827036]  ret_from_fork+0x10/0x20

Reported-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Fixes: f026e431cf ("drm/msm: Convert to Linux IRQ interfaces")
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Reviewed-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Patchwork: https://patchwork.freedesktop.org/patch/485422/
Link: https://lore.kernel.org/r/20220507010021.1667700-1-dmitry.baryshkov@linaro.org
Signed-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
2022-05-18 15:55:46 -07:00
Abhinav Kumar
e67dcecda0 drm/msm/dpu: limit writeback modes according to max_linewidth
Writeback modes were being added according to mode_config.max_width
but this is assigned to double of max_mixer_width.

For compositors/clients using a single SSPP, this will fail
the dpu_plane's atomic check as it checks for max_linewidth.

Limit writeback modes according to max_linewidth to allow
even compositors/clients which use only a single SSPP to
use writeback.

Fixes: 77b001acdc ("drm/msm/dpu: add the writeback connector layer")
Reported-by: Jessica Zhang <quic_jesszhan@quicinc.com>
Signed-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
Tested-by: Jessica Zhang <quic_jesszhan@quicinc.com> # Trogdor (SC8170)
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Patchwork: https://patchwork.freedesktop.org/patch/486176/
Link: https://lore.kernel.org/r/20220513225959.19004-1-quic_abhinavk@quicinc.com
Signed-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
2022-05-18 15:43:08 -07:00
Chao Yu
677a82b44e f2fs: fix to do sanity check for inline inode
Yanming reported a kernel bug in Bugzilla kernel [1], which can be
reproduced. The bug message is:

The kernel message is shown below:

kernel BUG at fs/inode.c:611!
Call Trace:
 evict+0x282/0x4e0
 __dentry_kill+0x2b2/0x4d0
 dput+0x2dd/0x720
 do_renameat2+0x596/0x970
 __x64_sys_rename+0x78/0x90
 do_syscall_64+0x3b/0x90

[1] https://bugzilla.kernel.org/show_bug.cgi?id=215895

The bug is due to fuzzed inode has both inline_data and encrypted flags.
During f2fs_evict_inode(), as the inode was deleted by rename(), it
will cause inline data conversion due to conflicting flags. The page
cache will be polluted and the panic will be triggered in clear_inode().

Try fixing the bug by doing more sanity checks for inline data inode in
sanity_check_inode().

Cc: stable@vger.kernel.org
Reported-by: Ming Yan <yanming@tju.edu.cn>
Signed-off-by: Chao Yu <chao.yu@oppo.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-05-18 15:36:11 -07:00
Chao Yu
958ed92922 f2fs: fix fallocate to use file_modified to update permissions consistently
This patch tries to fix permission consistency issue as all other
mainline filesystems.

Since the initial introduction of (posix) fallocate back at the turn of
the century, it has been possible to use this syscall to change the
user-visible contents of files.  This can happen by extending the file
size during a preallocation, or through any of the newer modes (punch,
zero, collapse, insert range).  Because the call can be used to change
file contents, we should treat it like we do any other modification to a
file -- update the mtime, and drop set[ug]id privileges/capabilities.

The VFS function file_modified() does all this for us if pass it a
locked inode, so let's make fallocate drop permissions correctly.

Cc: stable@kernel.org
Signed-off-by: Chao Yu <chao.yu@oppo.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-05-18 15:36:09 -07:00
Jens Axboe
1305e2c9d9 blk-cgroup: delete rcu_read_lock_held() WARN_ON_ONCE()
A previous commit got rid of unnecessary rcu_read_lock() inside the
IRQ disabling queue_lock, but this debug statement was left. It's now
firing since we are indeed not inside a RCU read lock, but we don't
need to be as we're still preempt safe.

Get rid of the check, as we have a lockdep assert for holding the
queue lock right after it anyway.

Link: https://lore.kernel.org/linux-block/46253c48-81cb-0787-20ad-9133afdd9e21@samsung.com/
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Fixes: 77c570a1ea ("blk-cgroup: Remove unnecessary rcu_read_lock/unlock()")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-05-18 16:32:00 -06:00
Jens Axboe
2fcabce2d7 io_uring: disallow mixed provided buffer group registrations
It's nonsensical to register a provided buffer ring, if a classic
provided buffer group with the same ID exists. Depending on the order of
which we decide what type to pick, the other type will never get used.
Explicitly disallow it and return an error if this is attempted.

Fixes: c7fb19428d ("io_uring: add support for ring mapped supplied buffers")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-05-18 16:21:30 -06:00
Jens Axboe
1d0dbbfa28 io_uring: initialize io_buffer_list head when shared ring is unregistered
We use ->buf_pages != 0 to tell if this is a shared buffer ring or a
classic provided buffer group. If we unregister the shared ring and
then attempt to use it, buf_pages is zero yet the classic list head
isn't properly initialized. This causes io_buffer_select() to think
that we have classic buffers available, but then we crash when we try
and get one from the list.

Just initialize the list if we unregister a shared buffer ring, leaving
it in a sane state for either re-registration or for attempting to use
it. And do the same for the initial setup from the classic path.

Fixes: c7fb19428d ("io_uring: add support for ring mapped supplied buffers")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-05-18 16:21:11 -06:00
Dmitry Torokhov
038c4bf85b Merge branch 'ib/5.17-cros-ec-keyb' into next
Merge changes to ChromeOS EC Keyboard driver.
2022-05-18 15:02:27 -07:00
Zongmin Zhou
c853246539 Input: vmmouse - disable vmmouse before entering suspend mode
Currently, when trying to suspend and resume with VirtualPS/2 VMMouse
there is an error message after resuming:

	psmouse serio1: vmmouse: Unable to re-enable mouse when reconnecting, err: -6

and the mouse will no longer be operable, requiring full rescan to find a
another driver to use for the port.

This error is due to QEMU still generating PS2 events which the kernel is
not consuming until resume time, where they interfere with mouse
identification and ultimately resulting in an error getting
VMMOUSE_VERSION_ID.

Test scenario:

1) start virtual machine with qemu command "vmport=on"
2) click suspend botton to enter suspend mode
3) resume and observe the error message in the kernel logs

Let's fix this by disabling the vmmouse in its reset handler. This will
notify qemu to stop vmmouse and remove the handler.

Signed-off-by: Zongmin Zhou<zhouzongmin@kylinos.cn>
Reviewed-by: Zack Rusin <zackr@vmware.com>
Link: https://lore.kernel.org/r/20220322021046.1087954-1-zhouzongmin@kylinos.cn
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2022-05-18 15:02:13 -07:00
Stephen Boyd
d95bca4fbd dt-bindings: google,cros-ec-keyb: Fixup bad compatible match
This uses anyOf which is wrong. Use oneOf and move the items under the
description. Also drop allOf for $ref.

Reported-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Stephen Boyd <swboyd@chromium.org>
Link: https://lore.kernel.org/r/CAE-0n50KE9bkqZvCOLtCGiq3g1jYhK7zpVcVFBzinaguNhNaPw@mail.gmail.com
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2022-05-18 15:01:06 -07:00
Marek Vasut
b26ff91371 Input: ili210x - use one common reset implementation
Rename ili251x_hardware_reset() to ili210x_hardware_reset(), change its
parameter from struct device * to struct gpio_desc *, and use it as one
single consistent reset implementation all over the driver. Also increase
the minimum reset duration to 12ms, to make sure the reset is really
within the spec.

Signed-off-by: Marek Vasut <marex@denx.de>
Link: https://lore.kernel.org/r/20220518210423.106555-1-marex@denx.de
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2022-05-18 14:31:31 -07:00
Marek Vasut
e4920d42ce Input: ili210x - fix reset timing
According to Ilitek "231x & ILI251x Programming Guide" Version: 2.30
"2.1. Power Sequence", "T4 Chip Reset and discharge time" is minimum
10ms and "T2 Chip initial time" is maximum 150ms. Adjust the reset
timings such that T4 is 12ms and T2 is 160ms to fit those figures.

This prevents sporadic touch controller start up failures when some
systems with at least ILI251x controller boot, without this patch
the systems sometimes fail to communicate with the touch controller.

Fixes: 201f3c8035 ("Input: ili210x - add reset GPIO support")
Signed-off-by: Marek Vasut <marex@denx.de>
Link: https://lore.kernel.org/r/20220518204901.93534-1-marex@denx.de
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2022-05-18 14:31:30 -07:00
Aidan MacDonald
2b0f3d70ce mips: ingenic: Do not manually reference the CPU clock
It isn't necessary to manually walk the device tree and enable
the CPU clock anymore. The CPU and other necessary clocks are
now flagged as critical in the clock driver, which accomplishes
the same thing in a more declarative fashion.

Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com>
Reviewed-by: Paul Cercueil <paul@crapouillou.net>
Link: https://lore.kernel.org/r/20220428164454.17908-4-aidanmacdonald.0x0@gmail.com
Tested-by: 周琰杰 (Zhou Yanjie) <zhouyanjie@wanyeetech.com> # On X1000 and X1830
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2022-05-18 13:56:26 -07:00
Aidan MacDonald
ca54d06fca clk: ingenic: Mark critical clocks in Ingenic SoCs
Consider CPU, L2 cache, and memory clocks as critical to prevent
them -- and the parent clocks -- from being automatically gated,
since nothing calls clk_get() on these clocks.

Gating the CPU clock hangs the processor, and gating memory makes
external DRAM inaccessible. Normal kernel code can't hope to deal
with either situation so those clocks have to be critical.

The L2 cache is required only if caches are running, and could be
gated if the kernel takes care to flush and disable caches before
gating the clock. There's no mechanism to do this, and probably no
reason to do it, so it's simpler to mark the L2 cache as critical.

Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com>
Reviewed-by: Paul Cercueil <paul@crapouillou.net>
Link: https://lore.kernel.org/r/20220428164454.17908-3-aidanmacdonald.0x0@gmail.com
Tested-by: 周琰杰 (Zhou Yanjie) <zhouyanjie@wanyeetech.com> # On X1000 and X1830
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2022-05-18 13:56:22 -07:00
Aidan MacDonald
bacf743e92 clk: ingenic: Allow specifying common clock flags
Provide a flags field for clocks under the ingenic-cgu driver,
which can be used to set generic common clock framework flags
on the created clocks. For example, the CLK_IS_CRITICAL flag
is needed for some clocks (such as CPU or memory) to stop them
being automatically disabled.

Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com>
Reviewed-by: Paul Cercueil <paul@crapouillou.net>
Link: https://lore.kernel.org/r/20220428164454.17908-2-aidanmacdonald.0x0@gmail.com
Tested-by: 周琰杰 (Zhou Yanjie) <zhouyanjie@wanyeetech.com> # On X1000 and X1830
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2022-05-18 13:56:16 -07:00
Hangyu Hua
bea0b66efa clk: ux500: fix a possible off-by-one in u8500_prcc_reset_base()
Off-by-one will happen when index == ARRAY_SIZE(ur->base).

Fixes: b14cbdfd46 ("clk: ux500: Add driver for the reset portions of PRCC")
Signed-off-by: Hangyu Hua <hbh25y@gmail.com>
Link: https://lore.kernel.org/r/20220518062537.17933-1-hbh25y@gmail.com
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2022-05-18 13:34:03 -07:00
Mario Limonciello
7123d39dc2 drm/amd: Don't reset dGPUs if the system is going to s2idle
An A+A configuration on ASUS ROG Strix G513QY proves that the ASIC
reset for handling aborted suspend can't work with s2idle.

This functionality was introduced in commit daf8de0874 ("drm/amdgpu:
always reset the asic in suspend (v2)").  A few other commits have
gone on top of the ASIC reset, but this still doesn't work on the A+A
configuration in s2idle.

Avoid doing the reset on dGPUs specifically when using s2idle.

Fixes: daf8de0874 ("drm/amdgpu: always reset the asic in suspend (v2)")
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2008
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2022-05-18 15:24:35 -04:00
Ilya Dryomov
d0bb883c63 libceph: fix misleading ceph_osdc_cancel_request() comment
cancel_request() never guaranteed that after its return the OSD
client would be completely done with the OSD request.  The callback
(if specified) can still be invoked and a ref can still be held.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Xiubo Li <xiubli@redhat.com>
2022-05-18 21:21:29 +02:00
Ilya Dryomov
75dbb685f4 libceph: fix potential use-after-free on linger ping and resends
request_reinit() is not only ugly as the comment rightfully suggests,
but also unsafe.  Even though it is called with osdc->lock held for
write in all cases, resetting the OSD request refcount can still race
with handle_reply() and result in use-after-free.  Taking linger ping
as an example:

    handle_timeout thread                     handle_reply thread

                                              down_read(&osdc->lock)
                                              req = lookup_request(...)
                                              ...
                                              finish_request(req)  # unregisters
                                              up_read(&osdc->lock)
                                              __complete_request(req)
                                                linger_ping_cb(req)

      # req->r_kref == 2 because handle_reply still holds its ref

    down_write(&osdc->lock)
    send_linger_ping(lreq)
      req = lreq->ping_req  # same req
      # cancel_linger_request is NOT
      # called - handle_reply already
      # unregistered
      request_reinit(req)
        WARN_ON(req->r_kref != 1)  # fires
        request_init(req)
          kref_init(req->r_kref)

                   # req->r_kref == 1 after kref_init

                                              ceph_osdc_put_request(req)
                                                kref_put(req->r_kref)

            # req->r_kref == 0 after kref_put, req is freed

        <further req initialization/use> !!!

This happens because send_linger_ping() always (re)uses the same OSD
request for watch ping requests, relying on cancel_linger_request() to
unregister it from the OSD client and rip its messages out from the
messenger.  send_linger() does the same for watch/notify registration
and watch reconnect requests.  Unfortunately cancel_request() doesn't
guarantee that after it returns the OSD client would be completely done
with the OSD request -- a ref could still be held and the callback (if
specified) could still be invoked too.

The original motivation for request_reinit() was inability to deal with
allocation failures in send_linger() and send_linger_ping().  Switching
to using osdc->req_mempool (currently only used by CephFS) respects that
and allows us to get rid of request_reinit().

Cc: stable@vger.kernel.org
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Xiubo Li <xiubli@redhat.com>
Acked-by: Jeff Layton <jlayton@kernel.org>
2022-05-18 21:21:05 +02:00
Mario Limonciello
0223e51647 drm/amd: Don't reset dGPUs if the system is going to s2idle
An A+A configuration on ASUS ROG Strix G513QY proves that the ASIC
reset for handling aborted suspend can't work with s2idle.

This functionality was introduced in commit daf8de0874 ("drm/amdgpu:
always reset the asic in suspend (v2)").  A few other commits have
gone on top of the ASIC reset, but this still doesn't work on the A+A
configuration in s2idle.

Avoid doing the reset on dGPUs specifically when using s2idle.

Fixes: daf8de0874 ("drm/amdgpu: always reset the asic in suspend (v2)")
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2008
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-18 15:20:18 -04:00
Luben Tuikov
5ad25ace7c drm/amdgpu: Unmap legacy queue when MES is enabled
This fixes a kernel oops when MES is not enabled.

Reported-by: Kenny Ho <Kenny.Ho@amd.com>
Suggested-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Alex Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Fixes: 18ee4ce63e ("drm/amdgpu: add mes unmap legacy queue routine")
Fixes: 3d879e81f0 ("drm/amdgpu: add init support for GFX11 (v2)")
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-18 15:20:18 -04:00
Rafael J. Wysocki
d44d6c4a3a Update devfreq next for v5.19
Detailed description for this pull request:
 1. Update devfreq core
 - Add cpu based scaling support to passive governor. Some device like
 cache might require the dynamic frequency scaling. But, it has very
 tightly to cpu frequency. So that use passive governor to scale
 the frequency according to current cpu frequency.
 
 To decide the frequency of the device, the governor does one of the following:
 : Derives the optimal devfreq device opp from required-opps property of
   the parent cpu opp_table.
 
 : Scales the device frequency in proportion to the CPU frequency. So, if
   the CPUs are running at their max frequency, the device runs at its
   max frequency. If the CPUs are running at their min frequency, the
   device runs at its min frequency. It is interpolated for frequencies
   in between.
 
 2. Update devfreq driver
 - Update rk3399_dmc.c as following:
 : Convert dt-binding document to YAML and deprecate unused properties.
 
 : Use Hz units for the device-tree properties of rk3399_dmc.
 
 : rk3399_dmc is able to set the idle time before changing the dmc clock.
   Specify idle time parameters by using nano-second unit on dt bidning.
 
 : Add new disable-freq properties to optimize the power-saving feature
   of rk3399_dmc.
 
 : Disable devfreq-event device on remove() to fix unbalanced
   enable-disable count.
 
 : Use devm_pm_opp_of_add_table()
 
 : Block PMU (Power-Management Unit) transitions when scaling frequency
   by ARM Trust Firmware in order to fix the conflict between PMU and DMC
   (Dynamic Memory Controller).
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCgA0FiEEsSpuqBtbWtRe4rLGnM3fLN7rz1MFAmKDddYWHGN3MDAuY2hv
 aUBzYW1zdW5nLmNvbQAKCRCczd8s3uvPU1o0EACGE7FV/pA/sbOFhhLXQlI1XMQf
 C2XVvjigP5M7Y9qgM6qLBKCSmRPbBjuO4VLKkIycmo2GRcUs+Sk5CqALjWZF0tDR
 aiAV+P3wIC2pCZNtAXJG6BhP5GNzGYYdv2zJfNmVUsYUYVC3ezppuE1Yp2Sscq5t
 K43eAOq0c9+TTeLfdDKdi07QtevMHwbOxjNUhwnzN5+yD9cknK1Ht5FVjDc8eYTG
 yBJPgMy9qhNqEZ4gl9zpKJkpAJMOhquT4ZtytNioIfnmKCrpyiOy+yApvW5WiU1d
 /8KMiG42C47rUBYCiEGEdn6PaTOpuOK0hnuoxlxE07dqJP349SbozFTqZxfbCJhI
 OjZSdLTolPFq82zuAafBJmijmeAiYdI0FKrhXmQo4doeleIQG9z0h8YpkQSnF47L
 xsj7QswD0LxtMQbYv5zds1NoGnUn/U+TpoJ8mbpg5gPWKaBTNdTZFJQj+HXtrDyy
 1ouR41u6kOoLLDHu2I5f9dTMptJRg3jS+kQfDv17K8u5mAmcwFidpOiyac6m2xad
 Nnn85NAE9JaxpzVe3+S3uQdscEe2FBWEZCkbrCcPn2T88BZwrfcby02YRBw22WkS
 p4xPAOncw4lhZ6z3pedcH/1+kk6uoTube5QK8ddSRxYrRavtOsqchAaec+hYRNif
 CbjJOqsdbDLa7ANLxw==
 =/3W8
 -----END PGP SIGNATURE-----

Merge tag 'devfreq-next-for-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux

Pull devfreq changes for 5.19-rc1 from Chanwoo Choi:

"1. Update devfreq core

 - Add cpu based scaling support to passive governor. Some device like
   cache might require the dynamic frequency scaling. But, it has very
   tightly to cpu frequency. So that use passive governor to scale
   the frequency according to current cpu frequency.

  To decide the frequency of the device, the governor does one of the following:
  : Derives the optimal devfreq device opp from required-opps property of
    the parent cpu opp_table.

  : Scales the device frequency in proportion to the CPU frequency. So, if
    the CPUs are running at their max frequency, the device runs at its
    max frequency. If the CPUs are running at their min frequency, the
    device runs at its min frequency. It is interpolated for frequencies
    in between.

 2. Update devfreq drivers

  - Update rk3399_dmc.c as following:

   : Convert dt-binding document to YAML and deprecate unused properties.

   : Use Hz units for the device-tree properties of rk3399_dmc.

   : rk3399_dmc is able to set the idle time before changing the dmc clock.
     Specify idle time parameters by using nano-second unit on dt bidning.

   : Add new disable-freq properties to optimize the power-saving feature
     of rk3399_dmc.

   : Disable devfreq-event device on remove() to fix unbalanced
     enable-disable count.

   : Use devm_pm_opp_of_add_table()

   : Block PMU (Power-Management Unit) transitions when scaling frequency
     by ARM Trust Firmware in order to fix the conflict between PMU and DMC
     (Dynamic Memory Controller)."

* tag 'devfreq-next-for-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux:
  PM / devfreq: passive: Keep cpufreq_policy for possible cpus
  PM / devfreq: passive: Reduce duplicate code when passive_devfreq case
  PM / devfreq: Add cpu based scaling support to passive governor
  PM / devfreq: Export devfreq_get_freq_range symbol within devfreq
  PM / devfreq: rk3399_dmc: Block PMU during transitions
  soc: rockchip: power-domain: Manage resource conflicts with firmware
  PM / devfreq: rk3399_dmc: Avoid static (reused) profile
  PM / devfreq: rk3399_dmc: Use devm_pm_opp_of_add_table()
  PM / devfreq: rk3399_dmc: Disable edev on remove()
  PM / devfreq: rk3399_dmc: Support new *-ns properties
  PM / devfreq: rk3399_dmc: Support new disable-freq properties
  PM / devfreq: rk3399_dmc: Use bitfield macro definitions for ODT_PD
  PM / devfreq: rk3399_dmc: Drop excess timing properties
  PM / devfreq: rk3399_dmc: Drop undocumented ondemand DT props
  dt-bindings: devfreq: rk3399_dmc: Add more disable-freq properties
  dt-bindings: devfreq: rk3399_dmc: Specify idle params in nanoseconds
  dt-bindings: devfreq: rk3399_dmc: Fix Hz units
  dt-bindings: devfreq: rk3399_dmc: Deprecate unused/redundant properties
  dt-bindings: devfreq: rk3399_dmc: Convert to YAML
2022-05-18 20:55:34 +02:00
Haowen Bai
5a66bfb277 thermal: intel: hfi: remove NULL check after container_of() call
container_of() will never return NULL, so remove useless code.

Signed-off-by: Haowen Bai <baihaowen@meizu.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-05-18 20:53:05 +02:00
Pavel Begunkov
0184f08e65 io_uring: add fully sparse buffer registration
Honour IORING_RSRC_REGISTER_SPARSE not only for direct files but fixed
buffers as well. It makes the rsrc API more consistent.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/66f429e4912fe39fb3318217ff33a2853d4544be.1652879898.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-05-18 12:53:04 -06:00
Zhang Rui
f125bdbdd6 powercap: intel_rapl: add support for ALDERLAKE_N
Add ALDERLAKE_N to the list of supported processor models in the Intel
RAPL power capping driver.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
[ rjw: Changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-05-18 20:40:50 +02:00
Lai Jiangshan
c42b145181 x86/sev: Annotate stack change in the #VC handler
In idtentry_vc(), vc_switch_off_ist() determines a safe stack to
switch to, off of the IST stack. Annotate the new stack switch with
ENCODE_FRAME_POINTER in case UNWINDER_FRAME_POINTER is used.

A stack walk before looks like this:

  CPU: 0 PID: 0 Comm: swapper Not tainted 5.18.0-rc7+ #2
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
  Call Trace:
   <TASK>
   dump_stack_lvl
   dump_stack
   kernel_exc_vmm_communication
   asm_exc_vmm_communication
   ? native_read_msr
   ? __x2apic_disable.part.0
   ? x2apic_setup
   ? cpu_init
   ? trap_init
   ? start_kernel
   ? x86_64_start_reservations
   ? x86_64_start_kernel
   ? secondary_startup_64_no_verify
   </TASK>

and with the fix, the stack dump is exact:

  CPU: 0 PID: 0 Comm: swapper Not tainted 5.18.0-rc7+ #3
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
  Call Trace:
   <TASK>
   dump_stack_lvl
   dump_stack
   kernel_exc_vmm_communication
   asm_exc_vmm_communication
  RIP: 0010:native_read_msr
  Code: ...
  < snipped regs >
   ? __x2apic_disable.part.0
   x2apic_setup
   cpu_init
   trap_init
   start_kernel
   x86_64_start_reservations
   x86_64_start_kernel
   secondary_startup_64_no_verify
   </TASK>

  [ bp: Test in a SEV-ES guest and rewrite the commit message to
    explain what exactly this does. ]

Fixes: a13644f3a5 ("x86/entry/64: Add entry code for #VC handler")
Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220316041612.71357-1-jiangshanlai@gmail.com
2022-05-18 20:36:03 +02:00
Hangyu Hua
947a844bb3 drm: msm: fix possible memory leak in mdp5_crtc_cursor_set()
drm_gem_object_lookup will call drm_gem_object_get inside. So cursor_bo
needs to be put when msm_gem_get_and_pin_iova fails.

Fixes: e172d10a9c ("drm/msm/mdp5: Add hardware cursor support")
Signed-off-by: Hangyu Hua <hbh25y@gmail.com>
Link: https://lore.kernel.org/r/20220509061125.18585-1-hbh25y@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-05-18 11:05:21 -07:00
Zhang Jianhua
b0487ede1f fs-verity: remove unused parameter desc_size in fsverity_create_info()
The parameter desc_size in fsverity_create_info() is useless and it is
not referenced anywhere. The greatest meaning of desc_size here is to
indecate the size of struct fsverity_descriptor and futher calculate the
size of signature. However, the desc->sig_size can do it also and it is
indeed, so remove it.

Therefore, it is no need to acquire desc_size by fsverity_get_descriptor()
in ensure_verity_info(), so remove the parameter desc_ret in
fsverity_get_descriptor() too.

Signed-off-by: Zhang Jianhua <chris.zjh@huawei.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Link: https://lore.kernel.org/r/20220518132256.2297655-1-chris.zjh@huawei.com
2022-05-18 11:01:31 -07:00
Rob Clark
cec4e5cbb9 drm/msm: Fix fb plane offset calculation
The offset got dropped by accident.

Fixes: d413e6f971 ("drm/msm: Drop msm_gem_iova()")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Tested-by: Stephen Boyd <swboyd@chromium.org> # CoachZ
Link: https://lore.kernel.org/r/20220510165216.3577068-1-robdclark@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-05-18 10:55:47 -07:00
Miaoqian Lin
c56de48309 drm/msm/a6xx: Fix refcount leak in a6xx_gpu_init
of_parse_phandle() returns a node pointer with refcount
incremented, we should use of_node_put() on it when not need anymore.

a6xx_gmu_init() passes the node to of_find_device_by_node()
and of_dma_configure(), of_find_device_by_node() will takes its
reference, of_dma_configure() doesn't need the node after usage.

Add missing of_node_put() to avoid refcount leak.

Fixes: 4b565ca5a2 ("drm/msm: Add A6XX device support")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Reviewed-by: Akhil P Oommen <quic_akhilpo@quicinc.com>
Link: https://lore.kernel.org/r/20220512121955.56937-1-linmq006@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-05-18 10:54:39 -07:00
Douglas Anderson
ec7981e6c6 drm/msm/dsi: don't powerup at modeset time for parade-ps8640
Commit 7d8e9a9050 ("drm/msm/dsi: move DSI host powerup to modeset
time") caused sc7180 Chromebooks that use the parade-ps8640 bridge
chip to fail to turn the display back on after it turns off.

Unfortunately, it doesn't look easy to fix the parade-ps8640 driver to
handle the new power sequence. The Linux driver has almost nothing in
it and most of the logic for this bridge chip is in black-box firmware
that the bridge chip uses.

Also unfortunately, reverting the patch will break "tc358762".

The long term solution here is probably Dave Stevenson's series [1]
that would give more flexibility. However, that is likely not a quick
fix.

For the short term, we'll look at the compatible of the next bridge in
the chain and go back to the old way for the Parade PS8640 bridge
chip. If it's found that other bridge chips also need this workaround
then we can add them to the list or consider inverting the
condition. However, the hope is that the framework will not take too
much longer to land and we won't have to add anything other than
ps8640 here.

[1] https://lore.kernel.org/r/cover.1646406653.git.dave.stevenson@raspberrypi.com

Fixes: 7d8e9a9050 ("drm/msm/dsi: move DSI host powerup to modeset time")
Suggested-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Reviewed-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
Link: https://lore.kernel.org/r/20220513131504.v5.1.Ia196e35ad985059e77b038a41662faae9e26f411@changeid
Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-05-18 10:51:42 -07:00
Xiu Jianfeng
29ed17389c cgroup: Make cgroup_debug static
Make cgroup_debug static since it's only used in cgroup.c

Signed-off-by: Xiu Jianfeng <xiujianfeng@huawei.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2022-05-18 06:59:20 -10:00
Christoph Hellwig
354201c53e nvme: add support for TP4084 - Time-to-Ready Enhancements
Add support for using longer timeouts during controller initialization
and letting the controller come up with namespaces that are not ready
for I/O yet.  We skip these not ready namespaces during scanning and
only bring them online once anoter scan is kicked off by the AEN that
is set when the NRDY bit gets set in the  I/O Command Set Independent
Identify Namespace Data Structure.   This asynchronous probing avoids
blocking the kernel boot when controllers take a very long time to
recover after unclean shutdowns (up to minutes).

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
2022-05-18 18:54:17 +02:00
Al Viro
fb4554c223 Fix double fget() in vhost_net_set_backend()
Descriptor table is a shared resource; two fget() on the same descriptor
may return different struct file references.  get_tap_ptr_ring() is
called after we'd found (and pinned) the socket we'll be using and it
tries to find the private tun/tap data structures associated with it.
Redoing the lookup by the same file descriptor we'd used to get the
socket is racy - we need to same struct file.

Thanks to Jason for spotting a braino in the original variant of patch -
I'd missed the use of fd == -1 for disabling backend, and in that case
we can end up with sock == NULL and sock != oldsock.

Cc: stable@kernel.org
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-05-18 12:33:51 -04:00
Eli Cohen
acde392949 vdpa/mlx5: Use consistent RQT size
The current code evaluates RQT size based on the configured number of
virtqueues. This can raise an issue in the following scenario:

Assume MQ was negotiated.
1. mlx5_vdpa_set_map() gets called.
2. handle_ctrl_mq() is called setting cur_num_vqs to some value, lower
   than the configured max VQs.
3. A second set_map gets called, but now a smaller number of VQs is used
   to evaluate the size of the RQT.
4. handle_ctrl_mq() is called with a value larger than what the RQT can
   hold. This will emit errors and the driver state is compromised.

To fix this, we use a new field in struct mlx5_vdpa_net to hold the
required number of entries in the RQT. This value is evaluated in
mlx5_vdpa_set_driver_features() where we have the negotiated features
all set up.

In addition to that, we take into consideration the max capability of RQT
entries early when the device is added so we don't need to take consider
it when creating the RQT.

Last, we remove the use of mlx5_vdpa_max_qps() which just returns the
max_vas / 2 and make the code clearer.

Fixes: 52893733f2 ("vdpa/mlx5: Add multiqueue support")
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Eli Cohen <elic@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-18 12:31:31 -04:00
Daire McNamara
7013654af6 PCI: microchip: Fix potential race in interrupt handling
Clear the MSI bit in ISTATUS_LOCAL register after reading it, but
before reading and handling individual MSI bits from the ISTATUS_MSI
register. This avoids a potential race where new MSI bits may be set
on the ISTATUS_MSI register after it was read and be missed when the
MSI bit in the ISTATUS_LOCAL register is cleared.

ISTATUS_LOCAL is a read/write/clear register; the register's bits
are set when the corresponding interrupt source is activated. Each
source is independent and thus multiple sources may be active
simultaneously. The processor can monitor and clear status
bits. If one or more ISTATUS_LOCAL interrupt sources are active,
the RootPort issues an interrupt towards the processor (on
the AXI domain). Bit 28 of this register reports an MSI has been
received by the RootPort.

ISTATUS_MSI is a read/write/clear register. Bits 31-0 are asserted
when an MSI with message number 31-0 is received by the RootPort.
The processor must monitor and clear these bits.

Effectively, Bit 28 of ISTATUS_LOCAL informs the processor that
an MSI has arrived at the RootPort and ISTATUS_MSI informs the
processor which MSI (in the range 0 - 31) needs handling.

Reported by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/linux-pci/20220127202000.GA126335@bhelgaas/

Link: https://lore.kernel.org/r/20220517141622.145581-1-daire.mcnamara@microchip.com
Fixes: 6f15a9c9f9 ("PCI: microchip: Add Microchip PolarFire PCIe controller driver")
Signed-off-by: Daire McNamara <daire.mcnamara@microchip.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
2022-05-18 17:14:21 +01:00
Abhishek Sahu
7ab5e10eda vfio/pci: Move the unused device into low power state with runtime PM
Currently, there is very limited power management support
available in the upstream vfio_pci_core based drivers. If there
are no users of the device, then the PCI device will be moved into
D3hot state by writing directly into PCI PM registers. This D3hot
state help in saving power but we can achieve zero power consumption
if we go into the D3cold state. The D3cold state cannot be possible
with native PCI PM. It requires interaction with platform firmware
which is system-specific. To go into low power states (including D3cold),
the runtime PM framework can be used which internally interacts with PCI
and platform firmware and puts the device into the lowest possible
D-States.

This patch registers vfio_pci_core based drivers with the
runtime PM framework.

1. The PCI core framework takes care of most of the runtime PM
   related things. For enabling the runtime PM, the PCI driver needs to
   decrement the usage count and needs to provide 'struct dev_pm_ops'
   at least. The runtime suspend/resume callbacks are optional and needed
   only if we need to do any extra handling. Now there are multiple
   vfio_pci_core based drivers. Instead of assigning the
   'struct dev_pm_ops' in individual parent driver, the vfio_pci_core
   itself assigns the 'struct dev_pm_ops'. There are other drivers where
   the 'struct dev_pm_ops' is being assigned inside core layer
   (For example, wlcore_probe() and some sound based driver, etc.).

2. This patch provides the stub implementation of 'struct dev_pm_ops'.
   The subsequent patch will provide the runtime suspend/resume
   callbacks. All the config state saving, and PCI power management
   related things will be done by PCI core framework itself inside its
   runtime suspend/resume callbacks (pci_pm_runtime_suspend() and
   pci_pm_runtime_resume()).

3. Inside pci_reset_bus(), all the devices in dev_set needs to be
   runtime resumed. vfio_pci_dev_set_pm_runtime_get() will take
   care of the runtime resume and its error handling.

4. Inside vfio_pci_core_disable(), the device usage count always needs
   to be decremented which was incremented in vfio_pci_core_enable().

5. Since the runtime PM framework will provide the same functionality,
   so directly writing into PCI PM config register can be replaced with
   the use of runtime PM routines. Also, the use of runtime PM can help
   us in more power saving.

   In the systems which do not support D3cold,

   With the existing implementation:

   // PCI device
   # cat /sys/bus/pci/devices/0000\:01\:00.0/power_state
   D3hot
   // upstream bridge
   # cat /sys/bus/pci/devices/0000\:00\:01.0/power_state
   D0

   With runtime PM:

   // PCI device
   # cat /sys/bus/pci/devices/0000\:01\:00.0/power_state
   D3hot
   // upstream bridge
   # cat /sys/bus/pci/devices/0000\:00\:01.0/power_state
   D3hot

   So, with runtime PM, the upstream bridge or root port will also go
   into lower power state which is not possible with existing
   implementation.

   In the systems which support D3cold,

   // PCI device
   # cat /sys/bus/pci/devices/0000\:01\:00.0/power_state
   D3hot
   // upstream bridge
   # cat /sys/bus/pci/devices/0000\:00\:01.0/power_state
   D0

   With runtime PM:

   // PCI device
   # cat /sys/bus/pci/devices/0000\:01\:00.0/power_state
   D3cold
   // upstream bridge
   # cat /sys/bus/pci/devices/0000\:00\:01.0/power_state
   D3cold

   So, with runtime PM, both the PCI device and upstream bridge will
   go into D3cold state.

6. If 'disable_idle_d3' module parameter is set, then also the runtime
   PM will be enabled, but in this case, the usage count should not be
   decremented.

7. vfio_pci_dev_set_try_reset() return value is unused now, so this
   function return type can be changed to void.

8. Use the runtime PM API's in vfio_pci_core_sriov_configure().
   The device can be in low power state either with runtime
   power management (when there is no user) or PCI_PM_CTRL register
   write by the user. In both the cases, the PF should be moved to
   D0 state. For preventing any runtime usage mismatch, pci_num_vf()
   has been called explicitly during disable.

Signed-off-by: Abhishek Sahu <abhsahu@nvidia.com>
Link: https://lore.kernel.org/r/20220518111612.16985-5-abhsahu@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-05-18 10:00:48 -06:00
Abhishek Sahu
54918c2874 vfio/pci: Virtualize PME related registers bits and initialize to zero
If any PME event will be generated by PCI, then it will be mostly
handled in the host by the root port PME code. For example, in the case
of PCIe, the PME event will be sent to the root port and then the PME
interrupt will be generated. This will be handled in
drivers/pci/pcie/pme.c at the host side. Inside this, the
pci_check_pme_status() will be called where PME_Status and PME_En bits
will be cleared. So, the guest OS which is using vfio-pci device will
not come to know about this PME event.

To handle these PME events inside guests, we need some framework so
that if any PME events will happen, then it needs to be forwarded to
virtual machine monitor. We can virtualize PME related registers bits
and initialize these bits to zero so vfio-pci device user will assume
that it is not capable of asserting the PME# signal from any power state.

Signed-off-by: Abhishek Sahu <abhsahu@nvidia.com>
Link: https://lore.kernel.org/r/20220518111612.16985-4-abhsahu@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-05-18 10:00:48 -06:00
Abhishek Sahu
f4162eb1e2 vfio/pci: Change the PF power state to D0 before enabling VFs
According to [PCIe v5 9.6.2] for PF Device Power Management States

 "The PF's power management state (D-state) has global impact on its
  associated VFs. If a VF does not implement the Power Management
  Capability, then it behaves as if it is in an equivalent
  power state of its associated PF.

  If a VF implements the Power Management Capability, the Device behavior
  is undefined if the PF is placed in a lower power state than the VF.
  Software should avoid this situation by placing all VFs in lower power
  state before lowering their associated PF's power state."

From the vfio driver side, user can enable SR-IOV when the PF is in D3hot
state. If VF does not implement the Power Management Capability, then
the VF will be actually in D3hot state and then the VF BAR access will
fail. If VF implements the Power Management Capability, then VF will
assume that its current power state is D0 when the PF is D3hot and
in this case, the behavior is undefined.

To support PF power management, we need to create power management
dependency between PF and its VF's. The runtime power management support
may help with this where power management dependencies are supported
through device links. But till we have such support in place, we can
disallow the PF to go into low power state, if PF has VF enabled.
There can be a case, where user first enables the VF's and then
disables the VF's. If there is no user of PF, then the PF can put into
D3hot state again. But with this patch, the PF will still be in D0
state after disabling VF's since detecting this case inside
vfio_pci_core_sriov_configure() requires access to
struct vfio_device::open_count along with its locks. But the subsequent
patches related to runtime PM will handle this case since runtime PM
maintains its own usage count.

Also, vfio_pci_core_sriov_configure() can be called at any time
(with and without vfio pci device user), so the power state change
and SR-IOV enablement need to be protected with the required locks.

Signed-off-by: Abhishek Sahu <abhsahu@nvidia.com>
Link: https://lore.kernel.org/r/20220518111612.16985-3-abhsahu@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-05-18 10:00:48 -06:00
Abhishek Sahu
2b2c651baf vfio/pci: Invalidate mmaps and block the access in D3hot power state
According to [PCIe v5 5.3.1.4.1] for D3hot state

 "Configuration and Message requests are the only TLPs accepted by a
  Function in the D3Hot state. All other received Requests must be
  handled as Unsupported Requests, and all received Completions may
  optionally be handled as Unexpected Completions."

Currently, if the vfio PCI device has been put into D3hot state and if
user makes non-config related read/write request in D3hot state, these
requests will be forwarded to the host and this access may cause
issues on a few systems.

This patch leverages the memory-disable support added in commit
'abafbc551f ("vfio-pci: Invalidate mmaps and block MMIO access on
disabled memory")' to generate page fault on mmap access and
return error for the direct read/write. If the device is D3hot state,
then the error will be returned for MMIO access. The IO access generally
does not make the system unresponsive so the IO access can still happen
in D3hot state. The default value should be returned in this case
without bringing down the complete system.

Also, the power related structure fields need to be protected so
we can use the same 'memory_lock' to protect these fields also.
This protection is mainly needed when user changes the PCI
power state by writing into PCI_PM_CTRL register.
vfio_lock_and_set_power_state() wrapper function will take the
required locks and then it will invoke the vfio_pci_set_power_state().

Signed-off-by: Abhishek Sahu <abhsahu@nvidia.com>
Link: https://lore.kernel.org/r/20220518111612.16985-2-abhsahu@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-05-18 09:59:18 -06:00
Linus Torvalds
ef1302160b sound fixes for 5.18
A collection of last-minute HD- an USB-audio quirks in addition to
 a fix for the legacy ISA wavefront driver.
 
 All look small and easy.
 -----BEGIN PGP SIGNATURE-----
 
 iQJCBAABCAAsFiEEIXTw5fNLNI7mMiVaLtJE4w1nLE8FAmKDXUMOHHRpd2FpQHN1
 c2UuZGUACgkQLtJE4w1nLE+PRw/+Pmfan+rymr86zTZ7kVmCzeVMB3r/1W6dPsMk
 rBYXOpdBrok7ovpgCvjx0WDLCCics/gSZCXg7RVzY/KypKrthPBpiUshe6on82zR
 Bmv9UMDKELuMeUPBHSCXxFDlTD0nzWk5Oz+WQN3hujM8HKBhT6v2jyuiGLMzBYkC
 pjnauvEVf9uNZq33oHyKtTaI43OIkrReG4q7GCpDpqMS4rKlt6hI5BGtUXpTHQsC
 CaR1d6XQcfjGr1peCcz0YZddxqhVOwujDCSyNSAted39HolsQUcIXVITsKYB4WXl
 rrLvUy+zxf8bKYO4bGfS04sXP9t7x0BOWUWeX0ApO97cQ1v95j4d363YFMDw3VE1
 /PIQxt/F+cq5cnhhl0goDTTN6pVA7eWcIck3XKmpqbduUGmWhhTm4WSuTeceK/7s
 Q0gWrX6vecLAFJjj2SBU4Se5rehDwoHbQ1xA4Pp2C1FgQf/47zg09K30UYEC3+iO
 Io8ohJZonQEf3+Y+NjFy72gaJ8mxMTA/kWCUIAm+DTr/4zGo/uC7I/5i0PbmeqVb
 OW26pfMYFcQ0e8MrqolQ9B0EBF+7/JEv/89w7F9CU/aF1VklWqUu0fWbXpHd9W0v
 JzhDzByc5NuUrdlfvwZl40Xd8Mf+LsnTO0nbdB+KqYQFUErmWYZTdzf+8zaC0Vka
 /6hpbMY=
 =9TJZ
 -----END PGP SIGNATURE-----

Merge tag 'sound-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
 "A collection of last-minute HD- an USB-audio quirks in addition to a
  fix for the legacy ISA wavefront driver.

  All look small and easy"

* tag 'sound-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: usb-audio: Restore Rane SL-1 quirk
  ALSA: hda/realtek: fix right sounds and mute/micmute LEDs for HP machine
  ALSA: hda/realtek: Add quirk for TongFang devices with pop noise
  ALSA: hda/realtek: Add quirk for the Framework Laptop
  ALSA: wavefront: Proper check of get_user() error
  ALSA: hda/realtek: Add quirk for Dell Latitude 7520
  ALSA: hda - fix unused Realtek function when PM is not enabled
  ALSA: usb-audio: Don't get sample rate for MCT Trigger 5 USB-to-HDMI
2022-05-18 05:53:43 -10:00
Pablo Neira Ayuso
9e539c5b6d netfilter: nf_tables: disable expression reduction infra
Either userspace or kernelspace need to pre-fetch keys inconditionally
before comparisons for this to work. Otherwise, register tracking data
is misleading and it might result in reducing expressions which are not
yet registers.

First expression is also guaranteed to be evaluated always, however,
certain expressions break before writing data to registers, before
comparing the data, leaving the register in undetermined state.

This patch disables this infrastructure by now.

Fixes: b2d306542f ("netfilter: nf_tables: do not reduce read-only expressions")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-05-18 17:34:26 +02:00
Ritaro Takenaka
2738d9d963 netfilter: flowtable: move dst_check to packet path
Fixes sporadic IPv6 packet loss when flow offloading is enabled.

IPv6 route GC and flowtable GC are not synchronized.
When dst_cache becomes stale and a packet passes through the flow before
the flowtable GC teardowns it, the packet can be dropped.
So, it is necessary to check dst every time in packet path.

Fixes: 227e1e4d0d ("netfilter: nf_flowtable: skip device lookup from interface index")
Signed-off-by: Ritaro Takenaka <ritarot634@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-05-18 17:34:26 +02:00
Pablo Neira Ayuso
e5eaac2beb netfilter: flowtable: fix TCP flow teardown
This patch addresses three possible problems:

1. ct gc may race to undo the timeout adjustment of the packet path, leaving
   the conntrack entry in place with the internal offload timeout (one day).

2. ct gc removes the ct because the IPS_OFFLOAD_BIT is not set and the CLOSE
   timeout is reached before the flow offload del.

3. tcp ct is always set to ESTABLISHED with a very long timeout
   in flow offload teardown/delete even though the state might be already
   CLOSED. Also as a remark we cannot assume that the FIN or RST packet
   is hitting flow table teardown as the packet might get bumped to the
   slow path in nftables.

This patch resets IPS_OFFLOAD_BIT from flow_offload_teardown(), so
conntrack handles the tcp rst/fin packet which triggers the CLOSE/FIN
state transition.

Moreover, teturn the connection's ownership to conntrack upon teardown
by clearing the offload flag and fixing the established timeout value.
The flow table GC thread will asynchonrnously free the flow table and
hardware offload entries.

Before this patch, the IPS_OFFLOAD_BIT remained set for expired flows on
which is also misleading since the flow is back to classic conntrack
path.

If nf_ct_delete() removes the entry from the conntrack table, then it
calls nf_ct_put() which decrements the refcnt. This is not a problem
because the flowtable holds a reference to the conntrack object from
flow_offload_alloc() path which is released via flow_offload_free().

This patch also updates nft_flow_offload to skip packets in SYN_RECV
state. Since we might miss or bump packets to slow path, we do not know
what will happen there while we are still in SYN_RECV, this patch
postpones offload up to the next packet which also aligns to the
existing behaviour in tc-ct.

flow_offload_teardown() does not reset the existing tcp state from
flow_offload_fixup_tcp() to ESTABLISHED anymore, packets bump to slow
path might have already update the state to CLOSE/FIN.

Joint work with Oz and Sven.

Fixes: 1e5b2471bc ("netfilter: nf_flow_table: teardown flow timeout race")
Signed-off-by: Oz Shlomo <ozsh@nvidia.com>
Signed-off-by: Sven Auhagen <sven.auhagen@voleatech.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-05-18 17:34:26 +02:00
Eric Biggers
c069db76ed ext4: fix memory leak in parse_apply_sb_mount_options()
If processing the on-disk mount options fails after any memory was
allocated in the ext4_fs_context, e.g. s_qf_names, then this memory is
leaked.  Fix this by calling ext4_fc_free() instead of kfree() directly.

Reproducer:

    mkfs.ext4 -F /dev/vdc
    tune2fs /dev/vdc -E mount_opts=usrjquota=file
    echo clear > /sys/kernel/debug/kmemleak
    mount /dev/vdc /vdc
    echo scan > /sys/kernel/debug/kmemleak
    sleep 5
    echo scan > /sys/kernel/debug/kmemleak
    cat /sys/kernel/debug/kmemleak

Fixes: 7edfd85b1f ("ext4: Completely separate options parsing and sb setup")
Cc: stable@vger.kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Tested-by: Ritesh Harjani <ritesh.list@gmail.com>
Link: https://lore.kernel.org/r/20220513231605.175121-2-ebiggers@kernel.org
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2022-05-18 11:24:22 -04:00
Eric Biggers
cb8435dc8b ext4: reject the 'commit' option on ext2 filesystems
The 'commit' option is only applicable for ext3 and ext4 filesystems,
and has never been accepted by the ext2 filesystem driver, so the ext4
driver shouldn't allow it on ext2 filesystems.

This fixes a failure in xfstest ext4/053.

Fixes: 8dc0aa8cf0 ("ext4: check incompatible mount options while mounting ext2/3")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Ritesh Harjani <ritesh.list@gmail.com>
Reviewed-by: Lukas Czerner <lczerner@redhat.com>
Link: https://lore.kernel.org/r/20220510183232.172615-1-ebiggers@kernel.org
2022-05-18 11:24:22 -04:00
Yang Li
b10b6278ae ext4: remove duplicated #include of dax.h in inode.c
Fix following includecheck warning:
./fs/ext4/inode.c: linux/dax.h is included more than once.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Link: https://lore.kernel.org/r/20220504225025.44753-1-yang.lee@linux.alibaba.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2022-05-18 11:24:22 -04:00
Fabiano Rosas
ad55bae7dc KVM: PPC: Book3S HV: Fix vcore_blocked tracepoint
We removed most of the vcore logic from the P9 path but there's still
a tracepoint that tried to dereference vc->runner.

Fixes: ecb6a7207f ("KVM: PPC: Book3S HV P9: Remove most of the vcore logic")
Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220328215831.320409-1-farosas@linux.ibm.com
2022-05-19 00:44:50 +10:00
Alexey Kardashevskiy
b22af90419 KVM: PPC: Book3s: Remove real mode interrupt controller hcalls handlers
Currently we have 2 sets of interrupt controller hypercalls handlers
for real and virtual modes, this is from POWER8 times when switching
MMU on was considered an expensive operation.

POWER9 however does not have dependent threads and MMU is enabled for
handling hcalls so the XIVE native or XICS-on-XIVE real mode handlers
never execute on real P9 and later CPUs.

This untemplate the handlers and only keeps the real mode handlers for
XICS native (up to POWER8) and remove the rest of dead code. Changes
in functions are mechanical except few missing empty lines to make
checkpatch.pl happy.

The default implemented hcalls list already contains XICS hcalls so
no change there.

This should not cause any behavioral change.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Acked-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220509071150.181250-1-aik@ozlabs.ru
2022-05-19 00:44:28 +10:00