My compiler yells:
.../drivers/gpu/drm/msm/adreno/adreno_gpu.c:69:27:
error: '/*' within block comment [-Werror,-Wcomment]
Let's fix.
Fixes: 6a0dea02c2 ("drm/msm: support firmware-name for zap fw (v2)")
Link: https://patchwork.freedesktop.org/patch/348519/
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>
This member of struct axp20x_usb_power is not used anywhere.
Remove it.
Reviewed-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Samuel Holland <samuel@sholland.org>
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
The AC power supply input can be used as a wakeup source. Hook up the
ACIN_PLUGIN IRQ to trigger wakeup based on userspace configuration.
To do this, we must remember the list of IRQs for the life of the
device. To know how much space to allocate for the flexible array
member, we switch from using a NULL sentinel to using an array length.
Because we now depend on the specific order of the IRQs (we assume
ACIN_PLUGIN is first and always present), failing to acquire an IRQ
during probe must be a fatal error.
To avoid spuriously waking up the system when the AC power supply is
not configured as a wakeup source, we must explicitly disable all non-
wake IRQs during system suspend. This is because the SoC's NMI input is
shared among all IRQs on the AXP PMIC. Due to the use of regmap-irq, the
individual IRQs within the PMIC are nested threaded interrupts, and are
therefore not automatically disabled during system suspend.
The upshot is that if any other device within the MFD (such as the power
key) is an enabled wakeup source, all enabled IRQs within the PMIC will
cause wakeup. We still need to call enable_irq_wake() when we *do* want
wakeup, in case those other wakeup sources on the PMIC are all disabled.
Reviewed-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Samuel Holland <samuel@sholland.org>
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
AXP803/AXP813 have a flag that enables/disables the AC power supply
input. Allow control of this flag via the ONLINE property on those
variants.
Reviewed-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Samuel Holland <samuel@sholland.org>
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
AXP803/AXP813 have a flag that enables/disables the AC power supply
input. This flag does not affect the status bits in PWR_INPUT_STATUS.
Its effect can be verified by checking the battery charge/discharge
state (bit 2 of PWR_INPUT_STATUS), or by examining the current draw on
the AC input.
Take this flag into account when getting the ONLINE property of the AC
input, on PMICs where this flag is present.
Fixes: 7693b5643f ("power: supply: add AC power supply driver for AXP813")
Cc: stable@vger.kernel.org
Signed-off-by: Samuel Holland <samuel@sholland.org>
Reviewed-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
This uses the new pata-common.yaml schema to
convert the Faraday FTIDE010 to DT schema.
Cc: Rob Herring <robh+dt@kernel.org>
Cc: devicetree@vger.kernel.org
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Rob Herring <robh@kernel.org>
I need to create subnodes for drives connected to PATA
host controllers, and this needs to be supported generally,
so create a common YAML binding for "ide" that will support
subnodes with ports.
This has been designed as a subset of
ata/ahci-platform.txt with the bare essentials and
should be possible to extend or superset to cover the
common bindings.
Cc: Rob Herring <robh+dt@kernel.org>
Cc: devicetree@vger.kernel.org
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Rob Herring <robh@kernel.org>
I need to create subnodes for drives connected to SATA
host controllers, and this needs to be supported
generally, so create a common YAML binding for
"sata" that will support subnodes with ports.
This has been designed as a subset of
ata/ahci-platform.txt with the bare essentials and
should be possible to extend or superset to cover the
common bindings.
Cc: Rob Herring <robh+dt@kernel.org>
Cc: devicetree@vger.kernel.org
Cc: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
[robh: fixup sata-port unit-address pattern]
Signed-off-by: Rob Herring <robh@kernel.org>
Currently, we reset the timer after a pre-eemption event. This has the
side-effect that the timeslice runs into the second context after the
first is completed after a normal promotion event, causing the second
context to be swapped out early and switched for a third context. To be
more fair, we want to reset the clock after promotion as well.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200113214546.1990139-1-chris@chris-wilson.co.uk
Boards with some of the 32-bit SoCs (mostly Meson8 and Meson8m2) use a
Ricoh RN5T618 PMU which acts as system power controller. The driver for
the system power controller may need to the I2C bus just before shutting
down or rebooting the system. At this stage the interrupts may be
disabled already.
Implement the master_xfer_atomic callback so the driver for the RN5T618
PMU can communicate properly with the PMU when shutting down or
rebooting the board. The CTRL register has a status bit which can be
polled to determine when processing has completed. According to the
public S805 datasheet the value 0 means "idle" and 1 means "running".
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Reviewed-by: Neil Armstrong <narmstrong@baylibre.com>
[wsa: converted some whitespace alignment]
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
[BUG]
There are several different KASAN reports for balance + snapshot
workloads. Involved call paths include:
should_ignore_root+0x54/0xb0 [btrfs]
build_backref_tree+0x11af/0x2280 [btrfs]
relocate_tree_blocks+0x391/0xb80 [btrfs]
relocate_block_group+0x3e5/0xa00 [btrfs]
btrfs_relocate_block_group+0x240/0x4d0 [btrfs]
btrfs_relocate_chunk+0x53/0xf0 [btrfs]
btrfs_balance+0xc91/0x1840 [btrfs]
btrfs_ioctl_balance+0x416/0x4e0 [btrfs]
btrfs_ioctl+0x8af/0x3e60 [btrfs]
do_vfs_ioctl+0x831/0xb10
create_reloc_root+0x9f/0x460 [btrfs]
btrfs_reloc_post_snapshot+0xff/0x6c0 [btrfs]
create_pending_snapshot+0xa9b/0x15f0 [btrfs]
create_pending_snapshots+0x111/0x140 [btrfs]
btrfs_commit_transaction+0x7a6/0x1360 [btrfs]
btrfs_mksubvol+0x915/0x960 [btrfs]
btrfs_ioctl_snap_create_transid+0x1d5/0x1e0 [btrfs]
btrfs_ioctl_snap_create_v2+0x1d3/0x270 [btrfs]
btrfs_ioctl+0x241b/0x3e60 [btrfs]
do_vfs_ioctl+0x831/0xb10
btrfs_reloc_pre_snapshot+0x85/0xc0 [btrfs]
create_pending_snapshot+0x209/0x15f0 [btrfs]
create_pending_snapshots+0x111/0x140 [btrfs]
btrfs_commit_transaction+0x7a6/0x1360 [btrfs]
btrfs_mksubvol+0x915/0x960 [btrfs]
btrfs_ioctl_snap_create_transid+0x1d5/0x1e0 [btrfs]
btrfs_ioctl_snap_create_v2+0x1d3/0x270 [btrfs]
btrfs_ioctl+0x241b/0x3e60 [btrfs]
do_vfs_ioctl+0x831/0xb10
[CAUSE]
All these call sites are only relying on root->reloc_root, which can
undergo btrfs_drop_snapshot(), and since we don't have real refcount
based protection to reloc roots, we can reach already dropped reloc
root, triggering KASAN.
[FIX]
To avoid such access to unstable root->reloc_root, we should check
BTRFS_ROOT_DEAD_RELOC_TREE bit first.
This patch introduces wrappers that provide the correct way to check the
bit with memory barriers protection.
Most callers don't distinguish merged reloc tree and no reloc tree. The
only exception is should_ignore_root(), as merged reloc tree can be
ignored, while no reloc tree shouldn't.
[CRITICAL SECTION ANALYSIS]
Although test_bit()/set_bit()/clear_bit() doesn't imply a barrier, the
DEAD_RELOC_TREE bit has extra help from transaction as a higher level
barrier, the lifespan of root::reloc_root and DEAD_RELOC_TREE bit are:
NULL: reloc_root is NULL PTR: reloc_root is not NULL
0: DEAD_RELOC_ROOT bit not set DEAD: DEAD_RELOC_ROOT bit set
(NULL, 0) Initial state __
| /\ Section A
btrfs_init_reloc_root() \/
| __
(PTR, 0) reloc_root initialized /\
| |
btrfs_update_reloc_root() | Section B
| |
(PTR, DEAD) reloc_root has been merged \/
| __
=== btrfs_commit_transaction() ====================
| /\
clean_dirty_subvols() |
| | Section C
(NULL, DEAD) reloc_root cleanup starts \/
| __
btrfs_drop_snapshot() /\
| | Section D
(NULL, 0) Back to initial state \/
Every have_reloc_root() or test_bit(DEAD_RELOC_ROOT) caller holds
transaction handle, so none of such caller can cross transaction boundary.
In Section A, every caller just found no DEAD bit, and grab reloc_root.
In the cross section A-B, caller may get no DEAD bit, but since reloc_root
is still completely valid thus accessing reloc_root is completely safe.
No test_bit() caller can cross the boundary of Section B and Section C.
In Section C, every caller found the DEAD bit, so no one will access
reloc_root.
In the cross section C-D, either caller gets the DEAD bit set, avoiding
access reloc_root no matter if it's safe or not. Or caller get the DEAD
bit cleared, then access reloc_root, which is already NULL, nothing will
be wrong.
The memory write barriers are between the reloc_root updates and bit
set/clear, the pairing read side is before test_bit.
Reported-by: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
Fixes: d2311e6985 ("btrfs: relocation: Delay reloc tree deletion after merge_reloc_roots")
CC: stable@vger.kernel.org # 5.4+
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
[ barriers ]
Signed-off-by: David Sterba <dsterba@suse.com>
In a recent change to the SPI subsystem [1], a new `delay` struct was added
to replace the `delay_usecs`. This change replaces the current `delay_usecs`
with `delay` for this driver.
The `spi_transfer_delay_exec()` function [in the SPI framework] makes sure
that both `delay_usecs` & `delay` are used (in this order to preserve
backwards compatibility).
[1] commit bebcfd272d ("spi: introduce `delay` field for
`spi_transfer` + spi_transfer_delay_exec()")
Signed-off-by: Alexandru Ardelean <alexandru.ardelean@analog.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Update sampling rate when oversampling ratio is changed
through the IIO ABI.
Signed-off-by: Olivier Moysan <olivier.moysan@st.com>
Acked-by: Fabrice Gasnier <fabrice.gasnier@st.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Apply data formatting to single conversion,
as this is already done in continuous and trigger modes.
Fixes: 102afde629 ("iio: adc: stm32-dfsdm: manage data resolution in trigger mode")
Signed-off-by: Olivier Moysan <olivier.moysan@st.com>
Cc: <Stable@vger.kernel.org>
Acked-by: Fabrice Gasnier <fabrice.gasnier@st.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Device property API allows to gather device resources from different sources,
such as ACPI. Convert the drivers to unleash the power of device property API.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Since we have access to the struct device_driver and thus to the ID table,
there is no need to supply special parameters to st_sensors_of_name_probe().
Besides that we have a common API to get driver match data, there is
no need to do matching separately for OF and ACPI.
Taking into consideration above, simplify the ST sensors code.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
The commit 41c128cb25 ("iio: st_gyro: Add lsm9ds0-gyro support")
assumes that gyro in LSM9DS0 is the same as others with 0xd4 WAI ID,
but datasheet tells slight different story, i.e. the first scale factor
for the chip is 245 dps, and not 250 dps.
Correct this by introducing a separate settings for LSM9DS0.
Fixes: 41c128cb25 ("iio: st_gyro: Add lsm9ds0-gyro support")
Depends-on: 45a4e4220b ("iio: gyro: st_gyro: fix L3GD20H support")
Cc: Leonard Crestez <leonard.crestez@nxp.com>
Cc: Lorenzo Bianconi <lorenzo.bianconi83@gmail.com>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
When resuming from hibernation (S4, also known as "suspend to disk") on a
VM, we have seen invalid config space, e.g.,
serial 0000:00:16.3: restoring config space at offset 0x14 (was 0x9104e000, writing 0xffffffff)
To help debug problems like this, log the config space being saved before
suspend, similar to the log in pci_restore_config_dword() when resuming.
Link: https://lore.kernel.org/r/20200113060724.19571-1-yu.c.chen@intel.com
[bhelgaas: commit log]
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Len Brown <lenb@kernel.org>
Add missing return value check in st_lsm6dsx_read_oneshot disabling the
sensor. The issue is reported by coverity with the following error:
Unchecked return value:
If the function returns an error value, the error value may be mistaken
for a normal value.
Addresses-Coverity-ID: 1446733 ("Unchecked return value")
Fixes: b5969abfa8 ("iio: imu: st_lsm6dsx: add motion events")
Fixes: 290a6ce11d ("iio: imu: add support to lsm6dsx driver")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
This fixes some problems that caused build errors in the
lgm-io schema file:
- No "bindings" infix in the schema id
- Move the allOf inclusion for pinconf and pinmux nodes into
the patternProperties for the -pins node
- We want "groups" not "group" to be compulsory for a pinmux
node blended with a pin config node.
- Fix the generic pinmux-schema to list "groups" rather than
"group" for a pinmux node, this might have led to some confusion.
This is a first user of the generic schema so a bit of a bumpy
road.
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Rahul Tanwar <rahul.tanwar@linux.intel.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
With CONFIG_PROVE_RCU_LIST, I had many suspicious RCU warnings
when I ran ftracetest trigger testcases.
-----
# dmesg -c > /dev/null
# ./ftracetest test.d/trigger
...
# dmesg | grep "RCU-list traversed" | cut -f 2 -d ] | cut -f 2 -d " "
kernel/trace/trace_events_hist.c:6070
kernel/trace/trace_events_hist.c:1760
kernel/trace/trace_events_hist.c:5911
kernel/trace/trace_events_trigger.c:504
kernel/trace/trace_events_hist.c:1810
kernel/trace/trace_events_hist.c:3158
kernel/trace/trace_events_hist.c:3105
kernel/trace/trace_events_hist.c:5518
kernel/trace/trace_events_hist.c:5998
kernel/trace/trace_events_hist.c:6019
kernel/trace/trace_events_hist.c:6044
kernel/trace/trace_events_trigger.c:1500
kernel/trace/trace_events_trigger.c:1540
kernel/trace/trace_events_trigger.c:539
kernel/trace/trace_events_trigger.c:584
-----
I investigated those warnings and found that the RCU-list
traversals in event trigger and hist didn't need to use
RCU version because those were called only under event_mutex.
I also checked other RCU-list traversals related to event
trigger list, and found that most of them were called from
event_hist_trigger_func() or hist_unregister_trigger() or
register/unregister functions except for a few cases.
Replace these unneeded RCU-list traversals with normal list
traversal macro and lockdep_assert_held() to check the
event_mutex is held.
Link: http://lkml.kernel.org/r/157680910305.11685.15110237954275915782.stgit@devnote2
Reviewed-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
If 'CONFIG_DRM_I915_CAPTURE_ERROR' not configured, there is an error
when compile the kernel:
drivers/gpu/drm/i915/gt/intel_reset.c:
In function intel_gt_handle_error:
drivers/gpu/drm/i915/gt/intel_reset.c:1233:3:
error: too few arguments to function i915_capture_error_state
i915_capture_error_state(gt->i915);
^~~~~~~~~~~~~~~~~~~~~~~~
In file included
from ./drivers/gpu/drm/i915/i915_drv.h:97:0,
from ./drivers/gpu/drm/i915/display/intel_display_types.h:46,
from drivers/gpu/drm/i915/gt/intel_reset.c:10:
./drivers/gpu/drm/i915/i915_gpu_error.h:267:20: note: declared here
static inline void i915_capture_error_state(struct drm_i915_private *dev_priv,
Fixes: 742379c0c4 ("drm/i915: Start chopping up the GPU error capture")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200113081942.15982-1-zhangxiaoxu5@huawei.com
If 'CONFIG_DRM_I915_CAPTURE_ERROR' not configured, there are some
errors like:
drivers/gpu/drm/i915/i915_irq.o:
In function `i915_vma_capture_finish':
./drivers/gpu/drm/i915/i915_gpu_error.h:312:
multiple definition of `i915_vma_capture_finish'
drivers/gpu/drm/i915/i915_drv.o:
./drivers/gpu/drm/i915/i915_gpu_error.h:312: first defined here
So, add 'static inline' on the defineation of the 'i915_vma_capture_finish'
Fixes: d713e3ab93fdc("drm/i915: Correct typo in i915_vma_compress_finish stub")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200113104009.13274-1-zhangxiaoxu5@huawei.com
The following tests:
* Fetch FD, and then compare via kcmp
* Make sure getfd can be blocked by blocking ptrace_may_access
* Making sure fetching bad FDs fails
* Make sure trying to set flags to non-zero results in an EINVAL
Signed-off-by: Sargun Dhillon <sargun@sargun.me>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Link: https://lore.kernel.org/r/20200107175927.4558-5-sargun@sargun.me
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
This wires up the pidfd_getfd syscall for all architectures.
Signed-off-by: Sargun Dhillon <sargun@sargun.me>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20200107175927.4558-4-sargun@sargun.me
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
This syscall allows for the retrieval of file descriptors from other
processes, based on their pidfd. This is possible using ptrace, and
injection of parasitic code to inject code which leverages SCM_RIGHTS
to move file descriptors between a tracee and a tracer. Unfortunately,
ptrace comes with a high cost of requiring the process to be stopped,
and breaks debuggers. This does not require stopping the process under
manipulation.
One reason to use this is to allow sandboxers to take actions on file
descriptors on the behalf of another process. For example, this can be
combined with seccomp-bpf's user notification to do on-demand fd
extraction and take privileged actions. One such privileged action
is binding a socket to a privileged port.
/* prototype */
/* flags is currently reserved and should be set to 0 */
int sys_pidfd_getfd(int pidfd, int fd, unsigned int flags);
/* testing */
Ran self-test suite on x86_64
Signed-off-by: Sargun Dhillon <sargun@sargun.me>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20200107175927.4558-3-sargun@sargun.me
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
This introduces a function which can be used to fetch a file, given an
arbitrary task. As long as the user holds a reference (refcnt) to the
task_struct it is safe to call, and will either return NULL on failure,
or a pointer to the file, with a refcnt.
This patch is based on Oleg Nesterov's (cf. [1]) patch from September
2018.
[1]: Link: https://lore.kernel.org/r/20180915160423.GA31461@redhat.com
Signed-off-by: Sargun Dhillon <sargun@sargun.me>
Suggested-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20200107175927.4558-2-sargun@sargun.me
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
The firmware-name property in the zap node can be used to specify a
device specific zap firmware.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
For newer devices we want to require the path to come from the
firmware-name property in the zap-shader dt node.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Since zap firmware can be device specific, allow for a firmware-name
property in the zap node to specify which firmware to load, similarly to
the scheme used for dsp/wifi/etc.
v2: only need a single error msg when we can't load from firmware-name
specified path, and fix comment [Bjorn A.]
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Qualcomm Technologies Inc WCD9340/WCD9341 Audio Codec has integrated
gpio controller to control 5 gpios on the chip. This patch adds
required device tree bindings for it.
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Reviewed-by: Rob Herring <robh@kernel.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20200107130844.20763-2-srinivas.kandagatla@linaro.org
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
The writer for async_file holds the ucontext_lock, while the readers are
left unlocked. Most readers rely on an implicit locking, either by having
a uobject (which cannot be created before a context) or by holding the
ib_ufile kref.
However ib_uverbs_free_hw_resources() has no implicit lock and has a
possible race. Make this all clear and sane by using READ_ONCE
consistently.
Link: https://lore.kernel.org/r/1578504126-9400-15-git-send-email-yishaih@mellanox.com
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
This makes async events aligned with completion events as both are full
uobjects of FD type and use the same uobject lifecycle.
A bunch of duplicate code is consolidated and the general flow between the
two FDs is now very similar.
Link: https://lore.kernel.org/r/1578504126-9400-14-git-send-email-yishaih@mellanox.com
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Now that all callers provide a non-NULL attrs the ufile is redundant.
Adjust things so that the context handling is done inside alloc_uobj,
and the ib_uverbs_get_ucontext_file() is avoided if we already have the
context.
Link: https://lore.kernel.org/r/1578504126-9400-13-git-send-email-yishaih@mellanox.com
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
This function works on an ib_uverbs_async_file. Accept that as a parameter
instead of the struct ib_uverbs_file.
Consoldiate all the callers working from an ib_uevent_object to a single
function and locate the async_file directly from the struct ib_uobject
instead of using context_ptr.
Link: https://lore.kernel.org/r/1578504126-9400-11-git-send-email-yishaih@mellanox.com
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Any uobject that sends events into the async_event_file should be using
ib_uevent_object so it can use the standard uevent based helper
functions. CQ pushes events into both the async_event and the comp_channel
in an open coded way. Move the async events related stuff to
ib_uevent_object.
Link: https://lore.kernel.org/r/1578504126-9400-6-git-send-email-yishaih@mellanox.com
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
This is a left over from an earlier version that creates a lot of
complexity for error unwind, particularly for FD uobjects.
The only reason this was done is so that anon_inode_get_file() could be
called with the final fops and a fully setup uobject. Both need to be
setup since unwinding anon_inode_get_file() via fput will call the
driver's release().
Now that the driver does not provide release, we no longer need to worry
about this complicated sequence, simply create the struct file at the
start and allow the core code's release function to deal with the abort
case.
This allows all the confusing error paths around commit to be removed.
Link: https://lore.kernel.org/r/1578504126-9400-5-git-send-email-yishaih@mellanox.com
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
With the new FD structure the async commands do not need to hold any
references while running. The existing mlx5_cmd_exec_cb() and
mlx5_cmd_cleanup_async_ctx() provide enough synchronization to ensure
that all outstanding commands are completed before the uobject can be
destructed.
Remove the now confusing get_file() and the type erasure of the
devx_async_cmd_event_file.
Link: https://lore.kernel.org/r/1578504126-9400-4-git-send-email-yishaih@mellanox.com
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
FD uobjects have a weird split between the struct file and uobject
world. Simplify this to make them pure uobjects and use a generic release
method for all struct file operations.
This fixes the control flow so that mlx5_cmd_cleanup_async_ctx() is always
called before erasing the linked list contents to make the concurrancy
simpler to understand.
For this to work the uobject destruction must fence anything that it is
cleaning up - the design must not rely on struct file lifetime.
Only deliver_event() relies on the struct file to when adding new events
to the queue, add a is_destroyed check under lock to block it.
Link: https://lore.kernel.org/r/1578504126-9400-3-git-send-email-yishaih@mellanox.com
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
dispatch_event_fd() runs from a notifier with minimal locking, and relies
on RCU and a file refcount to keep the uobject and eventfd alive.
As the next patch wants to remove the file_operations release function
from the drivers, re-organize things so that the devx_event_notifier()
path uses the existing RCU to manage the lifetime of the uobject and
eventfd.
Move the refcount puts to a call_rcu so that the objects are guaranteed to
exist and remove the indirect file refcount.
Link: https://lore.kernel.org/r/1578504126-9400-2-git-send-email-yishaih@mellanox.com
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
After device disassociation the uapi_objects are destroyed and freed,
however it is still possible that core code can be holding a kref on the
uobject. When it finally goes to uverbs_uobject_free() via the kref_put()
it can trigger a use-after-free on the uapi_object.
Since needs_kfree_rcu is a micro optimization that only benefits file
uobjects, just get rid of it. There is no harm in using kfree_rcu even if
it isn't required, and the number of involved objects is small.
Link: https://lore.kernel.org/r/20200113143306.GA28717@ziepe.ca
Signed-off-by: Michael Guralnik <michaelgur@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
We want to specify per-device firmware-name, so move the zap node into
the .dts file for individual boards/devices. This lets us get rid of
the /delete-node/ for cheza, which does not use zap.
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>
Link: https://lore.kernel.org/r/20200112195405.1132288-5-robdclark@gmail.com
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Previously, we call check_and_add_serial when serialization is
enabled for write IO, but it could allocate and free memory
back and forth.
Now, let's just get an element from memory pool with the new
function, then insert node to rb tree if no collision happens.
Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
Signed-off-by: Song Liu <songliubraving@fb.com>
Since raid1 had already used bucket based mechanism to reduce
the conflict between write IO and resync IO, it is possible to
speed up performance for io serialization with refer to the
same mechanism.
To align with the barrier bucket mechanism, we created arrays
(with the same number of BARRIER_BUCKETS_NR) for spinlock, rb
tree and waitqueue. Then we can reduce lock competition with
multiple spinlocks, boost search performance with multiple rb
trees and also reduce thundering herd problem with multiple
waitqueues.
Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
Signed-off-by: Song Liu <songliubraving@fb.com>
Obviously, IO serialization could cause the degradation of
performance a lot. In order to reduce the degradation, so a
rb interval tree is added in raid1 to speed up the check of
collision.
So, a rb root is needed in md_rdev, then abstract all the
serialize related members to a new struct (serial_in_rdev),
embed it into md_rdev.
Of course, we need to free the struct if it is not needed
anymore, so rdev/rdevs_uninit_serial are added accordingly.
And they should be called when destroty memory pool or can't
alloc memory.
And we need to consider to call mddev_destroy_serial_pool
in case serialize_policy/write-behind is disabled, bitmap
is destroyed or in __md_stop_writes.
Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
Signed-off-by: Song Liu <songliubraving@fb.com>