Commit graph

947024 commits

Author SHA1 Message Date
Meir Lichtinger
896ec97353 RDMA/mlx5: Set mkey relaxed ordering by UMR with ConnectX-7
Up to ConnectX-7 UMR is not used when user passes relaxed ordering access
flag. ConnectX-7 supports setting relaxed ordering read/write mkey
attribute by UMR, indicated by new HCA capabilities.

With ConnectX-7 driver uses UMR when user set relaxed ordering access
flag, in contrast to previous silicon models. Specifically it includes
setting relvant flags of mkey context mask in UMR control segment, and
relaxed ordering write and read flags in UMR mkey context segment.

Link: https://lore.kernel.org/r/20200716105248.1423452-4-leon@kernel.org
Signed-off-by: Meir Lichtinger <meirl@mellanox.com>
Reviewed-by: Michael Guralnik <michaelgur@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-27 11:19:00 -03:00
Meir Lichtinger
2224635938 RDMA/mlx5: Use MLX5_SET macro instead of local structure
Use generic mlx5 structure defined in mlx5_ifc.h to represent ConnectX
device data structures instead of using structure defined specifically for
mlx5_ib module.

Link: https://lore.kernel.org/r/20200716105248.1423452-3-leon@kernel.org
Signed-off-by: Meir Lichtinger <meirl@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-27 11:19:00 -03:00
Colton Lewis
cfd97f94d0
spi: correct kernel-doc inconsistency
Silence documentation build warnings by correcting kernel-doc comment
for spi_transfer struct.

Signed-off-by: Colton Lewis <colton.w.lewis@protonmail.com>
Link: https://lore.kernel.org/r/20200725050242.279548-1-colton.w.lewis@protonmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2020-07-27 14:55:22 +01:00
Jonathan Liu
241b888791
spi: sun4i: update max transfer size reported
The spi-sun4i driver already has the ability to do large transfers.
However, the max transfer size reported is still fifo depth - 1.

Update the max transfer size reported to the max value possible.

Fixes: 196737912d ("spi: sun4i: Allow transfers larger than FIFO size")
Signed-off-by: Jonathan Liu <net147@gmail.com>
Acked-by: Maxime Ripard <mripard@kernel.org>
Link: https://lore.kernel.org/r/20200727072328.510798-1-net147@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2020-07-27 14:55:21 +01:00
Clark Wang
525c9e5a32
spi: imx: enable runtime pm support
Enable runtime pm support for spi-imx driver.

Signed-off-by: Clark Wang <xiaoning.wang@nxp.com>
Link: https://lore.kernel.org/r/20200727063354.17031-1-xiaoning.wang@nxp.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2020-07-27 14:55:20 +01:00
Huacai Chen
033555f6eb MIPS: KVM: Fix build error caused by 'kvm_run' cleanup
Commit c34b26b98c ("KVM: MIPS: clean up redundant 'kvm_run'
parameters") remove the 'kvm_run' parameter in kvm_mips_complete_mmio_
load(), but forget to update all callers.

Fixes: c34b26b98c ("KVM: MIPS: clean up redundant 'kvm_run' parameters")
Reported-by: kernel test robot <lkp@intel.com>
Cc: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Signed-off-by: Huacai Chen <chenhc@lemote.com>
Message-Id: <1595154207-9787-1-git-send-email-chenhc@lemote.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-27 09:53:55 -04:00
Thomas Weißschuh
e33929537b platform/x86: thinkpad_acpi: use standard charge control attribute names
The standard attributes were only introduced after the ones from
thinkpad_acpi in commit 813cab8f39 ("power: supply: core:
Add CHARGE_CONTROL_{START_THRESHOLD,END_THRESHOLD} properties").

The new standard attributes are aliased to their previous names,
preserving backwards compatibility.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2020-07-27 16:36:24 +03:00
Thomas Weißschuh
c1c04fbcb9 platform/x86: thinkpad_acpi: remove unused defines
They were never used.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2020-07-27 16:36:23 +03:00
Pi-Hsun Shih
a233547660
platform/chrome: cros_ec: Fix host command for regulator control.
Since the host command number 0x012B conflicts with other EC host
command, add one to all regulator control related host command.

Also fix a wrong alignment on struct and sync the comment with the one
in ChromeOS EC codebase.

Fixes: dff08caf35 ("platform/chrome: cros_ec: Add command for regulator control.")
Signed-off-by: Pi-Hsun Shih <pihsun@chromium.org>
Acked-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
Link: https://lore.kernel.org/r/20200724080358.619245-1-pihsun@chromium.org
Signed-off-by: Mark Brown <broonie@kernel.org>
2020-07-27 14:34:23 +01:00
Axel Lin
3bda44ffd9
regulator: pca9450: Convert to use module_i2c_driver
Use module_i2c_driver to simplify driver init boilerplate.

Signed-off-by: Axel Lin <axel.lin@ingics.com>
Link: https://lore.kernel.org/r/20200725014414.1825183-1-axel.lin@ingics.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2020-07-27 14:34:22 +01:00
Greg Kroah-Hartman
ca63779009 Revert "usb: dwc2: override PHY input signals with usb role switch support"
This reverts commit bc0f0d4a58.

It was not meant to be applied yet.

Cc: Minas Harutyunyan <hminas@synopsys.com>
Cc: Amelie Delaunay <amelie.delaunay@st.com>
Cc: Felipe Balbi <balbi@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-27 15:34:15 +02:00
Randy Dunlap
50c8a002bf platform/x86: ISST: drop a duplicated word in isst_if.h
Drop the repeated word "for" in a comment.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Cc: platform-driver-x86@vger.kernel.org
Cc: Darren Hart <dvhart@infradead.org>
Cc: Andy Shevchenko <andy@infradead.org>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2020-07-27 16:34:12 +03:00
Greg Kroah-Hartman
09df709cb5 Revert "usb: dwc2: don't use ID/Vbus detection if usb-role-switch on STM32MP15 SoCs"
This reverts commit 916f8b6272.

This was not meant to be applied as-is at the moment.

Cc: Minas Harutyunyan <hminas@synopsys.com>
Cc: Amelie Delaunay <amelie.delaunay@st.com>
Cc: Felipe Balbi <balbi@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-27 15:33:12 +02:00
Colin Ian King
eb27e5a314 ACPI: APEI: remove redundant assignment to variable rc
The variable rc is being initialized with a value that is
never read and it is being updated later with a new value. The
initialization is redundant and can be removed.

Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-07-27 15:22:58 +02:00
Mark Brown
950039fcb3
Merge series "ASoC: intel: use asoc_substream_to_rtd()" from Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>:
Hi Mark

I have posted "ASoC: add asoc_substream_to_rtd() macro"
patch-set to ALSA SoC ML (= see Link), and then Pierre-Louis
wanted that Intel patch was separated for boards.

The patches which are not for Intel were already accepted.
This is for Intel, and Intel boards.

Link: https://lore.kernel.org/r/87y2nf0yw2.wl-kuninori.morimoto.gx@renesas.com
Link: https://lore.kernel.org/r/877duz0ysw.wl-kuninori.morimoto.gx@renesas.com

Kuninori Morimoto (2):
  ASoC: intel/boards: use asoc_substream_to_rtd()
  ASoC: intel: use asoc_substream_to_rtd()

 sound/soc/intel/atom/sst-mfld-platform-pcm.c     |  6 +++---
 sound/soc/intel/baytrail/sst-baytrail-pcm.c      | 16 ++++++++--------
 sound/soc/intel/boards/bdw-rt5650.c              |  2 +-
 sound/soc/intel/boards/bdw-rt5677.c              |  4 ++--
 sound/soc/intel/boards/broadwell.c               |  2 +-
 sound/soc/intel/boards/bxt_rt298.c               |  2 +-
 sound/soc/intel/boards/byt-rt5640.c              |  2 +-
 sound/soc/intel/boards/bytcht_da7213.c           |  4 ++--
 sound/soc/intel/boards/bytcr_rt5640.c            |  2 +-
 sound/soc/intel/boards/bytcr_rt5651.c            |  2 +-
 sound/soc/intel/boards/cht_bsw_max98090_ti.c     |  2 +-
 sound/soc/intel/boards/cht_bsw_nau8824.c         |  2 +-
 sound/soc/intel/boards/cht_bsw_rt5645.c          |  2 +-
 sound/soc/intel/boards/cht_bsw_rt5672.c          |  2 +-
 sound/soc/intel/boards/cml_rt1011_rt5682.c       |  4 ++--
 sound/soc/intel/boards/ehl_rt5660.c              |  2 +-
 sound/soc/intel/boards/glk_rt5682_max98357a.c    |  2 +-
 sound/soc/intel/boards/haswell.c                 |  2 +-
 sound/soc/intel/boards/kbl_da7219_max98927.c     |  8 ++++----
 sound/soc/intel/boards/kbl_rt5660.c              |  2 +-
 sound/soc/intel/boards/kbl_rt5663_max98927.c     |  4 ++--
 .../intel/boards/kbl_rt5663_rt5514_max98927.c    |  4 ++--
 sound/soc/intel/boards/skl_nau88l25_max98357a.c  |  2 +-
 sound/soc/intel/boards/skl_nau88l25_ssm4567.c    |  2 +-
 sound/soc/intel/boards/skl_rt286.c               |  2 +-
 sound/soc/intel/boards/sof_da7219_max98373.c     |  2 +-
 sound/soc/intel/boards/sof_maxim_common.c        |  4 ++--
 sound/soc/intel/boards/sof_pcm512x.c             |  4 ++--
 sound/soc/intel/boards/sof_rt5682.c              |  4 ++--
 sound/soc/intel/boards/sof_sdw_rt1308.c          |  2 +-
 sound/soc/intel/boards/sof_wm8804.c              |  2 +-
 sound/soc/intel/haswell/sst-haswell-pcm.c        | 12 ++++++------
 sound/soc/intel/keembay/kmb_platform.c           |  2 +-
 sound/soc/intel/skylake/skl-pcm.c                |  8 ++++----
 34 files changed, 62 insertions(+), 62 deletions(-)

--
2.25.1
2020-07-27 14:21:10 +01:00
Stephan Gerhold
34facb0422
ASoC: dt-bindings: q6asm: Add Q6ASM_DAI_{TX_RX, TX, RX} defines
Right now the direction of a DAI has to be specified as a literal
number in the device tree, e.g.:

	dai@0 {
		reg = <0>;
		direction = <2>;
	};

but this does not make it immediately clear that this is a
playback/RX-only DAI.

Actually, q6asm-dai.c has useful defines for this. Move them to the
dt-bindings header to allow using them in the dts(i) files.
The example above then becomes:

	dai@0 {
		reg = <0>;
		direction = <Q6ASM_DAI_RX>;
	};

which is immediately recognizable as playback/RX-only DAI.

Signed-off-by: Stephan Gerhold <stephan@gerhold.net>
Reviewed-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Cc: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Link: https://lore.kernel.org/r/20200727082502.2341-1-stephan@gerhold.net
Signed-off-by: Mark Brown <broonie@kernel.org>
2020-07-27 14:21:09 +01:00
Akshu Agrawal
1255296cf0
ASoC: AMD: Restore PME_EN state at Power On
PME_EN state needs to restored to the value set by fmw.
For the devices which are not using I2S wake event which gets
enabled by PME_EN bit, keeping PME_EN enabled burns considerable amount
of power as it blocks low power state.
For the devices using I2S wake event, PME_EN gets enabled in fmw and the
state should be maintained after ACP Power On.

Signed-off-by: Akshu Agrawal <akshu.agrawal@amd.com>
Link: https://lore.kernel.org/r/20200724195600.11798-1-akshu.agrawal@amd.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2020-07-27 14:21:08 +01:00
Hanjun Guo
5b1e80204d ACPI: NUMA: Remove the useless 'node >= MAX_NUMNODES' check
acpi_map_pxm_to_node() will never return a NUMA node greater than
MAX_NUMNODES, so the 'node >= MAX_NUMNODES' check is not needed.
Remove it.

Signed-off-by: Hanjun Guo <guohanjun@huawei.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-07-27 15:19:12 +02:00
Hanjun Guo
1c60f91c31 ACPI: NUMA: Remove the useless sub table pointer check
In acpi_parse_entries_array(), the subtable entries (entry.hdr)
will never be NULL, so for ACPI subtable handler in struct
acpi_subtable_proc, will never handle NULL subtable entries.

Remove those useless subtable pointer checks in the callback
handlers.

Signed-off-by: Hanjun Guo <guohanjun@huawei.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-07-27 15:19:12 +02:00
Hanjun Guo
24194a7e03 ACPI: tables: Remove the duplicated checks for acpi_parse_entries_array()
acpi_disabled, pointer id and table_header are checked in
acpi_table_parse_entries_array(), and acpi_parse_entries_array() is
only called by acpi_table_parse_entries_array(), so those checks in
acpi_parse_entries_array() are duplicate.

Remove those duplicated checks and move the table_size check to
acpi_table_parse_entries_array() as well.

Signed-off-by: Hanjun Guo <guohanjun@huawei.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-07-27 15:19:12 +02:00
peterz@infradead.org
ed00495333 locking/lockdep: Fix TRACE_IRQFLAGS vs. NMIs
Prior to commit:

  859d069ee1 ("lockdep: Prepare for NMI IRQ state tracking")

IRQ state tracking was disabled in NMIs due to nmi_enter()
doing lockdep_off() -- with the obvious requirement that NMI entry
call nmi_enter() before trace_hardirqs_off().

[ AFAICT, PowerPC and SH violate this order on their NMI entry ]

However, that commit explicitly changed lockdep_hardirqs_*() to ignore
lockdep_off() and breaks every architecture that has irq-tracing in
it's NMI entry that hasn't been fixed up (x86 being the only fixed one
at this point).

The reason for this change is that by ignoring lockdep_off() we can:

  - get rid of 'current->lockdep_recursion' in lockdep_assert_irqs*()
    which was going to to give header-recursion issues with the
    seqlock rework.

  - allow these lockdep_assert_*() macros to function in NMI context.

Restore the previous state of things and allow an architecture to
opt-in to the NMI IRQ tracking support, however instead of relying on
lockdep_off(), rely on in_nmi(), both are part of nmi_enter() and so
over-all entry ordering doesn't need to change.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20200727124852.GK119549@hirez.programming.kicks-ass.net
2020-07-27 15:13:29 +02:00
Paolo Bonzini
5e105c88ab KVM: nVMX: check for invalid hdr.vmx.flags
hdr.vmx.flags is meant for future extensions to the ABI, rejecting
invalid flags is necessary to avoid broken half-loads of the
nVMX state.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-27 09:04:50 -04:00
Paolo Bonzini
0f02bd0ade KVM: nVMX: check for required but missing VMCS12 in KVM_SET_NESTED_STATE
A missing VMCS12 was not causing -EINVAL (it was just read with
copy_from_user, so it is not a security issue, but it is still
wrong).  Test for VMCS12 validity and reject the nested state
if a VMCS12 is required but not present.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-27 09:04:49 -04:00
Paolo Bonzini
9319676595 selftests: kvm: do not set guest mode flag
Setting KVM_STATE_NESTED_GUEST_MODE enables various consistency checks
on VMCS12 and therefore causes KVM_SET_NESTED_STATE to fail spuriously
with -EINVAL.  Do not set the flag so that we're sure to cover the
conditions included by the test, and cover the case where VMCS12 is
set and KVM_SET_NESTED_STATE is called with invalid VMCS12 contents.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-27 09:04:48 -04:00
Kuninori Morimoto
2ab9a40966
ASoC: intel: use asoc_substream_to_rtd()
Now we can use asoc_substream_to_rtd() macro,
let's use it.

Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Link: https://lore.kernel.org/r/87tuxtydcz.wl-kuninori.morimoto.gx@renesas.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2020-07-27 14:00:23 +01:00
Kuninori Morimoto
2207b93bc7
ASoC: intel/boards: use asoc_substream_to_rtd()
Now we can use asoc_substream_to_rtd() macro,
let's use it.

Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Link: https://lore.kernel.org/r/87v9i9yddc.wl-kuninori.morimoto.gx@renesas.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2020-07-27 14:00:22 +01:00
Bob Moore
2861ba7a0c ACPICA: Update version to 20200717
ACPICA commit c1adb9a2a775df7a85df0103342ebf090e1b2016

Version 20200717.

Link: c1adb9a2
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Erik Kaneda <erik.kaneda@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-07-27 14:55:42 +02:00
Erik Kaneda
6a54ebae6d ACPICA: Do not increment operation_region reference counts for field units
ACPICA commit e17b28cfcc31918d0db9547b6b274b09c413eb70

Object reference counts are used as a part of ACPICA's garbage
collection mechanism. This mechanism keeps track of references to
heap-allocated structures such as the ACPI operand objects.

Recent server firmware has revealed that this reference count can
overflow on large servers that declare many field units under the
same operation_region. This occurs because each field unit declaration
will add a reference count to the source operation_region.

This change solves the reference count overflow for operation_regions
objects by preventing fieldunits from incrementing their
operation_region's reference count. Each operation_region's reference
count will not be changed by named objects declared under the Field
operator. During namespace deletion, the operation_region namespace
node will be deleted and each fieldunit will be deleted without
touching the deleted operation_region object.

Link: e17b28cf
Signed-off-by: Erik Kaneda <erik.kaneda@intel.com>
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-07-27 14:55:42 +02:00
Gustavo A. R. Silva
10cfde5dc6 ACPICA: Replace one-element array with flexible-array
ACPICA commit 7ba2f3d91a32f104765961fda0ed78b884ae193d

The current codebase makes use of one-element arrays in the following
form:

struct something {
    int length;
    u8 data[1];
};

struct something *instance;

instance = kmalloc(sizeof(*instance) + size, GFP_KERNEL);
instance->length = size;
memcpy(instance->data, source, size);

but the preferred mechanism to declare variable-length types such as
these ones is a flexible array member[1][2], introduced in C99:

struct foo {
        int stuff;
        struct boo array[];
};

By making use of the mechanism above, we will get a compiler warning
in case the flexible array does not occur last in the structure,
which will help us prevent some kind of undefined behavior bugs from
being inadvertently introduced[3] to the linux codebase from now on.

This issue was found with the help of Coccinelle and audited _manually_.

[1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
[2] https://github.com/KSPP/linux/issues/21
[3] commit 7649773293 ("cxgb3/l2t: Fix undefined behaviour")

Link: 7ba2f3d9
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Erik Kaneda <erik.kaneda@intel.com>
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-07-27 14:55:42 +02:00
Alexander A. Klimov
4ce7796632 ACPI: Replace HTTP links with HTTPS ones
Rationale:
Reduces attack surface on kernel devs opening the links for MITM
as HTTPS traffic is much harder to manipulate.

Deterministic algorithm:
For each file:
  If not .svg:
    For each line:
      If doesn't contain `\bxmlns\b`:
        For each link, `\bhttp://[^# \t\r\n]*(?:\w|/)`:
	  If neither `\bgnu\.org/license`, nor `\bmozilla\.org/MPL\b`:
            If both the HTTP and HTTPS versions
            return 200 OK and serve the same content:
              Replace HTTP with HTTPS.

Signed-off-by: Alexander A. Klimov <grandmaster@al2klimov.de>
Acked-by: Vishal Verma <vishal.l.verma@intel.com>
Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-07-27 14:47:08 +02:00
Arnd Bergmann
66d3037898 AT91 defconfig for 5.9
- Add ClassD, KSZ ethernet switches, brdige, vlan to sama5_defconfig
  - Reenable CAN support in sama5_defconfig
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEycoQi/giopmpPgB12wIijOdRNOUFAl8d2HEACgkQ2wIijOdR
 NOU7OxAAhlq87INveQUTtU1dry7AQTxXDaQWxmH1x8M4eLPumgvO8m7eqapcqQ1D
 +cNoGqPAz3L95jKnhNlc5MVr8ZNP0i6sllLdluSCuDTMOM547uUSVWP/j/f1nrKR
 x9rkEcgUHgHRVxDigPVN7rk/7A6nSt18sn4uJcuCGqAu5J3oxhXQF7HqD6/5aaSz
 RL97XOtrn+9FeVAYQPD/mIv1o254AgBeymC5SwXVclHoH+NGKmsthxhMTRxMgYzl
 O2kyMGqpSo96CVkBraNRh0/XsxVILgwr/HUhY5ZxxKyHZ3wNsqLFbpYMY70QuwBi
 IRNefNaczzaknr2VPovr8JReWmSgVOhBnaxzlQVaZLaQ1MJ2BwAFrZIxUMfLeMQk
 X2DESL8sZXyux9fOQgynnKpfJCVIg+tqxdG1/kAxdQccb1w4SCnEQrXYpfA4QOWQ
 U2RSO8P+FuRwsTuCW4R/4HRSZfyNMmKYLkBGjBWWsB3dnobqdZTR34LpBdWokgNZ
 1NmFEv4NU5MDMiO18s3NEQmpfaLYnMno6GoVdEJkvlhnoABsAm4K46+QI5N2Rr4r
 B8hmrnCxI8Z8EELCIhynzYMyj9RCboWy8K8ZWSwYBu5ZvC7BF55SA2NorBRog1Q7
 cdDwZUHaPsJL2Te1GTzoujtGhbq3Lv9mvgg/guxL/NGuGpR5puU=
 =SguR
 -----END PGP SIGNATURE-----

Merge tag 'at91-defconfig-5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/at91/linux into arm/defconfig

AT91 defconfig for 5.9

 - Add ClassD, KSZ ethernet switches, brdige, vlan to sama5_defconfig
 - Reenable CAN support in sama5_defconfig

* tag 'at91-defconfig-5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/at91/linux:
  ARM: configs: at91: sama5: enable CAN PLATFORM driver
  ARM: configs: at91: sama5: enable bridge and VLAN filtering
  ARM: configs: at91: sama5: add support for KSZ ethernet switches
  ARM: configs: at91: sama5: Enable CLASSD

Link: https://lore.kernel.org/r/20200726192810.GA181818@piout.net
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2020-07-27 14:35:28 +02:00
Arnd Bergmann
9c52a2647a SOC: TI Keystone driver update for v5.9
- TI K3 Ring Accelerator updates
  - Few non critical warining fixes
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJfHJ+3AAoJEHJsHOdBp5c/qgAP/RxwxIgRI2iXG+tqgotdvTfm
 wgXZOW18nAyNAkGeuoWmFV1eaEx7mYtn58wCVU+lX+7Hn3JlecGCCwLJlwQzxt88
 i0sCFHFYlDYecaPojEwB3/HFh/ADNIsos52laWgBA5P9wtxBDCK9/kJCn1GBIfTm
 ExXU/VLZffS3u66WYkSgSM+bDRgkB0ZB1h8oNebZ7/iDPSPUGw31/4Gzs0iQbghc
 Wf1ooOqyLBG+yVA1wzePmyCkmGaACJ049+AdNJHQ9rOoYVJ8RDOgY94+AWfogJfd
 SsuKbuchxAieSdaL8EcU8wZZYEuyKKqRZKxgLifMfcsYizrP93nKHTktgAcyvjp+
 JPBfQVvB85MrEDI8nxbuCU1Pu7L0v76ujsfIoC5RHFX+x/8W5IOKBwmpHcFPxWoe
 RLGzOyzytCJ29sQLH7o/15A0jnWUIiquro+4ZavWckXXUNhsPJAPzhfFSH/kpB2K
 47EMK9GX8wPk/HuDCZSjRGI7OYz0JvXL9FHwUgvERcWhpUOS3VyI1+Wj6jRdrTML
 qbd+J0Bg5GzzxgQmUbvyqIv2eqOJUx/PEe3QriO2t5MDOXtwl/UsN3zuxPsRItJw
 FFA2wb5l55Z717FxjdhCFUdCIPUTKauRbNaGrV5dJyAQiWCeZdtc0KbA/e6e2Lpm
 gSUghw3Gu5JdO6T3ghbn
 =Y8Sq
 -----END PGP SIGNATURE-----

Merge tag 'drivers_soc_for_5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/ssantosh/linux-keystone into arm/drivers

SOC: TI Keystone driver update for v5.9

 - TI K3 Ring Accelerator updates
 - Few non critical warining fixes

* tag 'drivers_soc_for_5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/ssantosh/linux-keystone:
  soc: TI knav_qmss: make symbol 'knav_acc_range_ops' static
  firmware: ti_sci: Replace HTTP links with HTTPS ones
  soc: ti/ti_sci_protocol.h: drop a duplicated word + clarify
  soc: ti: k3: fix semicolon.cocci warnings
  soc: ti: k3-ringacc: fix: warn: variable dereferenced before check 'ring'
  dmaengine: ti: k3-udma: Switch to k3_ringacc_request_rings_pair
  soc: ti: k3-ringacc: separate soc specific initialization
  soc: ti: k3-ringacc: add request pair of rings api.
  soc: ti: k3-ringacc: add ring's flags to dump
  soc: ti: k3-ringacc: Move state tracking variables under a struct
  dt-bindings: soc: ti: k3-ringacc: convert bindings to json-schema

Link: https://lore.kernel.org/r/1595711814-7015-1-git-send-email-santosh.shilimkar@oracle.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2020-07-27 14:24:51 +02:00
Sumeet Pawnikar
8365a898fe powercap: Add Power Limit4 support
Modern Intel Mobile platforms support power limit4 (PL4), which is
the SoC package level maximum power limit (in Watts). It can be used
to preemptively limits potential SoC power to prevent power spikes
from tripping the power adapter and battery over-current protection.
This patch enables this feature by exposing package level peak power
capping control to userspace via RAPL sysfs interface. With this,
application like DTPF can modify PL4 power limit, the similar way
of other package power limit (PL1).
As this feature is not tested on previous generations, here it is
enabled only for the platform that has been verified to work,
for safety concerns.

Signed-off-by: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com>
Co-developed-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Tested-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-07-27 14:17:36 +02:00
Tiezhu Yang
0585c1c06a ACPI: Use valid link to the ACPI specification
Currently, acpi.info is an invalid link to access ACPI specification,
the new valid link is https://uefi.org/specifications.

Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-07-27 14:11:22 +02:00
PeiSen Hou
6fa38ef153 ALSA: hda/realtek: Fix add a "ultra_low_power" function for intel reference board (alc256)
Intel requires to enable power saving mode for intel reference board (alc256)

Signed-off-by: PeiSen Hou <pshou@realtek.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20200727115647.10967-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2020-07-27 13:58:04 +02:00
Paul Cercueil
02fd86b661 mmc: jz4740: Use pm_ptr() macro
Use the newly introduced pm_ptr() macro to simplify the code.

Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-07-27 13:52:36 +02:00
Paul Cercueil
756a64ce34 PM: Make *_DEV_PM_OPS macros use __maybe_unused
This way, when the dev_pm_ops instance is not referenced anywhere, it
will simply be dropped by the compiler without a warning.

Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-07-27 13:52:36 +02:00
Paul Cercueil
7a82e97a11 PM: core: introduce pm_ptr() macro
This macro is analogous to the infamous of_match_ptr(). If CONFIG_PM
is enabled, this macro will resolve to its argument, otherwise to NULL.

Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-07-27 13:52:36 +02:00
Greg Kroah-Hartman
17a8271658 USB: iowarrior: fix up report size handling for some devices
In previous patches that added support for new iowarrior devices, the
handling of the report size was not done correct.

Fix that up and update the copyright date for the driver

Reworked from an original patch written by Christoph Jung.

Fixes: bab5417f5f ("USB: misc: iowarrior: add support for the 100 device")
Fixes: 5f6f8da2d7 ("USB: misc: iowarrior: add support for the 28 and 28L devices")
Fixes: 461d8deb26 ("USB: misc: iowarrior: add support for 2 OEMed devices")
Cc: stable <stable@kernel.org>
Reported-by: Christoph Jung <jung@codemercs.com>
Link: https://lore.kernel.org/r/20200726094939.1268978-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-27 13:19:58 +02:00
Greg Kroah-Hartman
e98ba8cc3f USB: changes for v5.9 merge window
CDNS3 got several improvements, most of which are non-critical fixes.
 DWC3 has a reset fix for the meson platform, while dwc2 has
 improvements for role switch on STM32MP15 SoCs.
 
 Apart from these, we have the usual set of non-critical fixes all over
 the place and support for new Ingenic SoC to their PHY driver.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEElLzh7wn96CXwjh2IzL64meEamQYFAl8etZ0ACgkQzL64meEa
 mQYbWA/+OcnH1oCqG1uD4M4NUzwAX8yM+ObEQ26e1K6r5z7G3kdqKbiOuH7Ksw35
 VeHaP9XuOVtTMF+YBJptINeuwmF+KMJen+So3x5b6JOdTHdVsu/RbSBmZUPhHeUU
 rUJbJlDPKUcms7IsPGKtNaKkqcBDvVUgh3LvqB7UOjMRHemf7rXp3Mp6Jvx1GrhJ
 C5eUqpNsPWd3JykYCfsyxGN1+crkSDRZ6EEBKRw2CLq0IaRLesK0SRX91vuD/YUV
 XBgLauop1MQr/cJR9OWEaFhOGNTDPvP5ijKJ98JYfGDmT5fxJh+XgYHm0LOqEy9X
 Vjsifox2tFqIhs8a9M4Xuk5CH/wAS07VUaS7zhHvI1k+QbExJPBkT8wFqPRHUVwq
 sfIAy9oHA/rAOnpq4RolBZUBgypKO9qkSIbozJ0aEEMcFWqcC1zvj5+8LBZx/VG5
 WiTlBqGyTaNClaT9FSJqJ3JlRdl7Sm719WpqJ4fUpHhlTXIv0PPhf5ov6a/nOYH4
 qjmF5LlK9Tf96iEoVTS+5EJkos2jyWX2RHNrQBMY+mm9Du2DrmzAE4tNqz8Em/yC
 rUUua8L0n0popcQAQqeLw0GX7xgol+K9smAULXlWLi3+TCVOHpousB3uMCca8cIw
 ERxAL1mPYMUXsravwu517Ky0lDDW429O/7Gk3YyzSAjAx3zn/Sw=
 =xNwh
 -----END PGP SIGNATURE-----

Merge tag 'usb-for-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb into usb-next

Felipe writes:

USB: changes for v5.9 merge window

CDNS3 got several improvements, most of which are non-critical fixes.
DWC3 has a reset fix for the meson platform, while dwc2 has
improvements for role switch on STM32MP15 SoCs.

Apart from these, we have the usual set of non-critical fixes all over
the place and support for new Ingenic SoC to their PHY driver.

* tag 'usb-for-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb: (38 commits)
  usb: dwc3: gadget: when the started list is empty stop the active xfer
  usb: dwc3: gadget: make starting isoc transfers more robust
  usb: dwc3: gadget: add frame number mask
  usb: gadget: function: printer: Interface is disabled and returns error
  usb: gadget: f_uac2: fix AC Interface Header Descriptor wTotalLength
  dt-bindings: usb: ti,keystone-dwc3.yaml: Improve schema
  usb: bdc: Use devm_clk_get_optional()
  usb: bdc: Halt controller on suspend
  usb: bdc: driver runs out of buffer descriptors on large ADB transfers
  usb: bdc: Adb shows offline after resuming from S2
  bdc: Fix bug causing crash after multiple disconnects
  usb: bdc: Add compatible string for new style USB DT nodes
  dt-bindings: usb: bdc: Update compatible strings
  USB: PHY: JZ4770: Reformat the code to align it.
  USB: PHY: JZ4770: Add support for new Ingenic SoCs.
  USB: PHY: JZ4770: Unify code style and simplify code.
  dt-bindings: USB: Add bindings for new Ingenic SoCs.
  usb: gadget: net2280: fix memory leak on probe error handling paths
  usb: cdns3: drd: simplify *switch_gadet and *switch_host
  usb: cdns3: core: removed overwriting some error code
  ...
2020-07-27 13:16:18 +02:00
Wei Hu
d6af2ed29c PCI: hv: Fix a timing issue which causes kdump to fail occasionally
Kdump could fail sometime on Hyper-V guest because the retry in
hv_pci_enter_d0() releases child device structures in hv_pci_bus_exit().

Although there is a second asynchronous device relations message sending
from the host, if this message arrives to the guest after
hv_send_resource_allocated() is called, the retry would fail.

Fix the problem by moving retry to hv_pci_probe() and start the retry
from hv_pci_query_relations() call.  This will cause a device relations
message to arrive to the guest synchronously; the guest would then be
able to rebuild the child device structures before calling
hv_send_resource_allocated().

Link: https://lore.kernel.org/r/20200727071731.18516-1-weh@microsoft.com
Fixes: c81992e7f4 ("PCI: hv: Retry PCI bus D0 entry on invalid device state")
Signed-off-by: Wei Hu <weh@microsoft.com>
[lorenzo.pieralisi@arm.com: fixed a comment and commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
2020-07-27 12:10:16 +01:00
Todd Brandt
68f9b22842 pm-graph v5.7 - important s2idle fixes
Important fixes:

 - in s2idle, use timekeeping_freeze trace mark instead of
   machine_suspend to denote entry into s2idle mode.

 - in s2idle, use machine_suspend trace mark to create a new virtual
   device called "s2idle_enter_<n>x". It denotes an s2idle_enter call
   loop of <n> iterations where s2idle was never actually achieved.
   It isn't counted as "freeze time" in the header.

 - in s2idle, only show multiple freeze times if s2idle went in and
   out of resume_noirq. Otherwise multiple freezes are shown with
   "waking" time subtracted (waking time is time spent outside s2idle
   dealing with wakeups).

 - in s2idle summaries, include "FREEZEWAKE" as an issue when at
   least 1ms is spent waking from s2idle. A clean run should only
   wake for the rtc timer.

 - add support for device callbacks with matching names in the same
   phase. In rare cases some devices register multiple callbacks from
   separate drivers using the same name. Without this fix only one is
   shown.

 - add kparamsfmt string back to fix bootgraph

General updates:

 - when suspend_machine is missing, error says "failed in
   suspend_machine"

 - extract target count/time and add to summary title if -multi
   used

 - include any instances of "timeout" in dmesg as issues to be
   logged.

 - fix ftrace parse to handle any number of flags (instead of
   just 4).

 - remove sync/async_device string from device detail, remains in
   hover.

 - when using callgraph (-f) add driver name to callgraph titles.

Signed-off-by: Todd Brandt <todd.e.brandt@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-07-27 13:03:36 +02:00
Filipe Manana
5e548b3201 btrfs: do not set the full sync flag on the inode during page release
When removing an extent map at try_release_extent_mapping(), called through
the page release callback (btrfs_releasepage()), we always set the full
sync flag on the inode, which forces the next fsync to use a slower code
path.

This hurts performance for workloads that dirty an amount of data that
exceeds or is very close to the system's RAM memory and do frequent fsync
operations (like database servers can for example). In particular if there
are concurrent fsyncs against different files, by falling back to a full
fsync we do a lot more checksum lookups in the checksums btree, as we do
it for all the extents created in the current transaction, instead of only
the new ones since the last fsync. These checksums lookups not only take
some time but, more importantly, they also cause contention on the
checksums btree locks due to the concurrency with checksum insertions in
the btree by ordered extents from other inodes.

We actually don't need to set the full sync flag on the inode, because we
only remove extent maps that are in the list of modified extents if they
were created in a past transaction, in which case an fsync skips them as
it's pointless to log them. So stop setting the full fsync flag on the
inode whenever we remove an extent map.

This patch is part of a patchset that consists of 3 patches, which have
the following subjects:

1/3 btrfs: fix race between page release and a fast fsync
2/3 btrfs: release old extent maps during page release
3/3 btrfs: do not set the full sync flag on the inode during page release

Performance tests were ran against a branch (misc-next) containing the
whole patchset. The test exercises a workload where there are multiple
processes writing to files and fsyncing them (each writing and fsyncing
its own file), and in total the amount of data dirtied ranges from 2x to
4x the system's RAM memory (16GiB), so that the page release callback is
invoked frequently.

The following script, using fio, was used to perform the tests:

  $ cat test-fsync.sh
  #!/bin/bash

  DEV=/dev/sdk
  MNT=/mnt/sdk
  MOUNT_OPTIONS="-o ssd"
  MKFS_OPTIONS="-d single -m single"

  if [ $# -ne 3 ]; then
      echo "Use $0 NUM_JOBS FILE_SIZE FSYNC_FREQ"
      exit 1
  fi

  NUM_JOBS=$1
  FILE_SIZE=$2
  FSYNC_FREQ=$3

  cat <<EOF > /tmp/fio-job.ini
  [writers]
  rw=write
  fsync=$FSYNC_FREQ
  fallocate=none
  group_reporting=1
  direct=0
  bs=64k
  ioengine=sync
  size=$FILE_SIZE
  directory=$MNT
  numjobs=$NUM_JOBS
  thread
  EOF

  echo "Using config:"
  echo
  cat /tmp/fio-job.ini
  echo

  mkfs.btrfs -f $MKFS_OPTIONS $DEV &> /dev/null
  mount $MOUNT_OPTIONS $DEV $MNT
  fio /tmp/fio-job.ini
  umount $MNT

The tests were performed for different numbers of jobs, file sizes and
fsync frequency. A qemu VM using kvm was used, with 8 cores (the host has
12 cores, with cpu governance set to performance mode on all cores), 16GiB
of ram (the host has 64GiB) and using a NVMe device directly (without an
intermediary filesystem in the host). While running the tests, the host
was not used for anything else, to avoid disturbing the tests.

The obtained results were the following, and the last line printed by
fio is pasted (includes aggregated throughput and test run time).

    *****************************************************
    ****     1 job, 32GiB file, fsync frequency 1     ****
    *****************************************************

Before patchset:

WRITE: bw=29.1MiB/s (30.5MB/s), 29.1MiB/s-29.1MiB/s (30.5MB/s-30.5MB/s), io=32.0GiB (34.4GB), run=1127557-1127557msec

After patchset:

WRITE: bw=29.3MiB/s (30.7MB/s), 29.3MiB/s-29.3MiB/s (30.7MB/s-30.7MB/s), io=32.0GiB (34.4GB), run=1119042-1119042msec
(+0.7% throughput, -0.8% run time)

    *****************************************************
    ****     2 jobs, 16GiB files, fsync frequency 1   ****
    *****************************************************

Before patchset:

WRITE: bw=33.5MiB/s (35.1MB/s), 33.5MiB/s-33.5MiB/s (35.1MB/s-35.1MB/s), io=32.0GiB (34.4GB), run=979000-979000msec

After patchset:

WRITE: bw=39.9MiB/s (41.8MB/s), 39.9MiB/s-39.9MiB/s (41.8MB/s-41.8MB/s), io=32.0GiB (34.4GB), run=821283-821283msec
(+19.1% throughput, -16.1% runtime)

    *****************************************************
    ****     4 jobs, 8GiB files, fsync frequency 1    ****
    *****************************************************

Before patchset:

WRITE: bw=52.1MiB/s (54.6MB/s), 52.1MiB/s-52.1MiB/s (54.6MB/s-54.6MB/s), io=32.0GiB (34.4GB), run=629130-629130msec

After patchset:

WRITE: bw=71.8MiB/s (75.3MB/s), 71.8MiB/s-71.8MiB/s (75.3MB/s-75.3MB/s), io=32.0GiB (34.4GB), run=456357-456357msec
(+37.8% throughput, -27.5% runtime)

    *****************************************************
    ****     8 jobs, 4GiB files, fsync frequency 1    ****
    *****************************************************

Before patchset:

WRITE: bw=76.1MiB/s (79.8MB/s), 76.1MiB/s-76.1MiB/s (79.8MB/s-79.8MB/s), io=32.0GiB (34.4GB), run=430708-430708msec

After patchset:

WRITE: bw=133MiB/s (140MB/s), 133MiB/s-133MiB/s (140MB/s-140MB/s), io=32.0GiB (34.4GB), run=245458-245458msec
(+74.7% throughput, -43.0% run time)

    *****************************************************
    ****    16 jobs, 2GiB files, fsync frequency 1    ****
    *****************************************************

Before patchset:

WRITE: bw=74.7MiB/s (78.3MB/s), 74.7MiB/s-74.7MiB/s (78.3MB/s-78.3MB/s), io=32.0GiB (34.4GB), run=438625-438625msec

After patchset:

WRITE: bw=184MiB/s (193MB/s), 184MiB/s-184MiB/s (193MB/s-193MB/s), io=32.0GiB (34.4GB), run=177864-177864msec
(+146.3% throughput, -59.5% run time)

    *****************************************************
    ****    32 jobs, 2GiB files, fsync frequency 1    ****
    *****************************************************

Before patchset:

WRITE: bw=72.6MiB/s (76.1MB/s), 72.6MiB/s-72.6MiB/s (76.1MB/s-76.1MB/s), io=64.0GiB (68.7GB), run=902615-902615msec

After patchset:

WRITE: bw=227MiB/s (238MB/s), 227MiB/s-227MiB/s (238MB/s-238MB/s), io=64.0GiB (68.7GB), run=288936-288936msec
(+212.7% throughput, -68.0% run time)

    *****************************************************
    ****    64 jobs, 1GiB files, fsync frequency 1    ****
    *****************************************************

Before patchset:

WRITE: bw=98.8MiB/s (104MB/s), 98.8MiB/s-98.8MiB/s (104MB/s-104MB/s), io=64.0GiB (68.7GB), run=663126-663126msec

After patchset:

WRITE: bw=294MiB/s (308MB/s), 294MiB/s-294MiB/s (308MB/s-308MB/s), io=64.0GiB (68.7GB), run=222940-222940msec
(+197.6% throughput, -66.4% run time)

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2020-07-27 12:55:48 +02:00
Filipe Manana
fbc2bd7e7a btrfs: release old extent maps during page release
When removing an extent map at try_release_extent_mapping(), called through
the page release callback (btrfs_releasepage()), we never release an extent
map that is in the list of modified extents. This is to prevent races with
a concurrent fsync using the fast path, which could lead to not logging an
extent created in the current transaction.

However we can safely remove an extent map created in a past transaction
that is still in the list of modified extents (because no one fsynced yet
the inode after that transaction got commited), because such extents are
skipped during an fsync as it is pointless to log them. This change does
that.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2020-07-27 12:55:48 +02:00
Filipe Manana
3d6448e631 btrfs: fix race between page release and a fast fsync
When releasing an extent map, done through the page release callback, we
can race with an ongoing fast fsync and cause the fsync to miss a new
extent and not log it. The steps for this to happen are the following:

1) A page is dirtied for some inode I;

2) Writeback for that page is triggered by a path other than fsync, for
   example by the system due to memory pressure;

3) When the ordered extent for the extent (a single 4K page) finishes,
   we unpin the corresponding extent map and set its generation to N,
   the current transaction's generation;

4) The btrfs_releasepage() callback is invoked by the system due to
   memory pressure for that no longer dirty page of inode I;

5) At the same time, some task calls fsync on inode I, joins transaction
   N, and at btrfs_log_inode() it sees that the inode does not have the
   full sync flag set, so we proceed with a fast fsync. But before we get
   into btrfs_log_changed_extents() and lock the inode's extent map tree:

6) Through btrfs_releasepage() we end up at try_release_extent_mapping()
   and we remove the extent map for the new 4Kb extent, because it is
   neither pinned anymore nor locked. By calling remove_extent_mapping(),
   we remove the extent map from the list of modified extents, since the
   extent map does not have the logging flag set. We unlock the inode's
   extent map tree;

7) The task doing the fast fsync now enters btrfs_log_changed_extents(),
   locks the inode's extent map tree and iterates its list of modified
   extents, which no longer has the 4Kb extent in it, so it does not log
   the extent;

8) The fsync finishes;

9) Before transaction N is committed, a power failure happens. After
   replaying the log, the 4K extent of inode I will be missing, since
   it was not logged due to the race with try_release_extent_mapping().

So fix this by teaching try_release_extent_mapping() to not remove an
extent map if it's still in the list of modified extents.

Fixes: ff44c6e36d ("Btrfs: do not hold the write_lock on the extent tree while logging")
CC: stable@vger.kernel.org # 5.4+
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2020-07-27 12:55:47 +02:00
Johannes Thumshirn
88c4703f00 btrfs: open-code remount flag setting in btrfs_remount
When we're (re)mounting a btrfs filesystem we set the
BTRFS_FS_STATE_REMOUNTING state in fs_info to serialize against async
reclaim or defrags.

This flag is set in btrfs_remount_prepare() called by btrfs_remount().
As btrfs_remount_prepare() does nothing but setting this flag and
doesn't have a second caller, we can just open-code the flag setting in
btrfs_remount().

Similarly do for so clearing of the flag by moving it out of
btrfs_remount_cleanup() into btrfs_remount() to be symmetrical.

Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2020-07-27 12:55:47 +02:00
Josef Bacik
162e0a16b7 btrfs: if we're restriping, use the target restripe profile
Previously we depended on some weird behavior in our chunk allocator to
force the allocation of new stripes, so by the time we got to doing the
reduce we would usually already have a chunk with the proper target.

However that behavior causes other problems and needs to be removed.
First however we need to remove this check to only restripe if we
already have those available profiles, because if we're allocating our
first chunk it obviously will not be available.  Simply use the target
as specified, and if that fails it'll be because we're out of space.

Tested-by: Holger Hoffstätte <holger@applied-asynchrony.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2020-07-27 12:55:47 +02:00
Josef Bacik
349e120ece btrfs: don't adjust bg flags and use default allocation profiles
btrfs/061 has been failing consistently for me recently with a
transaction abort.  We run out of space in the system chunk array, which
means we've allocated way too many system chunks than we need.

Chris added this a long time ago for balance as a poor mans restriping.
If you had a single disk and then added another disk and then did a
balance, update_block_group_flags would then figure out which RAID level
you needed.

Fast forward to today and we have restriping behavior, so we can
explicitly tell the fs that we're trying to change the raid level.  This
is accomplished through the normal get_alloc_profile path.

Furthermore this code actually causes btrfs/061 to fail, because we do
things like mkfs -m dup -d single with multiple devices.  This trips
this check

alloc_flags = update_block_group_flags(fs_info, cache->flags);
if (alloc_flags != cache->flags) {
	ret = btrfs_chunk_alloc(trans, alloc_flags, CHUNK_ALLOC_FORCE);

in btrfs_inc_block_group_ro.  Because we're balancing and scrubbing, but
not actually restriping, we keep forcing chunk allocation of RAID1
chunks.  This eventually causes us to run out of system space and the
file system aborts and flips read only.

We don't need this poor mans restriping any more, simply use the normal
get_alloc_profile helper, which will get the correct alloc_flags and
thus make the right decision for chunk allocation.  This keeps us from
allocating a billion system chunks and falling over.

Tested-by: Holger Hoffstätte <holger@applied-asynchrony.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2020-07-27 12:55:47 +02:00
Josef Bacik
ab0db043c3 btrfs: fix lockdep splat from btrfs_dump_space_info
When running with -o enospc_debug you can get the following splat if one
of the dump_space_info's trip

  ======================================================
  WARNING: possible circular locking dependency detected
  5.8.0-rc5+ #20 Tainted: G           OE
  ------------------------------------------------------
  dd/563090 is trying to acquire lock:
  ffff9e7dbf4f1e18 (&ctl->tree_lock){+.+.}-{2:2}, at: btrfs_dump_free_space+0x2b/0xa0 [btrfs]

  but task is already holding lock:
  ffff9e7e2284d428 (&cache->lock){+.+.}-{2:2}, at: btrfs_dump_space_info+0xaa/0x120 [btrfs]

  which lock already depends on the new lock.

  the existing dependency chain (in reverse order) is:

  -> #3 (&cache->lock){+.+.}-{2:2}:
	 _raw_spin_lock+0x25/0x30
	 btrfs_add_reserved_bytes+0x3c/0x3c0 [btrfs]
	 find_free_extent+0x7ef/0x13b0 [btrfs]
	 btrfs_reserve_extent+0x9b/0x180 [btrfs]
	 btrfs_alloc_tree_block+0xc1/0x340 [btrfs]
	 alloc_tree_block_no_bg_flush+0x4a/0x60 [btrfs]
	 __btrfs_cow_block+0x122/0x530 [btrfs]
	 btrfs_cow_block+0x106/0x210 [btrfs]
	 commit_cowonly_roots+0x55/0x300 [btrfs]
	 btrfs_commit_transaction+0x4ed/0xac0 [btrfs]
	 sync_filesystem+0x74/0x90
	 generic_shutdown_super+0x22/0x100
	 kill_anon_super+0x14/0x30
	 btrfs_kill_super+0x12/0x20 [btrfs]
	 deactivate_locked_super+0x36/0x70
	 cleanup_mnt+0x104/0x160
	 task_work_run+0x5f/0x90
	 __prepare_exit_to_usermode+0x1bd/0x1c0
	 do_syscall_64+0x5e/0xb0
	 entry_SYSCALL_64_after_hwframe+0x44/0xa9

  -> #2 (&space_info->lock){+.+.}-{2:2}:
	 _raw_spin_lock+0x25/0x30
	 btrfs_block_rsv_release+0x1a6/0x3f0 [btrfs]
	 btrfs_inode_rsv_release+0x4f/0x170 [btrfs]
	 btrfs_clear_delalloc_extent+0x155/0x480 [btrfs]
	 clear_state_bit+0x81/0x1a0 [btrfs]
	 __clear_extent_bit+0x25c/0x5d0 [btrfs]
	 clear_extent_bit+0x15/0x20 [btrfs]
	 btrfs_invalidatepage+0x2b7/0x3c0 [btrfs]
	 truncate_cleanup_page+0x47/0xe0
	 truncate_inode_pages_range+0x238/0x840
	 truncate_pagecache+0x44/0x60
	 btrfs_setattr+0x202/0x5e0 [btrfs]
	 notify_change+0x33b/0x490
	 do_truncate+0x76/0xd0
	 path_openat+0x687/0xa10
	 do_filp_open+0x91/0x100
	 do_sys_openat2+0x215/0x2d0
	 do_sys_open+0x44/0x80
	 do_syscall_64+0x52/0xb0
	 entry_SYSCALL_64_after_hwframe+0x44/0xa9

  -> #1 (&tree->lock#2){+.+.}-{2:2}:
	 _raw_spin_lock+0x25/0x30
	 find_first_extent_bit+0x32/0x150 [btrfs]
	 write_pinned_extent_entries.isra.0+0xc5/0x100 [btrfs]
	 __btrfs_write_out_cache+0x172/0x480 [btrfs]
	 btrfs_write_out_cache+0x7a/0xf0 [btrfs]
	 btrfs_write_dirty_block_groups+0x286/0x3b0 [btrfs]
	 commit_cowonly_roots+0x245/0x300 [btrfs]
	 btrfs_commit_transaction+0x4ed/0xac0 [btrfs]
	 close_ctree+0xf9/0x2f5 [btrfs]
	 generic_shutdown_super+0x6c/0x100
	 kill_anon_super+0x14/0x30
	 btrfs_kill_super+0x12/0x20 [btrfs]
	 deactivate_locked_super+0x36/0x70
	 cleanup_mnt+0x104/0x160
	 task_work_run+0x5f/0x90
	 __prepare_exit_to_usermode+0x1bd/0x1c0
	 do_syscall_64+0x5e/0xb0
	 entry_SYSCALL_64_after_hwframe+0x44/0xa9

  -> #0 (&ctl->tree_lock){+.+.}-{2:2}:
	 __lock_acquire+0x1240/0x2460
	 lock_acquire+0xab/0x360
	 _raw_spin_lock+0x25/0x30
	 btrfs_dump_free_space+0x2b/0xa0 [btrfs]
	 btrfs_dump_space_info+0xf4/0x120 [btrfs]
	 btrfs_reserve_extent+0x176/0x180 [btrfs]
	 __btrfs_prealloc_file_range+0x145/0x550 [btrfs]
	 cache_save_setup+0x28d/0x3b0 [btrfs]
	 btrfs_start_dirty_block_groups+0x1fc/0x4f0 [btrfs]
	 btrfs_commit_transaction+0xcc/0xac0 [btrfs]
	 btrfs_alloc_data_chunk_ondemand+0x162/0x4c0 [btrfs]
	 btrfs_check_data_free_space+0x4c/0xa0 [btrfs]
	 btrfs_buffered_write.isra.0+0x19b/0x740 [btrfs]
	 btrfs_file_write_iter+0x3cf/0x610 [btrfs]
	 new_sync_write+0x11e/0x1b0
	 vfs_write+0x1c9/0x200
	 ksys_write+0x68/0xe0
	 do_syscall_64+0x52/0xb0
	 entry_SYSCALL_64_after_hwframe+0x44/0xa9

  other info that might help us debug this:

  Chain exists of:
    &ctl->tree_lock --> &space_info->lock --> &cache->lock

   Possible unsafe locking scenario:

	 CPU0                    CPU1
	 ----                    ----
    lock(&cache->lock);
				 lock(&space_info->lock);
				 lock(&cache->lock);
    lock(&ctl->tree_lock);

   *** DEADLOCK ***

  6 locks held by dd/563090:
   #0: ffff9e7e21d18448 (sb_writers#14){.+.+}-{0:0}, at: vfs_write+0x195/0x200
   #1: ffff9e7dd0410ed8 (&sb->s_type->i_mutex_key#19){++++}-{3:3}, at: btrfs_file_write_iter+0x86/0x610 [btrfs]
   #2: ffff9e7e21d18638 (sb_internal#2){.+.+}-{0:0}, at: start_transaction+0x40b/0x5b0 [btrfs]
   #3: ffff9e7e1f05d688 (&cur_trans->cache_write_mutex){+.+.}-{3:3}, at: btrfs_start_dirty_block_groups+0x158/0x4f0 [btrfs]
   #4: ffff9e7e2284ddb8 (&space_info->groups_sem){++++}-{3:3}, at: btrfs_dump_space_info+0x69/0x120 [btrfs]
   #5: ffff9e7e2284d428 (&cache->lock){+.+.}-{2:2}, at: btrfs_dump_space_info+0xaa/0x120 [btrfs]

  stack backtrace:
  CPU: 3 PID: 563090 Comm: dd Tainted: G           OE     5.8.0-rc5+ #20
  Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./890FX Deluxe5, BIOS P1.40 05/03/2011
  Call Trace:
   dump_stack+0x96/0xd0
   check_noncircular+0x162/0x180
   __lock_acquire+0x1240/0x2460
   ? wake_up_klogd.part.0+0x30/0x40
   lock_acquire+0xab/0x360
   ? btrfs_dump_free_space+0x2b/0xa0 [btrfs]
   _raw_spin_lock+0x25/0x30
   ? btrfs_dump_free_space+0x2b/0xa0 [btrfs]
   btrfs_dump_free_space+0x2b/0xa0 [btrfs]
   btrfs_dump_space_info+0xf4/0x120 [btrfs]
   btrfs_reserve_extent+0x176/0x180 [btrfs]
   __btrfs_prealloc_file_range+0x145/0x550 [btrfs]
   ? btrfs_qgroup_reserve_data+0x1d/0x60 [btrfs]
   cache_save_setup+0x28d/0x3b0 [btrfs]
   btrfs_start_dirty_block_groups+0x1fc/0x4f0 [btrfs]
   btrfs_commit_transaction+0xcc/0xac0 [btrfs]
   ? start_transaction+0xe0/0x5b0 [btrfs]
   btrfs_alloc_data_chunk_ondemand+0x162/0x4c0 [btrfs]
   btrfs_check_data_free_space+0x4c/0xa0 [btrfs]
   btrfs_buffered_write.isra.0+0x19b/0x740 [btrfs]
   ? ktime_get_coarse_real_ts64+0xa8/0xd0
   ? trace_hardirqs_on+0x1c/0xe0
   btrfs_file_write_iter+0x3cf/0x610 [btrfs]
   new_sync_write+0x11e/0x1b0
   vfs_write+0x1c9/0x200
   ksys_write+0x68/0xe0
   do_syscall_64+0x52/0xb0
   entry_SYSCALL_64_after_hwframe+0x44/0xa9

This is because we're holding the block_group->lock while trying to dump
the free space cache.  However we don't need this lock, we just need it
to read the values for the printk, so move the free space cache dumping
outside of the block group lock.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2020-07-27 12:55:47 +02:00
Josef Bacik
01d01caf19 btrfs: move the chunk_mutex in btrfs_read_chunk_tree
We are currently getting this lockdep splat in btrfs/161:

  ======================================================
  WARNING: possible circular locking dependency detected
  5.8.0-rc5+ #20 Tainted: G            E
  ------------------------------------------------------
  mount/678048 is trying to acquire lock:
  ffff9b769f15b6e0 (&fs_devs->device_list_mutex){+.+.}-{3:3}, at: clone_fs_devices+0x4d/0x170 [btrfs]

  but task is already holding lock:
  ffff9b76abdb08d0 (&fs_info->chunk_mutex){+.+.}-{3:3}, at: btrfs_read_chunk_tree+0x6a/0x800 [btrfs]

  which lock already depends on the new lock.

  the existing dependency chain (in reverse order) is:

  -> #1 (&fs_info->chunk_mutex){+.+.}-{3:3}:
	 __mutex_lock+0x8b/0x8f0
	 btrfs_init_new_device+0x2d2/0x1240 [btrfs]
	 btrfs_ioctl+0x1de/0x2d20 [btrfs]
	 ksys_ioctl+0x87/0xc0
	 __x64_sys_ioctl+0x16/0x20
	 do_syscall_64+0x52/0xb0
	 entry_SYSCALL_64_after_hwframe+0x44/0xa9

  -> #0 (&fs_devs->device_list_mutex){+.+.}-{3:3}:
	 __lock_acquire+0x1240/0x2460
	 lock_acquire+0xab/0x360
	 __mutex_lock+0x8b/0x8f0
	 clone_fs_devices+0x4d/0x170 [btrfs]
	 btrfs_read_chunk_tree+0x330/0x800 [btrfs]
	 open_ctree+0xb7c/0x18ce [btrfs]
	 btrfs_mount_root.cold+0x13/0xfa [btrfs]
	 legacy_get_tree+0x30/0x50
	 vfs_get_tree+0x28/0xc0
	 fc_mount+0xe/0x40
	 vfs_kern_mount.part.0+0x71/0x90
	 btrfs_mount+0x13b/0x3e0 [btrfs]
	 legacy_get_tree+0x30/0x50
	 vfs_get_tree+0x28/0xc0
	 do_mount+0x7de/0xb30
	 __x64_sys_mount+0x8e/0xd0
	 do_syscall_64+0x52/0xb0
	 entry_SYSCALL_64_after_hwframe+0x44/0xa9

  other info that might help us debug this:

   Possible unsafe locking scenario:

	 CPU0                    CPU1
	 ----                    ----
    lock(&fs_info->chunk_mutex);
				 lock(&fs_devs->device_list_mutex);
				 lock(&fs_info->chunk_mutex);
    lock(&fs_devs->device_list_mutex);

   *** DEADLOCK ***

  3 locks held by mount/678048:
   #0: ffff9b75ff5fb0e0 (&type->s_umount_key#63/1){+.+.}-{3:3}, at: alloc_super+0xb5/0x380
   #1: ffffffffc0c2fbc8 (uuid_mutex){+.+.}-{3:3}, at: btrfs_read_chunk_tree+0x54/0x800 [btrfs]
   #2: ffff9b76abdb08d0 (&fs_info->chunk_mutex){+.+.}-{3:3}, at: btrfs_read_chunk_tree+0x6a/0x800 [btrfs]

  stack backtrace:
  CPU: 2 PID: 678048 Comm: mount Tainted: G            E     5.8.0-rc5+ #20
  Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./890FX Deluxe5, BIOS P1.40 05/03/2011
  Call Trace:
   dump_stack+0x96/0xd0
   check_noncircular+0x162/0x180
   __lock_acquire+0x1240/0x2460
   ? asm_sysvec_apic_timer_interrupt+0x12/0x20
   lock_acquire+0xab/0x360
   ? clone_fs_devices+0x4d/0x170 [btrfs]
   __mutex_lock+0x8b/0x8f0
   ? clone_fs_devices+0x4d/0x170 [btrfs]
   ? rcu_read_lock_sched_held+0x52/0x60
   ? cpumask_next+0x16/0x20
   ? module_assert_mutex_or_preempt+0x14/0x40
   ? __module_address+0x28/0xf0
   ? clone_fs_devices+0x4d/0x170 [btrfs]
   ? static_obj+0x4f/0x60
   ? lockdep_init_map_waits+0x43/0x200
   ? clone_fs_devices+0x4d/0x170 [btrfs]
   clone_fs_devices+0x4d/0x170 [btrfs]
   btrfs_read_chunk_tree+0x330/0x800 [btrfs]
   open_ctree+0xb7c/0x18ce [btrfs]
   ? super_setup_bdi_name+0x79/0xd0
   btrfs_mount_root.cold+0x13/0xfa [btrfs]
   ? vfs_parse_fs_string+0x84/0xb0
   ? rcu_read_lock_sched_held+0x52/0x60
   ? kfree+0x2b5/0x310
   legacy_get_tree+0x30/0x50
   vfs_get_tree+0x28/0xc0
   fc_mount+0xe/0x40
   vfs_kern_mount.part.0+0x71/0x90
   btrfs_mount+0x13b/0x3e0 [btrfs]
   ? cred_has_capability+0x7c/0x120
   ? rcu_read_lock_sched_held+0x52/0x60
   ? legacy_get_tree+0x30/0x50
   legacy_get_tree+0x30/0x50
   vfs_get_tree+0x28/0xc0
   do_mount+0x7de/0xb30
   ? memdup_user+0x4e/0x90
   __x64_sys_mount+0x8e/0xd0
   do_syscall_64+0x52/0xb0
   entry_SYSCALL_64_after_hwframe+0x44/0xa9

This is because btrfs_read_chunk_tree() can come upon DEV_EXTENT's and
then read the device, which takes the device_list_mutex.  The
device_list_mutex needs to be taken before the chunk_mutex, so this is a
problem.  We only really need the chunk mutex around adding the chunk,
so move the mutex around read_one_chunk.

An argument could be made that we don't even need the chunk_mutex here
as it's during mount, and we are protected by various other locks.
However we already have special rules for ->device_list_mutex, and I'd
rather not have another special case for ->chunk_mutex.

CC: stable@vger.kernel.org # 4.19+
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2020-07-27 12:55:46 +02:00