Commit graph

75139 commits

Author SHA1 Message Date
Daniel Rosenberg
edc440e3d2 fscrypt: improve format of no-key names
When an encrypted directory is listed without the key, the filesystem
must show "no-key names" that uniquely identify directory entries, are
at most 255 (NAME_MAX) bytes long, and don't contain '/' or '\0'.
Currently, for short names the no-key name is the base64 encoding of the
ciphertext filename, while for long names it's the base64 encoding of
the ciphertext filename's dirhash and second-to-last 16-byte block.

This format has the following problems:

- Since it doesn't always include the dirhash, it's incompatible with
  directories that will use a secret-keyed dirhash over the plaintext
  filenames.  In this case, the dirhash won't be computable from the
  ciphertext name without the key, so it instead must be retrieved from
  the directory entry and always included in the no-key name.
  Casefolded encrypted directories will use this type of dirhash.

- It's ambiguous: it's possible to craft two filenames that map to the
  same no-key name, since the method used to abbreviate long filenames
  doesn't use a proper cryptographic hash function.

Solve both these problems by switching to a new no-key name format that
is the base64 encoding of a variable-length structure that contains the
dirhash, up to 149 bytes of the ciphertext filename, and (if any bytes
remain) the SHA-256 of the remaining bytes of the ciphertext filename.

This ensures that each no-key name contains everything needed to find
the directory entry again, contains only legal characters, doesn't
exceed NAME_MAX, is unambiguous unless there's a SHA-256 collision, and
that we only take the performance hit of SHA-256 on very long filenames.

Note: this change does *not* address the existing issue where users can
modify the 'dirhash' part of a no-key name and the filesystem may still
accept the name.

Signed-off-by: Daniel Rosenberg <drosen@google.com>
[EB: improved comments and commit message, fixed checking return value
 of base64_decode(), check for SHA-256 error, continue to set disk_name
 for short names to keep matching simpler, and many other cleanups]
Link: https://lore.kernel.org/r/20200120223201.241390-7-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
2020-01-22 14:50:03 -08:00
Daniel Rosenberg
aa408f835d fscrypt: derive dirhash key for casefolded directories
When we allow indexed directories to use both encryption and
casefolding, for the dirhash we can't just hash the ciphertext filenames
that are stored on-disk (as is done currently) because the dirhash must
be case insensitive, but the stored names are case-preserving.  Nor can
we hash the plaintext names with an unkeyed hash (or a hash keyed with a
value stored on-disk like ext4's s_hash_seed), since that would leak
information about the names that encryption is meant to protect.

Instead, if we can accept a dirhash that's only computable when the
fscrypt key is available, we can hash the plaintext names with a keyed
hash using a secret key derived from the directory's fscrypt master key.
We'll use SipHash-2-4 for this purpose.

Prepare for this by deriving a SipHash key for each casefolded encrypted
directory.  Make sure to handle deriving the key not only when setting
up the directory's fscrypt_info, but also in the case where the casefold
flag is enabled after the fscrypt_info was already set up.  (We could
just always derive the key regardless of casefolding, but that would
introduce unnecessary overhead for people not using casefolding.)

Signed-off-by: Daniel Rosenberg <drosen@google.com>
[EB: improved commit message, updated fscrypt.rst, squashed with change
 that avoids unnecessarily deriving the key, and many other cleanups]
Link: https://lore.kernel.org/r/20200120223201.241390-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
2020-01-22 14:49:55 -08:00
Daniel Rosenberg
6e1918cfb2 fscrypt: don't allow v1 policies with casefolding
Casefolded encrypted directories will use a new dirhash method that
requires a secret key.  If the directory uses a v2 encryption policy,
it's easy to derive this key from the master key using HKDF.  However,
v1 encryption policies don't provide a way to derive additional keys.

Therefore, don't allow casefolding on directories that use a v1 policy.
Specifically, make it so that trying to enable casefolding on a
directory that has a v1 policy fails, trying to set a v1 policy on a
casefolded directory fails, and trying to open a casefolded directory
that has a v1 policy (if one somehow exists on-disk) fails.

Signed-off-by: Daniel Rosenberg <drosen@google.com>
[EB: improved commit message, updated fscrypt.rst, and other cleanups]
Link: https://lore.kernel.org/r/20200120223201.241390-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
2020-01-22 14:47:15 -08:00
Alexei Starovoitov
be8704ff07 bpf: Introduce dynamic program extensions
Introduce dynamic program extensions. The users can load additional BPF
functions and replace global functions in previously loaded BPF programs while
these programs are executing.

Global functions are verified individually by the verifier based on their types only.
Hence the global function in the new program which types match older function can
safely replace that corresponding function.

This new function/program is called 'an extension' of old program. At load time
the verifier uses (attach_prog_fd, attach_btf_id) pair to identify the function
to be replaced. The BPF program type is derived from the target program into
extension program. Technically bpf_verifier_ops is copied from target program.
The BPF_PROG_TYPE_EXT program type is a placeholder. It has empty verifier_ops.
The extension program can call the same bpf helper functions as target program.
Single BPF_PROG_TYPE_EXT type is used to extend XDP, SKB and all other program
types. The verifier allows only one level of replacement. Meaning that the
extension program cannot recursively extend an extension. That also means that
the maximum stack size is increasing from 512 to 1024 bytes and maximum
function nesting level from 8 to 16. The programs don't always consume that
much. The stack usage is determined by the number of on-stack variables used by
the program. The verifier could have enforced 512 limit for combined original
plus extension program, but it makes for difficult user experience. The main
use case for extensions is to provide generic mechanism to plug external
programs into policy program or function call chaining.

BPF trampoline is used to track both fentry/fexit and program extensions
because both are using the same nop slot at the beginning of every BPF
function. Attaching fentry/fexit to a function that was replaced is not
allowed. The opposite is true as well. Replacing a function that currently
being analyzed with fentry/fexit is not allowed. The executable page allocated
by BPF trampoline is not used by program extensions. This inefficiency will be
optimized in future patches.

Function by function verification of global function supports scalars and
pointer to context only. Hence program extensions are supported for such class
of global functions only. In the future the verifier will be extended with
support to pointers to structures, arrays with sizes, etc.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20200121005348.2769920-2-ast@kernel.org
2020-01-22 23:04:52 +01:00
Florent Revest
6beea7afcc ima: add the ability to query the cached hash of a given file
This allows other parts of the kernel (perhaps a stacked LSM allowing
system monitoring, eg. the proposed KRSI LSM [1]) to retrieve the hash
of a given file from IMA if it's present in the iint cache.

It's true that the existence of the hash means that it's also in the
audit logs or in /sys/kernel/security/ima/ascii_runtime_measurements,
but it can be difficult to pull that information out for every
subsequent exec. This is especially true if a given host has been up
for a long time and the file was first measured a long time ago.

It should be kept in mind that this function gives access to cached
entries which can be removed, for instance on security_inode_free().

This is based on Peter Moody's patch:
 https://sourceforge.net/p/linux-ima/mailman/message/33036180/

[1] https://lkml.org/lkml/2019/9/10/393

Signed-off-by: Florent Revest <revest@google.com>
Reviewed-by: KP Singh <kpsingh@chromium.org>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-01-22 15:22:51 -05:00
Ming Lei
11ea68f553 genirq, sched/isolation: Isolate from handling managed interrupts
The affinity of managed interrupts is completely handled in the kernel and
cannot be changed via the /proc/irq/* interfaces from user space. As the
kernel tries to spread out interrupts evenly accross CPUs on x86 to prevent
vector exhaustion, it can happen that a managed interrupt whose affinity
mask contains both isolated and housekeeping CPUs is routed to an isolated
CPU. As a consequence IO submitted on a housekeeping CPU causes interrupts
on the isolated CPU.

Add a new sub-parameter 'managed_irq' for 'isolcpus' and the corresponding
logic in the interrupt affinity selection code.

The subparameter indicates to the interrupt affinity selection logic that
it should try to avoid the above scenario.

This isolation is best effort and only effective if the automatically
assigned interrupt mask of a device queue contains isolated and
housekeeping CPUs. If housekeeping CPUs are online then such interrupts are
directed to the housekeeping CPU so that IO submitted on the housekeeping
CPU cannot disturb the isolated CPU.

If a queue's affinity mask contains only isolated CPUs then this parameter
has no effect on the interrupt routing decision, though interrupts are only
happening when tasks running on those isolated CPUs submit IO. IO submitted
on housekeeping CPUs has no influence on those queues.

If the affinity mask contains both housekeeping and isolated CPUs, but none
of the contained housekeeping CPUs is online, then the interrupt is also
routed to an isolated CPU. Interrupts are only delivered when one of the
isolated CPUs in the affinity mask submits IO. If one of the contained
housekeeping CPUs comes online, the CPU hotplug logic migrates the
interrupt automatically back to the upcoming housekeeping CPU. Depending on
the type of interrupt controller, this can require that at least one
interrupt is delivered to the isolated CPU in order to complete the
migration.

[ tglx: Removed unused parameter, added and edited comments/documentation
  	and rephrased the changelog so it contains more details. ]

Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20200120091625.17912-1-ming.lei@redhat.com
2020-01-22 16:29:49 +01:00
Marc Zyngier
f4a81f5a85 irqchip/gic-v4.1: Allow direct invalidation of VLPIs
Just like for INVALL, GICv4.1 has grown a VPE-aware INVLPI register.
Let's plumb it in and make use of the DirectLPI code in that case.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
Link: https://lore.kernel.org/r/20191224111055.11836-16-maz@kernel.org
2020-01-22 14:22:21 +00:00
Marc Zyngier
b4a4bd0f26 irqchip/gic-v4.1: Add VPE INVALL callback
GICv4.1 redistributors have a VPE-aware INVALL register. Progress!
We can now emulate a guest-requested INVALL without emiting a
VINVALL command.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
Link: https://lore.kernel.org/r/20191224111055.11836-14-maz@kernel.org
2020-01-22 14:22:21 +00:00
Marc Zyngier
91bf6395f7 irqchip/gic-v4.1: Add VPE residency callback
Making a VPE resident on GICv4.1 is pretty simple, as it is just a
single write to the local redistributor. We just need extra information
about which groups to enable, which the KVM code will have to provide.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
Link: https://lore.kernel.org/r/20191224111055.11836-12-maz@kernel.org
2020-01-22 14:22:20 +00:00
Marc Zyngier
d97c97baa2 irqchip/gic-v4.1: Add mask/unmask doorbell callbacks
masking/unmasking doorbells on GICv4.1 relies on a new INVDB command,
which broadcasts the invalidation to all RDs.

Implement the new command as well as the masking callbacks, and plug
the whole thing into the v4.1 VPE irqchip.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
Link: https://lore.kernel.org/r/20191224111055.11836-11-maz@kernel.org
2020-01-22 14:22:20 +00:00
Marc Zyngier
64edfaa9a2 irqchip/gic-v4.1: Implement the v4.1 flavour of VMAPP
The ITS VMAPP command gains some new fields with GICv4.1:
- a default doorbell, which allows a single doorbell to be used for
  all the VLPIs routed to a given VPE
- a pointer to the configuration table (instead of having it in a register
  that gets context switched)
- a flag indicating whether this is the first map or the last unmap for
  this particular VPE
- a flag indicating whether the pending table is known to be zeroed, or not

Plumb in the new fields in the VMAPP builder, and add the map/unmap
refcounting so that the ITS can do the right thing.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
Link: https://lore.kernel.org/r/20191224111055.11836-7-maz@kernel.org
2020-01-22 14:22:19 +00:00
Marc Zyngier
5e5168461c irqchip/gic-v4.1: VPE table (aka GICR_VPROPBASER) allocation
GICv4.1 defines a new VPE table that is potentially shared between
both the ITSs and the redistributors, following complicated affinity
rules.

To make things more confusing, the programming of this table at
the redistributor level is reusing the GICv4.0 GICR_VPROPBASER register
for something completely different.

The code flow is somewhat complexified by the need to respect the
affinities required by the HW, meaning that tables can either be
inherited from a previously discovered ITS or redistributor.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
Link: https://lore.kernel.org/r/20191224111055.11836-6-maz@kernel.org
2020-01-22 14:22:19 +00:00
Marc Zyngier
f2d834092e irqchip/gic-v3: Add GICv4.1 VPEID size discovery
While GICv4.0 mandates 16 bit worth of VPEIDs, GICv4.1 allows smaller
implementations to be built. Add the required glue to dynamically
compute the limit.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
Link: https://lore.kernel.org/r/20191224111055.11836-3-maz@kernel.org
2020-01-22 14:22:19 +00:00
Marc Zyngier
b25319d279 irqchip/gic-v3: Detect GICv4.1 supporting RVPEID
GICv4.1 supports the RVPEID ("Residency per vPE ID"), which allows for
a much efficient way of making virtual CPUs resident (to allow direct
injection of interrupts).

The functionnality needs to be discovered on each and every redistributor
in the system, and disabled if the settings are inconsistent.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
Link: https://lore.kernel.org/r/20191224111055.11836-2-maz@kernel.org
2020-01-22 14:22:19 +00:00
Greg Kroah-Hartman
10d3e38c79 interconnect patches for 5.6
Here are the interconnect patches for the 5.6-rc1 merge window.
 
 - New core helper functions for some common functionalities in drivers.
 - Improvements in the information exposed via debugfs.
 - Basic tracepoints support.
 - New interconnect driver for msm8916 platforms.
 - Misc fixes.
 
 Signed-off-by: Georgi Djakov <georgi.djakov@linaro.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJeIXe2AAoJEIDQzArG2BZjmToQAMijCGSN8TkzoS1/9w95ML9Q
 vbth57JAjHRoigpIt2nVP9c/oIAqxprIB+ZsXX6pKUtnRVTJo/gviQEw1UCqMp58
 d1v2Yb/b8lyzAbXdiVHbFKjF7pdNRUZDraMZ7HSC3UGD1LIy5cx0YjqwYyWFJEz6
 +9Ko3uDYEGqh3HM4Ybl5X5F019tdYY8TrTpajnKnh9yQDh3SAzw5mEDc7PuldAFn
 jgd0KVx56LRKWpA7SKfCjfsS8h4Ft48yzaUATJdNlcihl09mvNYcJ8ZHrGod/n/B
 cBKSwdaKwK0/LuGHrEM6I4UmbbfTmiNJ9dImtMwo5XapBG+OFBNhcVukJrW5xINe
 D3N9hXdUWSnnQyT+sOm8U3cL70PiueKP/CofhEJ4UxBrlckYfPGCZLLTAi4+dfPS
 T2rTonMGNIBCoqZ7iLDGff8uVQxGfOJkdNfd/UpJ64OLshQIXU+WQfa9PbVzkTly
 5opGJlU4X261S3lxxYCKFAP0QGUyDvrzItSMBegcDvUFFwQN06ISMYdXp0VJh3xk
 vYXV3vokqKJZcU8/YBGlZU8DAlHMxtEETt2B+Ua2tuYMzyHFWd76aQt3q7Id0xRB
 Dp6FoZROyBMgH00FCAdAK25dhL0M2CiUpSjmgDXprZeMX1YJRz24zR4u+fJOmq5g
 XZpJM/HIdRlq9pHHEZ/h
 =5hpf
 -----END PGP SIGNATURE-----

Merge tag 'icc-5.6-rc1' of https://git.linaro.org/people/georgi.djakov/linux into char-misc-next

Georgi writes:

interconnect patches for 5.6

Here are the interconnect patches for the 5.6-rc1 merge window.

- New core helper functions for some common functionalities in drivers.
- Improvements in the information exposed via debugfs.
- Basic tracepoints support.
- New interconnect driver for msm8916 platforms.
- Misc fixes.

Signed-off-by: Georgi Djakov <georgi.djakov@linaro.org>

* tag 'icc-5.6-rc1' of https://git.linaro.org/people/georgi.djakov/linux:
  interconnect: qcom: Add MSM8916 interconnect provider driver
  dt-bindings: interconnect: Add Qualcomm MSM8916 DT bindings
  interconnect: Check for valid path in icc_set_bw()
  interconnect: Print the tag in the debugfs summary
  interconnect: Add interconnect_graph file to debugfs
  interconnect: qcom: Use the standard aggregate function
  interconnect: Add a common standard aggregate function
  interconnect: Add basic tracepoints
  interconnect: Add a name to struct icc_path
  interconnect: Move internal structs into a separate file
  interconnect: qcom: Use the new common helper for node removal
  interconnect: Add a common helper for removing all nodes
2020-01-22 10:04:39 +01:00
Tudor Ambarus
b46f36c05a crypto: atmel-{aes,sha,tdes} - Retire crypto_platform_data
These drivers no longer need it as they are only probed via DT.
crypto_platform_data was allocated but unused, so remove it.
This is a follow up for:
commit 45a536e3a7 ("crypto: atmel-tdes - Retire dma_request_slave_channel_compat()")
commit db28512f48 ("crypto: atmel-sha - Retire dma_request_slave_channel_compat()")
commit 62f72cbdcf ("crypto: atmel-aes - Retire dma_request_slave_channel_compat()")

Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-01-22 16:21:09 +08:00
Greg Kroah-Hartman
fd2d11cc8a Merge 5.5-rc7 into char-misc-next
We need the char-misc fixes in here as well.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-22 09:08:01 +01:00
Greg Kroah-Hartman
c318f074d9 Merge 5.5-rc7 into staging-next
We want the staging fixes in here as well

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-22 09:05:34 +01:00
Olof Johansson
88b4750151 arm64: soc: ZynqMP SoC changes for v5.6
- Extend firmware interface for feature checking
 - Use mailbox for communication with firmware for power management
 -----BEGIN PGP SIGNATURE-----
 
 iF0EABECAB0WIQQbPNTMvXmYlBPRwx7KSWXLKUoMIQUCXibIfgAKCRDKSWXLKUoM
 ISEiAJ0dOki4FCVTEloO1IIcbgY38SgvlwCfTSN4RQThu5iiG40ODZp4cJMaZmA=
 =jGIz
 -----END PGP SIGNATURE-----

Merge tag 'zynqmp-soc-for-v5.6' of https://github.com/Xilinx/linux-xlnx into arm/drivers

arm64: soc: ZynqMP SoC changes for v5.6

- Extend firmware interface for feature checking
- Use mailbox for communication with firmware for power management

* tag 'zynqmp-soc-for-v5.6' of https://github.com/Xilinx/linux-xlnx:
  drivers: soc: xilinx: Use mailbox IPI callback
  dt-bindings: power: reset: xilinx: Add bindings for ipi mailbox
  drivers: firmware: xilinx: Add support for feature check

Link: https://lore.kernel.org/r/f6fb26f8-b00d-a3e8-bf7d-c7ff2a8483b1@monstr.eu
Signed-off-by: Olof Johansson <olof@lixom.net>
2020-01-21 15:05:15 -08:00
Greg Kroah-Hartman
dd7d99dc68 Merge 5.5-rc7 into usb-next
We need the USB fixes in here as well.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-21 19:36:59 +01:00
Mark Brown
ea87683909
Merge branch 'regmap-5.6' into regmap-next 2020-01-21 17:29:48 +00:00
Jason Gunthorpe
e8b3a426fb Use ODP MRs for kernel ULPs
The following series extends MR creation routines to allow creation of
 user MRs through kernel ULPs as a proxy. The immediate use case is to
 allow RDS to work over FS-DAX, which requires ODP (on-demand-paging)
 MRs to be created and such MRs were not possible to create prior this
 series.
 
 The first part of this patchset extends RDMA to have special verb
 ib_reg_user_mr(). The common use case that uses this function is a
 userspace application that allocates memory for HCA access but the
 responsibility to register the memory at the HCA is on an kernel ULP.
 This ULP acts as an agent for the userspace application.
 
 The second part provides advise MR functionality for ULPs. This is
 integral part of ODP flows and used to trigger pagefaults in advance
 to prepare memory before running working set.
 
 The third part is actual user of those in-kernel APIs.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQT1m3YD37UfMCUQBNwp8NhrnBAZsQUCXiVO8AAKCRAp8NhrnBAZ
 scTrAP9gb0d3qv0IOtHw5aGI1DAgjTUn/SzUOnsjDEn7DIoh9gEA2+ZmaEyLXKrl
 +UcZb31auy5P8ueJYokRLhLAyRcOIAg=
 =yaHb
 -----END PGP SIGNATURE-----

Merge tag 'rds-odp-for-5.5' into rdma.git for-next

From https://git.kernel.org/pub/scm/linux/kernel/git/leon/linux-rdma

Leon Romanovsky says:

====================
Use ODP MRs for kernel ULPs

The following series extends MR creation routines to allow creation of
user MRs through kernel ULPs as a proxy. The immediate use case is to
allow RDS to work over FS-DAX, which requires ODP (on-demand-paging)
MRs to be created and such MRs were not possible to create prior this
series.

The first part of this patchset extends RDMA to have special verb
ib_reg_user_mr(). The common use case that uses this function is a
userspace application that allocates memory for HCA access but the
responsibility to register the memory at the HCA is on an kernel ULP.
This ULP acts as an agent for the userspace application.

The second part provides advise MR functionality for ULPs. This is
integral part of ODP flows and used to trigger pagefaults in advance
to prepare memory before running working set.

The third part is actual user of those in-kernel APIs.
====================

* tag 'rds-odp-for-5.5':
  net/rds: Use prefetch for On-Demand-Paging MR
  net/rds: Handle ODP mr registration/unregistration
  net/rds: Detect need of On-Demand-Paging memory registration
  RDMA/mlx5: Fix handling of IOVA != user_va in ODP paths
  IB/mlx5: Mask out unsupported ODP capabilities for kernel QPs
  RDMA/mlx5: Don't fake udata for kernel path
  IB/mlx5: Add ODP WQE handlers for kernel QPs
  IB/core: Add interface to advise_mr for kernel users
  IB/core: Introduce ib_reg_user_mr
  IB: Allow calls to ib_umem_get from kernel ULPs

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-21 09:55:04 -04:00
Milan Pandurov
09cbcef6c6 kvm: Refactor handling of VM debugfs files
We can store reference to kvm_stats_debugfs_item instead of copying
its values to kvm_stat_data.
This allows us to remove duplicated code and usage of temporary
kvm_stat_data inside vm_stat_get et al.

Signed-off-by: Milan Pandurov <milanpa@amazon.de>
Reviewed-by: Alexander Graf <graf@amazon.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-21 14:45:33 +01:00
Arvind Sankar
9167bd9634 sparc/console: kill off obsolete declarations
commit 09d3f3f0e0 ("sparc: Kill PROM console driver.") missed removing
the declarations of the deleted prom_con structure and prom_con_init
function from console.h. Kill them off now.

Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-21 13:28:24 +01:00
David S. Miller
4f2c17e0f3 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next
Steffen Klassert says:

====================
pull request (net-next): ipsec-next 2020-01-21

1) Add support for TCP encapsulation of IKE and ESP messages,
   as defined by RFC 8229. Patchset from Sabrina Dubroca.

Please note that there is a merge conflict in:

net/unix/af_unix.c

between commit:

3c32da19a8 ("unix: Show number of pending scm files of receive queue in fdinfo")

from the net-next tree and commit:

b50b0580d2 ("net: add queue argument to __skb_wait_for_more_packets and __skb_{,try_}recv_datagram")

from the ipsec-next tree.

The conflict can be solved as done in linux-next.

Please pull or let me know if there are problems.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-21 12:18:20 +01:00
Heiner Kallweit
bbbf8430af net: phy: add new version of phy_do_ioctl
Add a new version of phy_do_ioctl that doesn't check whether net_device
is running. It will typically be used if suitable drivers attach the
PHY in probe already.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-21 10:50:41 +01:00
Heiner Kallweit
3231e5d222 net: phy: rename phy_do_ioctl to phy_do_ioctl_running
We just added phy_do_ioctl, but it turned out that we need another
version of this function that doesn't check whether net_device is
running. So rename phy_do_ioctl to phy_do_ioctl_running.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-21 10:50:41 +01:00
Geert Uytterhoeven
c3c431de99 dmaengine: Move dma_get_{,any_}slave_channel() to private dmaengine.h
The functions dma_get_slave_channel() and dma_get_any_slave_channel()
are called from DMA engine drivers only.  Hence move their declarations
from the public header file <linux/dmaengine.h> to the private header
file drivers/dma/dmaengine.h.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20200121093311.28639-4-geert+renesas@glider.be
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2020-01-21 15:05:29 +05:30
Geert Uytterhoeven
71ca5b7823 dmaengine: Remove dma_request_slave_channel_compat() wrapper
At its original introduction, dma_request_slave_channel_compat() used a
wrapper, to accommodate filter functions that modify the mask passed.
Filter functions can no longer modify masks, and the mask parameter was
made const in commit a53e28da57 ("dma: Make the 'mask' parameter
of __dma_request_channel const") consecutively.

Hence remove the wrapper, and rename __dma_request_slave_channel_compat()
to dma_request_slave_channel_compat(), to get rid of one more function
name starting with a double underscore.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20200121093311.28639-3-geert+renesas@glider.be
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2020-01-21 15:05:28 +05:30
Grygorii Strashko
d702419134 dmaengine: ti: k3-udma: Add glue layer for non DMAengine users
Certain users can not use right now the DMAengine API due to missing
features in the core. Prime example is Networking.

These users can use the glue layer interface to avoid misuse of DMAengine
API and when the core gains the needed features they can be converted to
use generic API.

The most prominent features the glue layer clients are depending on:

- most PSI-L native peripheral use extra rflow ranges on a receive channel
   and depending on the peripheral's configuration packets from a single
   free descriptor ring is going to be received to different receive ring
  - it is also possible to have different free descriptor rings per rflow
    and an rflow can also support 4 additional free descriptor ring based
    on the size of the incoming packet
- out of order completion of descriptors on a channel
 - when we have several queues to handle different priority packets the
   descriptors will be completed 'out-of-order'
- the notion of prep_slave_sg is not matching with what the streaming type
   of operation is demanding for networking
- Streaming type of operation
 - Ability to fill the free descriptor ring with descriptors in
   anticipation of incoming traffic and when a packet arrives UDMAP will
   form a packet and gives it to the client driver
 - the descriptors are not backed with exact size data buffers as we don't
   know the size of the packet we will receive, but as a generic pool of
   buffers to be used by the receive channel
- NAPI type of operation (polling instead of interrupt driven transfer)
 - without this we can not sustain gigabit speeds and we need to support NAPI
 - not to limit this to networking, but other high performance operations

Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Tested-by: Keerthy <j-keerthy@ti.com>
Link: https://lore.kernel.org/r/20191223110458.30766-12-peter.ujfalusi@ti.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2020-01-21 11:06:12 +05:30
Peter Ujfalusi
8c6bb62f6b dmaengine: ti: k3 PSI-L remote endpoint configuration
In K3 architecture the DMA operates within threads. One end of the thread
is UDMAP, the other is on the peripheral side.

The UDMAP channel configuration depends on the needs of the remote
endpoint and it can be differ from peripheral to peripheral.

This patch adds database for am654 and j721e and small API to fetch the
PSI-L endpoint configuration from the database which should only used by
the DMA driver(s).

Another API is added for native peripherals to give possibility to pass new
configuration for the threads they are using, which is needed to be able to
handle changes caused by different firmware loaded for the peripheral for
example.

Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Tested-by: Keerthy <j-keerthy@ti.com>
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Link: https://lore.kernel.org/r/20191223110458.30766-9-peter.ujfalusi@ti.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2020-01-21 11:06:12 +05:30
Peter Ujfalusi
69bafc3185 dmaengine: ti: Add cppi5 header for K3 NAVSS/UDMA
The K3 DMA architecture uses CPPI5 (Communications Port Programming
Interface) specified descriptors over PSI-L bus within NAVSS.

The header provides helpers, macros to work with these descriptors in a
consistent way.

Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Tested-by: Keerthy <j-keerthy@ti.com>
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Link: https://lore.kernel.org/r/20191223110458.30766-8-peter.ujfalusi@ti.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2020-01-21 11:06:12 +05:30
Peter Ujfalusi
816ebf4844 dmaengine: Add helper function to convert direction value to text
dmaengine_get_direction_text() can be useful when the direction is printed
out. The text is easier to comprehend than the number.

Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Link: https://lore.kernel.org/r/20191223110458.30766-7-peter.ujfalusi@ti.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2020-01-21 11:06:12 +05:30
Peter Ujfalusi
6755ec06d1 dmaengine: Add support for reporting DMA cached data amount
A DMA hardware can have big cache or FIFO and the amount of data sitting in
the DMA fabric can be an interest for the clients.

For example in audio we want to know the delay in the data flow and in case
the DMA have significantly large FIFO/cache, it can affect the latenc/delay

Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Reviewed-by: Tero Kristo <t-kristo@ti.com>
Tested-by: Keerthy <j-keerthy@ti.com>
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Link: https://lore.kernel.org/r/20191223110458.30766-6-peter.ujfalusi@ti.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2020-01-21 11:06:12 +05:30
Peter Ujfalusi
4db8fd32ed dmaengine: Add metadata_ops for dma_async_tx_descriptor
The metadata is best described as side band data or parameters traveling
alongside the data DMAd by the DMA engine. It is data
which is understood by the peripheral and the peripheral driver only, the
DMA engine see it only as data block and it is not interpreting it in any
way.

The metadata can be different per descriptor as it is a parameter for the
data being transferred.

If the DMA supports per descriptor metadata it can implement the attach,
get_ptr/set_len callbacks.

Client drivers must only use either attach or get_ptr/set_len to avoid
misconfiguration.

Client driver can check if a given metadata mode is supported by the
channel during probe time with
dmaengine_is_metadata_mode_supported(chan, DESC_METADATA_CLIENT);
dmaengine_is_metadata_mode_supported(chan, DESC_METADATA_ENGINE);

and based on this information can use either mode.

Wrappers are also added for the metadata_ops.

To be used in DESC_METADATA_CLIENT mode:
dmaengine_desc_attach_metadata()

To be used in DESC_METADATA_ENGINE mode:
dmaengine_desc_get_metadata_ptr()
dmaengine_desc_set_metadata_len()

Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Reviewed-by: Tero Kristo <t-kristo@ti.com>
Tested-by: Keerthy <j-keerthy@ti.com>
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Link: https://lore.kernel.org/r/20191223110458.30766-5-peter.ujfalusi@ti.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2020-01-21 11:06:12 +05:30
Vinod Koul
5fe4beaac2 SOC: TI Keystone Ring Accelerator driver
The Ring Accelerator (RINGACC or RA) provides hardware acceleration to
 enable straightforward passing of work between a producer and a consumer.
 There is one RINGACC module per NAVSS on TI AM65x SoCs.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJeH1VYAAoJEHJsHOdBp5c/k4cP/RHrk67mB8L6JC7G9H1XWedM
 w4aG80q7u8dAydk9k3wJM1Y4ChXfflR3JYc5KBghrsQiRPF8myh4JEwi3+8jpTT5
 LUOJifaEtR79VuNCHyG9Wgni3LWYnp7HmG4jM1bzcDwLgpf4I8cBwXkCfW0kmyPq
 HAaFLWo1net3TkhSnUkbFUn54vESl569/D6FIfFFK2DPJH/gLUKQ6W8Xnd1uKBaI
 9RpnN6eGA0ZWZkZG+Xu7XA/VfC/AR+wxbCr0l1jHB9+4CrngOEwsTnP1SFxFsD1v
 L7zAT5JguNg13jbELVgTpUX8X5/XBfbOzIrNZpeXnl0fl8p2SU25n8hw8KdYjKCP
 5puO6wye5NnliQs5hLlMTydF77VazAqdBhLz5i1OMa+cuA1zVzm7dyIZBAWbI8IC
 TY86h9eGyGKELjljrYhW8ijARbzu/J3SpO3cSklvL/BfXtlVYen5d0mZxsKBDWOi
 bni1yV37b65IWjkimwzkaVq/sN9jWTAF2a192SkKeEBqnms5jxKDeCRq9qSxpCWv
 ONFROd6WXImDTk/MLgo4EdAq+ProoBFR+YDLModSAv3fZaBFUgi5mU5Xnx1+cAFq
 SL9TguzvXhUv6o6ywNZSsSM/7I2iB5uOMRKjCeGg2x1JN9lyOMzrVlJ8JzzUyKV3
 iiKDJjNdZD0/Rn2fl5a1
 =m1rI
 -----END PGP SIGNATURE-----

Merge TI ringacc driver from Santosh

This is for dependency of new TI ringacc dmaengine drivers

Merge tag 'drivers_soc_for_5.6' into topic/ti

SOC: TI Keystone Ring Accelerator driver

The Ring Accelerator (RINGACC or RA) provides hardware acceleration to
enable straightforward passing of work between a producer and a consumer.
There is one RINGACC module per NAVSS on TI AM65x SoCs.

Signed-off-by: Vinod Koul <vkoul@kernel.org>
2020-01-21 11:01:26 +05:30
Pavel Begunkov
4e5ef02317 pcpu_ref: add percpu_ref_tryget_many()
Add percpu_ref_tryget_many(), which works the same way as
percpu_ref_tryget(), but grabs specified number of refs.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: Dennis Zhou <dennis@kernel.org>
Cc: Christoph Lameter <cl@linux.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-20 17:04:02 -07:00
Jens Axboe
db08ca2525 mm: make do_madvise() available internally
This is in preparation for enabling this functionality through io_uring.
Add a helper that is just exporting what sys_madvise() does, and have the
system call use it.

No functional changes in this patch.

Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-20 17:04:02 -07:00
Olof Johansson
684415d0de cmdq:
- clean ups of unused code and debuggability
 - add cmdq_instruction to make the function call interface more readable
 - add functions for polling and providing info for the user of cmdq
 
 scpsys:
 - add bindings for MT6765
 -----BEGIN PGP SIGNATURE-----
 
 iQJLBAABCAA1FiEEUdvKHhzqrUYPB/u8L21+TfbCqH4FAl4cPekXHG1hdHRoaWFz
 LmJnZ0BnbWFpbC5jb20ACgkQL21+TfbCqH6YDA/+MRYiMb26fu+Ac3pF1SW6ao0A
 fCI0JH2L8Zj4nklg3Y/hoi2PG7DJL2vaG71Ng3iQxQqKLWJdS1WbF/DBl8JdozQx
 mJXXZ0JM5cc/gIvrG8PLAauWRTDUIPDC2ZZ66mO4CiR6T7Kp/LpnmZF5ZR4qxu/W
 KNAYxSeyPJ36WQ3HCmubBWOn4JaetOeBFID6z3roKhsGyFXLmCo0Djw4c0Lpk79t
 3qVpo7TcC3hintZSC8ufIjQZA/KU3kwy3pS2O+HKMDb4EwORl14Grd6/cN9rAf0p
 twprn4Y9Jx4cBy0jGyZShKMbIn6wT6zF8GqpoV71vvXTgknyr8XoaiepnjMd2Enm
 BXw7R8kUt3w4mHnMwX9aUPAP48S1DVMm0OsbwOOqRcSXyJWnQBFB85LnkKeSfzRW
 iWfeXQBaESCQziYkJSkXz2D4epV/Rzq/L1y5IrddQuwIlpQ3eeqYUWBDYhuV3VlY
 K4r/AawG0NKPWTvH0bgLZAiWybnzXPZUvVJzwkdwRBjpiyhsl0FX1oZb1H8Vf6mK
 yrCUgaTIVp6z0pMd8kaxPTyi2AXryKODOXp+XJ2JFmdLFuBPQMSLCiG+B6sDa2Sw
 eYMcZpFtL5B/p/NpNuSUlGZ2HXTxknS3RnIDTUJ583BZLeItShHnBW3P/8qjx7Wm
 8oGsLh0ZLF0G5iqdeuM=
 =Q9/6
 -----END PGP SIGNATURE-----

Merge tag 'v5.5-next-soc' of https://git.kernel.org/pub/scm/linux/kernel/git/matthias.bgg/linux into arm/drivers

cmdq:
- clean ups of unused code and debuggability
- add cmdq_instruction to make the function call interface more readable
- add functions for polling and providing info for the user of cmdq

scpsys:
- add bindings for MT6765

* tag 'v5.5-next-soc' of https://git.kernel.org/pub/scm/linux/kernel/git/matthias.bgg/linux:
  dt-bindings: mediatek: add MT6765 power dt-bindings
  soc: mediatek: cmdq: delete not used define
  soc: mediatek: cmdq: add cmdq_dev_get_client_reg function
  soc: mediatek: cmdq: add polling function
  soc: mediatek: cmdq: define the instruction struct
  soc: mediatek: cmdq: remove OR opertaion from err return

Link: https://lore.kernel.org/r/9b365e76-e346-f813-d750-d7cfd0d16e4e@gmail.com
Signed-off-by: Olof Johansson <olof@lixom.net>
2020-01-20 11:26:10 -08:00
Pi-Hsun Shih
7017996951 rpmsg: add rpmsg support for mt8183 SCP.
Add a simple rpmsg support for mt8183 SCP, that use IPI / IPC directly.

Signed-off-by: Pi-Hsun Shih <pihsun@chromium.org>
Link: https://lore.kernel.org/r/20191112110330.179649-4-pihsun@chromium.org
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
2020-01-20 10:29:56 -08:00
Erin Lo
63c13d61ea remoteproc/mediatek: add SCP support for mt8183
Provide a basic driver to control Cortex M4 co-processor

Signed-off-by: Erin Lo <erin.lo@mediatek.com>
Signed-off-by: Nicolas Boichat <drinkcat@chromium.org>
Signed-off-by: Pi-Hsun Shih <pihsun@chromium.org>
Link: https://lore.kernel.org/r/20191112110330.179649-3-pihsun@chromium.org
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
2020-01-20 10:29:54 -08:00
Kadlecsik József
32c72165db netfilter: ipset: use bitmap infrastructure completely
The bitmap allocation did not use full unsigned long sizes
when calculating the required size and that was triggered by KASAN
as slab-out-of-bounds read in several places. The patch fixes all
of them.

Reported-by: syzbot+fabca5cbf5e54f3fe2de@syzkaller.appspotmail.com
Reported-by: syzbot+827ced406c9a1d9570ed@syzkaller.appspotmail.com
Reported-by: syzbot+190d63957b22ef673ea5@syzkaller.appspotmail.com
Reported-by: syzbot+dfccdb2bdb4a12ad425e@syzkaller.appspotmail.com
Reported-by: syzbot+df0d0f5895ef1f41a65b@syzkaller.appspotmail.com
Reported-by: syzbot+b08bd19bb37513357fd4@syzkaller.appspotmail.com
Reported-by: syzbot+53cdd0ec0bbabd53370a@syzkaller.appspotmail.com
Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-01-20 17:41:45 +01:00
Dennis Zhou
e837dfde15 bitmap: genericize percpu bitmap region iterators
Bitmaps are fairly popular for their space efficiency, but we don't have
generic iterators available. Make percpu's bitmap region iterators
available to everyone.

Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Dennis Zhou <dennis@kernel.org>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2020-01-20 16:40:56 +01:00
Chen Yu
e79f15a459 x86/resctrl: Add task resctrl information display
Monitoring tools that want to find out which resctrl control and monitor
groups a task belongs to must currently read the "tasks" file in every
group until they locate the process ID.

Add an additional file /proc/{pid}/cpu_resctrl_groups to provide this
information:

1)   res:
     mon:

resctrl is not available.

2)   res:/
     mon:

Task is part of the root resctrl control group, and it is not associated
to any monitor group.

3)  res:/
    mon:mon0

Task is part of the root resctrl control group and monitor group mon0.

4)  res:group0
    mon:

Task is part of resctrl control group group0, and it is not associated
to any monitor group.

5) res:group0
   mon:mon1

Task is part of resctrl control group group0 and monitor group mon1.

Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Tested-by: Jinshi Chen <jinshi.chen@intel.com>
Link: https://lkml.kernel.org/r/20200115092851.14761-1-yu.c.chen@intel.com
2020-01-20 16:19:10 +01:00
Heiner Kallweit
2ab1d925aa net: phy: add generic ndo_do_ioctl handler phy_do_ioctl
A number of network drivers has the same glue code to use phy_mii_ioctl
as ndo_do_ioctl handler. So let's add such a generic ndo_do_ioctl
handler to phylib.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-20 10:43:24 +01:00
Yash Shah
b01ecceaf2 genirq: Introduce irq_domain_translate_onecell
Add a new function irq_domain_translate_onecell() that is to be used as
the translate function in struct irq_domain_ops.

Signed-off-by: Yash Shah <yash.shah@sifive.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/1575976274-13487-2-git-send-email-yash.shah@sifive.com
2020-01-20 09:19:33 +00:00
Ingo Molnar
cb6c82df68 Linux 5.5-rc7
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl4k7i8eHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGvk0IAKRenVOdiudY77SQ
 VZjsteyrYTTQtPPv494ToIRjR0XQ+gYp8vyWzXTUC5Nm9Y9U3VzDqUPUjWszrSXE
 6mU+tzcMc9qwuUxnIFn8zfg64ygw+37sn/w3xqeH4QmF9Z5Wl3EX3SdXTs7jp3RS
 VxiztkUNI5ZBV2GDtla5K/9qLPqCQnUYXIiyi5lAtBtiitZDVXFp7dy7hMgEiaEO
 +78K5Kh3xlt5ndDsBFOlwIb2Oof3KL7bBXntdbSBc/bjol6IRvAgln48HWCv59G2
 jzAp2tj2KobX9GRAEPj+v4TQZEW0SXDNDi8MgQsM+3DYVCTmANsv57CBKRuf01+F
 nB1kAys=
 =zSnJ
 -----END PGP SIGNATURE-----

Merge tag 'v5.5-rc7' into perf/core, to pick up fixes

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2020-01-20 08:43:44 +01:00
Ingo Molnar
837171fe77 Linux 5.5-rc7
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl4k7i8eHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGvk0IAKRenVOdiudY77SQ
 VZjsteyrYTTQtPPv494ToIRjR0XQ+gYp8vyWzXTUC5Nm9Y9U3VzDqUPUjWszrSXE
 6mU+tzcMc9qwuUxnIFn8zfg64ygw+37sn/w3xqeH4QmF9Z5Wl3EX3SdXTs7jp3RS
 VxiztkUNI5ZBV2GDtla5K/9qLPqCQnUYXIiyi5lAtBtiitZDVXFp7dy7hMgEiaEO
 +78K5Kh3xlt5ndDsBFOlwIb2Oof3KL7bBXntdbSBc/bjol6IRvAgln48HWCv59G2
 jzAp2tj2KobX9GRAEPj+v4TQZEW0SXDNDi8MgQsM+3DYVCTmANsv57CBKRuf01+F
 nB1kAys=
 =zSnJ
 -----END PGP SIGNATURE-----

Merge tag 'v5.5-rc7' into locking/kcsan, to refresh the tree

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2020-01-20 08:42:47 +01:00
Dan Williams
484a418d07 efi: Fix handling of multiple efi_fake_mem= entries
Dave noticed that when specifying multiple efi_fake_mem= entries only
the last entry was successfully being reflected in the efi memory map.
This is due to the fact that the efi_memmap_insert() is being called
multiple times, but on successive invocations the insertion should be
applied to the last new memmap rather than the original map at
efi_fake_memmap() entry.

Rework efi_fake_memmap() to install the new memory map after each
efi_fake_mem= entry is parsed.

This also fixes an issue in efi_fake_memmap() that caused it to litter
emtpy entries into the end of the efi memory map. An empty entry causes
efi_memmap_insert() to attempt more memmap splits / copies than
efi_memmap_split_count() accounted for when sizing the new map. When
that happens efi_memmap_insert() may overrun its allocation, and if you
are lucky will spill over to an unmapped page leading to crash
signature like the following rather than silent corruption:

    BUG: unable to handle page fault for address: ffffffffff281000
    [..]
    RIP: 0010:efi_memmap_insert+0x11d/0x191
    [..]
    Call Trace:
     ? bgrt_init+0xbe/0xbe
     ? efi_arch_mem_reserve+0x1cb/0x228
     ? acpi_parse_bgrt+0xa/0xd
     ? acpi_table_parse+0x86/0xb8
     ? acpi_boot_init+0x494/0x4e3
     ? acpi_parse_x2apic+0x87/0x87
     ? setup_acpi_sci+0xa2/0xa2
     ? setup_arch+0x8db/0x9e1
     ? start_kernel+0x6a/0x547
     ? secondary_startup_64+0xb6/0xc0

Commit af16489848 "x86/efi: Update e820 with reserved EFI boot
services data to fix kexec breakage" introduced more occurrences where
efi_memmap_insert() is invoked after an efi_fake_mem= configuration has
been parsed. Previously the side effects of vestigial empty entries were
benign, but with commit af16489848 that follow-on efi_memmap_insert()
invocation triggers efi_memmap_insert() overruns.

Reported-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20191231014630.GA24942@dhcp-128-65.nay.redhat.com
Link: https://lore.kernel.org/r/20200113172245.27925-14-ardb@kernel.org
2020-01-20 08:14:29 +01:00
Dan Williams
1db91035d0 efi: Add tracking for dynamically allocated memmaps
In preparation for fixing efi_memmap_alloc() leaks, add support for
recording whether the memmap was dynamically allocated from slab,
memblock, or is the original physical memmap provided by the platform.

Given this tracking is established in efi_memmap_alloc() and needs to be
carried to efi_memmap_install(), use 'struct efi_memory_map_data' to
convey the flags.

Some small cleanups result from this reorganization, specifically the
removal of local variables for 'phys' and 'size' that are already
tracked in @data.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20200113172245.27925-12-ardb@kernel.org
2020-01-20 08:14:29 +01:00