A set of new API functions exported for the drivers will soon use
'bpf_offload_dev_' as a prefix. Rename the bpf_offload_dev_match()
which is internal to the core (used by the verifier) to avoid any
confusion.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
BPF code should unregister the offload capabilities from .ndo_uninit(),
to make sure the operation is atomic with unlist_netdevice(). Plumb
the init/uninit NDOs for vNICs and representors.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Move bound program information from netdevsim to shared sub-object,
as programs will soon be shared between netdevs of the same ASIC.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Factor out sharable netdevsim sub-object and use IFLA_LINK to link
netdevsims together at creation time. Sharable object will have
its own DebugFS directory.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Grouping netdevsim devices into "ASICs" will soon be supported.
Add switch_id attribute to all netdevsims. For now each netdevsim
will have its switch_id matching the device id.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
The aspeed pwm driver always sets the clock source to 24MHz, specify
the fixed clock in device tree to make sure the driver is using the
correct clock frequency to calculate the fan speed.
Signed-off-by: Lei YU <mine260309@gmail.com>
Signed-off-by: Joel Stanley <joel@jms.id.au>
Pointer sg is being assigned but is never used hence it is
redundant and can be removed.
Cleans up clang warning:
warning: variable 'sg' set but not used [-Wunused-but-set-variable]
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
The progs local variable in compute_effective_progs() is marked
as __rcu, which is not correct. This is a local pointer, which
is initialized by bpf_prog_array_alloc(), which also now
returns a generic non-rcu pointer.
The real rcu-protected pointer is *array (array is a pointer
to an RCU-protected pointer), so the assignment should be performed
using rcu_assign_pointer().
Fixes: 324bda9e6c ("bpf: multi program support for cgroup+bpf")
Signed-off-by: Roman Gushchin <guro@fb.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Currently the return type of the bpf_prog_array_alloc() is
struct bpf_prog_array __rcu *, which is not quite correct.
Obviously, the returned pointer is a generic pointer, which
is valid for an indefinite amount of time and it's not shared
with anyone else, so there is no sense in marking it as __rcu.
This change eliminate the following sparse warnings:
kernel/bpf/core.c:1544:31: warning: incorrect type in return expression (different address spaces)
kernel/bpf/core.c:1544:31: expected struct bpf_prog_array [noderef] <asn:4>*
kernel/bpf/core.c:1544:31: got void *
kernel/bpf/core.c:1548:17: warning: incorrect type in return expression (different address spaces)
kernel/bpf/core.c:1548:17: expected struct bpf_prog_array [noderef] <asn:4>*
kernel/bpf/core.c:1548:17: got struct bpf_prog_array *<noident>
kernel/bpf/core.c:1681:15: warning: incorrect type in assignment (different address spaces)
kernel/bpf/core.c:1681:15: expected struct bpf_prog_array *array
kernel/bpf/core.c:1681:15: got struct bpf_prog_array [noderef] <asn:4>*
Fixes: 324bda9e6c ("bpf: multi program support for cgroup+bpf")
Signed-off-by: Roman Gushchin <guro@fb.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Allow platform specific drivers to provide their own set_cs callback when
the IP integration requires it.
-----BEGIN PGP SIGNATURE-----
iQFHBAABCgAxFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAltPMSETHGJyb29uaWVA
a2VybmVsLm9yZwAKCRAk1otyXVSH0BizB/9lSfg4JrgJdAP8mDWfi7L5LjbpP9bB
9n1MlUFGRGU0Uc/yOD99QA6kB1WSpthFcgLBj+JDn3X30MzOqfZcHb3d9x5Z+XiV
YnIPzYqBPnnMSeTAX21c3Tz5nwO9hgz/xYr8onwYPeJWmt4RltYhiMdCgCQFWm9c
QaYNHiHymJP0ZFPXzLXlnIf+OW21ZOO9bH+eX1PAzbuWoDPxJZ63Eir22eKwTNey
bIIu19sxZE6o6Rd2Z0XAPtSGbd/r/9ULZAQcP429H+ETIfyefC6/IZv+yd4NrHmc
7WpWYi0JNdPLCXgLOvg5mwPBgDfs3tzoInWQu305882qlaK6kkivCgBu
=JPLR
-----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
iQFHBAABCgAxFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAltPMZgTHGJyb29uaWVA
a2VybmVsLm9yZwAKCRAk1otyXVSH0JxACACAVtjwSy7KjJZ92ZljiyDP2MUydcM8
+6WiN96NCQVgMwFb6WBfePSJFeDeAnr0t4oANZpfa9rGV7YEYB5KlPmkGvuP+u6N
Jj+kRYXNcsu4LiyfZ82b/in3nBQ370Ufe32c4OECD/WDBLUx9TfWcur0sHLjHyp0
wh/9VxqQcGjgt/espt3QGDrASdymWHwF3sTEAoAIOeUzOgO61J2/4fQotUhPVPTN
q7jS6tmbR9GapEKGPOEzRz+3Ici3zK75zM3Yk2wSfh2xXk9Ly9q2HVO7ju7tC8Pe
XMnPxermqar/+PRgYYgLxrUDNyKHlzce4+hqc1cwj0XwAlvMiImB2nsB
=3WLv
-----END PGP SIGNATURE-----
Merge tag 'spi-dw-set-cs' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi into spi-4.19
spi: dw: Allow custom set_cs_callback
Allow platform specific drivers to provide their own set_cs callback when
the IP integration requires it.
Allow platform specific drivers to provide their own set_cs callback when
the IP integration requires it.
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
It is possible to get an interrupt as soon as it is requested. dw_spi_irq
does spi_controller_get_devdata(master) and expects it to be different than
NULL. However, spi_controller_set_devdata() is called after request_irq(),
resulting in the following crash:
CPU 0 Unable to handle kernel paging request at virtual address 00000030, epc == 8058e09c, ra == 8018ff90
[...]
Call Trace:
[<8058e09c>] dw_spi_irq+0x8/0x64
[<8018ff90>] __handle_irq_event_percpu+0x70/0x1d4
[<80190128>] handle_irq_event_percpu+0x34/0x8c
[<801901c4>] handle_irq_event+0x44/0x80
[<801951a8>] handle_level_irq+0xdc/0x194
[<8018f580>] generic_handle_irq+0x38/0x50
[<804c6924>] ocelot_irq_handler+0x104/0x1c0
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Extend the existing support for backup mode to toggle power switches.
With a toggle power switch (or level signal), the following steps must
be followed exactly:
1. Configure PMIC for backup mode, to change the role of the
accessory power switch from a power switch to a wake-up switch,
2. Switch accessory power switch off, to prepare for system suspend,
which is a manual step not controlled by software,
3. Suspend system,
4. Switch accessory power switch on, to resume the system.
Hence the PMIC is configured for backup mode when "on" or "1" is written
to the PMIC's "backup_mode" virtual file in sysfs. Conversely, writing
"off" or "0" reverts the role of the accessory switch to a power
switch.
Unlike with momentary switches, backup mode is not enabled by default,
as enabling it prevents the board from being powered off using the power
switch, which may confuse the user.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Signed-off-by: Mark Brown <broonie@kernel.org>
Currently the BD9571MWV PMIC driver uses the standard "wake_up" sysfs
file to control enablement of DDR Backup Mode.
However, configuring DDR Backup Mode is not really equivalent to
configuring the PMIC as a wake-up source. To avoid confusion, use a
custom "backup_mode" attribute file in sysfs instead.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Signed-off-by: Mark Brown <broonie@kernel.org>
Add the DT bindings documentation for axg's TDM formatters: TDMIN
and TDMOUT.
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Add support for the spdif output serializer of the axg SoC family
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Add the DT bindings documentation for axg's SPDIF output.
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Add the capture memory interface of Amlogic's axg SoCs.
TDM, SPDIF or PDM input devices place audio samples inside this FIFO.
The FIFO content is then pushed to DDR
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Add the playback memory interface of Amlogic's axg SoCs.
This device pulls data from DDR to an internal FIFO.
This FIFO is then used to feed TDM and SPDIF Output devices.
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Amlogic's axg SoCs have two types of fifos which are the memory
interfaces of the audio subsystem. FRDDR provides the playback
interface while TODDR provides the capture interface.
The way these fifos operate is very similar. Only a few settings
are specific to each.
They implement the same pcm driver here and the specifics of each
will be dealt with the related DAI driver.
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Add the DT bindings documentation for axg's FIFOs: TODDR and FRDDR.
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Add documentation for power management of HDAC HDMI codec device for
various scenarios such as S0/S3, probe and playback use case.
Signed-off-by: Sriram Periyasamy <sriramx.periyasamy@intel.com>
Signed-off-by: Sanyog Kale <sanyog.r.kale@intel.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Now that the component framework is integrated into the ASoC core,
remove any redundant code in this driver.
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Reviewed-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Now that the component framework is integrated into the ASoC core,
remove any redundant code in this driver.
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Reviewed-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Now that the component framework is integrated into the ASoC core,
remove any redundant code in this driver.
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Reviewed-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Now that the component framework is integrated into the ASoC core,
remove any redundant code in this driver.
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Reviewed-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
This patch aims at achieving dynamic behaviour of audio card when
the dependent components disappear and reappear.
With this patch the card is removed if any of the dependent component
is removed and card is added back if the dependent component comes back.
All this is done using component framework and matching based on
component name.
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Reviewed-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Currently, intel_pstate doesn't register if _PSS is not present on
HP Proliant systems, because it expects the firmware to take over
CPU performance scaling in that case. However, if ACPI PCCH is
present, the firmware expects the kernel to use it for CPU
performance scaling and the pcc-cpufreq driver is loaded for that.
Unfortunately, the firmware interface used by that driver is not
scalable for fundamental reasons, so pcc-cpufreq is way suboptimal
on systems with more than just a few CPUs. In fact, it is better to
avoid using it at all.
For this reason, modify intel_pstate to look for ACPI PCCH if _PSS
is not present and register if it is there. Also prevent the
pcc-cpufreq driver from trying to initialize itself if intel_pstate
has been registered already.
Fixes: fbbcdc0744 (intel_pstate: skip the driver if ACPI has power mgmt option)
Reported-by: Andreas Herrmann <aherrmann@suse.com>
Reviewed-by: Andreas Herrmann <aherrmann@suse.com>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Tested-by: Andreas Herrmann <aherrmann@suse.com>
Cc: 4.16+ <stable@vger.kernel.org> # 4.16+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
This patch aimed to prevent deadlock during digsig verification.The point
of issue - user space utility modprobe and/or it's dependencies (ld-*.so,
libz.so.*, libc-*.so and /lib/modules/ files) that could be used for
kernel modules load during digsig verification and could be signed by
digsig in the same time.
First at all, look at crypto_alloc_tfm() work algorithm:
crypto_alloc_tfm() will first attempt to locate an already loaded
algorithm. If that fails and the kernel supports dynamically loadable
modules, it will then attempt to load a module of the same name or alias.
If that fails it will send a query to any loaded crypto manager to
construct an algorithm on the fly.
We have situation, when public_key_verify_signature() in case of RSA
algorithm use alg_name to store internal information in order to construct
an algorithm on the fly, but crypto_larval_lookup() will try to use
alg_name in order to load kernel module with same name.
1) we can't do anything with crypto module work, since it designed to work
exactly in this way;
2) we can't globally filter module requests for modprobe, since it
designed to work with any requests.
In this patch, I propose add an exception for "crypto-pkcs1pad(rsa,*)"
module requests only in case of enabled integrity asymmetric keys support.
Since we don't have any real "crypto-pkcs1pad(rsa,*)" kernel modules for
sure, we are safe to fail such module request from crypto_larval_lookup().
In this way we prevent modprobe execution during digsig verification and
avoid possible deadlock if modprobe and/or it's dependencies also signed
with digsig.
Requested "crypto-pkcs1pad(rsa,*)" kernel module name formed by:
1) "pkcs1pad(rsa,%s)" in public_key_verify_signature();
2) "crypto-%s" / "crypto-%s-all" in crypto_larval_lookup().
"crypto-pkcs1pad(rsa," part of request is a constant and unique and could
be used as filter.
Signed-off-by: Mikhail Kurinnoi <viewizard@viewizard.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
include/linux/integrity.h | 13 +++++++++++++
security/integrity/digsig_asymmetric.c | 23 +++++++++++++++++++++++
security/security.c | 7 ++++++-
3 files changed, 42 insertions(+), 1 deletion(-)
SHA1 is reasonable in HMAC constructs, but it's desirable to be able to
use stronger hashes in digital signatures. Modify the EVM crypto code so
the hash type is imported from the digital signature and passed down to
the hash calculation code, and return the digest size to higher layers
for validation.
Signed-off-by: Matthew Garrett <mjg59@google.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
When EVM attempts to appraise a file signed with a crypto algorithm the
kernel doesn't have support for, it will cause the kernel to trigger a
module load. If the EVM policy includes appraisal of kernel modules this
will in turn call back into EVM - since EVM is holding a lock until the
crypto initialisation is complete, this triggers a deadlock. Add a
CRYPTO_NOLOAD flag and skip module loading if it's set, and add that flag
in the EVM case in order to fail gracefully with an error message
instead of deadlocking.
Signed-off-by: Matthew Garrett <mjg59@google.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
When CONFIG_SECURITYFS is not enabled, securityfs_create_dir returns
-ENODEV which throws the following error:
"Unable to create integrity sysfs dir: -19"
However, if the feature is disabled, it can't be warning and hence
we need to silence the error. This patch checks for the error -ENODEV
which is returned when CONFIG_SECURITYFS is disabled to stop the error
being thrown.
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Matthew Garrett <mjg59@google.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
The AUDIT_INTEGRITY_RULE is used for auditing IMA policy rules and
the IMA "audit" policy action. This patch defines
AUDIT_INTEGRITY_POLICY_RULE to reflect the IMA policy rules.
Since we defined a new message type we can now also pass the
audit_context and get an associated SYSCALL record. This now produces
the following records when parsing IMA policy's rules:
type=UNKNOWN[1807] msg=audit(1527888965.738:320): action=audit \
func=MMAP_CHECK mask=MAY_EXEC res=1
type=UNKNOWN[1807] msg=audit(1527888965.738:320): action=audit \
func=FILE_CHECK mask=MAY_READ res=1
type=SYSCALL msg=audit(1527888965.738:320): arch=c000003e syscall=1 \
success=yes exit=17 a0=1 a1=55bcfcca9030 a2=11 a3=7fcc1b55fb38 \
items=0 ppid=1567 pid=1601 auid=0 uid=0 gid=0 euid=0 suid=0 \
fsuid=0 egid=0 sgid=0 fsgid=0 tty=tty2 ses=2 comm="echo" \
exe="/usr/bin/echo" \
subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 key=(null)
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Acked-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
If Integrity is not auditing, IMA shouldn't audit, either.
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Acked-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Remove the usage of audit_log_string() and replace it with
audit_log_format().
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Suggested-by: Steve Grubb <sgrubb@redhat.com>
Acked-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
The parameters passed to this logging function are all provided by
a privileged user and therefore we can call audit_log_string()
rather than audit_log_untrustedstring().
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Suggested-by: Steve Grubb <sgrubb@redhat.com>
Acked-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
On 64-bit servers, SPRN_SPRG3 and its userspace read-only mirror
SPRN_USPRG3 are used as userspace VDSO write and read registers
respectively.
SPRN_SPRG3 is lost when we enter stop4 and above, and is currently not
restored. As a result, any read from SPRN_USPRG3 returns zero on an
exit from stop4 (Power9 only) and above.
Thus in this situation, on POWER9, any call from sched_getcpu() always
returns zero, as on powerpc, we call __kernel_getcpu() which relies
upon SPRN_USPRG3 to report the CPU and NUMA node information.
Fix this by restoring SPRN_SPRG3 on wake up from a deep stop state
with the sprg_vdso value that is cached in PACA.
Fixes: e1c1cfed54 ("powerpc/powernv: Save/Restore additional SPRs for stop4 cpuidle")
Cc: stable@vger.kernel.org # v4.14+
Reported-by: Florian Weimer <fweimer@redhat.com>
Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Reviewed-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Some of the assembly files use instructions specific to BookE or E500,
which are rejected with the now-default -mcpu=powerpc, so we must pass
-me500 to the assembler just as we pass -me200 for E200.
Fixes: 4bf4f42a2f ("powerpc/kbuild: Set default generic machine type for 32-bit compile")
Signed-off-by: James Clarke <jrtc27@jrtc27.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Some Thinkpads have a single battery, but expose it as BAT1. Use the quirks
engine to force these machines into always addressing the primary battery.
Without this, the battery name would resolve to the non-existent secondary
battery and ACPI calls would fail.
Signed-off-by: Jouke Witteveen <j.witteveen@gmail.com>
Acked-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Modern Thinkpads have three character model designators. Previously, they
would be accepted, but recorded incompletely. Revision matching extracted
the wrong bytes from the ID string. This made the use of quirks for modern
machines impossible.
Fixes: 1b0eb5bc24
Signed-off-by: Jouke Witteveen <j.witteveen@gmail.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Variables slope and offset are being assigned but are never used hence
they are redundant and can be removed.
Cleans up clang warnings:
warning: variable 'slope' set but not used [-Wunused-but-set-variable]
warning: variable 'offset' set but not used [-Wunused-but-set-variable]
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
As already done treewide, switch from open-coded multiplication to using
2-factor allocation helpers.
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Vadim Pasternak <vadimp@mellanox.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
The PCI SSID 1558:95e1 needs the same quirk for other Clevo P950
models, too. Otherwise no sound comes out of speakers.
Bugzilla: https://bugzilla.opensuse.org/show_bug.cgi?id=1101143
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Inside a nested guest, access to hardware can be slow enough that
tsc_read_refs always return ULLONG_MAX, causing tsc_refine_calibration_work
to be called periodically and the nested guest to spend a lot of time
reading the ACPI timer.
However, if the TSC frequency is available from the pvclock page,
we can just set X86_FEATURE_TSC_KNOWN_FREQ and avoid the recalibration.
'refine' operation.
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Peng Hao <peng.hao2@zte.com.cn>
[Commit message rewritten. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When eVMCS is enabled, all VMCS allocated to be used by KVM are marked
with revision_id of KVM_EVMCS_VERSION instead of revision_id reported
by MSR_IA32_VMX_BASIC.
However, even though not explictly documented by TLFS, VMXArea passed
as VMXON argument should still be marked with revision_id reported by
physical CPU.
This issue was found by the following setup:
* L0 = KVM which expose eVMCS to it's L1 guest.
* L1 = KVM which consume eVMCS reported by L0.
This setup caused the following to occur:
1) L1 execute hardware_enable().
2) hardware_enable() calls kvm_cpu_vmxon() to execute VMXON.
3) L0 intercept L1 VMXON and execute handle_vmon() which notes
vmxarea->revision_id != VMCS12_REVISION and therefore fails with
nested_vmx_failInvalid() which sets RFLAGS.CF.
4) L1 kvm_cpu_vmxon() don't check RFLAGS.CF for failure and therefore
hardware_enable() continues as usual.
5) L1 hardware_enable() then calls ept_sync_global() which executes
INVEPT.
6) L0 intercept INVEPT and execute handle_invept() which notes
!vmx->nested.vmxon and thus raise a #UD to L1.
7) Raised #UD caused L1 to panic.
Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Cc: stable@vger.kernel.org
Fixes: 773e8a0425
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
A comment warning against this bug is there, but the code is not doing what
the comment says. Therefore it is possible that an EPOLLHUP races against
irq_bypass_register_consumer. The EPOLLHUP handler schedules irqfd_shutdown,
and if that runs soon enough, you get a use-after-free.
Reported-by: syzbot <syzkaller@googlegroups.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Syzbot reports crashes in kvm_irqfd_assign(), caused by use-after-free
when kvm_irqfd_assign() and kvm_irqfd_deassign() run in parallel
for one specific eventfd. When the assign path hasn't finished but irqfd
has been added to kvm->irqfds.items list, another thead may deassign the
eventfd and free struct kvm_kernel_irqfd(). The assign path then uses
the struct kvm_kernel_irqfd that has been freed by deassign path. To avoid
such issue, keep irqfd under kvm->irq_srcu protection after the irqfd
has been added to kvm->irqfds.items list, and call synchronize_srcu()
in irq_shutdown() to make sure that irqfd has been fully initialized in
the assign path.
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Tianyu Lan <tianyu.lan@intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add missing definitions from nf_osf.h in order to extract Passive OS
fingerprint infrastructure from xt_osf.
Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>