There is no module named tipc_diag.
The assignment to tipc_diag-y has no effect.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allow all the RGMII modes to be used. This would allow us to represent
the hardware better in the device tree with RGMII_ID where in most
cases the PHY's internal delay for both RX and TX are used.
Fixes: 9f93ac8d40 ("net-next: stmmac: Add dwmac-sun8i")
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allow all the RGMII modes to be used. This would allow us to represent
the hardware better in the device tree with RGMII_ID where in most
cases the PHY's internal delay for both RX and TX are used.
Fixes: af0bd4e9ba ("net: stmmac: sunxi platform extensions for GMAC in Allwinner A20 SoC's")
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This series adds 2 sets of changes to mlx5 driver
1) Misc updates and cleanups:
1.1) Stack usages warning cleanups and log level reduction
1.2) Increase the max number of supported rings
1.3) Support accept TC action on native NIC netdev.
2) Software steering support for multi destination steering rules:
First three patches from Erez are adding the low level FW command support
and SW steering infrastructure to create the mult-destination FW tables.
Last four patches from Alex are introducing the needed changes and APIs in
SW steering to create and manage multi-destination actions and rules.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAl4U0UMACgkQSD+KveBX
+j77OQgArFBG3REazF6P+ML3zv5jYKELlJ+pMNCTwubD5En3r8FpivharAuBYZNS
HRn35opDRQbhUYKwigdvWWycWLh7dQt6OCD1/g8+LrzukwLX4odGIiXkR4uS+P4C
lrFLa1QBjYdWwuCcyn0RZE8B/qxk/5b4Xh3KiBzYkbO7sSpniUg48S/FpTsowvre
YxFhtjaCO9kQxaxWEkpM5SqcCuONLednWfLY0L2YLGLCZIWfTGNXamAbDgMQMciW
jZFgYnCXvlkuD7EaZVowjMRdVjlULhxXxlRaaqRXNC1TAU8Q/FlWAljc2434DeUn
Dr1vGL2n7IRZNANjOPxQL205zMTxqw==
=PRNu
-----END PGP SIGNATURE-----
Merge tag 'mlx5-updates-2020-01-07' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5-updates-2020-01-07
This series adds 2 sets of changes to mlx5 driver
1) Misc updates and cleanups:
1.1) Stack usages warning cleanups and log level reduction
1.2) Increase the max number of supported rings
1.3) Support accept TC action on native NIC netdev.
2) Software steering support for multi destination steering rules:
First three patches from Erez are adding the low level FW command support
and SW steering infrastructure to create the mult-destination FW tables.
Last four patches from Alex are introducing the needed changes and APIs in
SW steering to create and manage multi-destination actions and rules.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Enable fimd device node which is a display controller, and add panel
node required by it.
Signed-off-by: Yangtao Li <tiny.windzz@gmail.com>
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
The clock setup on Meson8 cannot achieve a Mali frequency of exactly
182.15MHz. The vendor driver uses "FCLK_DIV7 / 1" for this frequency,
which translates to 2550MHz / 7 / 1 = 364285714Hz.
Update the GPU operating point to that specific frequency to not confuse
myself when comparing the frequency from the .dts with the actual clock
rate on the system.
Fixes: c3ea80b613 ("ARM: dts: meson8b: add the Mali-450 MP2 GPU")
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Signed-off-by: Kevin Hilman <khilman@baylibre.com>
The clock setup on Meson8 cannot achieve a Mali frequency of exactly
182.15MHz. The vendor driver uses "FCLK_DIV7 / 2" for this frequency,
which translates to 2550MHz / 7 / 2 = 182142857Hz.
Update the GPU operating point to that specific frequency to not confuse
myself when comparing the frequency from the .dts with the actual clock
rate on the system.
Fixes: 7d3f6b536e ("ARM: dts: meson8: add the Mali-450 MP6 GPU")
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Signed-off-by: Kevin Hilman <khilman@baylibre.com>
The Meson8b clock controller is an evolution of the Meson8 clock
controller. The clock controller on Meson8b contains two identical mali
clock trees for glitch-free rate switching.
Use the correct compatible string to make use of the glitch free mux.
Fixes: b6db3936f2 ("ARM: dts: meson: switch the clock controller to the HHI register area")
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Signed-off-by: Kevin Hilman <khilman@baylibre.com>
Add the property describing the depth of the audio fifo on the axg, g12a
and sm1 SoC family
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Signed-off-by: Kevin Hilman <khilman@baylibre.com>
force_iret() was originally intended to prevent the return to user mode with
the SYSRET or SYSEXIT instructions, in cases where the register state could
have been changed to be incompatible with those instructions. The entry code
has been significantly reworked since then, and register state is validated
before SYSRET or SYSEXIT are used. force_iret() no longer serves its original
purpose and can be eliminated.
Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Link: https://lkml.kernel.org/r/20191219115812.102620-1-brgerst@gmail.com
Introduce a new probe section (misc) for probes not related to concrete
map types, program types, functions or kernel configuration. Introduce a
probe for large INSN limit as the first one in that section.
Example outputs:
# bpftool feature probe
[...]
Scanning miscellaneous eBPF features...
Large program size limit is available
# bpftool feature probe macros
[...]
/*** eBPF misc features ***/
#define HAVE_HAVE_LARGE_INSN_LIMIT
# bpftool feature probe -j | jq '.["misc"]'
{
"have_large_insn_limit": true
}
Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Link: https://lore.kernel.org/bpf/20200108162428.25014-3-mrostecki@opensuse.org
Introduce a new probe which checks whether kernel has large maximum
program size which was increased in the following commit:
c04c0d2b96 ("bpf: increase complexity limit and maximum program size")
Based on the similar check in Cilium[0], authored by Daniel Borkmann.
[0] 657d0f585a
Co-authored-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Link: https://lore.kernel.org/bpf/20200108162428.25014-2-mrostecki@opensuse.org
corresponding cpuidle driver. This support is based upon using the generic
PM domain, which already supports devices belonging to CPUs.
Finally, these is a DTS patch that enables the hierarchical topology to be
used for the Qcom 410c Dragonboard, which supports the PSCI OS-initiated
mode.
-----BEGIN PGP SIGNATURE-----
iQJLBAABCgA1FiEEugLDXPmKSktSkQsV/iaEJXNYjCkFAl4OEkgXHHVsZi5oYW5z
c29uQGxpbmFyby5vcmcACgkQ/iaEJXNYjCmEZw//Zr7kQS4NMBZjtLtKxLofvzXz
rtdNV3aO6xkDeU3qlaZzz8gFpsO6U5Ohg2LjQFDSAv+kDAHY8ll2Ivl20n56adWT
CXUjBriID1MDWgDe3F1wDTcgAoilpgI36yAKsZsZaSg5hYg01ooVnwEctQj+IWT/
9CzoOghTeDFxL1LoeCMCs6VZ9+qU4sPDcCwKB22YOirDGIlCpNmMt8FPcyN0Qxr6
XfRPlzSUXDKPVS4Uf4BJanQdd4fc2hSXTGf9ha6yByEtft8yb96ZyuJFhzEa91Fq
zM9UXkoL2hxHm8y9jMqrrOsg61+bOLNP73GToQMrLHUJTU970XNvOM+yWVAUshCH
KwfAwPMOTmwC61sGb8WyO7cnuul4sxsGKMGRjGcY/SvTCqB/hZcIke+AJH97zzC6
XcB/vt42cKFzATG0WHpmMcJ6v0ahpAVfDYcXvao+v86EvZTzZD8DBHQC/iWOysC/
XK+BUi7NTas2x3CC8cktFgHgU5kPdQG6Mu6mZe6fjGDMGrwMO0+eoFVwIbAYcCLJ
MEDvKepPhpYpqPQCz9bTxnGoTrv00r58UYwXukR5ohTEt+j+GaFH9FD++x53cBGZ
kQPkTJwZDh6Ozc98Ii9uEjXHSc4uq9aYURrrr8Qjs4/uAKRve2iyeKmFjnsJJt1f
WiAO6lSo2C546AV+ZEI=
=WFox
-----END PGP SIGNATURE-----
Merge tag 'cpuidle_psci-v5.5-rc4' of git://git.linaro.org/people/ulf.hansson/linux-pm into arm/drivers
Initial support for hierarchical CPU arrangement, managed by PSCI and its
corresponding cpuidle driver. This support is based upon using the generic
PM domain, which already supports devices belonging to CPUs.
Finally, these is a DTS patch that enables the hierarchical topology to be
used for the Qcom 410c Dragonboard, which supports the PSCI OS-initiated
mode.
* tag 'cpuidle_psci-v5.5-rc4' of git://git.linaro.org/people/ulf.hansson/linux-pm: (611 commits)
arm64: dts: Convert to the hierarchical CPU topology layout for MSM8916
cpuidle: psci: Add support for PM domains by using genpd
PM / Domains: Introduce a genpd OF helper that removes a subdomain
cpuidle: psci: Support CPU hotplug for the hierarchical model
cpuidle: psci: Manage runtime PM in the idle path
cpuidle: psci: Prepare to use OS initiated suspend mode via PM domains
cpuidle: psci: Attach CPU devices to their PM domains
cpuidle: psci: Add a helper to attach a CPU to its PM domain
cpuidle: psci: Support hierarchical CPU idle states
cpuidle: psci: Simplify OF parsing of CPU idle state nodes
cpuidle: dt: Support hierarchical CPU idle states
of: base: Add of_get_cpu_state_node() to get idle states for a CPU node
firmware: psci: Export functions to manage the OSI mode
dt: psci: Update DT bindings to support hierarchical PSCI states
cpuidle: psci: Align psci_power_state count with idle state count
Linux 5.5-rc4
locks: print unsigned ino in /proc/locks
riscv: export flush_icache_all to modules
riscv: reject invalid syscalls below -1
riscv: fix compile failure with EXPORT_SYMBOL() & !MMU
...
Link: https://lore.kernel.org/r/20200102160820.3572-1-ulf.hansson@linaro.org
Signed-off-by: Olof Johansson <olof@lixom.net>
Fixes for some badly applied patches that went in to 5.5. There is also
a fix for an incorrect i2c address.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE+nHMAt9PCBDH63wBa3ZZB4FHcJ4FAl4VPf8ACgkQa3ZZB4FH
cJ6TfhAApRoZpDJ+8R+8F7e88Gw7sDJY4KpPVMUodwizX4ReCGKkAl5R8R0NC5Z3
EYWl4iXrOdcbQZjAaJnnIXSA6X7aeryLfwvw2h1hqXvIQ1mkwciaoXH1UUgPATK6
GXPJabI4ickvvfFomu6YoB6LRasoDPrkPjOQOpSvg5ySBaxUWUNnNgy5UZBJLLmL
ucujAPccKYCvXoMm6prqvBLFbODhsyBSslNOhCOXtYlXxv5KrzcglKah7sEizKRa
WFC7pgMM8k9u49BJd994I36/IQ4H3e/28CpkDUTQppShPQHhDk6ufNATcCDogkXD
95MKIFLKlMgZauthwgbQ49IVf1xU8kKiAo6SYfalu7RYAaspFQwzHuDRTDItu7NX
2tNHsJ7jUJXWo0hxN6FKre/IWoRyKdotCpGF8/W+W9RB8W78Xd9x3J6wTF0+ZyxE
Dv8KSvTkkikTxoje+oHGbzQSGCeVkt8w1whXGpys3/dNRcz8saEU0yMRvvEw3zse
VMe0ehipHLZapy3vUTm3bODky0taesV39wRZcaV0O4sOR89tkPnLdLiWkJ3BoRJZ
86p8rQoNmRhrYowqgab+HMlcU1kgk/aoPrsnSIQWzRndn1N9t8t/38yz+PGBoJJr
YBAdw/a8hmD57KSGjl23RD9TZrYDh2Hv4c3Ru+RYLXFFE+IHc04=
=JQMa
-----END PGP SIGNATURE-----
Merge tag 'aspeed-5.5-devicetree-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/joel/aspeed into arm/fixes
ASPEED device tree fixes for 5.5
Fixes for some badly applied patches that went in to 5.5. There is also
a fix for an incorrect i2c address.
* tag 'aspeed-5.5-devicetree-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/joel/aspeed:
ARM: dts: aspeed: rainier: Fix fan fault and presence
ARM: dts: aspeed: rainier: Remove duplicate i2c busses
ARM: dts: aspeed: tacoma: Remove duplicate flash nodes
ARM: dts: aspeed: tacoma: Remove duplicate i2c busses
ARM: dts: aspeed: tacoma: Fix fsi master node
ARM: dts: aspeed-g6: Fix FSI master location
Link: https://lore.kernel.org/r/CACPK8XcjazgORXNZBU1ECMukXG4HA8D9VeDxiSPifDk_iB7_dw@mail.gmail.com
Signed-off-by: Olof Johansson <olof@lixom.net>
With the new omap_prm driver added unconditionally, omap2 builds
fail when the reset controller subsystem is disabled:
drivers/soc/ti/omap_prm.o: In function `omap_prm_probe':
omap_prm.c:(.text+0x2d4): undefined reference to `devm_reset_controller_register'
Link: https://lore.kernel.org/r/20191216132132.3330811-1-arnd@arndb.de
Fixes: 3e99cb214f ("soc: ti: add initial PRM driver with reset control support")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
Selecting RESET_CONTROLLER is actually required, otherwise we
can get a link failure in the clock driver:
drivers/clk/davinci/psc.o: In function `__davinci_psc_register_clocks':
psc.c:(.text+0x9a0): undefined reference to `devm_reset_controller_register'
drivers/clk/davinci/psc-da850.o: In function `da850_psc0_init':
psc-da850.c:(.text+0x24): undefined reference to `reset_controller_add_lookup'
Link: https://lore.kernel.org/r/20191210195202.622734-1-arnd@arndb.de
Fixes: f962396ce2 ("ARM: davinci: support multiplatform build for ARM v5")
Cc: <stable@vger.kernel.org> # v5.4
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Acked-by: Sekhar Nori <nsekhar@ti.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
Since the if statement only checks for the value of the `id` variable,
it can be replaced by the more concise BUG_ON() macro for error
reporting.
Issue found using coccinelle.
Signed-off-by: Wambui Karuga <wambui.karugax@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20200102095515.7106-1-wambui.karugax@gmail.com
Leaves one space before and after a binary operator both, it may be more elegant.
Signed-off-by: Pan Zhang <zhangpan26@huawei.com>
Signed-off-by: Will Deacon <will@kernel.org>
Linux commit b6e43c0e31 ("arm64: remove __exception annotations") has
removed __exception_text_start and __exception_text_end sections.
So removing reference of __exception_text_start and __exception_text_end
from from asm/section.h.
Cc: James Morse <james.morse@arm.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Prabhakar Kushwaha <pkushwaha@marvell.com>
Signed-off-by: Will Deacon <will@kernel.org>
Remove the CONFIG_ prefix from the select statement for ARM_GIC_V3.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Will Deacon <will@kernel.org>
In commit c0d8832e78 ("arm64: Ensure the instruction emulation is
ready for userspace"), armv8_deprecated_init() was promoted to
core_initcall() but the comments were left unchanged, update it now.
Spotted by some random reading of the code.
Signed-off-by: Hanjun Guo <guohanjun@huawei.com>
[will: "can guarantee" => "guarantees"]
Signed-off-by: Will Deacon <will@kernel.org>
Broadcom Brahma-B53 CPUs do not implement ID_AA64PFR0_EL1.CSV3 but are
not susceptible to Meltdown, so add all Brahma-B53 part numbers to
kpti_safe_list[].
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Will Deacon <will@kernel.org>
When the control of the selected speculation misbehavior is unsupported,
the kernel should return ENODEV according to the documentation:
https://www.kernel.org/doc/html/v4.17/userspace-api/spec_ctrl.html
Current aarch64 implementation of SSB control sometimes returns EINVAL
which is reserved for unimplemented prctl and for violations of reserved
arguments. This change makes the aarch64 implementation consistent with
the x86 implementation and with the documentation.
Signed-off-by: Anthony Steinhauser <asteinhauser@google.com>
Signed-off-by: Will Deacon <will@kernel.org>
-----BEGIN PGP SIGNATURE-----
iJYEABYIAD4WIQRE6pSOnaBC00OEHEIaerohdGur0gUCXhX/xyAcamFya2tvLnNh
a2tpbmVuQGxpbnV4LmludGVsLmNvbQAKCRAaerohdGur0rWzAQCWBxqWAa9FCR+O
UnoTOUKxAXKS2tZ9zaiIMPUkqHxvugEAjlMkcpMERDuHX40m1WyE6Q0mPaFXxLVp
yc1Wc5dTNQw=
=lKXu
-----END PGP SIGNATURE-----
Merge tag 'tpmdd-next-20200108' of git://git.infradead.org/users/jjs/linux-tpmdd
Pull more tpmd fixes from Jarkko Sakkinen:
"One critical regression fix (the faulty commit got merged in rc3, but
also marked for stable)"
* tag 'tpmdd-next-20200108' of git://git.infradead.org/users/jjs/linux-tpmdd:
tpm: Handle negative priv->response_len in tpm_common_read()
WARN if root_hpa is invalid when handling a page fault. The check on
root_hpa exists for historical reasons that no longer apply to the
current KVM code base.
Remove an equivalent debug-only warning in direct_page_fault(), whose
existence more or less confirms that root_hpa should always be valid
when handling a page fault.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
WARN on the existing invalid root_hpa checks in __direct_map() and
FNAME(fetch). The "legitimate" path that invalidated root_hpa in the
middle of a page fault is long since gone, i.e. it should no longer be
impossible to invalidate in the middle of a page fault[*].
The root_hpa checks were added by two related commits
989c6b34f6 ("KVM: MMU: handle invalid root_hpa at __direct_map")
37f6a4e237 ("KVM: x86: handle invalid root_hpa everywhere")
to fix a bug where nested_vmx_vmexit() could be called *in the middle*
of a page fault. At the time, vmx_interrupt_allowed(), which was and
still is used by kvm_can_do_async_pf() via ->interrupt_allowed(),
directly invoked nested_vmx_vmexit() to switch from L2 to L1 to emulate
a VM-Exit on a pending interrupt. Emulating the nested VM-Exit resulted
in root_hpa being invalidated by kvm_mmu_reset_context() without
explicitly terminating the page fault.
Now that root_hpa is checked for validity by kvm_mmu_page_fault(), WARN
on an invalid root_hpa to detect any flows that reset the MMU while
handling a page fault. The broken vmx_interrupt_allowed() behavior has
long since been fixed and resetting the MMU during a page fault should
not be considered legal behavior.
[*] It's actually technically possible in FNAME(page_fault)() because it
calls inject_page_fault() when the guest translation is invalid, but
in that case the page fault handling is immediately terminated.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add a check on root_hpa at the beginning of the page fault handler to
consolidate several checks on root_hpa that are scattered throughout the
page fault code. This is a preparatory step towards eventually removing
such checks altogether, or at the very least WARNing if an invalid root
is encountered. Remove only the checks that can be easily audited to
confirm that root_hpa cannot be invalidated between their current
location and the new check in kvm_mmu_page_fault(), and aren't currently
protected by mmu_lock, i.e. keep the checks in __direct_map() and
FNAME(fetch) for the time being.
The root_hpa checks that are consolidate were all added by commit
37f6a4e237 ("KVM: x86: handle invalid root_hpa everywhere")
which was a follow up to a bug fix for __direct_map(), commit
989c6b34f6 ("KVM: MMU: handle invalid root_hpa at __direct_map")
At the time, nested VMX had, in hindsight, crazy handling of nested
interrupts and would trigger a nested VM-Exit in ->interrupt_allowed(),
and thus unexpectedly reset the MMU in flows such as can_do_async_pf().
Now that the wonky nested VM-Exit behavior is gone, the root_hpa checks
are bogus and confusing, e.g. it's not at all obvious what they actually
protect against, and at first glance they appear to be broken since many
of them run without holding mmu_lock.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Move the calls to thp_adjust() down a level from the page fault handlers
to the map/fetch helpers and remove the page count shuffling done in
thp_adjust().
Despite holding a reference to the underlying page while processing a
page fault, the page fault flows don't actually rely on holding a
reference to the page when thp_adjust() is called. At that point, the
fault handlers hold mmu_lock, which prevents mmu_notifier from completing
any invalidations, and have verified no invalidations from mmu_notifier
have occurred since the page reference was acquired (which is done prior
to taking mmu_lock).
The kvm_release_pfn_clean()/kvm_get_pfn() dance in thp_adjust() is a
quirk that is necessitated because thp_adjust() modifies the pfn that is
consumed by its caller. Because the page fault handlers call
kvm_release_pfn_clean() on said pfn, thp_adjust() needs to transfer the
reference to the correct pfn purely for correctness when the pfn is
released.
Calling thp_adjust() from __direct_map() and FNAME(fetch) means the pfn
adjustment doesn't change the pfn as seen by the page fault handlers,
i.e. the pfn released by the page fault handlers is the same pfn that
was returned by gfn_to_pfn().
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Move thp_adjust() above __direct_map() in preparation of calling
thp_adjust() from __direct_map() and FNAME(fetch).
No functional change intended.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Consolidate the direct MMU page fault handlers into a common helper,
direct_page_fault(). Except for unique max level conditions, the tdp
and nonpaging fault handlers are functionally identical.
No functional change intended.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Rename __direct_map()'s param that controls whether or not a disallowed
NX large page should be accounted to match what it actually does. The
nonpaging_page_fault() case unconditionally passes %false for the param
even though it locally sets lpage_disallowed.
No functional change intended.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Persist the max page level calculated via gfn_lpage_is_disallowed() to
the max level "returned" by mapping_level() so that its naturally taken
into account by the max level check that conditions calling
transparent_hugepage_adjust().
Drop the gfn_lpage_is_disallowed() check in thp_adjust() as it's now
handled by mapping_level() and its callers.
Add a comment to document the behavior of host_mapping_level() and its
interaction with max level and transparent huge pages.
Note, transferring the gfn_lpage_is_disallowed() from thp_adjust() to
mapping_level() superficially affects how changes to a memslot's
disallow_lpage count will be handled due to thp_adjust() being run while
holding mmu_lock.
In the more common case where a different vCPU increments the count via
account_shadowed(), gfn_lpage_is_disallowed() is rechecked by set_spte()
to ensure a writable large page isn't created.
In the less common case where the count is decremented to zero due to
all shadow pages in the memslot being zapped, THP behavior now matches
hugetlbfs behavior in the sense that a small page will be created when a
large page could be used if the count reaches zero in the miniscule
window between mapping_level() and acquiring mmu_lock.
Lastly, the new THP behavior also follows hugetlbfs behavior in the
absurdly unlikely scenario of a memslot being moved such that the
memslot's compatibility with respect to large pages changes, but without
changing the validity of the gpf->pfn walk. I.e. if a memslot is moved
between mapping_level() and snapshotting mmu_seq, it's theoretically
possible to consume a stale disallow_lpage count. But, since KVM zaps
all shadow pages when moving a memslot and forces all vCPUs to reload a
new MMU, the inserted spte will always be thrown away prior to
completing the memslot move, i.e. whether or not the spte accurately
reflects disallow_lpage is irrelevant.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Restrict the max level for a shadow page based on the guest's level
instead of capping the level after the fact for host-mapped huge pages,
e.g. hugetlbfs pages. Explicitly capping the max level using the guest
mapping level also eliminates FNAME(page_fault)'s subtle dependency on
THP only supporting 2mb pages.
No functional change intended.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Refactor the page fault handlers and mapping_level() to track the max
allowed page level instead of only tracking if a 4k page is mandatory
due to one restriction or another. This paves the way for cleanly
consolidating tdp_page_fault() and nonpaging_page_fault(), and for
eliminating a redundant check on mmu_gfn_lpage_is_disallowed().
No functional change intended.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Invert the loop which adjusts the allowed page level based on what's
compatible with the associated memslot to use a largest-to-smallest
page size walk. This paves the way for passing around a "max level"
variable instead of having redundant checks and/or multiple booleans.
No functional change intended.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Pre-calculate the max level for a TDP page with respect to MTRR cache
consistency in preparation of replacing force_pt_level with max_level,
and eventually combining the bulk of nonpaging_page_fault() and
tdp_page_fault() into a common helper.
No functional change intended.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Move nonpaging_page_fault() below try_async_pf() to eliminate the
forward declaration of try_async_pf() and to prepare for combining the
bulk of nonpaging_page_fault() and tdp_page_fault() into a common
helper.
No functional change intended.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Fold nonpaging_map() into its sole caller, nonpaging_page_fault(), in
preparation for combining the bulk of nonpaging_page_fault() and
tdp_page_fault() into a common helper.
No functional change intended.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Move make_mmu_pages_available() above its first user to put it closer
to related code and eliminate a forward declaration.
No functional change intended.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Convert a plethora of parameters and variables in the MMU and page fault
flows from type gva_t to gpa_t to properly handle TDP on 32-bit KVM.
Thanks to PSE and PAE paging, 32-bit kernels can access 64-bit physical
addresses. When TDP is enabled, the fault address is a guest physical
address and thus can be a 64-bit value, even when both KVM and its guest
are using 32-bit virtual addressing, e.g. VMX's VMCS.GUEST_PHYSICAL is a
64-bit field, not a natural width field.
Using a gva_t for the fault address means KVM will incorrectly drop the
upper 32-bits of the GPA. Ditto for gva_to_gpa() when it is used to
translate L2 GPAs to L1 GPAs.
Opportunistically rename variables and parameters to better reflect the
dual address modes, e.g. use "cr2_or_gpa" for fault addresses and plain
"addr" instead of "vaddr" when the address may be either a GVA or an L2
GPA. Similarly, use "gpa" in the nonpaging_page_fault() flows to avoid
a confusing "gpa_t gva" declaration; this also sets the stage for a
future patch to combing nonpaging_page_fault() and tdp_page_fault() with
minimal churn.
Sprinkle in a few comments to document flows where an address is known
to be a GVA and thus can be safely truncated to a 32-bit value. Add
WARNs in kvm_handle_page_fault() and FNAME(gva_to_gpa_nested)() to help
document such cases and detect bugs.
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
WARN once in kvm_load_guest_fpu() if TIF_NEED_FPU_LOAD is observed, as
that would mean that KVM is corrupting userspace's FPU by saving
unknown register state into arch.user_fpu. Add a comment to explain
why KVM WARNs on TIF_NEED_FPU_LOAD instead of implementing logic
similar to fpu__copy().
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Unlike most state managed by XSAVE, MPX is initialized to zero on INIT.
Because INITs are usually recognized in the context of a VCPU_RUN call,
kvm_vcpu_reset() puts the guest's FPU so that the FPU state is resident
in memory, zeros the MPX state, and reloads FPU state to hardware. But,
in the unlikely event that an INIT is recognized during
kvm_arch_vcpu_ioctl_get_mpstate() via kvm_apic_accept_events(),
kvm_vcpu_reset() will call kvm_put_guest_fpu() without a preceding
kvm_load_guest_fpu() and corrupt the guest's FPU state (and possibly
userspace's FPU state as well).
Given that MPX is being removed from the kernel[*], fix the bug with the
simple-but-ugly approach of loading the guest's FPU during
KVM_GET_MP_STATE.
[*] See commit f240652b60 ("x86/mpx: Remove MPX APIs").
Fixes: f775b13eed ("x86,kvm: move qemu/guest FPU switching out to vcpu_run")
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Use out_err jump label to handle resource release. It's a
good practice to release resource in one place and help
eliminate some duplicated code.
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Use out_err jump label to handle resource release. It's a
good practice to release resource in one place and help
eliminate some duplicated code.
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
There are two declarations of kvm_vcpu_kick() in kvm_host.h where
one of them is redundant. Remove to keep the git grep a bit cleaner.
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>