Pull sparc updates from David Miller:
"Here we go:
- Fix various long standing issues in the sparc 32-bit IOMMU support
code, from Christoph Hellwig.
- Various other code cleanups and simplifications all over. From
Gustavo A. R. Silva, Jagadeesh Pagadala, Masahiro Yamada, Mauro
Carvalho Chehab, Mike Rapoport"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
sparc64: simplify reduce_memory() function
sparc: use struct_size() in kzalloc()
docs: sparc: convert to ReST
sparc/iommu: merge iommu_get_one and __sbus_iommu_map_page
sparc/iommu: use __sbus_iommu_map_page to implement the map_sg path
sparc/iommu: fix __sbus_iommu_map_page for highmem pages
sparc/iommu: move per-page flushing into __sbus_iommu_map_page
sparc/iommu: pass a physical address to iommu_get_one
sparc/iommu: create a common helper for map_sg
sparc/iommu: merge iommu_release_one and sbus_iommu_unmap_page
sparc/iommu: use sbus_iommu_unmap_page in sbus_iommu_unmap_sg
sparc/iommu: use !PageHighMem to check if a page has a kernel mapping
sparc: vdso: add FORCE to the build rule of %.so
arch:sparc:kernel/uprobes.c : Remove duplicate header
around because we've finally gotten around to fixing some long standing issues.
There's still work to do though, so this PR is largely laying down the
foundation for all the driver changes to come in the next merge window.
The first problem we're alleviating is how parents of clks are specified. With
the new method, we should see lots of drivers migrate away from the current
design of string comparisons on the entire clk tree to a more direct method
where they can use clk_hw pointers or more localized names specified in DT or
via clkdev. This should reduce our reliance on string comparisons for all the
topology description logic that we've been using for years and hopefully speed
some things up while avoiding problems we have with generating clk names.
Beyond that we also got rid of the CLK_IS_BASIC flag because it wasn't really
helping anyone and we introduced big-endian versions of the basic clk types so
that we can get rid of clk_{readl,writel}(). Both of these are things that
driver developers have tried to use over the years that I typically bat away
during code reviews because they're not useful. It's great to see these two
things go away so maintainers can save time not worrying about these things.
On the driver side we got the usual collection of new SoC support and
non-critical fixes and updates to existing code. The big topics that stand out
are the new driver support for Mediatek MT8183 and MT8516 SoCs, Amlogic Meson8b
and G12a SoCs, and the SiFive FU540 SoC. The other patches in the driver pile
are mostly fixes for things that are being used for the first time or additions
for clks that couldn't be tested before because there wasn't a consumer driver
that exercised them. Details are below and also in the sub-maintainer tags.
Core:
- Remove clk_readl() and introduce BE versions of basic clk types
- Rewrite how clk parents can be specified to allow DT/clkdev lookups
- Removal of the CLK_IS_BASIC clk flag
- Framework documentation updates and fixes
New Drivers:
- Support for STM32F769
- AT91 sam9x60 PMC support
- SiFive FU540 PRCI and PLL support
- Qualcomm QCS404 CDSP clk support
- Qualcomm QCS404 Turing clk support
- Mediatek MT8183 clock support
- Mediatek MT8516 clock support
- Milbeaut M10V clk controller support
- Support for Cirrus Logic Lochnagar clks
Updates:
- Rework AT91 sckc DT bindings
- Fix slow RC oscillator issue on sama5d3
- Mark UFS clk as critical on Hi-Silicon hi3660 SoCs
- Various static analysis fixes/finds and const markings
- Video Engine (ECLK) support on Aspeed SoCs
- Xilinx ZynqMP Versal platform support
- Convert Xilinx ZynqMP driver to be struct oriented
- Fixes for Rockchip rk3328 and rk3288 SoCs
- Sub-type for Rockchip SoCs where mux and divider aren't a single register
- Remove SNVS clock from i.MX7UPL clock driver and bindings
- Improve i.MX5 clock driver for i.MX50 support
- Addition of ADC clock definition for Exynos 5410 SoC (Odroid XU)
- Export a new clock for the MBUS controller on the A13
- Allwinner H6 fixes to support a finer clocking of the video and VPU engines
- Add g12a support in the Amlogic axg audio clock controller
- Add missing PCI USB clock on Rensas RZ/N1
- Add Z2 (Cortex-A53) clocks on Rensas R-Car E3 and RZ/G2E
- A new helper DIV64_U64_ROUND_CLOSEST() in <linux/math64.h>
- VPU and Video Decoder clocks on Amlogic Meson8b
- Finally remove the wrong ABP Meson8b clock id
- Add Video Decoder, PCIe PLL, and CPU Clocks on Amlogic G12A
- Re-expose SAR_ADC_SEL and CTS_OSCIN on Amlogic G12A AO clock controller
- Un-expose some Amlogic AXG-Audio input clocks IDs
-----BEGIN PGP SIGNATURE-----
iQJFBAABCAAvFiEE9L57QeeUxqYDyoaDrQKIl8bklSUFAlzUabQRHHNib3lkQGtl
cm5lbC5vcmcACgkQrQKIl8bklSUMmBAAr0WrvWa3s1Ue+lPfmehcAfeI2NkBPC/E
uhKD+vHBil/Aha33tFTPtVjsZiaMuUETNGPppEUUrHgu4K3UMJZl0iYql6XNVP77
OObIM5wqXoJ5Yv1e1G0p7X/Qztx7UxEtPwbXJ/9kNN2t6yzg4y8vD2cmXgV5KzHp
yRUDFNbH9JEyWFbrPhPjD3Bk1PCwdmXNFQg/uYk79g3c84js9MCbWvIqVEuU3vps
3/9lsDkhbp/flrSOA7D1eloQ6aPXdkLsFzDkJ+6mCA3zxsW/i2N38ZKVDTYG/5rx
USh3Z0Vyd0f9pKlQqwe1tyr2PBJrYWTtcPBSDcdr4BI1209xseFe4TaqHw1IRKcB
uYX0gtNTTcgx+8Znvp9y+hE8DhbVNpZvblZuab+rbfb/Gte2wC5/zvzEAz1EqPap
43VYdi3JR9iWGsC/r4+5OdVFRgWmXFsn5ysQRLkRgE41fKRn7joGHhPS5xDTI0l/
1rA/8Oh0GMAcSOQ0aSBtavmMCsyJJTjG7s6MpqiO5u/DHnb4MB1Jd0rEWNIjiVmJ
cqS8II+EbaLWZgt9r3W7ePhkfpHlLw+c4mwI9lRF7Zo67Rz9lrlt1l9YxPnJHewN
uTRn2ch5W90Jr289wymDNQZGGvCyr+nxKYlSd+kXjlH6poDn6bxYdKHgJexDYmZR
NVylbizS6lg=
=/uso
-----END PGP SIGNATURE-----
Merge tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk framework updates from Stephen Boyd:
"We have a couple new features and changes in the core clk framework
this time around because we've finally gotten around to fixing some
long standing issues. There's still work to do though, so this pull
request is largely laying down the foundation for all the driver
changes to come in the next merge window.
The first problem we're alleviating is how parents of clks are
specified. With the new method, we should see lots of drivers migrate
away from the current design of string comparisons on the entire clk
tree to a more direct method where they can use clk_hw pointers or
more localized names specified in DT or via clkdev. This should reduce
our reliance on string comparisons for all the topology description
logic that we've been using for years and hopefully speed some things
up while avoiding problems we have with generating clk names.
Beyond that we also got rid of the CLK_IS_BASIC flag because it wasn't
really helping anyone and we introduced big-endian versions of the
basic clk types so that we can get rid of clk_{readl,writel}(). Both
of these are things that driver developers have tried to use over the
years that I typically bat away during code reviews because they're
not useful. It's great to see these two things go away so maintainers
can save time not worrying about these things.
On the driver side we got the usual collection of new SoC support and
non-critical fixes and updates to existing code. The big topics that
stand out are the new driver support for Mediatek MT8183 and MT8516
SoCs, Amlogic Meson8b and G12a SoCs, and the SiFive FU540 SoC. The
other patches in the driver pile are mostly fixes for things that are
being used for the first time or additions for clks that couldn't be
tested before because there wasn't a consumer driver that exercised
them. Details are below and also in the sub-maintainer tags.
Core:
- Remove clk_readl() and introduce BE versions of basic clk types
- Rewrite how clk parents can be specified to allow DT/clkdev lookups
- Removal of the CLK_IS_BASIC clk flag
- Framework documentation updates and fixes
New Drivers:
- Support for STM32F769
- AT91 sam9x60 PMC support
- SiFive FU540 PRCI and PLL support
- Qualcomm QCS404 CDSP clk support
- Qualcomm QCS404 Turing clk support
- Mediatek MT8183 clock support
- Mediatek MT8516 clock support
- Milbeaut M10V clk controller support
- Support for Cirrus Logic Lochnagar clks
Updates:
- Rework AT91 sckc DT bindings
- Fix slow RC oscillator issue on sama5d3
- Mark UFS clk as critical on Hi-Silicon hi3660 SoCs
- Various static analysis fixes/finds and const markings
- Video Engine (ECLK) support on Aspeed SoCs
- Xilinx ZynqMP Versal platform support
- Convert Xilinx ZynqMP driver to be struct oriented
- Fixes for Rockchip rk3328 and rk3288 SoCs
- Sub-type for Rockchip SoCs where mux and divider aren't a single register
- Remove SNVS clock from i.MX7UPL clock driver and bindings
- Improve i.MX5 clock driver for i.MX50 support
- Addition of ADC clock definition for Exynos 5410 SoC (Odroid XU)
- Export a new clock for the MBUS controller on the A13
- Allwinner H6 fixes to support a finer clocking of the video and VPU engines
- Add g12a support in the Amlogic axg audio clock controller
- Add missing PCI USB clock on Rensas RZ/N1
- Add Z2 (Cortex-A53) clocks on Rensas R-Car E3 and RZ/G2E
- A new helper DIV64_U64_ROUND_CLOSEST() in <linux/math64.h>
- VPU and Video Decoder clocks on Amlogic Meson8b
- Finally remove the wrong ABP Meson8b clock id
- Add Video Decoder, PCIe PLL, and CPU Clocks on Amlogic G12A
- Re-expose SAR_ADC_SEL and CTS_OSCIN on Amlogic G12A AO clock controller
- Un-expose some Amlogic AXG-Audio input clocks IDs"
* tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: (172 commits)
clk: Cache core in clk_fetch_parent_index() without names
clk: imx: correct pfdv2 gate_bit/vld_bit operations
clk: sifive: add a driver for the SiFive FU540 PRCI IP block
clk: analogbits: add Wide-Range PLL library
clk: imx: clk-pllv3: mark expected switch fall-throughs
clk: imx8mq: Add dsi_ipg_div
clk: imx: pllv4: add fractional-N pll support
clk: sunxi-ng: Use the correct style for SPDX License Identifier
clk: sprd: Use the correct style for SPDX License Identifier
clk: renesas: Use the correct style for SPDX License Identifier
clk: qcom: Use the correct style for SPDX License Identifier
clk: davinci: Use the correct style for SPDX License Identifier
clk: actions: Use the correct style for SPDX License Identifier
clk: imx: keep uart clock on during system boot
clk: imx: correct i.MX7D AV PLL num/denom offset
dt-bindings: clk: add documentation for the SiFive PRCI driver
clk: stm32mp1: Add ddrperfm clock
clk: Remove CLK_IS_BASIC clk flag
clock: milbeaut: Add Milbeaut M10V clock controller
dt-bindings: clock: milbeaut: add Milbeaut clock description
...
The reduce_memory() function clampls the available memory to a limit
defined by the "mem=" command line parameter. It takes into account the
amount of already reserved memory and excludes it from the limit
calculations.
Rather than traverse memblocks and remove them by hand, use
memblock_reserved_size() to account the reserved memory and
memblock_enforce_memory_limit() to clamp the available memory.
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
One of the more common cases of allocation size calculations is finding the
size of a structure that has a zero-sized array at the end, along with memory
for some number of elements for that array. For example:
struct foo {
int stuff;
void *entry[];
};
instance = kzalloc(sizeof(struct foo) + sizeof(void *) * count, GFP_KERNEL);
Instead of leaving these open-coded and prone to type mistakes, we can now
use the new struct_size() helper:
instance = kzalloc(struct_size(instance, entry, count), GFP_KERNEL);
This code was detected with the help of Coccinelle.
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull intgrity updates from James Morris:
"This contains just three patches, the remainder were either included
in other pull requests (eg. audit, lockdown) or will be upstreamed via
other subsystems (eg. kselftests, Power).
Included here is one bug fix, one documentation update, and extending
the x86 IMA arch policy rules to coordinate the different kernel
module signature verification methods"
* 'next-integrity' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
doc/kernel-parameters.txt: Deprecate ima_appraise_tcb
x86/ima: add missing include
x86/ima: require signed kernel modules
Use __iomem pointers for efficiency to write the DMA registers.
This avoids the compiler emitting several instructions for each access
to calculate the register address.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
All we really need is the DMA address and size, we don't need the
baggage of a full scatterlist structure.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
DMA got broken a while back in two different ways:
1) a change in the behaviour of disable_irq() to wait for the interrupt
to finish executing causes us to deadlock at the end of DMA.
2) a change to avoid modifying the scatterlist left the first transfer
uninitialised.
DMA is only used with expansion cards, so has gone unnoticed.
Fixes: fa4e998999 ("[ARM] dma: RiscPC: don't modify DMA SG entries")
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Multiple printk() statements appear to get broken into separate lines,
which messes up the formatting. Fix these up.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Fix lack of keyboard interrupts for RiscPC due to incorrect conversion.
Fixes: e8d36d5dbb ("ARM: kill off set_irq_flags usage")
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
- remove the already broken support for NULL dev arguments to the
DMA API calls
- Kconfig tidyups
-----BEGIN PGP SIGNATURE-----
iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAlzT00YLHGhjaEBsc3Qu
ZGUACgkQD55TZVIEUYO66hAAx2kCUIh+K2gFB5uxHqZiG62UmRjkPzolxcR5/Jx9
4Rz6NRAE+rp8v2fbBr2bveDx7cF5bm1L+pRyRsFMfwkm3a8dCHQ51ldIm5VFoI3e
NiX6Zoxk02BCXP/Qk//aHeNW9dBmuiemiXzdPEhOvWvVzqTO5JZrECQpkHEkG+8A
R/IWU15sr5xzw9Td/HVN9CRJri/qiTAuB9nSoP6BGjZeHkQjREJKNMGKDTvSzH4L
tlyD1G7yEymQvLBqGGO64ztuav00l8sqjI3tn1mmwpw4VTajabeRHPnWh+7g9Od+
sH1pRvIOTvEMc456fizufYIOedB5Ze344kgfrxhngRbBVXmMfShr8ZLzdIUGhGjY
1cdGqIUOEKywiDf13KrHVkNU+lJtvjMCMxvV93mAYRLOIQg0Jf4T2kklgKyEhqrG
rqFdbbtSBzmLjPyqc1FS0heDWmA+yJsKAumGcH4blJXCpsD1rHWGe0AJ34x+OHPT
tw5l+P4zAH1eO1qHCtmxN9s0lXZv1VLcFkOrJH91LPvAhZsUCrdqDjyJpTUYaIao
yzkiLbDwFO7SVoqzaVNlVZIJ/9LX0qfAnl2Atty+sAQomrQMoviNSzGbLSLQqhHN
FbTIEBMxrxS49+3lfzHOS/lYPpJp6B31yotNM+6YpXmbRQZN5gjGNYBqhKD+7Rgn
L0Y=
=IdsP
-----END PGP SIGNATURE-----
Merge tag 'dma-mapping-5.2' of git://git.infradead.org/users/hch/dma-mapping
Pull DMA mapping updates from Christoph Hellwig:
- remove the already broken support for NULL dev arguments to the DMA
API calls
- Kconfig tidyups
* tag 'dma-mapping-5.2' of git://git.infradead.org/users/hch/dma-mapping:
dma-mapping: add a Kconfig symbol to indicate arch_dma_prep_coherent presence
dma-mapping: remove an unnecessary NULL check
x86/dma: Remove the x86_dma_fallback_dev hack
dma-mapping: remove leftover NULL device support
arm: use a dummy struct device for ISA DMA use of the DMA API
pxa3xx-gcu: pass struct device to dma_mmap_coherent
gbefb: switch to managed version of the DMA allocator
da8xx-fb: pass struct device to DMA API functions
parport_ip32: pass struct device to DMA API functions
dma: select GENERIC_ALLOCATOR for DMA_REMAP
Replace the old gettimeoffset() interface (which became buggy in
several ways) with a clocksource that atomically reads the count
and status from the timer, and corrects the count as appropriate
ensuring proper resolution of time without time warping backwards.
We keep the original periodic timer non-clock event implementation
to provide the kernel with a regular source of interrupts, which
are required to keep the clocksource properly updated.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
The APIC timer calibration (calibrate_APIC_timer()) can be skipped
in cases where we know the APIC timer frequency. On Intel SoCs,
we believe that the APIC is fed by the crystal clock; this would make
sense, and the crystal clock frequency has been verified against the
APIC timer calibration result on ApolloLake, GeminiLake, Kabylake,
CoffeeLake, WhiskeyLake and AmberLake.
Set lapic_timer_period based on the crystal clock frequency
accordingly.
APIC timer calibration would normally be skipped on modern CPUs
by nature of the TSC deadline timer being used instead,
however this change is still potentially useful, e.g. if the
TSC deadline timer has been disabled with a kernel parameter.
calibrate_APIC_timer() uses the legacy timer, but we are seeing
new platforms that omit such legacy functionality, so avoiding
such codepaths is becoming more important.
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Daniel Drake <drake@endlessm.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: len.brown@intel.com
Cc: linux@endlessm.com
Cc: rafael.j.wysocki@intel.com
Link: http://lkml.kernel.org/r/20190509055417.13152-3-drake@endlessm.com
Link: https://lkml.kernel.org/r/20190419083533.32388-1-drake@endlessm.com
Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1904031206440.1967@nanos.tec.linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This variable is a period unit (number of clock cycles per jiffy),
not a frequency (which is number of cycles per second).
Give it a more appropriate name.
Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Daniel Drake <drake@endlessm.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: len.brown@intel.com
Cc: linux@endlessm.com
Cc: rafael.j.wysocki@intel.com
Link: http://lkml.kernel.org/r/20190509055417.13152-2-drake@endlessm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
native_calibrate_tsc() had a data mapping Intel CPU families
and crystal clock speed, but hardcoded tables are not ideal, and this
approach was already problematic at least in the Skylake X case, as
seen in commit:
b511203093 ("x86/tsc: Fix erroneous TSC rate on Skylake Xeon")
By examining CPUID data from http://instlatx64.atw.hu/ and units
in the lab, we have found that 3 different scenarios need to be dealt
with, and we can eliminate most of the hardcoded data using an approach a
little more advanced than before:
1. ApolloLake, GeminiLake, CannonLake (and presumably all new chipsets
from this point) report the crystal frequency directly via CPUID.0x15.
That's definitive data that we can rely upon.
2. Skylake, Kabylake and all variants of those two chipsets report a
crystal frequency of zero, however we can calculate the crystal clock
speed by condidering data from CPUID.0x16.
This method correctly distinguishes between the two crystal clock
frequencies present on different Skylake X variants that caused
headaches before.
As the calculations do not quite match the previously-hardcoded values
in some cases (e.g. 23913043Hz instead of 24MHz), TSC refinement is
enabled on all platforms where we had to calculate the crystal
frequency in this way.
3. Denverton (GOLDMONT_X) reports a crystal frequency of zero and does
not support CPUID.0x16, so we leave this entry hardcoded.
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Daniel Drake <drake@endlessm.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: len.brown@intel.com
Cc: linux@endlessm.com
Cc: rafael.j.wysocki@intel.com
Link: http://lkml.kernel.org/r/20190509055417.13152-1-drake@endlessm.com
Link: https://lkml.kernel.org/r/20190419083533.32388-1-drake@endlessm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This is a bit of a mess, to put it mildly. But, it's a bug
that only seems to have showed up in 4.20 but wasn't noticed
until now, because nobody uses MPX.
MPX has the arch_unmap() hook inside of munmap() because MPX
uses bounds tables that protect other areas of memory. When
memory is unmapped, there is also a need to unmap the MPX
bounds tables. Barring this, unused bounds tables can eat 80%
of the address space.
But, the recursive do_munmap() that gets called vi arch_unmap()
wreaks havoc with __do_munmap()'s state. It can result in
freeing populated page tables, accessing bogus VMA state,
double-freed VMAs and more.
See the "long story" further below for the gory details.
To fix this, call arch_unmap() before __do_unmap() has a chance
to do anything meaningful. Also, remove the 'vma' argument
and force the MPX code to do its own, independent VMA lookup.
== UML / unicore32 impact ==
Remove unused 'vma' argument to arch_unmap(). No functional
change.
I compile tested this on UML but not unicore32.
== powerpc impact ==
powerpc uses arch_unmap() well to watch for munmap() on the
VDSO and zeroes out 'current->mm->context.vdso_base'. Moving
arch_unmap() makes this happen earlier in __do_munmap(). But,
'vdso_base' seems to only be used in perf and in the signal
delivery that happens near the return to userspace. I can not
find any likely impact to powerpc, other than the zeroing
happening a little earlier.
powerpc does not use the 'vma' argument and is unaffected by
its removal.
I compile-tested a 64-bit powerpc defconfig.
== x86 impact ==
For the common success case this is functionally identical to
what was there before. For the munmap() failure case, it's
possible that some MPX tables will be zapped for memory that
continues to be in use. But, this is an extraordinarily
unlikely scenario and the harm would be that MPX provides no
protection since the bounds table got reset (zeroed).
I can't imagine anyone doing this:
ptr = mmap();
// use ptr
ret = munmap(ptr);
if (ret)
// oh, there was an error, I'll
// keep using ptr.
Because if you're doing munmap(), you are *done* with the
memory. There's probably no good data in there _anyway_.
This passes the original reproducer from Richard Biener as
well as the existing mpx selftests/.
The long story:
munmap() has a couple of pieces:
1. Find the affected VMA(s)
2. Split the start/end one(s) if neceesary
3. Pull the VMAs out of the rbtree
4. Actually zap the memory via unmap_region(), including
freeing page tables (or queueing them to be freed).
5. Fix up some of the accounting (like fput()) and actually
free the VMA itself.
This specific ordering was actually introduced by:
dd2283f260 ("mm: mmap: zap pages with read mmap_sem in munmap")
during the 4.20 merge window. The previous __do_munmap() code
was actually safe because the only thing after arch_unmap() was
remove_vma_list(). arch_unmap() could not see 'vma' in the
rbtree because it was detached, so it is not even capable of
doing operations unsafe for remove_vma_list()'s use of 'vma'.
Richard Biener reported a test that shows this in dmesg:
[1216548.787498] BUG: Bad rss-counter state mm:0000000017ce560b idx:1 val:551
[1216548.787500] BUG: non-zero pgtables_bytes on freeing mm: 24576
What triggered this was the recursive do_munmap() called via
arch_unmap(). It was freeing page tables that has not been
properly zapped.
But, the problem was bigger than this. For one, arch_unmap()
can free VMAs. But, the calling __do_munmap() has variables
that *point* to VMAs and obviously can't handle them just
getting freed while the pointer is still in use.
I tried a couple of things here. First, I tried to fix the page
table freeing problem in isolation, but I then found the VMA
issue. I also tried having the MPX code return a flag if it
modified the rbtree which would force __do_munmap() to re-walk
to restart. That spiralled out of control in complexity pretty
fast.
Just moving arch_unmap() and accepting that the bonkers failure
case might eat some bounds tables seems like the simplest viable
fix.
This was also reported in the following kernel bugzilla entry:
https://bugzilla.kernel.org/show_bug.cgi?id=203123
There are some reports that this commit triggered this bug:
dd2283f260 ("mm: mmap: zap pages with read mmap_sem in munmap")
While that commit certainly made the issues easier to hit, I believe
the fundamental issue has been with us as long as MPX itself, thus
the Fixes: tag below is for one of the original MPX commits.
[ mingo: Minor edits to the changelog and the patch. ]
Reported-by: Richard Biener <rguenther@suse.de>
Reported-by: H.J. Lu <hjl.tools@gmail.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Yang Shi <yang.shi@linux.alibaba.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rik van Riel <riel@surriel.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: linux-arch@vger.kernel.org
Cc: linux-mm@kvack.org
Cc: linux-um@lists.infradead.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: stable@vger.kernel.org
Fixes: dd2283f260 ("mm: mmap: zap pages with read mmap_sem in munmap")
Link: http://lkml.kernel.org/r/20190419194747.5E1AD6DC@viggo.jf.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJc04M6AAoJEAx081l5xIa+SJgP/0uIgIOM53vPpydgmr+2IEHF
jbDqrd+mipgNriRVHjDsWdUHCUNtyhB7YEBCMrj3mY0rRFI7FlQQf4lOwYGoHiKP
4JZg4kwC37997lFXl1uabGj3DmJLtxKL2/D15zCH/uLe+2EDzWznP6NVdFT3WK0P
YKZQCWT19PWSsLoBRPutWxkmop4AYvkqE0a6vXUlJlFYZK3Bbytx6/179uWKfiX5
ZkKEEtx1XiDAvcp5gBb6PISurycrBY0e/bkPBnK3ES5vawMbTU5IrmWOrQ4D8yOd
z9qOVZawZ6+b2XBDgBWjQ9bM7I5R7Il1q/LglYEaFI9+wHUnlUdDSm6ft5/5BiCZ
fqgkh5Bj2iEsajbSsacoljMOpxpYPqj63mqc+7fAGXF34V+B+9U1bpt8kCbMKowf
7Abb7IuiCR6vLDapjP6VqTMvdQ4O466OEAN83ULGFTdmMqYYH4AxaIwc+xcAk/aP
RNq7/RHhh4FRynRAj9fCkGlF3ArnM88gLINwWuEQq4SClWGcvdw7eaHpwWo77c4g
iccCnTLqSIg5pDVu07AQzzBlW6KulWxh5o72x+Xx+EXWdYUDHQ1SlNs11bSNUBV1
5MkrzY2GuD+NFEjsXJEDIPOr40mQOyJCXnxq8nXPsz/hD9kHeJPvWn3J3eVKyb5B
Z6/knNqM0BDn3SaYR/rD
=YFiQ
-----END PGP SIGNATURE-----
Merge tag 'drm-next-2019-05-09' of git://anongit.freedesktop.org/drm/drm
Pull drm updates from Dave Airlie:
"This has two exciting community drivers for ARM Mali accelerators.
Since ARM has never been open source friendly on the GPU side of the
house, the community has had to create open source drivers for the
Mali GPUs. Lima covers the older t4xx and panfrost the newer 6xx/7xx
series. Well done to all involved and hopefully this will help ARM
head in the right direction.
There is also now the ability if you don't have any of the legacy
drivers enabled (pre-KMS) to remove all the pre-KMS support code from
the core drm, this saves 10% or so in codesize on my machine.
i915 also enable Icelake/Elkhart Lake Gen11 GPUs by default, vboxvideo
moves out of staging.
There are also some rcar-du patches which crossover with media tree
but all should be acked by Mauro.
Summary:
uapi changes:
- Colorspace connector property
- fourcc - new YUV formts
- timeline sync objects initially merged
- expose FB_DAMAGE_CLIPS to atomic userspace
new drivers:
- vboxvideo: moved out of staging
- aspeed: ASPEED SoC BMC chip display support
- lima: ARM Mali4xx GPU acceleration driver support
- panfrost: ARM Mali6xx/7xx Midgard/Bitfrost acceleration driver support
core:
- component helper docs
- unplugging fixes
- devm device init
- MIPI/DSI rate control
- shmem backed gem objects
- connector, display_info, edid_quirks cleanups
- dma_buf fence chain support
- 64-bit dma-fence seqno comparison fixes
- move initial fb config code to core
- gem fence array helpers for Lima
- ability to remove legacy support code if no drivers requires it (removes 10% of drm.ko size)
- lease fixes
ttm:
- unified DRM_FILE_PAGE_OFFSET handling
- Account for kernel allocations in kernel zone only
panel:
- OSD070T1718-19TS panel support
- panel-tpo-td028ttec1 backlight support
- Ronbo RB070D30 MIPI/DSI
- Feiyang FY07024DI26A30-D MIPI-DSI panel
- Rocktech jh057n00900 MIPI-DSI panel
i915:
- Comet Lake (Gen9) PCI IDs
- Updated Icelake PCI IDs
- Elkhartlake (Gen11) support
- DP MST property addtions
- plane and watermark fixes
- Icelake port sync and VEBOX disable fixes
- struct_mutex usage reduction
- Icelake gamma fix
- GuC reset fixes
- make mmap more asynchronous
- sound display power well race fixes
- DDI/MIPI-DSI clocks for Icelake
- Icelake RPS frequency changing support
- Icelake workarounds
amdgpu:
- Use HMM for userptr
- vega20 experimental smu11 support
- RAS support for vega20
- BACO support for vega12 + fixes for vega20
- reworked IH interrupt handling
- amdkfd RAS support
- Freesync improvements
- initial timeline sync object support
- DC Z ordering fixes
- NV12 planes support
- colorspace properties for planes=
- eDP opts if eDP already initialized
nouveau:
- misc fixes
etnaviv:
- misc fixes
msm:
- GPU zap shader support expansion
- robustness ABI addition
exynos:
- Logging cleanups
tegra:
- Shared reset fix
- CPU cache maintenance fix
cirrus:
- driver rewritten using simple helpers
meson:
- G12A support
vmwgfx:
- Resource dirtying management improvements
- Userspace logging improvements
virtio:
- PRIME fixes
rockchip:
- rk3066 hdmi support
sun4i:
- DSI burst mode support
vc4:
- load tracker to detect underflow
v3d:
- v3d v4.2 support
malidp:
- initial Mali D71 support in komeda driver
tfp410:
- omap related improvement
omapdrm:
- drm bridge/panel support
- drop some omap specific panels
rcar-du:
- Display writeback support"
* tag 'drm-next-2019-05-09' of git://anongit.freedesktop.org/drm/drm: (1507 commits)
drm/msm/a6xx: No zap shader is not an error
drm/cma-helper: Fix drm_gem_cma_free_object()
drm: Fix timestamp docs for variable refresh properties.
drm/komeda: Mark the local functions as static
drm/komeda: Fixed warning: Function parameter or member not described
drm/komeda: Expose bus_width to Komeda-CORE
drm/komeda: Add sysfs attribute: core_id and config_id
drm: add non-desktop quirk for Valve HMDs
drm/panfrost: Show stored feature registers
drm/panfrost: Don't scream about deferred probe
drm/panfrost: Disable PM on probe failure
drm/panfrost: Set DMA masks earlier
drm/panfrost: Add sanity checks to submit IOCTL
drm/etnaviv: initialize idle mask before querying the HW db
drm: introduce a capability flag for syncobj timeline support
drm: report consistent errors when checking syncobj capibility
drm/nouveau/nouveau: forward error generated while resuming objects tree
drm/nouveau/fb/ramgk104: fix spelling mistake "sucessfully" -> "successfully"
drm/nouveau/i2c: Disable i2c bus access after ->fini()
drm/nouveau: Remove duplicate ACPI_VIDEO_NOTIFY_PROBE definition
...
When implementing the KUAP support on Radix we fixed one case where
mmu_has_feature() was being called too early in boot via
__put_user_size().
However since then some new code in linux-next has created a new path
via which we can end up calling mmu_has_feature() too early.
On P9 this leads to crashes early in boot if we have both PPC_KUAP and
CONFIG_JUMP_LABEL_FEATURE_CHECK_DEBUG enabled. Our early boot code
calls printk() which calls probe_kernel_read(), that does a
__copy_from_user_inatomic() which calls into set_kuap() and that uses
mmu_has_feature().
At that point in boot we haven't patched MMU features yet so the debug
code in mmu_has_feature() complains, and calls printk(). At that point
we recurse, eg:
...
dump_stack+0xdc
probe_kernel_read+0x1a4
check_pointer+0x58
...
printk+0x40
dump_stack_print_info+0xbc
dump_stack+0x8
probe_kernel_read+0x1a4
probe_kernel_read+0x19c
check_pointer+0x58
...
printk+0x40
cpufeatures_process_feature+0xc8
scan_cpufeatures_subnodes+0x380
of_scan_flat_dt_subnodes+0xb4
dt_cpu_ftrs_scan_callback+0x158
of_scan_flat_dt+0xf0
dt_cpu_ftrs_scan+0x3c
early_init_devtree+0x360
early_setup+0x9c
And so on for infinity, symptom is a dead system.
Even more fun is what happens when using the hash MMU (ie. p8 or p9
with Radix disabled), and when we don't have
CONFIG_JUMP_LABEL_FEATURE_CHECK_DEBUG enabled. With the debug disabled
we don't check if static keys have been initialised, we just rely on
the jump label. But the jump label defaults to true so we just whack
the AMR even though Radix is not enabled.
Clearing the AMR is fine, but after we've done the user copy we write
(0b11 << 62) into AMR. When using hash that makes all pages with key
zero no longer readable or writable. All kernel pages implicitly have
key zero, and so all of a sudden the kernel can't read or write any of
its memory. Again dead system.
In the medium term we have several options for fixing this.
probe_kernel_read() doesn't need to touch AMR at all, it's not doing a
user access after all, but it uses __copy_from_user_inatomic() just
because it's easy, we could fix that.
It would also be safe to default to not writing to the AMR during
early boot, until we've detected features. But it's not clear that
flipping all the MMU features to static_key_false won't introduce
other bugs.
But for now just switch to early_mmu_has_feature() in set_kuap(), that
avoids all the problems with jump labels. It adds the overhead of a
global lookup and test, but that's probably trivial compared to the
writes to the AMR anyway.
Fixes: 890274c2dc ("powerpc/64s: Implement KUAP for Radix MMU")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Russell Currey <ruscur@russell.cc>
There is only one caller of iommu_get_one left, so merge it into
that one to clean things up a bit.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
This means we handle > PAGE_SIZE offsets fine, and grow the size check
so far only performed in the map_page path. We lose the optimization
to not double flush a page if it apears in multiple consecutive SG list
entries. But at least for block I/O those don't happen anymore since
we properly merge in higher layers anyway.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
__sbus_iommu_map_page currently assumes all pages are mapped into the
kernel direct mapping. Switch to using physical address instead of
virtual ones for all the normal mapping operations, and only use
the virtual addresses for cache flushing when not operating on
a highmem page.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This prepares for reusing __sbus_iommu_map_page in the map_sg path.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
No need for the page structure, just the paddr / pfn. This is
going to simplify fixes to the callers.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Share the code for the global and per-page flush map_sg loops using a
simple bool parameter to disable the per-page flush for the former
variant.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is only one caller of iommu_release_one left, so merge it into
that one to clean things up a bit.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use the page-level helper instead of duplicating the logic, while also
fixing the incorrect handling of larger than page sized offsets in
the sg variant.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This deobsfucates the check a bit, and prepares for future changes.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
$(call if_changed,...) must have FORCE as a prerequisite.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove duplicate header which is included twice.
Signed-off-by: Jagadeesh Pagadala <jagdsh.linux@gmail.com>
Reviewed-by: Mukesh Ojha <mojha@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
- A set of memblock initialization improvements thanks to Serge Semin,
tidying up after our conversion from bootmem to memblock back in
v4.20.
- Our eBPF JIT the previously supported only MIPS64r2 through MIPS64r5
is improved to also support MIPS64r6. Support for MIPS32 systems is
introduced, with the caveat that it only works for programs that don't
use 64 bit registers or operations - those will bail out & need to be
interpreted.
- Improvements to the allocation & configuration of our exception vector
that should fix issues seen on some platforms using recent versions of
U-Boot.
- Some minor improvements to code generated for jump labels, along with
enabling them by default for generic kernels.
-----BEGIN PGP SIGNATURE-----
iIsEABYIADMWIQRgLjeFAZEXQzy86/s+p5+stXUA3QUCXNNB2RUccGF1bC5idXJ0
b25AbWlwcy5jb20ACgkQPqefrLV1AN1zeAD/U/ScowcQE8ynoY97nA70d3UmbETH
YETUX5WcOfR65O8A/1hvMX8QJ1x87XUlNTkE6Gdh/itAZJpJWiSo3dnd1GoF
=L9IJ
-----END PGP SIGNATURE-----
Merge tag 'mips_5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux
Pull MIPS updates from Paul Burton:
- A set of memblock initialization improvements thanks to Serge Semin,
tidying up after our conversion from bootmem to memblock back in
v4.20.
- Our eBPF JIT the previously supported only MIPS64r2 through MIPS64r5
is improved to also support MIPS64r6. Support for MIPS32 systems is
introduced, with the caveat that it only works for programs that
don't use 64 bit registers or operations - those will bail out & need
to be interpreted.
- Improvements to the allocation & configuration of our exception
vector that should fix issues seen on some platforms using recent
versions of U-Boot.
- Some minor improvements to code generated for jump labels, along with
enabling them by default for generic kernels.
* tag 'mips_5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: (27 commits)
mips: Manually call fdt_init_reserved_mem() method
mips: Make sure dt memory regions are valid
mips: Perform early low memory test
mips: Dump memblock regions for debugging
mips: Add reserve-nomap memory type support
mips: Use memblock to reserve the __nosave memory range
mips: Discard post-CMA-init foreach loop
mips: Reserve memory for the kernel image resources
MIPS: Remove duplicate EBase configuration
MIPS: Sync icache for whole exception vector
MIPS: Always allocate exception vector for MIPSr2+
MIPS: Use memblock_phys_alloc() for exception vector
mips: Combine memblock init and memory reservation loops
mips: Discard rudiments from bootmem_init
mips: Make sure kernel .bss exists in boot mem pool
mips: vdso: drop unnecessary cc-ldoption
Revert "MIPS: ralink: fix cpu clock of mt7621 and add dt clk devices"
MIPS: generic: Enable CONFIG_JUMP_LABEL
MIPS: jump_label: Use compact branches for >= r6
MIPS: jump_label: Remove redundant nops
...
- allow users to invoke 'make' out of the source tree
- refactor scripts/mkmakefile
- deprecate KBUILD_SRC, which was used to track the source tree
location for O= build.
- fix recordmcount.pl in case objdump output is localized
- turn unresolved symbols in external modules to errors from warnings
by default; pass KBUILD_MODPOST_WARN=1 to get them back to warnings
- generate modules.builtin.modinfo to collect .modinfo data from
built-in modules
- misc Makefile cleanups
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJc0upmAAoJED2LAQed4NsGreoP/jkKQojVdzTjM/nn/Xe9FrE9
elQ59Wu/gW/q7jouN733WTJ2fRPAEVaGPw/CT/cCRWjNJjd6MM4voDKVxifZE+u4
qz5XUGq4Mjozd0MJdU1LA1dIRJrEaPofOj7E0JEIerEdLPKpv4kjCCbbHOZwN2Kl
YrpPFJkYspCRVG+txEWY8YaZeU/+OAdNckJnMkX8hnUwsNplAlPw4L/5Y12Uncuz
7g/3T1f701pJhStoO4OPR/+Rivi0EX5AP7TZyv2/FOwTO+Oau5G1rF0j1BT0+ceg
d+bApuaHLR4ocZ6GvWfACMYhH7ais0BUgAwi6HY/b78SGMmNB6xlD5biVqcklx6c
mjrPNaj6WJEQIflZ3yr83tRNhTu7xXRKZkrGHHUjOGNsRULuD8RFrXtusl06A22S
hA3vt2SF4FuLBabPw5yFkKNOTX+P5dG79BC1fWIQjglBGoYqWTRQNYoottr/7t2T
Si+E6J2lAD0EDNuc/EFKWfrgONfatgcdgQuTibwCcE2KfYnMG/C+9DvxRV4lAQtn
fYpap9gws+0AisbVx0m1W088NMFiwrMbN3n3KITGY/15XDmeySEA1QgPKFN8JHJd
Do0duaMx8BHjkA5ms/Bf0AZd0tWUkWVdt0epwU2KafWNGK7VaPnhKxn5eQOGkWV+
gT3J3XZc3b7OEOQ4XX3Z
=zj9w
-----END PGP SIGNATURE-----
Merge tag 'kbuild-v5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild updates from Masahiro Yamada:
- allow users to invoke 'make' out of the source tree
- refactor scripts/mkmakefile
- deprecate KBUILD_SRC, which was used to track the source tree
location for O= build.
- fix recordmcount.pl in case objdump output is localized
- turn unresolved symbols in external modules to errors from warnings
by default; pass KBUILD_MODPOST_WARN=1 to get them back to warnings
- generate modules.builtin.modinfo to collect .modinfo data from
built-in modules
- misc Makefile cleanups
* tag 'kbuild-v5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (21 commits)
.gitignore: add more all*.config patterns
moduleparam: Save information about built-in modules in separate file
Remove MODULE_ALIAS() calls that take undefined macro
.gitignore: add leading and trailing slashes to generated directories
scripts/tags.sh: fix direct execution of scripts/tags.sh
scripts: override locale from environment when running recordmcount.pl
samples: kobject: allow CONFIG_SAMPLE_KOBJECT to become y
samples: seccomp: turn CONFIG_SAMPLE_SECCOMP into a bool option
kbuild: move Documentation to vmlinux-alldirs
kbuild: move samples/ to KBUILD_VMLINUX_OBJS
modpost: make KBUILD_MODPOST_WARN also configurable for external modules
kbuild: check arch/$(SRCARCH)/include/generated before out-of-tree build
kbuild: remove unneeded dependency for include/config/kernel.release
memory: squash drivers/memory/Makefile.asm-offsets
kbuild: use $(srctree) instead of KBUILD_SRC to check out-of-tree build
kbuild: mkmakefile: generate a simple wrapper of top Makefile
kbuild: mkmakefile: do not check the generated Makefile marker
kbuild: allow Kbuild to start from any directory
kbuild: pass $(MAKECMDGOALS) to sub-make as is
kbuild: fix warning "overriding recipe for target 'Makefile'"
...
Here are the patches which made on 5.1-rc6 and all are tested in our
buildroot gitlab CI:
https://gitlab.com/c-sky/buildroot/pipelines/57892579
- Fixup vdsp&fpu issues in kernel
- Add dynamic function tracer
- Use in_syscall & forget_syscall instead of r11_sig
- Reconstruct signal processing
- Support dynamic start physical address
- Fixup wrong update_mmu_cache implementation
- Support vmlinux bootup with MMU off
- Use va_pa_offset instead of phys_offset
- Fixup syscall_trace return processing flow
- Add perf callchain support
- Add perf_arch_fetch_caller_regs support
- Add page fault perf event support
- Add support for perf registers sampling
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE2KAv+isbWR/viAKHAXH1GYaIxXsFAlzReFYSHHJlbl9ndW9A
Yy1za3kuY29tAAoJEAFx9RmGiMV71cwP/RQX7VHqH4Slak5myDlHCabzkzYJjU/a
zy8RdFSfzjXkNWzz8zo68Ze+DqDBDLfRHftqikg7CpZF6nQo5vC0VfDtKJZ6x2bi
8bzuTIb5q0eZEUJmJROjnT4At3mVr+sCZlrEshI0DIAbd1BT34LLF2hE06pbBlej
kH9ngVJw2XpIFJE/S2Gs0ieftOuOEvXgwxFdXrEdfr2TFR+Z9UGd8imLL4LF/2Fu
rbfxFDNUAAmm6CzIzDz7BMGlQAXdMGsQiQDa3YuJ/v4yg5EuqzDz+p1bwEGxGelB
R0FYjOxfLB8PUkpigb3v1E21eUHbVQsWgvWdH0ksoJIFUboH0TRRmtsjQnbBaSkE
DT5tJHTzqTIIawo+PuOlQ8to3i8L6oWE0MlNG354wjgwfus21v2mWwbXzWtnYsN+
dv4VZ9NDwifrpGY6k9j+vVmqzQ2FYBtrX4R/oPBc3StYdPqKJHp2K87Y3UQVLqPH
D7GxV6LHLy4RlpSbA4Zq6zZRQTQmgLFaVRXbnMuicy1QZyYzp8wagpYRh+XJVvg7
rQr8EnP2B7G8N1JLrh3B46TxuZHOSQXXTsg3fbcIlv5at3vQ1xx6KU2InKLBvALU
3ICsOnJ/UzID35zmTY0KSxzEL7u5LxJUYPd/YxKpfuN405jvnvnSCAYKm0fFuYnJ
tFVd9uZc5W4q
=oQ+E
-----END PGP SIGNATURE-----
Merge tag 'csky-for-linus-5.2-rc1' of git://github.com/c-sky/csky-linux
Pull arch/csky updates from Guo Ren:
- Fixup vdsp&fpu issues in kernel
- Add dynamic function tracer
- Use in_syscall & forget_syscall instead of r11_sig
- Reconstruct signal processing
- Support dynamic start physical address
- Fixup wrong update_mmu_cache implementation
- Support vmlinux bootup with MMU off
- Use va_pa_offset instead of phys_offset
- Fixup syscall_trace return processing flow
- Add perf callchain support
- Add perf_arch_fetch_caller_regs support
- Add page fault perf event support
- Add support for perf registers sampling
* tag 'csky-for-linus-5.2-rc1' of git://github.com/c-sky/csky-linux:
csky/syscall_trace: Fixup return processing flow
csky: Fixup compile warning
csky: Add support for perf registers sampling
csky: add page fault perf event support
csky: Use va_pa_offset instead of phys_offset
csky: Support vmlinux bootup with MMU off
csky: Add perf_arch_fetch_caller_regs support
csky: Fixup wrong update_mmu_cache implementation
csky: Support dynamic start physical address
csky: Reconstruct signal processing
csky: Use in_syscall & forget_syscall instead of r11_sig
csky: Add non-uapi asm/ptrace.h namespace
csky: mm/fault.c: Remove duplicate header
csky: remove redundant generic-y
csky: Update syscall_trace_enter/exit implementation
csky: Add perf callchain support
csky/ftrace: Add dynamic function tracer (include graph tracer)
csky: Fixup vdsp&fpu issues in kernel
Nex drivers:
- New driver for Bitmain BM1880 pin controller
- New driver for Mediatek MT8516
- New driver for Cirrus Logich Lochnagar PMIC pins
Updates:
- Incremental development on Renesas SH-PFC
- Incremental development on Intel pin controller and some
particular updates for Cedarfork.
- Pin configuration support in Allwinner SunXi drivers
- Suspend/resume support in the NXP/Freescale i.MX8MQ driver
- Support for more packaging of the ST Micro STM32
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJc0ouWAAoJEEEQszewGV1zGe0P/jo3cOjcU1pZ1KUMWOBPdh4p
7Ks3sjjDPpJ5JB8Vv9piXhlglZQMixh2PFsmsWYN35CP+4but1BJ9x2NqAOrKEnY
3IJcaXxjyYLyqMlYTCLNLHcMca1Ku9KcGHMX0/5JKpM84qfrOMQmjzv9RMC9UcFb
3AAZgDHKGdu6kqs6nm5l62z5bg+GKTzjuOZ+a+YsiOLHRHz2loGdP7egZy6VtmuQ
L1bFibfJX6Nexmh/TF0iceX82385aPfU8AMbxdhfMUEI3OJqR4gtSEVOxzW1FJaA
+Ursxeess7QiAyyStOTszccOsJJe6ZjThU25s5kVLPygx0+kv2Z94Vvyb+ii1yU+
UWk1zbOcYkd1mPAA6NGgzrpLgUUrL8Wzz45OwZKFxLWHjNRB2vDI6p8ABL+TRlXA
cHOSpszHbAh/oqEZ45WmqFC0S011xmniOSqHuGHxanKAYKQjdaxO/rHtWkrgsEAJ
NybpV67Ar2UtW2/3VqQHV97FcPzXoUMorn1qFs8aZeFxkqjnEMXP4zCgujLeJNn7
Aa8Q9pWAOspMaudD96BUNsbjavn8PLRtzkOGRtL/Hfzfnt03SDTHUcrlxrwQr1fq
dpw1bn2XIOjWwgzyrYXJhQAVOADuRks7wSVAP+/Vg2Gke8XmzAJ4x9tiLKDy7oyE
DReRSj47xF01ns2AY9Bu
=ovk5
-----END PGP SIGNATURE-----
Merge tag 'pinctrl-v5.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
Pull pin control updates from Linus Walleij:
"It is pretty calm and chill in pin control for the moment. Just
incremental development.
There is an odd patch to the Super-H architecture, it's coming from
the maintainers so should be fine.
Summary:
New drivers:
- Bitmain BM1880 pin controller
- Mediatek MT8516
- Cirrus Logich Lochnagar PMIC pins
Updates:
- Incremental development on Renesas SH-PFC
- Incremental development on Intel pin controller and some particular
updates for Cedarfork.
- Pin configuration support in Allwinner SunXi drivers
- Suspend/resume support in the NXP/Freescale i.MX8MQ driver
- Support for more packaging of the ST Micro STM32"
* tag 'pinctrl-v5.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: (72 commits)
pinctrl: mcp23s08: Do not complain about unsupported params
pinctrl: Rework Kconfig dependency for BM1880 pinctrl driver
MAINTAINERS: Add entry for BM1880 pinctrl
pinctrl: Add pinctrl support for BM1880 SoC
dt-bindings: pinctrl: Add BM1880 pinctrl binding
pinctrl: stm32: check irq controller availability at probe
pinctrl: mediatek: Add MT8516 Pinctrl driver
pinctrl: zte: fix leaked of_node references
pinctrl: intel: Increase readability of intel_gpio_update_pad_mode()
pinctrl: intel: Retain HOSTSW_OWN for requested gpio pin
pinctrl: pistachio: fix leaked of_node references
pinctrl: sunxi: Support I/O bias voltage setting on H6
pinctrl: sunxi: Prepare for alternative bias voltage setting methods
pinctrl: st: fix leaked of_node references
pinctrl: samsung: fix leaked of_node references
pinctrl: stm32: align stm32mp157 pin names
pinctrl: stm32: add package information for stm32mp157c
pinctrl: stm32: introduce package support
dt-bindings: pinctrl: stm32: add new entry for package information
pinctrl: imx8mq: Add suspend/resume ops
...
The commit
0a9fe8ca84 ("x86/mm: Validate kernel_physical_mapping_init() PTE population")
triggers this warning in SEV guests:
WARNING: CPU: 0 PID: 0 at arch/x86/include/asm/pgalloc.h:87 phys_pmd_init+0x30d/0x386
Call Trace:
kernel_physical_mapping_init+0xce/0x259
early_set_memory_enc_dec+0x10f/0x160
kvm_smp_prepare_boot_cpu+0x71/0x9d
start_kernel+0x1c9/0x50b
secondary_startup_64+0xa4/0xb0
A SEV guest calls kernel_physical_mapping_init() to clear the encryption
mask from an existing mapping. While doing so, it also splits large
pages into smaller.
To split a page, kernel_physical_mapping_init() allocates a new page and
updates the existing entry. The set_{pud,pmd}_safe() helpers trigger a
warning when updating an entry with a page in the present state.
Add a new kernel_physical_mapping_change() helper which uses the
non-safe variants of set_{pmd,pud,p4d}() and {pmd,pud,p4d}_populate()
routines when updating the entry.
Since kernel_physical_mapping_change() may replace an existing
entry with a new entry, the caller is responsible to flush
the TLB at the end. Change early_set_memory_enc_dec() to use
kernel_physical_mapping_change() when it wants to clear the memory
encryption mask from the page table entry.
[ bp:
- massage commit message.
- flesh out comment according to dhansen's request.
- align function arguments at opening brace. ]
Fixes: 0a9fe8ca84 ("x86/mm: Validate kernel_physical_mapping_init() PTE population")
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Lendacky <Thomas.Lendacky@amd.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190417154102.22613-1-brijesh.singh@amd.com
Here is the big set of USB and PHY driver patches for 5.2-rc1
There is the usual set of:
- USB gadget updates
- PHY driver updates and additions
- USB serial driver updates and fixes
- typec updates and new chips supported
- mtu3 driver updates
- xhci driver updates
- other tiny driver updates
Nothing really interesting, just constant forward progress.
All of these have been in linux-next for a while with no reported
issues. The usb-gadget and usb-serial trees were merged a bit "late",
but both of them had been in linux-next before they got merged here last
Friday.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXNKuwg8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ymRUgCfa8Ri7KrCaBR5NHQcLhbdrX90ToQAmgNw7vpo
fqt0XpNM0CSa9O/gOr79
=8HFh
-----END PGP SIGNATURE-----
Merge tag 'usb-5.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB/PHY updates from Greg KH:
"Here is the big set of USB and PHY driver patches for 5.2-rc1
There is the usual set of:
- USB gadget updates
- PHY driver updates and additions
- USB serial driver updates and fixes
- typec updates and new chips supported
- mtu3 driver updates
- xhci driver updates
- other tiny driver updates
Nothing really interesting, just constant forward progress.
All of these have been in linux-next for a while with no reported
issues. The usb-gadget and usb-serial trees were merged a bit "late",
but both of them had been in linux-next before they got merged here
last Friday"
* tag 'usb-5.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (206 commits)
USB: serial: f81232: implement break control
USB: serial: f81232: add high baud rate support
USB: serial: f81232: clear overrun flag
USB: serial: f81232: fix interrupt worker not stop
usb: dwc3: Rename DWC3_DCTL_LPM_ERRATA
usb: dwc3: Fix default lpm_nyet_threshold value
usb: dwc3: debug: Print GET_STATUS(device) tracepoint
usb: dwc3: Do core validation early on probe
usb: dwc3: gadget: Set lpm_capable
usb: gadget: atmel: tie wake lock to running clock
usb: gadget: atmel: support USB suspend
usb: gadget: atmel_usba_udc: simplify setting of interrupt-enabled mask
dwc2: gadget: Fix completed transfer size calculation in DDMA
usb: dwc2: Set lpm mode parameters depend on HW configuration
usb: dwc2: Fix channel disable flow
usb: dwc2: Set actual frame number for completed ISOC transfer
usb: gadget: do not use __constant_cpu_to_le16
usb: dwc2: gadget: Increase descriptors count for ISOC's
usb: introduce usb_ep_type_string() function
usb: dwc3: move synchronize_irq() out of the spinlock protected block
...
Nicolai Stange discovered[1] that if live kernel patching is enabled, and the
function tracer started tracing the same function that was patched, the
conversion of the fentry call site during the translation of going from
calling the live kernel patch trampoline to the iterator trampoline, would
have as slight window where it didn't call anything.
As live kernel patching depends on ftrace to always call its code (to
prevent the function being traced from being called, as it will redirect
it). This small window would allow the old buggy function to be called, and
this can cause undesirable results.
Nicolai submitted new patches[2] but these were controversial. As this is
similar to the static call emulation issues that came up a while ago[3].
But after some debate[4][5] adding a gap in the stack when entering the
breakpoint handler allows for pushing the return address onto the stack to
easily emulate a call.
[1] http://lkml.kernel.org/r/20180726104029.7736-1-nstange@suse.de
[2] http://lkml.kernel.org/r/20190427100639.15074-1-nstange@suse.de
[3] http://lkml.kernel.org/r/3cf04e113d71c9f8e4be95fb84a510f085aa4afa.1541711457.git.jpoimboe@redhat.com
[4] http://lkml.kernel.org/r/CAHk-=wh5OpheSU8Em_Q3Hg8qw_JtoijxOdPtHru6d+5K8TWM=A@mail.gmail.com
[5] http://lkml.kernel.org/r/CAHk-=wjvQxY4DvPrJ6haPgAa6b906h=MwZXO6G8OtiTGe=N7_w@mail.gmail.com
[
Live kernel patching is not implemented on x86_32, thus the emulate
calls are only for x86_64.
]
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Nicolai Stange <nstange@suse.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: the arch/x86 maintainers <x86@kernel.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Miroslav Benes <mbenes@suse.cz>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Joe Lawrence <joe.lawrence@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Mimi Zohar <zohar@linux.ibm.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Nayna Jain <nayna@linux.ibm.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: "open list:KERNEL SELFTEST FRAMEWORK" <linux-kselftest@vger.kernel.org>
Cc: stable@vger.kernel.org
Fixes: b700e7f03d ("livepatch: kernel: add support for live patching")
Tested-by: Nicolai Stange <nstange@suse.de>
Reviewed-by: Nicolai Stange <nstange@suse.de>
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
[ Changed to only implement emulated calls for x86_64 ]
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
In order to allow breakpoints to emulate call instructions, they need to push
the return address onto the stack. The x86_64 int3 handler adds a small gap
to allow the stack to grow some. Use this gap to add the return address to
be able to emulate a call instruction at the breakpoint location.
These helper functions are added:
int3_emulate_jmp(): changes the location of the regs->ip to return there.
(The next two are only for x86_64)
int3_emulate_push(): to push the address onto the gap in the stack
int3_emulate_call(): push the return address and change regs->ip
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Nicolai Stange <nstange@suse.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: the arch/x86 maintainers <x86@kernel.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Miroslav Benes <mbenes@suse.cz>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Joe Lawrence <joe.lawrence@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Mimi Zohar <zohar@linux.ibm.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Nayna Jain <nayna@linux.ibm.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: "open list:KERNEL SELFTEST FRAMEWORK" <linux-kselftest@vger.kernel.org>
Cc: stable@vger.kernel.org
Fixes: b700e7f03d ("livepatch: kernel: add support for live patching")
Tested-by: Nicolai Stange <nstange@suse.de>
Reviewed-by: Nicolai Stange <nstange@suse.de>
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
[ Modified to only work for x86_64 and added comment to int3_emulate_push() ]
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
To allow an int3 handler to emulate a call instruction, it must be able to
push a return address onto the stack. Add a gap to the stack to allow the
int3 handler to push the return address and change the return from int3 to
jump straight to the emulated called function target.
Link: http://lkml.kernel.org/r/20181130183917.hxmti5josgq4clti@treble
Link: http://lkml.kernel.org/r/20190502162133.GX2623@hirez.programming.kicks-ass.net
[
Note, this is needed to allow Live Kernel Patching to not miss calling a
patched function when tracing is enabled. -- Steven Rostedt
]
Cc: stable@vger.kernel.org
Fixes: b700e7f03d ("livepatch: kernel: add support for live patching")
Tested-by: Nicolai Stange <nstange@suse.de>
Reviewed-by: Nicolai Stange <nstange@suse.de>
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Fix the following build error when the kernel is built with CONFIG_KASAN
broken since commit 98587c2d89 ("s390: simplify disabled_wait"):
arch/s390/mm/kasan_init.c: In function 'kasan_early_panic':
arch/s390/mm/kasan_init.c:31:2: error: too many arguments to function
'disabled_wait'
31 | disabled_wait(0);
Fixes: 98587c2d89 ("s390: simplify disabled_wait")
Reported-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The recently introduced XUSB support for Jetson TX2 is causing boot, CPU
hotplug and suspend/resume failures according to several reports.
Temporarily work around this by disabling the XUSB controller and XUSB
pad controller nodes in device tree, while we figure out what's causing
this.
Reported-by: Bitan Biswas <bbiswas@nvidia.com>
Reported-by: Jonathan Hunter <jonathanh@nvidia.com>
Tested-by: Bitan Biswas <bbiswas@nvidia.com>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Acked-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Commit 954a03be03 ("iommu/arm-smmu: Break insecure users by disabling
bypass by default") intentionally breaks all devices using the SMMU in
bypass mode. This breaks, among other things, PCI support on Tegra186.
Fix this by populating the iommus property and friends for the PCIe
controller.
Fixes: 954a03be03 ("iommu/arm-smmu: Break insecure users by disabling bypass by default")
Signed-off-by: Thierry Reding <treding@nvidia.com>
Commit 954a03be03 ("iommu/arm-smmu: Break insecure users by disabling
bypass by default") intentionally breaks all devices using the SMMU in
bypass mode. This is breaking various devices on Tegra186 which include
the ethernet, BPMP and HDA device. Fix this by populating the iommus
property for these devices with their stream ID.
Fixes: 954a03be03 ("iommu/arm-smmu: Break insecure users by disabling bypass by default")
Signed-off-by: Jonathan Hunter <jonathanh@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Enable ARM_GIC_PM for 64-bit Tegra devices. This is required to ensure
that the driver gets built into kernel and helps to register the AGIC
device when enabled in DT.
Signed-off-by: Sameer Pujar <spujar@nvidia.com>
Reviewed-by: Jon Hunter <jonathanh@nvidia.com>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
nested_run_pending=1 implies we have successfully entered guest mode.
Move setting from external state in vmx_set_nested_state() until after
all other checks are complete.
Based on a patch by Aaron Lewis.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Move call to nested_enable_evmcs until after free_nested() is complete.
Signed-off-by: Aaron Lewis <aaronlewis@google.com>
Reviewed-by: Marc Orr <marcorr@google.com>
Reviewed-by: Peter Shier <pshier@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This function is referenced from assembler, so in LTO
it needs to be global and visible to not be optimized away.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Link: https://lkml.kernel.org/r/20190330004743.29541-7-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>