With VMBus protocol 5.0, we're able to better support new features, e.g.
running two or more VMBus drivers simultaneously in a single VM -- note:
we can't simply load the current VMBus driver twice, instead, a secondary
VMBus driver must be implemented.
This patch adds the support for the new VMBus protocol, which is available
on new Windows hosts, by:
1) We still use SINT2 for compatibility;
2) We must use Connection ID 4 for the Initiate Contact Message, and for
subsequent messages, we must use the Message Connection ID field in
the host-returned VersionResponse Message.
Notes for developers of the secondary VMBus driver:
1) Must use VMBus protocol 5.0 as well;
2) Must use a different SINT number that is not in use.
3) Must use Connection ID 4 for the Initiate Contact Message, and for
subsequent messages, must use the Message Connection ID field in
the host-returned VersionResponse Message.
4) It's possible that the primary VMBus driver using protocol version 4.0
can work with a secondary VMBus driver using protocol version 5.0, but it's
recommended that both should use 5.0 for new Hyper-V features in the future.
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Cc: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Clean up existing instances of unnecessary parentheses in if
statement and change order of conditionals to make it easier to read
The opening /* should be followed by a single space and the closing */
should be preceded with a single space.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Commit 9e343e87d2 ("mtd: cfi: convert inline functions to macros")
changed map_word_andequal() into a macro, but also changed the right
hand side of the comparison from val3 to val2. Change it back to use
val3 on the right hand side.
Thankfully this did not cause a regression because all callers
currently pass the same argument for val2 and val3.
Fixes: 9e343e87d2 ("mtd: cfi: convert inline functions to macros")
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
s390 is now the last architecture that entirely overwrites
local_softirq_pending() and uses the according default definitions of
set_softirq_pending() and or_softirq_pending().
Just move these to s390 to debloat the generic code complexity.
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: James E.J. Bottomley <jejb@parisc-linux.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Rich Felker <dalias@libc.org>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Link: http://lkml.kernel.org/r/1525786706-22846-12-git-send-email-frederic@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Consolidate and optimize default softirq mask API implementations.
Per-CPU operations are expected to be faster and a few architectures
already rely on them to implement local_softirq_pending() and related
accessors/mutators. Those will be migrated to the new generic code.
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: James E.J. Bottomley <jejb@parisc-linux.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Rich Felker <dalias@libc.org>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Link: http://lkml.kernel.org/r/1525786706-22846-6-git-send-email-frederic@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
In order to optimize and consolidate softirq mask accesses, let's
convert the default irq_cpustat_t implementation to per-CPU standard API.
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: James E.J. Bottomley <jejb@parisc-linux.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Rich Felker <dalias@libc.org>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Link: http://lkml.kernel.org/r/1525786706-22846-5-git-send-email-frederic@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
- AEMIF driver update to support board files and remove
need of mach-davinci aemif code
- Use percpu counters for qmss datapath stats
- License update for TI SCI
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJa8hefAAoJEHJsHOdBp5c/Z7gP/RJcEM/bUrmIj+iAf+h/Azp3
f5KiFrBmwIcRPC4VULRL06uuRgiExWcDY6j3gheKdJzqHOKprRysdRDEkHLKnmoy
EGUS2HKo6Bbig/G/lMy9YhmrOEqm2tsh008TwSj6V8ZHSXgdyd3R4Kbe0YM5bbVp
TMUMTuGN6EP1RMAMk4zh9jGxCSDgfzI6FZd2Yf8pxAhsIAa7ssbzGGT85p3lBP3T
PTQ5h/aMP833gf7Ir7z5wEdqvdmfLVIxyu2bOzbP+rPUnaUGI9E2qaKPgiz8Tw5W
sCICBv2ELauoSyLLUaJ8BOVU6pI07Cm0DisdUmhuHex3EOm+U1Leg3XX5pyJ/Sh7
sGeLfFpMYWflinp6owq3J/z4sOQCq6pHugfS+6H3k6+uBX1S4RVkhcLda/Z0zz7p
LAGpFkSSHP1vFtQLU7phSu7m+v4KqtnUEDZelCLroJIkncpwotYaAjisIOxOif0j
WgNCHMNdTF/oxGCWBxqk5GH3bvZg53uUK9iy+WjgfKLOe+2RKEM+MbPpmVBWJxag
ZV9vwRpgmi/5I/cUNt7sTJ8ine66I5N68ps4K6e8I8JNHqBqpcZFQwnZ+x+Ib0l9
vv3unT92f/7q2XtIGkxkDNgqnsLF/b6sxqmy8jnNme/FIOWPX4iq2TGSnZ151FQ1
nH3qGLDyn3UAEdBXMnkb
=uqd7
-----END PGP SIGNATURE-----
Merge tag 'soc_drivers_for_4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/ssantosh/linux-keystone into next/drivers
ARM: SOC driver update for 4.18
- AEMIF driver update to support board files and remove
need of mach-davinci aemif code
- Use percpu counters for qmss datapath stats
- License update for TI SCI
* tag 'soc_drivers_for_4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/ssantosh/linux-keystone:
firmware: ti_sci: Switch to SPDX Licensing
soc: ti: knav_qmss: Use percpu instead atomic for stats counter
memory: aemif: add support for board files
memory: aemif: don't rely on kbuild for driver's name
Signed-off-by: Olof Johansson <olof@lixom.net>
This series improves the ti-sysc interconnect target module driver to
the point where a most of SoC can be booted with interconnect target
module data configured in device tree instead of legacy platform data.
The related device tree changes need some more work though, and can
wait for v4.19. Also some drivers using nested interconnects like DSS
need more work.
We can now remove the unused pm-noop code that is not doing anything
any longer. And we can now initialize things for PM and display pdata
later to prepare things for using ti-sysc driver.
We also need to add some more quirk handling so we can boot both with
platform data and dts data.
-----BEGIN PGP SIGNATURE-----
iQJFBAABCAAvFiEEkgNvrZJU/QSQYIcQG9Q+yVyrpXMFAlrsg9QRHHRvbnlAYXRv
bWlkZS5jb20ACgkQG9Q+yVyrpXNmwxAAzPt1GpHQSw/XhhtK8+DLlqo9fdPQ9C65
Iw+PyvQYy2bTj0y64VkZ4Msmi3SOhfr5zKhIrwBHEG59/LW81oXqnb9JZHPP+YC+
/A6ZufXIt8X+Nd1L6id2OD9ItJoLXk7llBvDckwb/zUgcVib9cA79GvWgexfRWBE
w/bjZfUibddYdhKoCGmvWcZBapDKHMfv8MdN8h0QUyofTIefZZeykRvb1Pmn7Ntl
vz3QfPUq3oyRfG9PMRI7mjHrW7jxEKgjvWANbUg64UQJN7s1tfa8ICpzycc4/X/a
pdetH7G+BPaRdeqDCmGrcGHfO4b5HyD7nkTD3R6yzV+Dw8nWl+aWGJHAsPYRUJkd
o/BroflhqK2ICfEkeK6AWebbicOSlF5P+EEFwp6pHSd/9JiEqR1IkhcCvTdV8CB1
qyUQxD+iKof+rY5f1EicaGq8HXhkV+9aIOoqBH6C0qObEJDUWvVoGIzDdN2vwVAu
C2w9WqdQII3R4g2ZX1SmdEqFO/f6PkAoKiyNt+WGBGBUfYo1sfwpkFAEeGU50moJ
5m9TtLcAbbvgMwy2ttfWcHPn5z3p4Ocf7aN93TZ6RPk6A6R57PzCcYqJ2bXsumeV
5yaP9w4pbFj+FQuu8jA8s/cSwhIP8SwqwFWKCi2JcU3ugEdJfwF555y5bm0R9MDz
7W82aAicw+M=
=jYZ6
-----END PGP SIGNATURE-----
Merge tag 'omap-for-v4.18/ti-sysc-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into next/soc
ti-sysc driver related changes for omap variants
This series improves the ti-sysc interconnect target module driver to
the point where a most of SoC can be booted with interconnect target
module data configured in device tree instead of legacy platform data.
The related device tree changes need some more work though, and can
wait for v4.19. Also some drivers using nested interconnects like DSS
need more work.
We can now remove the unused pm-noop code that is not doing anything
any longer. And we can now initialize things for PM and display pdata
later to prepare things for using ti-sysc driver.
We also need to add some more quirk handling so we can boot both with
platform data and dts data.
* tag 'omap-for-v4.18/ti-sysc-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
bus: ti-sysc: Show module information for suspend if DEBUG is enabled
bus: ti-sysc: Tag sdio and wdt with legacy mode for suspend
bus: ti-sysc: Detect UARTs for SYSC_QUIRK_LEGACY_IDLE quirk on omap4
bus: ti-sysc: Detect omap4 type timers for quirk
bus: ti-sysc: Add initial support for external resets
bus: ti-sysc: Improve suspend and resume handling
bus: ti-sysc: Tag some modules resource providers for noirq suspend
bus: ti-sysc: Add handling for clkctrl opt clocks
bus: ti-sysc: Make child clock alias handling more generic
bus: ti-sysc: Handle simple-bus for nested children
ARM: OMAP2+: Make display related init into device_initcall
ARM: OMAP2+: Initialize SoC PM later
ARM: OMAP2+: Only probe SDMA via ti-sysc if configured in dts
ARM: OMAP2+: Use signed value for sysc register offsets
ARM: OMAP2+: Allow using ti-sysc for system timers
ARM: OMAP2+: Drop unused pm-noop
Signed-off-by: Olof Johansson <olof@lixom.net>
In the following commit:
247f2f6f3c ("sched/core: Don't schedule threads on pre-empted vCPUs")
... we distinguish between idle_cpu() when the vCPU is not running for
scheduling threads.
However, the idle_cpu() function is used in other places for
actually checking whether the state of the CPU is idle or not.
Hence split the use of that function based on the desired return value,
by introducing the available_idle_cpu() function.
This fixes a (slight) regression in that initial vCPU commit, because
some code paths (like the load-balancer) don't care and shouldn't care
if the vCPU is preempted or not, they just want to know if there's any
tasks on the CPU.
Signed-off-by: Rohit Jain <rohit.k.jain@oracle.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: dhaval.giani@oracle.com
Cc: linux-kernel@vger.kernel.org
Cc: matt@codeblueprint.co.uk
Cc: steven.sistare@oracle.com
Cc: subhra.mazumdar@oracle.com
Link: http://lkml.kernel.org/r/1525883988-10356-1-git-send-email-rohit.k.jain@oracle.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
kbuild bot reported against an intermediate RT patch that the build
fails with:
> In file included from include/linux/completion.h:12:0,
> from include/linux/rcupdate_wait.h:10,
> from kernel/rcu/srcutiny.c:27:
> kernel/rcu/srcutiny.c: In function 'srcu_drive_gp':
> >> include/linux/swait.h:172:7: error: implicit declaration of function '___wait_is_interruptible'; did you mean '__swait_event_interruptible'?
> if (___wait_is_interruptible(state) && __int) { \
That error vanishes a few patches later (in the RT queue) because wait.h
is then pulled in by other means. It does not seem to surface on !RT.
I think that swait should include a header file for a function/macro
(___wait_is_interruptible()) it is using.
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/20180504104224.20218-1-bigeasy@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
In order to use the helper macros that perform type mangling with the
EFI PCI I/O protocol struct typedefs, align their Linux typenames with
the convention we use for definitionns that originate in the UEFI spec,
and add the trailing _t to each.
Tested-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/20180504060003.19618-14-ard.biesheuvel@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Recognize the IA32/X64 Processor Error Section.
Do the section decoding in a new "cper-x86.c" file and add this to the
Makefile depending on a new "UEFI_CPER_X86" config option.
Print the Local APIC ID and CPUID info from the Processor Error Record.
The "Processor Error Info" and "Processor Context" fields will be
decoded in following patches.
Based on UEFI 2.7 Table 252. Processor Error Record.
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/20180504060003.19618-5-ard.biesheuvel@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Based on UEFI 2.7 Table 255. Processor Error Record, the "Local APIC_ID"
field is 8 bytes but Linux defines this field as 1 byte.
Fix this in the struct cper_sec_proc_ia definition.
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/20180504060003.19618-4-ard.biesheuvel@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Mixed mode allows a kernel built for x86_64 to interact with 32-bit
EFI firmware, but requires us to define all struct definitions carefully
when it comes to pointer sizes.
'struct efi_pci_io_protocol_32' currently uses a 'void *' for the
'romimage' field, which will be interpreted as a 64-bit field
on such kernels, potentially resulting in bogus memory references
and subsequent crashes.
Tested-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: <stable@vger.kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/20180504060003.19618-13-ard.biesheuvel@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Needed by ext4 to test frozen fs before updating s_last_mounted.
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
Pull x86/pti updates from Thomas Gleixner:
"A mixed bag of fixes and updates for the ghosts which are hunting us.
The scheduler fixes have been pulled into that branch to avoid
conflicts.
- A set of fixes to address a khread_parkme() race which caused lost
wakeups and loss of state.
- A deadlock fix for stop_machine() solved by moving the wakeups
outside of the stopper_lock held region.
- A set of Spectre V1 array access restrictions. The possible
problematic spots were discuvered by Dan Carpenters new checks in
smatch.
- Removal of an unused file which was forgotten when the rest of that
functionality was removed"
* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/vdso: Remove unused file
perf/x86/cstate: Fix possible Spectre-v1 indexing for pkg_msr
perf/x86/msr: Fix possible Spectre-v1 indexing in the MSR driver
perf/x86: Fix possible Spectre-v1 indexing for x86_pmu::event_map()
perf/x86: Fix possible Spectre-v1 indexing for hw_perf_event cache_*
perf/core: Fix possible Spectre-v1 indexing for ->aux_pages[]
sched/autogroup: Fix possible Spectre-v1 indexing for sched_prio_to_weight[]
sched/core: Fix possible Spectre-v1 indexing for sched_prio_to_weight[]
sched/core: Introduce set_special_state()
kthread, sched/wait: Fix kthread_parkme() completion issue
kthread, sched/wait: Fix kthread_parkme() wait-loop
sched/fair: Fix the update of blocked load when newly idle
stop_machine, sched: Fix migrate_swap() vs. active_balance() deadlock
GICv3 offers the possibility to signal SPIs using a pair of doorbells
(SETPI, CLRSPI) under the name of Message Based Interrupts (MBI).
They can be used as either traditional (edge) MSIs, or the more exotic
level-triggered flavour.
Let's implement support for platform MSI, which is the original intent
for this feature.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Rob Herring <robh@kernel.org>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Cc: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Cc: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lkml.kernel.org/r/20180508121438.11301-8-marc.zyngier@arm.com
At the beginning of times, irq_find_host() was simple. Each device node
implemented at most one irq domain, and we were happy. Over time, things
have become more complex, and we now have nodes implementing a plurality
of domains, tagged by "bus_token".
Crutially, users of irq_find_host() all expect the most basic domain
to be returned, and not any other domain such as a bus-specific MSI
domain.
So let's change irq_find_host() to first look for a DOMAIN_BUS_WIRED
domain, and only if this fails fallback to DOMAIN_BUS_ANY. Note that
this is consistent with what irq_create_fwspec_mapping is already
doing, see 530cbe100e ("irqdomain: Allow domain lookup with
DOMAIN_BUS_WIRED token").
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Rob Herring <robh@kernel.org>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Cc: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Cc: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lkml.kernel.org/r/20180508121438.11301-6-marc.zyngier@arm.com
Inclusion of include/dma-iommu.h when CONFIG_IOMMU_DMA is not selected
results in the following splat:
In file included from drivers/irqchip/irq-gic-v3-mbi.c:20:0:
./include/linux/dma-iommu.h:95:69: error: unknown type name ‘dma_addr_t’
static inline int iommu_get_msi_cookie(struct iommu_domain *domain, dma_addr_t base)
^~~~~~~~~~
./include/linux/dma-iommu.h:108:74: warning: ‘struct list_head’ declared inside parameter list will not be visible outside of this definition or declaration
static inline void iommu_dma_get_resv_regions(struct device *dev, struct list_head *list)
^~~~~~~~~
scripts/Makefile.build:312: recipe for target 'drivers/irqchip/irq-gic-v3-mbi.o' failed
Fix it by including linux/types.h.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Rob Herring <robh@kernel.org>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Cc: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Cc: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lkml.kernel.org/r/20180508121438.11301-5-marc.zyngier@arm.com
Nobody would be insane enough to try and use level triggered
MSIs on PCI, but let's make sure it doesn't happen. Also,
let's mandate that the irqchip backing the platform MSI domain
is providing the IRQCHIP_SUPPORTS_LEVEL_MSI flag.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Rob Herring <robh@kernel.org>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Cc: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Cc: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lkml.kernel.org/r/20180508121438.11301-3-marc.zyngier@arm.com
So far, MSIs have been used to signal edge-triggered interrupts, as
a write is a good model for an edge (you can't "unwrite" something).
On the other hand, routing zillions of wires in an SoC because you
need level interrupts is a bit extreme.
People have come up with a variety of schemes to support this, which
involves sending two messages: one to signal the interrupt, and one
to clear it. Since the kernel cannot represent this, we've ended up
with side-band mechanisms that are pretty awful.
Instead, let's acknoledge the requirement, and ensure that, under the
right circumstances, the irq_compose_msg and irq_write_msg can take
as a parameter an array of two messages instead of a pointer to a
single one. We also add some checking that the compose method only
clobbers the second message if the MSI domain has been created with
the MSI_FLAG_LEVEL_CAPABLE flags.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Rob Herring <robh@kernel.org>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Cc: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Cc: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lkml.kernel.org/r/20180508121438.11301-2-marc.zyngier@arm.com
This contains:
- Support for SoundWire Streaming
- Documentation updates for streaming
- Cadence and Intel driver updates for streaming
- ASoC API for programming soundwire stream
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJa98rIAAoJEHwUBw8lI4NHrU4P/2h6FLtl1RETg28dcUiGUXj4
ALgMej7IbkRXaqEClfA6sU9qe9NCi1UrWD4bEnwetjgMki8/WeJP1pSgoRwZT/YA
38lle3KoGaptBRf2uZe9hfiKKyvp3oI9XPNHEsk6Qqw0qSSUr8t6k5FAWBiupkiV
s8R/dYDW/qSEmoWODodFFuJFYr1Xok5L95vt4dyCg4FXglr8Sym8Sk6DmjzcqEy/
w4YsWKvg62RbOZ4hbuZcwY3NunIvvMAx0+A3xweZaLobpCwJ0KB3ApYZeOybF5px
ODDRNuh76scgz+UMBr796vfvzPAztbUUwcCmXVJrZkWC98OCSOUXfQSZOKnIRmFL
AZICxsSPOmip++O3C221bKeI65ldXiPHBFzzmzAHOI5IyMfZREu3bbE69x+cwTUi
1UKrQb0uQRPCQJtaR9hMc4CadRA/SwhOIk42G5aNJ0R4CRpVpVhixuXUHv9+Huaj
AHSYr3BKCiAW5FshcDcfl9PUvmgULgO66Nk5YURSEdaY39E+1NElHSeKxbbSYRqc
SNr+gAUKP9Pqfpa8ToOUvb7yg4X72Tw4U9tUbj0i18vb7robkJD4BwlIikUpyBK/
w8Ew/6WqatmHvX8GpQPRIuE3IbbGKYJquQjTF9tRwzuJBzWFr3JYRU3dZ8g1q+I7
o9amLP9RcMVwD+qYFKtA
=KZwI
-----END PGP SIGNATURE-----
Merge tag 'soundwire-streaming' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire into char-misc-next
Vinod writes:
soundwire streaming
This contains:
- Support for SoundWire Streaming
- Documentation updates for streaming
- Cadence and Intel driver updates for streaming
- ASoC API for programming soundwire stream
In commit e7ff3a4763 (x86/amd: Check for the C1E bug post ACPI subsystem
init) a new function arch_post_acpi_subsys_init() was introduced. This
weak function can potentially be overridden on a per arch basis, introduce
the prototype for clarity.
Silence the following gcc warning (W=1):
init/main.c:484:20: warning: no previous prototype for ‘arch_post_acpi_subsys_init’ [-Wmissing-prototypes]
Signed-off-by: Mathieu Malaterre <malat@debian.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Remove redunadant description of label in struct mlxreg_hotplug_device.
Change location of access_mode in struct mlxreg_hotplug_device.
Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
Signed-off-by: Darren Hart (VMware) <dvhart@infradead.org>
Move the tsl2772 driver out of staging and into mainline.
Signed-off-by: Brian Masney <masneyb@onstation.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Merge misc fixes from Andrew Morton:
"13 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
rbtree: include rcu.h
scripts/faddr2line: fix error when addr2line output contains discriminator
ocfs2: take inode cluster lock before moving reflinked inode from orphan dir
mm, oom: fix concurrent munlock and oom reaper unmap, v3
mm: migrate: fix double call of radix_tree_replace_slot()
proc/kcore: don't bounds check against address 0
mm: don't show nr_indirectly_reclaimable in /proc/vmstat
mm: sections are not offlined during memory hotremove
z3fold: fix reclaim lock-ups
init: fix false positives in W+X checking
lib/find_bit_benchmark.c: avoid soft lockup in test_find_first_bit()
KASAN: prohibit KASAN+STRUCTLEAK combination
MAINTAINERS: update Shuah's email address
The bpf syscall and selftests conflicts were trivial
overlapping changes.
The r8169 change involved moving the added mdelay from 'net' into a
different function.
A TLS close bug fix overlapped with the splitting of the TLS state
into separate TX and RX parts. I just expanded the tests in the bug
fix from "ctx->conf == X" into "ctx->tx_conf == X && ctx->rx_conf
== X".
Signed-off-by: David S. Miller <davem@davemloft.net>
Since commit c1adf20052 ("Introduce rb_replace_node_rcu()")
rbtree_augmented.h uses RCU related data structures but does not include
the header file. It works as long as it gets somehow included before
that and fails otherwise.
Link: http://lkml.kernel.org/r/20180504103159.19938-1-bigeasy@linutronix.de
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Since exit_mmap() is done without the protection of mm->mmap_sem, it is
possible for the oom reaper to concurrently operate on an mm until
MMF_OOM_SKIP is set.
This allows munlock_vma_pages_all() to concurrently run while the oom
reaper is operating on a vma. Since munlock_vma_pages_range() depends
on clearing VM_LOCKED from vm_flags before actually doing the munlock to
determine if any other vmas are locking the same memory, the check for
VM_LOCKED in the oom reaper is racy.
This is especially noticeable on architectures such as powerpc where
clearing a huge pmd requires serialize_against_pte_lookup(). If the pmd
is zapped by the oom reaper during follow_page_mask() after the check
for pmd_none() is bypassed, this ends up deferencing a NULL ptl or a
kernel oops.
Fix this by manually freeing all possible memory from the mm before
doing the munlock and then setting MMF_OOM_SKIP. The oom reaper can not
run on the mm anymore so the munlock is safe to do in exit_mmap(). It
also matches the logic that the oom reaper currently uses for
determining when to set MMF_OOM_SKIP itself, so there's no new risk of
excessive oom killing.
This issue fixes CVE-2018-1000200.
Link: http://lkml.kernel.org/r/alpine.DEB.2.21.1804241526320.238665@chino.kir.corp.google.com
Fixes: 2129258024 ("mm: oom: let oom_reap_task and exit_mmap run concurrently")
Signed-off-by: David Rientjes <rientjes@google.com>
Suggested-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: <stable@vger.kernel.org> [4.14+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull networking fixes from David Miller:
1) Verify lengths of keys provided by the user is AF_KEY, from Kevin
Easton.
2) Add device ID for BCM89610 PHY. Thanks to Bhadram Varka.
3) Add Spectre guards to some ATM code, courtesy of Gustavo A. R.
Silva.
4) Fix infinite loop in NSH protocol code. To Eric Dumazet we are most
grateful for this fix.
5) Line up /proc/net/netlink headers properly. This fix from YU Bo, we
do appreciate.
6) Use after free in TLS code. Once again we are blessed by the
honorable Eric Dumazet with this fix.
7) Fix regression in TLS code causing stalls on partial TLS records.
This fix is bestowed upon us by Andrew Tomt.
8) Deal with too small MTUs properly in LLC code, another great gift
from Eric Dumazet.
9) Handle cached route flushing properly wrt. MTU locking in ipv4, to
Hangbin Liu we give thanks for this.
10) Fix regression in SO_BINDTODEVIC handling wrt. UDP socket demux.
Paolo Abeni, he gave us this.
11) Range check coalescing parameters in mlx4 driver, thank you Moshe
Shemesh.
12) Some ipv6 ICMP error handling fixes in rxrpc, from our good brother
David Howells.
13) Fix kexec on mlx5 by freeing IRQs in shutdown path. Daniel Juergens,
you're the best!
14) Don't send bonding RLB updates to invalid MAC addresses. Debabrata
Benerjee saved us!
15) Uh oh, we were leaking in udp_sendmsg and ping_v4_sendmsg. The ship
is now water tight, thanks to Andrey Ignatov.
16) IPSEC memory leak in ixgbe from Colin Ian King, man we've got holes
everywhere!
17) Fix error path in tcf_proto_create, Jiri Pirko what would we do
without you!
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (92 commits)
net sched actions: fix refcnt leak in skbmod
net: sched: fix error path in tcf_proto_create() when modules are not configured
net sched actions: fix invalid pointer dereferencing if skbedit flags missing
ixgbe: fix memory leak on ipsec allocation
ixgbevf: fix ixgbevf_xmit_frame()'s return type
ixgbe: return error on unsupported SFP module when resetting
ice: Set rq_last_status when cleaning rq
ipv4: fix memory leaks in udp_sendmsg, ping_v4_sendmsg
mlxsw: core: Fix an error handling path in 'mlxsw_core_bus_device_register()'
bonding: send learning packets for vlans on slave
bonding: do not allow rlb updates to invalid mac
net/mlx5e: Err if asked to offload TC match on frag being first
net/mlx5: E-Switch, Include VF RDMA stats in vport statistics
net/mlx5: Free IRQs in shutdown path
rxrpc: Trace UDP transmission failure
rxrpc: Add a tracepoint to log ICMP/ICMP6 and error messages
rxrpc: Fix the min security level for kernel calls
rxrpc: Fix error reception on AF_INET6 sockets
rxrpc: Fix missing start of call timeout
qed: fix spelling mistake: "taskelt" -> "tasklet"
...
Bump the internal tag to 32, instead of stealing the last tag in
our regular command space. This works just fine, since we don't
actually need a separate hardware tag for this. Internal commands
cannot coexist with NCQ commands.
As a bonus, we get rid of the special casing of what tag to use
for the internal command.
This is in preparation for utilizing all 32 commands for normal IO.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Tejun Heo <tj@kernel.org>
Some check for the value directly, use the provided helper instead.
Also make it return a bool, since that's what it does.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Tejun Heo <tj@kernel.org>
This is in preparation for allowing full usage of the tag space,
which means that our reserved error handling command will be
using an internal tag value of 32. This doesn't fit in a u32, so
move to a u64.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Tejun Heo <tj@kernel.org>
Rigth now these are the same, but drivers should be using ->hw_tag
for their command setup and issue.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Tejun Heo <tj@kernel.org>
Clean up: Eliminate a structure that is no longer used.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
While sending each RPC Reply, svc_rdma_sendto allocates and DMA-
maps a separate buffer where the RPC/RDMA transport header is
constructed. The buffer is unmapped and released in the Send
completion handler. This is significant per-RPC overhead,
especially for small RPCs.
Instead, allocate and DMA-map a buffer, and cache it in each
svc_rdma_send_ctxt. This buffer and its mapping can be re-used
for each RPC, saving the cost of memory allocation and DMA
mapping.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Clean up: Now that the send_wr is part of the svc_rdma_send_ctxt,
svc_rdma_post_send_wr is nearly empty.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Receive buffers are always the same size, but each Send WR has a
variable number of SGEs, based on the contents of the xdr_buf being
sent.
While assembling a Send WR, keep track of the number of SGEs so that
we don't exceed the device's maximum, or walk off the end of the
Send SGE array.
For now the Send path just fails if it exceeds the maximum.
The current logic in svc_rdma_accept bases the maximum number of
Send SGEs on the largest NFS request that can be sent or received.
In the transport layer, the limit is actually based on the
capabilities of the underlying device, not on properties of the
Upper Layer Protocol.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
svc_rdma_op_ctxt's are pre-allocated and maintained on a per-xprt
free list. This eliminates the overhead of calling kmalloc / kfree,
both of which grab a globally shared lock that disables interrupts.
Introduce a replacement to svc_rdma_op_ctxt's that is built
especially for the svcrdma Send path.
Subsequent patches will take advantage of this new structure by
allocating real resources which are then cached in these objects.
The allocations are freed when the transport is torn down.
I've renamed the structure so that static type checking can be used
to ensure that uses of op_ctxt and send_ctxt are not confused. As an
additional clean up, structure fields are renamed to conform with
kernel coding conventions.
Additional clean ups:
- Handle svc_rdma_send_ctxt_get allocation failure at each call
site, rather than pre-allocating and hoping we guessed correctly
- All send_ctxt_put call-sites request page freeing, so remove
the @free_pages argument
- All send_ctxt_put call-sites unmap SGEs, so fold that into
svc_rdma_send_ctxt_put
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Clean up: Since there's already a svc_rdma_op_ctxt being passed
around with the running count of mapped SGEs, drop unneeded
parameters to svc_rdma_post_send_wr().
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Clean up: svc_rdma_dma_map_buf does mostly the same thing as
svc_rdma_dma_map_page, so let's fold these together.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
There is a significant latency penalty when processing an ingress
Receive if the Receive buffer resides in memory that is not on the
same NUMA node as the the CPU handling completions for a CQ.
The system administrator and the device driver determine which CPU
handles completions. This CPU does not change during life of the CQ.
Further the Upper Layer does not have any visibility of which CPU it
is.
Allocating Receive buffers in the Receive completion handler
guarantees that Receive buffers are allocated on the preferred NUMA
node for that CQ.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
The current Receive path uses an array of pages which are allocated
and DMA mapped when each Receive WR is posted, and then handed off
to the upper layer in rqstp::rq_arg. The page flip releases unused
pages in the rq_pages pagelist. This mechanism introduces a
significant amount of overhead.
So instead, kmalloc the Receive buffer, and leave it DMA-mapped
while the transport remains connected. This confers a number of
benefits:
* Each Receive WR requires only one receive SGE, no matter how large
the inline threshold is. This helps the server-side NFS/RDMA
transport operate on less capable RDMA devices.
* The Receive buffer is left allocated and mapped all the time. This
relieves svc_rdma_post_recv from the overhead of allocating and
DMA-mapping a fresh buffer.
* svc_rdma_wc_receive no longer has to DMA unmap the Receive buffer.
It has to DMA sync only the number of bytes that were received.
* svc_rdma_build_arg_xdr no longer has to free a page in rq_pages
for each page in the Receive buffer, making it a constant-time
function.
* The Receive buffer is now plugged directly into the rq_arg's
head[0].iov_vec, and can be larger than a page without spilling
over into rq_arg's page list. This enables simplification of
the RDMA Read path in subsequent patches.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Currently svc_rdma_recv_ctxt_put's callers have to know whether they
want to free the ctxt's pages or not. This means the human
developers have to know when and why to set that free_pages
argument.
Instead, the ctxt should carry that information with it so that
svc_rdma_recv_ctxt_put does the right thing no matter who is
calling.
We want to keep track of the number of pages in the Receive buffer
separately from the number of pages pulled over by RDMA Read. This
is so that the correct number of pages can be freed properly and
that number is well-documented.
So now, rc_hdr_count is the number of pages consumed by head[0]
(ie., the page index where the Read chunk should start); and
rc_page_count is always the number of pages that need to be released
when the ctxt is put.
The @free_pages argument is no longer needed.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Clean up: No need to retain rq_depth in struct svcrdma_xprt, it is
used only in svc_rdma_accept().
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
svc_rdma_op_ctxt's are pre-allocated and maintained on a per-xprt
free list. This eliminates the overhead of calling kmalloc / kfree,
both of which grab a globally shared lock that disables interrupts.
To reduce contention further, separate the use of these objects in
the Receive and Send paths in svcrdma.
Subsequent patches will take advantage of this separation by
allocating real resources which are then cached in these objects.
The allocations are freed when the transport is torn down.
I've renamed the structure so that static type checking can be used
to ensure that uses of op_ctxt and recv_ctxt are not confused. As an
additional clean up, structure fields are renamed to conform with
kernel coding conventions.
As a final clean up, helpers related to recv_ctxt are moved closer
to the functions that use them.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>