linux-xiaomi-chiron

Author	SHA1	Message	Date
Dexuan Cui	ae20b25430	Drivers: hv: vmbus: enable VMBus protocol version 5.0 With VMBus protocol 5.0, we're able to better support new features, e.g. running two or more VMBus drivers simultaneously in a single VM -- note: we can't simply load the current VMBus driver twice, instead, a secondary VMBus driver must be implemented. This patch adds the support for the new VMBus protocol, which is available on new Windows hosts, by: 1) We still use SINT2 for compatibility; 2) We must use Connection ID 4 for the Initiate Contact Message, and for subsequent messages, we must use the Message Connection ID field in the host-returned VersionResponse Message. Notes for developers of the secondary VMBus driver: 1) Must use VMBus protocol 5.0 as well; 2) Must use a different SINT number that is not in use. 3) Must use Connection ID 4 for the Initiate Contact Message, and for subsequent messages, must use the Message Connection ID field in the host-returned VersionResponse Message. 4) It's possible that the primary VMBus driver using protocol version 4.0 can work with a secondary VMBus driver using protocol version 5.0, but it's recommended that both should use 5.0 for new Hyper-V features in the future. Signed-off-by: Dexuan Cui <decui@microsoft.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Michael Kelley <mikelley@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-14 16:06:48 +02:00
Bruce Allan	0fccb85ad2	virtchnl: Whitespace and parenthesis cleanup Clean up existing instances of unnecessary parentheses in if statement and change order of conditionals to make it easier to read The opening /* should be followed by a single space and the closing */ should be preceded with a single space. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-05-14 07:05:16 -07:00
Al Viro	2220c5b0a7	make xattr_getsecurity() static many years overdue... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2018-05-14 09:51:34 -04:00
Bartlomiej Zolnierkiewicz	e7deb3c774	drm: shmobile: remove unused MERAM support Since commit `a521422ea4` ("ARM: shmobile: mackerel: Remove Legacy C board code") MERAM functionality is unused. Remove it. Reviewed-by: Simon Horman <horms+renesas@verge.net.au> Cc: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Acked-by: Daniel Vetter <daniel@ffwll.ch> Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>	2018-05-14 15:47:30 +02:00
Ben Hutchings	ea739a287f	mtd: Fix comparison in map_word_andequal() Commit `9e343e87d2` ("mtd: cfi: convert inline functions to macros") changed map_word_andequal() into a macro, but also changed the right hand side of the comparison from val3 to val2. Change it back to use val3 on the right hand side. Thankfully this did not cause a regression because all callers currently pass the same argument for val2 and val3. Fixes: `9e343e87d2` ("mtd: cfi: convert inline functions to macros") Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>	2018-05-14 14:46:20 +02:00
Matthias Brugger	0afd32c6bb	Merge commit '`f15cd6d991`' into v.4.17-next/soc-test	2018-05-14 12:22:03 +02:00
Frederic Weisbecker	48bda43eab	softirq/s390: Move default mutators of overwritten softirq mask to s390 s390 is now the last architecture that entirely overwrites local_softirq_pending() and uses the according default definitions of set_softirq_pending() and or_softirq_pending(). Just move these to s390 to debloat the generic code complexity. Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: David S. Miller <davem@davemloft.net> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: James E.J. Bottomley <jejb@parisc-linux.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Rich Felker <dalias@libc.org> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: http://lkml.kernel.org/r/1525786706-22846-12-git-send-email-frederic@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2018-05-14 11:25:28 +02:00
Frederic Weisbecker	0fd7d86285	softirq/core: Consolidate default local_softirq_pending() implementations Consolidate and optimize default softirq mask API implementations. Per-CPU operations are expected to be faster and a few architectures already rely on them to implement local_softirq_pending() and related accessors/mutators. Those will be migrated to the new generic code. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: David S. Miller <davem@davemloft.net> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: James E.J. Bottomley <jejb@parisc-linux.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Rich Felker <dalias@libc.org> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: http://lkml.kernel.org/r/1525786706-22846-6-git-send-email-frederic@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2018-05-14 11:25:27 +02:00
Frederic Weisbecker	0f6f47bacb	softirq/core: Turn default irq_cpustat_t to standard per-cpu In order to optimize and consolidate softirq mask accesses, let's convert the default irq_cpustat_t implementation to per-CPU standard API. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: David S. Miller <davem@davemloft.net> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: James E.J. Bottomley <jejb@parisc-linux.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Rich Felker <dalias@libc.org> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: http://lkml.kernel.org/r/1525786706-22846-5-git-send-email-frederic@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2018-05-14 11:25:27 +02:00
Ingo Molnar	4b96583869	Linux 4.17-rc5 -----BEGIN PGP SIGNATURE----- iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAlr4xw8eHHRvcnZhbGRz QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGNYoH/1d5zyMpVJVUKZ0K LuEctCGby1PjSvSOhmMuxFVagFAqfBJXmwWTeohLfLG48r/Yk0AsZQ5HH13/8baj k/T8UgUvKZKustndCRp+joQ3Pa1ZpcIFaWRvB8pKFCefJ/F/Lj4B4X1HYI7vLq0K /ZBXUdy3ry0lcVuypnaARYAb2O7l/nyZIjZ3FhiuyymWe7Jpo+G7VK922LOMSX/y VYFZCWa8nxN+yFhO0ao9X5k7ggIiUrEBtbfNrk19VtAn0hx+OYKW2KfJK/eHNey/ CKrOT+KAxU8VU29AEIbYzlL3yrQmULcEoIDiqJ/6m5m6JwsEbP6EqQHs0TiuQFpq A0MO9rw= =yjUP -----END PGP SIGNATURE----- Merge tag 'v4.17-rc5' into irq/core, to pick up fixes Signed-off-by: Ingo Molnar <mingo@kernel.org>	2018-05-14 11:22:59 +02:00
Olof Johansson	71fe67e0e2	ARM: SOC driver update for 4.18 - AEMIF driver update to support board files and remove need of mach-davinci aemif code - Use percpu counters for qmss datapath stats - License update for TI SCI -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJa8hefAAoJEHJsHOdBp5c/Z7gP/RJcEM/bUrmIj+iAf+h/Azp3 f5KiFrBmwIcRPC4VULRL06uuRgiExWcDY6j3gheKdJzqHOKprRysdRDEkHLKnmoy EGUS2HKo6Bbig/G/lMy9YhmrOEqm2tsh008TwSj6V8ZHSXgdyd3R4Kbe0YM5bbVp TMUMTuGN6EP1RMAMk4zh9jGxCSDgfzI6FZd2Yf8pxAhsIAa7ssbzGGT85p3lBP3T PTQ5h/aMP833gf7Ir7z5wEdqvdmfLVIxyu2bOzbP+rPUnaUGI9E2qaKPgiz8Tw5W sCICBv2ELauoSyLLUaJ8BOVU6pI07Cm0DisdUmhuHex3EOm+U1Leg3XX5pyJ/Sh7 sGeLfFpMYWflinp6owq3J/z4sOQCq6pHugfS+6H3k6+uBX1S4RVkhcLda/Z0zz7p LAGpFkSSHP1vFtQLU7phSu7m+v4KqtnUEDZelCLroJIkncpwotYaAjisIOxOif0j WgNCHMNdTF/oxGCWBxqk5GH3bvZg53uUK9iy+WjgfKLOe+2RKEM+MbPpmVBWJxag ZV9vwRpgmi/5I/cUNt7sTJ8ine66I5N68ps4K6e8I8JNHqBqpcZFQwnZ+x+Ib0l9 vv3unT92f/7q2XtIGkxkDNgqnsLF/b6sxqmy8jnNme/FIOWPX4iq2TGSnZ151FQ1 nH3qGLDyn3UAEdBXMnkb =uqd7 -----END PGP SIGNATURE----- Merge tag 'soc_drivers_for_4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/ssantosh/linux-keystone into next/drivers ARM: SOC driver update for 4.18 - AEMIF driver update to support board files and remove need of mach-davinci aemif code - Use percpu counters for qmss datapath stats - License update for TI SCI * tag 'soc_drivers_for_4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/ssantosh/linux-keystone: firmware: ti_sci: Switch to SPDX Licensing soc: ti: knav_qmss: Use percpu instead atomic for stats counter memory: aemif: add support for board files memory: aemif: don't rely on kbuild for driver's name Signed-off-by: Olof Johansson <olof@lixom.net>	2018-05-14 01:27:47 -07:00
Olof Johansson	e5d9875ecd	ti-sysc driver related changes for omap variants This series improves the ti-sysc interconnect target module driver to the point where a most of SoC can be booted with interconnect target module data configured in device tree instead of legacy platform data. The related device tree changes need some more work though, and can wait for v4.19. Also some drivers using nested interconnects like DSS need more work. We can now remove the unused pm-noop code that is not doing anything any longer. And we can now initialize things for PM and display pdata later to prepare things for using ti-sysc driver. We also need to add some more quirk handling so we can boot both with platform data and dts data. -----BEGIN PGP SIGNATURE----- iQJFBAABCAAvFiEEkgNvrZJU/QSQYIcQG9Q+yVyrpXMFAlrsg9QRHHRvbnlAYXRv bWlkZS5jb20ACgkQG9Q+yVyrpXNmwxAAzPt1GpHQSw/XhhtK8+DLlqo9fdPQ9C65 Iw+PyvQYy2bTj0y64VkZ4Msmi3SOhfr5zKhIrwBHEG59/LW81oXqnb9JZHPP+YC+ /A6ZufXIt8X+Nd1L6id2OD9ItJoLXk7llBvDckwb/zUgcVib9cA79GvWgexfRWBE w/bjZfUibddYdhKoCGmvWcZBapDKHMfv8MdN8h0QUyofTIefZZeykRvb1Pmn7Ntl vz3QfPUq3oyRfG9PMRI7mjHrW7jxEKgjvWANbUg64UQJN7s1tfa8ICpzycc4/X/a pdetH7G+BPaRdeqDCmGrcGHfO4b5HyD7nkTD3R6yzV+Dw8nWl+aWGJHAsPYRUJkd o/BroflhqK2ICfEkeK6AWebbicOSlF5P+EEFwp6pHSd/9JiEqR1IkhcCvTdV8CB1 qyUQxD+iKof+rY5f1EicaGq8HXhkV+9aIOoqBH6C0qObEJDUWvVoGIzDdN2vwVAu C2w9WqdQII3R4g2ZX1SmdEqFO/f6PkAoKiyNt+WGBGBUfYo1sfwpkFAEeGU50moJ 5m9TtLcAbbvgMwy2ttfWcHPn5z3p4Ocf7aN93TZ6RPk6A6R57PzCcYqJ2bXsumeV 5yaP9w4pbFj+FQuu8jA8s/cSwhIP8SwqwFWKCi2JcU3ugEdJfwF555y5bm0R9MDz 7W82aAicw+M= =jYZ6 -----END PGP SIGNATURE----- Merge tag 'omap-for-v4.18/ti-sysc-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into next/soc ti-sysc driver related changes for omap variants This series improves the ti-sysc interconnect target module driver to the point where a most of SoC can be booted with interconnect target module data configured in device tree instead of legacy platform data. The related device tree changes need some more work though, and can wait for v4.19. Also some drivers using nested interconnects like DSS need more work. We can now remove the unused pm-noop code that is not doing anything any longer. And we can now initialize things for PM and display pdata later to prepare things for using ti-sysc driver. We also need to add some more quirk handling so we can boot both with platform data and dts data. * tag 'omap-for-v4.18/ti-sysc-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap: bus: ti-sysc: Show module information for suspend if DEBUG is enabled bus: ti-sysc: Tag sdio and wdt with legacy mode for suspend bus: ti-sysc: Detect UARTs for SYSC_QUIRK_LEGACY_IDLE quirk on omap4 bus: ti-sysc: Detect omap4 type timers for quirk bus: ti-sysc: Add initial support for external resets bus: ti-sysc: Improve suspend and resume handling bus: ti-sysc: Tag some modules resource providers for noirq suspend bus: ti-sysc: Add handling for clkctrl opt clocks bus: ti-sysc: Make child clock alias handling more generic bus: ti-sysc: Handle simple-bus for nested children ARM: OMAP2+: Make display related init into device_initcall ARM: OMAP2+: Initialize SoC PM later ARM: OMAP2+: Only probe SDMA via ti-sysc if configured in dts ARM: OMAP2+: Use signed value for sysc register offsets ARM: OMAP2+: Allow using ti-sysc for system timers ARM: OMAP2+: Drop unused pm-noop Signed-off-by: Olof Johansson <olof@lixom.net>	2018-05-14 01:18:44 -07:00
Rohit Jain	943d355d7f	sched/core: Distinguish between idle_cpu() calls based on desired effect, introduce available_idle_cpu() In the following commit: `247f2f6f3c` ("sched/core: Don't schedule threads on pre-empted vCPUs") ... we distinguish between idle_cpu() when the vCPU is not running for scheduling threads. However, the idle_cpu() function is used in other places for actually checking whether the state of the CPU is idle or not. Hence split the use of that function based on the desired return value, by introducing the available_idle_cpu() function. This fixes a (slight) regression in that initial vCPU commit, because some code paths (like the load-balancer) don't care and shouldn't care if the vCPU is preempted or not, they just want to know if there's any tasks on the CPU. Signed-off-by: Rohit Jain <rohit.k.jain@oracle.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dhaval.giani@oracle.com Cc: linux-kernel@vger.kernel.org Cc: matt@codeblueprint.co.uk Cc: steven.sistare@oracle.com Cc: subhra.mazumdar@oracle.com Link: http://lkml.kernel.org/r/1525883988-10356-1-git-send-email-rohit.k.jain@oracle.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2018-05-14 09:12:26 +02:00
Sebastian Andrzej Siewior	a59a68fee0	sched/wait: Include <linux/wait.h> in <linux/swait.h> kbuild bot reported against an intermediate RT patch that the build fails with: > In file included from include/linux/completion.h:12:0, > from include/linux/rcupdate_wait.h:10, > from kernel/rcu/srcutiny.c:27: > kernel/rcu/srcutiny.c: In function 'srcu_drive_gp': > >> include/linux/swait.h:172:7: error: implicit declaration of function '___wait_is_interruptible'; did you mean '__swait_event_interruptible'? > if (___wait_is_interruptible(state) && __int) { \ That error vanishes a few patches later (in the RT queue) because wait.h is then pulled in by other means. It does not seem to surface on !RT. I think that swait should include a header file for a function/macro (___wait_is_interruptible()) it is using. Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Link: http://lkml.kernel.org/r/20180504104224.20218-1-bigeasy@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2018-05-14 09:12:25 +02:00
Ard Biesheuvel	cb0ba79352	efi: Align efi_pci_io_protocol typedefs to type naming convention In order to use the helper macros that perform type mangling with the EFI PCI I/O protocol struct typedefs, align their Linux typenames with the convention we use for definitionns that originate in the UEFI spec, and add the trailing _t to each. Tested-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/20180504060003.19618-14-ard.biesheuvel@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2018-05-14 08:57:48 +02:00
Yazen Ghannam	f9e1bdb9f3	efi: Decode IA32/X64 Processor Error Section Recognize the IA32/X64 Processor Error Section. Do the section decoding in a new "cper-x86.c" file and add this to the Makefile depending on a new "UEFI_CPER_X86" config option. Print the Local APIC ID and CPUID info from the Processor Error Record. The "Processor Error Info" and "Processor Context" fields will be decoded in following patches. Based on UEFI 2.7 Table 252. Processor Error Record. Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/20180504060003.19618-5-ard.biesheuvel@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2018-05-14 08:57:47 +02:00
Yazen Ghannam	742632d237	efi: Fix IA32/X64 Processor Error Record definition Based on UEFI 2.7 Table 255. Processor Error Record, the "Local APIC_ID" field is 8 bytes but Linux defines this field as 1 byte. Fix this in the struct cper_sec_proc_ia definition. Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/20180504060003.19618-4-ard.biesheuvel@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2018-05-14 08:57:47 +02:00
Ard Biesheuvel	0b3225ab94	efi: Avoid potential crashes, fix the 'struct efi_pci_io_protocol_32' definition for mixed mode Mixed mode allows a kernel built for x86_64 to interact with 32-bit EFI firmware, but requires us to define all struct definitions carefully when it comes to pointer sizes. 'struct efi_pci_io_protocol_32' currently uses a 'void *' for the 'romimage' field, which will be interpreted as a 64-bit field on such kernels, potentially resulting in bogus memory references and subsequent crashes. Tested-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: <stable@vger.kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/20180504060003.19618-13-ard.biesheuvel@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2018-05-14 08:56:29 +02:00
Amir Goldstein	0c8e3fe35d	vfs: add the sb_start_intwrite_trylock() helper Needed by ext4 to test frozen fs before updating s_last_mounted. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Jan Kara <jack@suse.cz>	2018-05-13 22:40:30 -04:00
Linus Torvalds	66e1c94db3	Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86/pti updates from Thomas Gleixner: "A mixed bag of fixes and updates for the ghosts which are hunting us. The scheduler fixes have been pulled into that branch to avoid conflicts. - A set of fixes to address a khread_parkme() race which caused lost wakeups and loss of state. - A deadlock fix for stop_machine() solved by moving the wakeups outside of the stopper_lock held region. - A set of Spectre V1 array access restrictions. The possible problematic spots were discuvered by Dan Carpenters new checks in smatch. - Removal of an unused file which was forgotten when the rest of that functionality was removed" * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/vdso: Remove unused file perf/x86/cstate: Fix possible Spectre-v1 indexing for pkg_msr perf/x86/msr: Fix possible Spectre-v1 indexing in the MSR driver perf/x86: Fix possible Spectre-v1 indexing for x86_pmu::event_map() perf/x86: Fix possible Spectre-v1 indexing for hw_perf_event cache_* perf/core: Fix possible Spectre-v1 indexing for ->aux_pages[] sched/autogroup: Fix possible Spectre-v1 indexing for sched_prio_to_weight[] sched/core: Fix possible Spectre-v1 indexing for sched_prio_to_weight[] sched/core: Introduce set_special_state() kthread, sched/wait: Fix kthread_parkme() completion issue kthread, sched/wait: Fix kthread_parkme() wait-loop sched/fair: Fix the update of blocked load when newly idle stop_machine, sched: Fix migrate_swap() vs. active_balance() deadlock	2018-05-13 10:53:08 -07:00
Marc Zyngier	505287525c	irqchip/gic-v3: Add support for Message Based Interrupts as an MSI controller GICv3 offers the possibility to signal SPIs using a pair of doorbells (SETPI, CLRSPI) under the name of Message Based Interrupts (MBI). They can be used as either traditional (edge) MSIs, or the more exotic level-triggered flavour. Let's implement support for platform MSI, which is the original intent for this feature. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Rob Herring <robh@kernel.org> Cc: Jason Cooper <jason@lakedaemon.net> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Cc: Thomas Petazzoni <thomas.petazzoni@bootlin.com> Cc: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lkml.kernel.org/r/20180508121438.11301-8-marc.zyngier@arm.com	2018-05-13 15:59:01 +02:00
Marc Zyngier	6461934371	irqdomain: Let irq_find_host default to DOMAIN_BUS_WIRED At the beginning of times, irq_find_host() was simple. Each device node implemented at most one irq domain, and we were happy. Over time, things have become more complex, and we now have nodes implementing a plurality of domains, tagged by "bus_token". Crutially, users of irq_find_host() all expect the most basic domain to be returned, and not any other domain such as a bus-specific MSI domain. So let's change irq_find_host() to first look for a DOMAIN_BUS_WIRED domain, and only if this fails fallback to DOMAIN_BUS_ANY. Note that this is consistent with what irq_create_fwspec_mapping is already doing, see `530cbe100e` ("irqdomain: Allow domain lookup with DOMAIN_BUS_WIRED token"). Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Rob Herring <robh@kernel.org> Cc: Jason Cooper <jason@lakedaemon.net> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Cc: Thomas Petazzoni <thomas.petazzoni@bootlin.com> Cc: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lkml.kernel.org/r/20180508121438.11301-6-marc.zyngier@arm.com	2018-05-13 15:59:00 +02:00
Marc Zyngier	8a22a3e1e7	dma-iommu: Fix compilation when !CONFIG_IOMMU_DMA Inclusion of include/dma-iommu.h when CONFIG_IOMMU_DMA is not selected results in the following splat: In file included from drivers/irqchip/irq-gic-v3-mbi.c:20:0: ./include/linux/dma-iommu.h:95:69: error: unknown type name ‘dma_addr_t’ static inline int iommu_get_msi_cookie(struct iommu_domain domain, dma_addr_t base) ^~~~~~~~~~ ./include/linux/dma-iommu.h:108:74: warning: ‘struct list_head’ declared inside parameter list will not be visible outside of this definition or declaration static inline void iommu_dma_get_resv_regions(struct device dev, struct list_head *list) ^~~~~~~~~ scripts/Makefile.build:312: recipe for target 'drivers/irqchip/irq-gic-v3-mbi.o' failed Fix it by including linux/types.h. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Rob Herring <robh@kernel.org> Cc: Jason Cooper <jason@lakedaemon.net> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Cc: Thomas Petazzoni <thomas.petazzoni@bootlin.com> Cc: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lkml.kernel.org/r/20180508121438.11301-5-marc.zyngier@arm.com	2018-05-13 15:59:00 +02:00
Marc Zyngier	6988e0e0d2	genirq/msi: Limit level-triggered MSI to platform devices Nobody would be insane enough to try and use level triggered MSIs on PCI, but let's make sure it doesn't happen. Also, let's mandate that the irqchip backing the platform MSI domain is providing the IRQCHIP_SUPPORTS_LEVEL_MSI flag. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Rob Herring <robh@kernel.org> Cc: Jason Cooper <jason@lakedaemon.net> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Cc: Thomas Petazzoni <thomas.petazzoni@bootlin.com> Cc: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lkml.kernel.org/r/20180508121438.11301-3-marc.zyngier@arm.com	2018-05-13 15:58:59 +02:00
Marc Zyngier	0be8153cbc	genirq/msi: Allow level-triggered MSIs to be exposed by MSI providers So far, MSIs have been used to signal edge-triggered interrupts, as a write is a good model for an edge (you can't "unwrite" something). On the other hand, routing zillions of wires in an SoC because you need level interrupts is a bit extreme. People have come up with a variety of schemes to support this, which involves sending two messages: one to signal the interrupt, and one to clear it. Since the kernel cannot represent this, we've ended up with side-band mechanisms that are pretty awful. Instead, let's acknoledge the requirement, and ensure that, under the right circumstances, the irq_compose_msg and irq_write_msg can take as a parameter an array of two messages instead of a pointer to a single one. We also add some checking that the compose method only clobbers the second message if the MSI domain has been created with the MSI_FLAG_LEVEL_CAPABLE flags. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Rob Herring <robh@kernel.org> Cc: Jason Cooper <jason@lakedaemon.net> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Cc: Thomas Petazzoni <thomas.petazzoni@bootlin.com> Cc: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lkml.kernel.org/r/20180508121438.11301-2-marc.zyngier@arm.com	2018-05-13 15:58:59 +02:00
Greg Kroah-Hartman	176c2572cd	soundwire streaming This contains: - Support for SoundWire Streaming - Documentation updates for streaming - Cadence and Intel driver updates for streaming - ASoC API for programming soundwire stream -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJa98rIAAoJEHwUBw8lI4NHrU4P/2h6FLtl1RETg28dcUiGUXj4 ALgMej7IbkRXaqEClfA6sU9qe9NCi1UrWD4bEnwetjgMki8/WeJP1pSgoRwZT/YA 38lle3KoGaptBRf2uZe9hfiKKyvp3oI9XPNHEsk6Qqw0qSSUr8t6k5FAWBiupkiV s8R/dYDW/qSEmoWODodFFuJFYr1Xok5L95vt4dyCg4FXglr8Sym8Sk6DmjzcqEy/ w4YsWKvg62RbOZ4hbuZcwY3NunIvvMAx0+A3xweZaLobpCwJ0KB3ApYZeOybF5px ODDRNuh76scgz+UMBr796vfvzPAztbUUwcCmXVJrZkWC98OCSOUXfQSZOKnIRmFL AZICxsSPOmip++O3C221bKeI65ldXiPHBFzzmzAHOI5IyMfZREu3bbE69x+cwTUi 1UKrQb0uQRPCQJtaR9hMc4CadRA/SwhOIk42G5aNJ0R4CRpVpVhixuXUHv9+Huaj AHSYr3BKCiAW5FshcDcfl9PUvmgULgO66Nk5YURSEdaY39E+1NElHSeKxbbSYRqc SNr+gAUKP9Pqfpa8ToOUvb7yg4X72Tw4U9tUbj0i18vb7robkJD4BwlIikUpyBK/ w8Ew/6WqatmHvX8GpQPRIuE3IbbGKYJquQjTF9tRwzuJBzWFr3JYRU3dZ8g1q+I7 o9amLP9RcMVwD+qYFKtA =KZwI -----END PGP SIGNATURE----- Merge tag 'soundwire-streaming' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire into char-misc-next Vinod writes: soundwire streaming This contains: - Support for SoundWire Streaming - Documentation updates for streaming - Cadence and Intel driver updates for streaming - ASoC API for programming soundwire stream	2018-05-13 11:53:18 +02:00
Mathieu Malaterre	63fab6977a	ACPI: Add missing prototype_for arch_post_acpi_subsys_init() In commit `e7ff3a4763` (x86/amd: Check for the C1E bug post ACPI subsystem init) a new function arch_post_acpi_subsys_init() was introduced. This weak function can potentially be overridden on a per arch basis, introduce the prototype for clarity. Silence the following gcc warning (W=1): init/main.c:484:20: warning: no previous prototype for ‘arch_post_acpi_subsys_init’ [-Wmissing-prototypes] Signed-off-by: Mathieu Malaterre <malat@debian.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2018-05-13 11:17:13 +02:00
Vadim Pasternak	98004a78bb	platform_data/mlxreg: Document fixes for hotplug device Remove redunadant description of label in struct mlxreg_hotplug_device. Change location of access_mode in struct mlxreg_hotplug_device. Signed-off-by: Vadim Pasternak <vadimp@mellanox.com> Signed-off-by: Darren Hart (VMware) <dvhart@infradead.org>	2018-05-12 15:38:40 -07:00
Brian Masney	c06c4d7935	staging: iio: tsl2x7x/tsl2772: move out of staging Move the tsl2772 driver out of staging and into mainline. Signed-off-by: Brian Masney <masneyb@onstation.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2018-05-12 12:40:04 +01:00
Linus Torvalds	f0ab773f5c	Merge branch 'akpm' (patches from Andrew) Merge misc fixes from Andrew Morton: "13 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: rbtree: include rcu.h scripts/faddr2line: fix error when addr2line output contains discriminator ocfs2: take inode cluster lock before moving reflinked inode from orphan dir mm, oom: fix concurrent munlock and oom reaper unmap, v3 mm: migrate: fix double call of radix_tree_replace_slot() proc/kcore: don't bounds check against address 0 mm: don't show nr_indirectly_reclaimable in /proc/vmstat mm: sections are not offlined during memory hotremove z3fold: fix reclaim lock-ups init: fix false positives in W+X checking lib/find_bit_benchmark.c: avoid soft lockup in test_find_first_bit() KASAN: prohibit KASAN+STRUCTLEAK combination MAINTAINERS: update Shuah's email address	2018-05-11 18:04:12 -07:00
David S. Miller	b2d6cee117	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net The bpf syscall and selftests conflicts were trivial overlapping changes. The r8169 change involved moving the added mdelay from 'net' into a different function. A TLS close bug fix overlapped with the splitting of the TLS state into separate TX and RX parts. I just expanded the tests in the bug fix from "ctx->conf == X" into "ctx->tx_conf == X && ctx->rx_conf == X". Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-11 20:53:22 -04:00
Sebastian Andrzej Siewior	2075b16e32	rbtree: include rcu.h Since commit `c1adf20052` ("Introduce rb_replace_node_rcu()") rbtree_augmented.h uses RCU related data structures but does not include the header file. It works as long as it gets somehow included before that and fails otherwise. Link: http://lkml.kernel.org/r/20180504103159.19938-1-bigeasy@linutronix.de Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2018-05-11 17:28:45 -07:00
David Rientjes	27ae357fa8	mm, oom: fix concurrent munlock and oom reaper unmap, v3 Since exit_mmap() is done without the protection of mm->mmap_sem, it is possible for the oom reaper to concurrently operate on an mm until MMF_OOM_SKIP is set. This allows munlock_vma_pages_all() to concurrently run while the oom reaper is operating on a vma. Since munlock_vma_pages_range() depends on clearing VM_LOCKED from vm_flags before actually doing the munlock to determine if any other vmas are locking the same memory, the check for VM_LOCKED in the oom reaper is racy. This is especially noticeable on architectures such as powerpc where clearing a huge pmd requires serialize_against_pte_lookup(). If the pmd is zapped by the oom reaper during follow_page_mask() after the check for pmd_none() is bypassed, this ends up deferencing a NULL ptl or a kernel oops. Fix this by manually freeing all possible memory from the mm before doing the munlock and then setting MMF_OOM_SKIP. The oom reaper can not run on the mm anymore so the munlock is safe to do in exit_mmap(). It also matches the logic that the oom reaper currently uses for determining when to set MMF_OOM_SKIP itself, so there's no new risk of excessive oom killing. This issue fixes CVE-2018-1000200. Link: http://lkml.kernel.org/r/alpine.DEB.2.21.1804241526320.238665@chino.kir.corp.google.com Fixes: `2129258024` ("mm: oom: let oom_reap_task and exit_mmap run concurrently") Signed-off-by: David Rientjes <rientjes@google.com> Suggested-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: <stable@vger.kernel.org> [4.14+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2018-05-11 17:28:45 -07:00
Linus Torvalds	4bc871984f	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: 1) Verify lengths of keys provided by the user is AF_KEY, from Kevin Easton. 2) Add device ID for BCM89610 PHY. Thanks to Bhadram Varka. 3) Add Spectre guards to some ATM code, courtesy of Gustavo A. R. Silva. 4) Fix infinite loop in NSH protocol code. To Eric Dumazet we are most grateful for this fix. 5) Line up /proc/net/netlink headers properly. This fix from YU Bo, we do appreciate. 6) Use after free in TLS code. Once again we are blessed by the honorable Eric Dumazet with this fix. 7) Fix regression in TLS code causing stalls on partial TLS records. This fix is bestowed upon us by Andrew Tomt. 8) Deal with too small MTUs properly in LLC code, another great gift from Eric Dumazet. 9) Handle cached route flushing properly wrt. MTU locking in ipv4, to Hangbin Liu we give thanks for this. 10) Fix regression in SO_BINDTODEVIC handling wrt. UDP socket demux. Paolo Abeni, he gave us this. 11) Range check coalescing parameters in mlx4 driver, thank you Moshe Shemesh. 12) Some ipv6 ICMP error handling fixes in rxrpc, from our good brother David Howells. 13) Fix kexec on mlx5 by freeing IRQs in shutdown path. Daniel Juergens, you're the best! 14) Don't send bonding RLB updates to invalid MAC addresses. Debabrata Benerjee saved us! 15) Uh oh, we were leaking in udp_sendmsg and ping_v4_sendmsg. The ship is now water tight, thanks to Andrey Ignatov. 16) IPSEC memory leak in ixgbe from Colin Ian King, man we've got holes everywhere! 17) Fix error path in tcf_proto_create, Jiri Pirko what would we do without you! * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (92 commits) net sched actions: fix refcnt leak in skbmod net: sched: fix error path in tcf_proto_create() when modules are not configured net sched actions: fix invalid pointer dereferencing if skbedit flags missing ixgbe: fix memory leak on ipsec allocation ixgbevf: fix ixgbevf_xmit_frame()'s return type ixgbe: return error on unsupported SFP module when resetting ice: Set rq_last_status when cleaning rq ipv4: fix memory leaks in udp_sendmsg, ping_v4_sendmsg mlxsw: core: Fix an error handling path in 'mlxsw_core_bus_device_register()' bonding: send learning packets for vlans on slave bonding: do not allow rlb updates to invalid mac net/mlx5e: Err if asked to offload TC match on frag being first net/mlx5: E-Switch, Include VF RDMA stats in vport statistics net/mlx5: Free IRQs in shutdown path rxrpc: Trace UDP transmission failure rxrpc: Add a tracepoint to log ICMP/ICMP6 and error messages rxrpc: Fix the min security level for kernel calls rxrpc: Fix error reception on AF_INET6 sockets rxrpc: Fix missing start of call timeout qed: fix spelling mistake: "taskelt" -> "tasklet" ...	2018-05-11 14:14:46 -07:00
Jens Axboe	28361c4036	libata: add extra internal command Bump the internal tag to 32, instead of stealing the last tag in our regular command space. This works just fine, since we don't actually need a separate hardware tag for this. Internal commands cannot coexist with NCQ commands. As a bonus, we get rid of the special casing of what tag to use for the internal command. This is in preparation for utilizing all 32 commands for normal IO. Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Tejun Heo <tj@kernel.org>	2018-05-11 13:10:44 -07:00
Jens Axboe	2e2cc676ce	libata: use ata_tag_internal() consistently Some check for the value directly, use the provided helper instead. Also make it return a bool, since that's what it does. Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Tejun Heo <tj@kernel.org>	2018-05-11 13:10:43 -07:00
Jens Axboe	e3ed893964	libata: bump ->qc_active to a 64-bit type This is in preparation for allowing full usage of the tag space, which means that our reserved error handling command will be using an internal tag value of 32. This doesn't fit in a u32, so move to a u64. Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Tejun Heo <tj@kernel.org>	2018-05-11 13:10:43 -07:00
Jens Axboe	5ac40790b4	libata: introduce notion of separate hardware tags Rigth now these are the same, but drivers should be using ->hw_tag for their command setup and issue. Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Tejun Heo <tj@kernel.org>	2018-05-11 13:10:42 -07:00
Chuck Lever	51cc257a11	svcrdma: Remove unused svc_rdma_op_ctxt Clean up: Eliminate a structure that is no longer used. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2018-05-11 15:48:57 -04:00
Chuck Lever	99722fe4d5	svcrdma: Persistently allocate and DMA-map Send buffers While sending each RPC Reply, svc_rdma_sendto allocates and DMA- maps a separate buffer where the RPC/RDMA transport header is constructed. The buffer is unmapped and released in the Send completion handler. This is significant per-RPC overhead, especially for small RPCs. Instead, allocate and DMA-map a buffer, and cache it in each svc_rdma_send_ctxt. This buffer and its mapping can be re-used for each RPC, saving the cost of memory allocation and DMA mapping. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2018-05-11 15:48:57 -04:00
Chuck Lever	986b78894b	svcrdma: Remove post_send_wr Clean up: Now that the send_wr is part of the svc_rdma_send_ctxt, svc_rdma_post_send_wr is nearly empty. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2018-05-11 15:48:57 -04:00
Chuck Lever	25fd86eca1	svcrdma: Don't overrun the SGE array in svc_rdma_send_ctxt Receive buffers are always the same size, but each Send WR has a variable number of SGEs, based on the contents of the xdr_buf being sent. While assembling a Send WR, keep track of the number of SGEs so that we don't exceed the device's maximum, or walk off the end of the Send SGE array. For now the Send path just fails if it exceeds the maximum. The current logic in svc_rdma_accept bases the maximum number of Send SGEs on the largest NFS request that can be sent or received. In the transport layer, the limit is actually based on the capabilities of the underlying device, not on properties of the Upper Layer Protocol. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2018-05-11 15:48:57 -04:00
Chuck Lever	4201c74647	svcrdma: Introduce svc_rdma_send_ctxt svc_rdma_op_ctxt's are pre-allocated and maintained on a per-xprt free list. This eliminates the overhead of calling kmalloc / kfree, both of which grab a globally shared lock that disables interrupts. Introduce a replacement to svc_rdma_op_ctxt's that is built especially for the svcrdma Send path. Subsequent patches will take advantage of this new structure by allocating real resources which are then cached in these objects. The allocations are freed when the transport is torn down. I've renamed the structure so that static type checking can be used to ensure that uses of op_ctxt and send_ctxt are not confused. As an additional clean up, structure fields are renamed to conform with kernel coding conventions. Additional clean ups: - Handle svc_rdma_send_ctxt_get allocation failure at each call site, rather than pre-allocating and hoping we guessed correctly - All send_ctxt_put call-sites request page freeing, so remove the @free_pages argument - All send_ctxt_put call-sites unmap SGEs, so fold that into svc_rdma_send_ctxt_put Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2018-05-11 15:48:57 -04:00
Chuck Lever	232627905f	svcrdma: Clean up Send SGE accounting Clean up: Since there's already a svc_rdma_op_ctxt being passed around with the running count of mapped SGEs, drop unneeded parameters to svc_rdma_post_send_wr(). Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2018-05-11 15:48:57 -04:00
Chuck Lever	f016f305f9	svcrdma: Refactor svc_rdma_dma_map_buf Clean up: svc_rdma_dma_map_buf does mostly the same thing as svc_rdma_dma_map_page, so let's fold these together. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2018-05-11 15:48:57 -04:00
Chuck Lever	eb5d7a622e	svcrdma: Allocate recv_ctxt's on CPU handling Receives There is a significant latency penalty when processing an ingress Receive if the Receive buffer resides in memory that is not on the same NUMA node as the the CPU handling completions for a CQ. The system administrator and the device driver determine which CPU handles completions. This CPU does not change during life of the CQ. Further the Upper Layer does not have any visibility of which CPU it is. Allocating Receive buffers in the Receive completion handler guarantees that Receive buffers are allocated on the preferred NUMA node for that CQ. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2018-05-11 15:48:57 -04:00
Chuck Lever	3316f06311	svcrdma: Persistently allocate and DMA-map Receive buffers The current Receive path uses an array of pages which are allocated and DMA mapped when each Receive WR is posted, and then handed off to the upper layer in rqstp::rq_arg. The page flip releases unused pages in the rq_pages pagelist. This mechanism introduces a significant amount of overhead. So instead, kmalloc the Receive buffer, and leave it DMA-mapped while the transport remains connected. This confers a number of benefits: * Each Receive WR requires only one receive SGE, no matter how large the inline threshold is. This helps the server-side NFS/RDMA transport operate on less capable RDMA devices. * The Receive buffer is left allocated and mapped all the time. This relieves svc_rdma_post_recv from the overhead of allocating and DMA-mapping a fresh buffer. * svc_rdma_wc_receive no longer has to DMA unmap the Receive buffer. It has to DMA sync only the number of bytes that were received. * svc_rdma_build_arg_xdr no longer has to free a page in rq_pages for each page in the Receive buffer, making it a constant-time function. * The Receive buffer is now plugged directly into the rq_arg's head[0].iov_vec, and can be larger than a page without spilling over into rq_arg's page list. This enables simplification of the RDMA Read path in subsequent patches. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2018-05-11 15:48:57 -04:00
Chuck Lever	1e5f416074	svcrdma: Simplify svc_rdma_recv_ctxt_put Currently svc_rdma_recv_ctxt_put's callers have to know whether they want to free the ctxt's pages or not. This means the human developers have to know when and why to set that free_pages argument. Instead, the ctxt should carry that information with it so that svc_rdma_recv_ctxt_put does the right thing no matter who is calling. We want to keep track of the number of pages in the Receive buffer separately from the number of pages pulled over by RDMA Read. This is so that the correct number of pages can be freed properly and that number is well-documented. So now, rc_hdr_count is the number of pages consumed by head[0] (ie., the page index where the Read chunk should start); and rc_page_count is always the number of pages that need to be released when the ctxt is put. The @free_pages argument is no longer needed. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2018-05-11 15:48:57 -04:00
Chuck Lever	2c577bfea8	svcrdma: Remove sc_rq_depth Clean up: No need to retain rq_depth in struct svcrdma_xprt, it is used only in svc_rdma_accept(). Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2018-05-11 15:48:57 -04:00
Chuck Lever	ecf85b2384	svcrdma: Introduce svc_rdma_recv_ctxt svc_rdma_op_ctxt's are pre-allocated and maintained on a per-xprt free list. This eliminates the overhead of calling kmalloc / kfree, both of which grab a globally shared lock that disables interrupts. To reduce contention further, separate the use of these objects in the Receive and Send paths in svcrdma. Subsequent patches will take advantage of this separation by allocating real resources which are then cached in these objects. The allocations are freed when the transport is torn down. I've renamed the structure so that static type checking can be used to ensure that uses of op_ctxt and recv_ctxt are not confused. As an additional clean up, structure fields are renamed to conform with kernel coding conventions. As a final clean up, helpers related to recv_ctxt are moved closer to the functions that use them. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2018-05-11 15:48:57 -04:00

... 136 137 138 139 140 ...

67868 commits