Commit graph

62682 commits

Author SHA1 Message Date
Sowmini Varadhan
6f89dbce8e skbuff: export mm_[un]account_pinned_pages for other modules
RDS would like to use the helper functions for managing pinned pages
added by Commit a91dbff551 ("sock: ulimit on MSG_ZEROCOPY pages")

Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-16 16:04:16 -05:00
David S. Miller
da27988766 skbuff: Fix comment mis-spelling.
'peform' --> 'perform'

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-16 15:52:42 -05:00
Linus Torvalds
1e3510b2b0 A few dma-mapping fixes for the fallout from the changes in rc-1.
-----BEGIN PGP SIGNATURE-----
 
 iQI/BAABCAApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAlqHGfMLHGhjaEBsc3Qu
 ZGUACgkQD55TZVIEUYNqBhAAicRKvMghVLqrmW8wiy81cBxCZ96UL6gaogmtVnL/
 jQ37zcgX77qKMzf/5M2grHQsttURBGa3TaMGPC21E6g8vJ++Oe7gTDhswDGj24yY
 yJOK5PrqKAqaTSjHn9c64DsCNia8BwMnY2ypT+c9nCAsUh1Jk+bBJMkyQQAx0/i5
 /z2rsc7FDZB9Lq7+DOApQB86ALfbeRaS29QRl1yl6wlLKmKKC57mFjHKom9HujsY
 UUuzHO8TFppbv/Gsl/UPns3ONPT6of88iCbSTIC44lO0WFtk/lS0qP3KVI9K96uo
 /DTmpTJOZn5d1GPGW0tQ23KjRXH+6MZryMX5SRoPZnJJvQLzLHDCu2OCRNFN3SXD
 t+wWBS6kW2ZoeDOAwh2Ncp1SC1hhri9WBAT2MS41kwTeMJ4fHt7rofsIRkMjRJEr
 vx6j9fmloL9rYT3KOu0eMapfYIlkg549FsPK5QZfOuXDyNdPw+Wxq7wRoEsTjTkI
 32rLWnl+5/1nHMlSjPTpnbK9V+42WL8pTy8Rz2TkmjiiNh9WAsxHVg1XzsrEWwKD
 5RQBQl7LBFI8jNlF2Ke9iubm45R3Eu9U8BmduF7pfaACrF8uh5KPMkhKFQs/KHl7
 NPvFGbKD/1c3BMsRO0ehnoEchL1mo6K4Tnwos9u4TzxcC/bniWmllV0gRAAvs5TF
 pQQ=
 =p0Hm
 -----END PGP SIGNATURE-----

Merge tag 'dma-mapping-4.16-2' of git://git.infradead.org/users/hch/dma-mapping

Pull dma-mapping fixes from Christoph Hellwig:
 "A few dma-mapping fixes for the fallout from the changes in rc1"

* tag 'dma-mapping-4.16-2' of git://git.infradead.org/users/hch/dma-mapping:
  powerpc/macio: set a proper dma_coherent_mask
  dma-mapping: fix a comment typo
  dma-direct: comment the dma_direct_free calling convention
  dma-direct: mark as is_phys
  ia64: fix build failure with CONFIG_SWIOTLB
2018-02-16 12:22:33 -08:00
Linus Walleij
11da04af0d
regulator: da9211: Pass descriptors instead of GPIO numbers
This augments the DA9211 regulator driver to fetch its GPIO descriptors
directly from the device tree using the newly exported
devm_get_gpiod_from_child().

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
2018-02-16 17:05:52 +00:00
Linus Walleij
8d05560d1d
regulator: da9055: Pass descriptor instead of GPIO number
When setting up a fixed regulator on the DA9055, pass a descriptor
instead of a global GPIO number. This facility is not used in the
kernel so we can easily just say that this should be a descriptor
if/when put to use.

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
2018-02-16 17:05:45 +00:00
Linus Walleij
e45e290a88
regulator: core: Support passing an initialized GPIO enable descriptor
We are currently passing a GPIO number from the global GPIO numberspace
into the regulator core for handling enable GPIOs. This is not good
since it ties into the global GPIO numberspace and uses gpio_to_desc()
to overcome this.

Start supporting passing an already initialized GPIO descriptor to the
core instead: leaf drivers pick their descriptors, associated directly
with the device node (or from ACPI or from a board descriptor table)
and use that directly without any roundtrip over the global GPIO
numberspace.

This looks messy since it adds a bunch of extra code in the core, but
at the end of the patch series we will delete the handling of the GPIO
number and only deal with descriptors so things end up neat.

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
2018-02-16 17:04:02 +00:00
Sinan Kaya
5cf0c37a71 PCI: Remove pci_get_bus_and_slot() function
pci_get_bus_and_slot() is restrictive such that it assumes domain=0 as
where a PCI device is present. This restricts the device drivers to be
reused for other domain numbers.

Now that all users of pci_get_bus_and_slot() switched to
pci_get_domain_bus_and_slot(), it is now safe to remove this function.

Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
Signed-off-by: Bjorn Helgaas <helgaas@kernel.org>
2018-02-16 08:51:21 -06:00
NeilBrown
0957a2c1d9 sched/wait: add wait_event_idle() functions.
The new TASK_IDLE state (TASK_UNINTERRUPTIBLE | __TASK_NOLOAD)
is not much used.  One way to make it easier to use is to
add wait_event*() family functions that make use of it.
This patch adds:
  wait_event_idle()
  wait_event_idle_timeout()
  wait_event_idle_exclusive()
  wait_event_idle_exclusive_timeout()

This set was chosen because lustre needs them before
it can discard its own l_wait_event() macro.

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: NeilBrown <neilb@suse.com>
Reviewed-by: Patrick Farrell <paf@cray.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-16 15:19:09 +01:00
Aaron Ma
6de0b13cc0 HID: core: Fix size as type u32
When size is negative, calling memset will make segment fault.
Declare the size as type u32 to keep memset safe.

size in struct hid_report is unsigned, fix return type of
hid_report_len to u32.

Cc: stable@vger.kernel.org
Signed-off-by: Aaron Ma <aaron.ma@canonical.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2018-02-16 13:30:56 +01:00
Michael Kelley
d207af2eab cpumask: Make for_each_cpu_wrap() available on UP as well
for_each_cpu_wrap() was originally added in the #else half of a
large "#if NR_CPUS == 1" statement, but was omitted in the #if
half.  This patch adds the missing #if half to prevent compile
errors when NR_CPUS is 1.

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Michael Kelley <mhkelley@outlook.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kys@microsoft.com
Cc: martin.petersen@oracle.com
Cc: mikelley@microsoft.com
Fixes: c743f0a5c5 ("sched/fair, cpumask: Export for_each_cpu_wrap()")
Link: http://lkml.kernel.org/r/SN6PR1901MB2045F087F59450507D4FCC17CBF50@SN6PR1901MB2045.namprd19.prod.outlook.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-16 10:40:24 +01:00
Boris Brezillon
9c3736a3de mtd: nand: Add core infrastructure to deal with NAND devices
Add an intermediate layer to abstract NAND device interface so that
some logic can be shared between SPI NANDs, parallel/raw NANDs,
OneNANDs, ...

Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
2018-02-16 10:10:53 +01:00
Boris Brezillon
43a0a45abc mtd: nand: Get rid of comments giving the file path inside the file itself
Some files add a comment giving the path of the file inside the Linux
tree, which is pretty useless since the reader had to find the file to
open it.

Getting rid of these comments will also allow us to easily move these
files around when needed.

Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
2018-02-16 10:08:27 +01:00
Randy Dunlap
562c45d635 headers: Drop two #included headers from <linux/interrupt.h>
It seems that <linux/interrupt.h> does not need <linux/linkage.h>
nor <linux/preempt.h>.  8 kernels builds are successful without
these 2 headers (allmodconfig, allyesconfig, allnoconfig, and
tinyconfig on both i386 and x86_64).

<linux/interrupt.h> is #included 3875 times in 4.16-rc1, so this
reduces #include processing of these 2 files by a total of 7750 times.

Since I only tested x86 builds, this needs to be tested on other
$ARCHes as well.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-arch@vger.kernel.org
Link: http://lkml.kernel.org/r/b24b9ec8-4970-65f5-759a-911d4ba2fcf0@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-16 08:59:16 +01:00
Linus Torvalds
b63b1e5730 ACPI updates for v4.16-rc2
- Revert a problematic EC driver change from the 4.13 cycle that
    introduced a system resume regression on Thinkpad X240 (Rafael
    Wysocki).
 
  - Clean up device tables handling in the ACPI core and the related
    part of the device properties framework (Andy Shevchenko).
 
  - Update the sysfs ABI documentatio of the dock and the INT3407
    special device drivers (Aishwarya Pant).
 
  - Add an expected switch fall-through marker to the SPCR table
    parsing code (Gustavo Silva).
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJahfZtAAoJEILEb/54YlRxi38P/i+I+LcWj06h4aj2xXqoqyqk
 No7/BZrcQyg4Y8goRagsYwbxkC1WqcrlDcjNkaHV+tTjR77pAJlFsNhYNG4lo4ch
 hlA3ickDXhC71Sm/vUQ1SpOKRUAOojFyWkBf82JSqTiOkjJ3NpNy0z//JX6lM1II
 YwK45QK2GGQt6USJvU6pfBEBdDETfYq4l4xV7FhfpTrcqs3SFOHNBlbUYjtwoZ9l
 RCVvdUjAmYd3LTMyuLuQnj0g+oCul230CmAb2xd5E82jep+Wdne/oXmNMeJw/6vm
 hb2SAdHvJqAIm/yV25fKYt+/h8rjoUVdILoDtmjByvc3h5No6OvjhxL4zu4kg9O4
 EEVEnKGs55syk+fHpyhfawxdj/qe1XQHw2QUKh/gCbE/ObOnx+WC9Ot1gB+6Sw0w
 08CFzb5PJ74Atf/6ceFotSksWZzOsEM/QixqKVZ4u0QUiG42rO7rYTa8TVHxwGVv
 LOdIpShWbOzXaqBH+Se/9loKJVG+UnCWyfRlU+W1JjZs+I1c0PDgbIGyaOmgZXjk
 n5FxQ/dCudKWfZwU/z9dWya5O58aQM/aWP6KweuLtv2ZQ03aKNpen+HJC7MN3Xh2
 0x7RPCTmK9QROsXDOOeKckiCLSGiuSQZ2VMReY5C6EtlCX5GiD9jAG9IQiLDGVfb
 93VxegPRbEyY5HQe6Udy
 =gdK1
 -----END PGP SIGNATURE-----

Merge tag 'acpi-4.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI fixes from Rafael Wysocki:
 "These fix a system resume regression from the 4.13 cycle, clean up
  device table handling in the ACPI core, update sysfs ABI documentation
  of a couple of drivers and add an expected switch fall-through marker
  to the SPCR table parsing code.

  Specifics:

   - Revert a problematic EC driver change from the 4.13 cycle that
     introduced a system resume regression on Thinkpad X240 (Rafael
     Wysocki).

   - Clean up device tables handling in the ACPI core and the related
     part of the device properties framework (Andy Shevchenko).

   - Update the sysfs ABI documentatio of the dock and the INT3407
     special device drivers (Aishwarya Pant).

   - Add an expected switch fall-through marker to the SPCR table
     parsing code (Gustavo Silva)"

* tag 'acpi-4.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI: dock: document sysfs interface
  ACPI / DPTF: Document dptf_power sysfs atttributes
  device property: Constify device_get_match_data()
  ACPI / bus: Rename acpi_get_match_data() to acpi_device_get_match_data()
  ACPI / bus: Remove checks in acpi_get_match_data()
  ACPI / bus: Do not traverse through non-existed device table
  ACPI: SPCR: Mark expected switch fall-through in acpi_parse_spcr
  ACPI / EC: Restore polling during noirq suspend/resume phases
2018-02-15 14:50:32 -08:00
Linus Torvalds
8bb8966603 Power management updates for v4.16-rc2
- Fix a recently introduced build issue related to cpuidle by
    covering all of the relevant combinations of Kconfig options
    in its header (Rafael Wysocki).
 
  - Add missing invocation of pm_runtime_drop_link() to the
    !CONFIG_SRCU variant of __device_link_del() (Lukas Wunner).
 
  - Fix unbalanced IRQ enable in the wakeup interrupts framework
    (Tony Lindgren).
 
  - Update cpuidle sysfs ABI documentation (Aishwarya Pant).
 
  - Use GFP_KERNEL instead of GFP_ATOMIC for allocating memory
    in dev_pm_opp_init_cpufreq_table() (Jia-Ju Bai).
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJahfXsAAoJEILEb/54YlRxvCQP/jtIJ6wtIPVCKBY0RrWEsya8
 MkFrsaXBVA2mTfBbRxiCr/gMEP/7OFfV1WN2Y0lLUhCckI9EB5Q4ctF4UFRv6f4y
 ZfPF6THLLN4Li7l+1XUulfom8mtO1ZJAcfRV2iwd4Lgk3bX8l74ViZXyhb/3LckF
 BMKbkQi7sLvCx01Cw63YcEuqLJz1wwQPq3yGEUvRySSKB1qQsFabqR8cuRVkp5Js
 e3dDoi0Pe0i80m+E2Ophi5k3QpiTEejaPTmHVzs35Z1sgUXGne7Fa+TI1WTKoe9b
 9M2mXRe9l1rJ8t4IFwtOV5Cv6onSt9kFP+rIEnLu1urf8GhHgzHXbzvoTeK89NkR
 dA+/iS2q9cU7R3hAiear/Treb8TiaCBHES4koE/exLjtjyfuM5gcf7GG4IjXVIu6
 zVT4ioCqegexcgCBEm3uCu7YWiObWPjmG6KC8pH0+MKlaKtyKfZqtnAkD1ZMzDlG
 Rh9r++NO1AZMMHVlMxydNYVBYy4Vm6WTrA311/41GB3bgkFxxHZ+opio1t3wfebW
 It5Gar6yJfrzNLruu2GGjqnfDLKIrXxCc2NUvuWhtRR2KDuEz7Z3ugvdEEMM+QSI
 YwMtSPMUmOJOaqN5brkmHz9pl/bxWjFOwldboS8XVCC6Oeiodg2UOHcTjDbkFfJ0
 yBoNdooSQ0X+Woq7J2np
 =U6ev
 -----END PGP SIGNATURE-----

Merge tag 'pm-4.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management fixes from Rafael Wysocki:
 "These fix a recently introduced build issue related to cpuidle and two
  bugs in the PM core, update cpuidle documentation and clean up memory
  allocations in the operating performance points (OPP) framework.

  Specifics:

   - Fix a recently introduced build issue related to cpuidle by
     covering all of the relevant combinations of Kconfig options
     in its header (Rafael Wysocki).

   - Add missing invocation of pm_runtime_drop_link() to the
     !CONFIG_SRCU variant of __device_link_del() (Lukas Wunner).

   - Fix unbalanced IRQ enable in the wakeup interrupts framework
     (Tony Lindgren).

   - Update cpuidle sysfs ABI documentation (Aishwarya Pant).

   - Use GFP_KERNEL instead of GFP_ATOMIC for allocating memory
     in dev_pm_opp_init_cpufreq_table() (Jia-Ju Bai)"

* tag 'pm-4.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  PM: cpuidle: Fix cpuidle_poll_state_init() prototype
  PM / runtime: Update links_count also if !CONFIG_SRCU
  PM / wakeirq: Fix unbalanced IRQ enable for wakeirq
  Documentation/ABI: update cpuidle sysfs documentation
  opp: cpu: Replace GFP_ATOMIC with GFP_KERNEL in dev_pm_opp_init_cpufreq_table
2018-02-15 14:40:01 -08:00
Kirill Tkhai
d8d211a2a0 net: Make extern and export get_net_ns()
This function will be used to obtain net of tun device.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-15 15:34:42 -05:00
Linus Torvalds
e9e3b3002f Merge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking fixes from Ingo Molnar:
 "This contains two qspinlock fixes and three documentation and comment
  fixes"

* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking/semaphore: Update the file path in documentation
  locking/atomic/bitops: Document and clarify ordering semantics for failed test_and_{}_bit()
  locking/qspinlock: Ensure node->count is updated before initialising node
  locking/qspinlock: Ensure node is initialised before updating prev->next
  Documentation/locking/mutex-design: Update to reflect latest changes
2018-02-15 09:05:26 -08:00
Minwoo Im
096392e071 block: fix a typo in comment of BLK_MQ_POLL_STATS_BKTS
Update comment typo _consisitent_ to _consistent_ from following commit.
commit 0206319fdf ("blk-mq: Fix poll_stat for new size-based bucketing.")

Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Minwoo Im <minwoo.im.dev@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-02-15 08:27:06 -07:00
Rafael J. Wysocki
822ffaa581 Merge branches 'pm-cpuidle' and 'pm-opp'
* pm-cpuidle:
  PM: cpuidle: Fix cpuidle_poll_state_init() prototype
  Documentation/ABI: update cpuidle sysfs documentation

* pm-opp:
  opp: cpu: Replace GFP_ATOMIC with GFP_KERNEL in dev_pm_opp_init_cpufreq_table
2018-02-15 12:01:53 +01:00
Yonatan Cohen
388ca8be00 IB/mlx5: Implement fragmented completion queue (CQ)
The current implementation of create CQ requires contiguous
memory, such requirement is problematic once the memory is
fragmented or the system is low in memory, it causes for
failures in dma_zalloc_coherent().

This patch implements new scheme of fragmented CQ to overcome
this issue by introducing new type: 'struct mlx5_frag_buf_ctrl'
to allocate fragmented buffers, rather than contiguous ones.

Base the Completion Queues (CQs) on this new fragmented buffer.

It fixes following crashes:
kworker/29:0: page allocation failure: order:6, mode:0x80d0
CPU: 29 PID: 8374 Comm: kworker/29:0 Tainted: G OE 3.10.0
Workqueue: ib_cm cm_work_handler [ib_cm]
Call Trace:
[<>] dump_stack+0x19/0x1b
[<>] warn_alloc_failed+0x110/0x180
[<>] __alloc_pages_slowpath+0x6b7/0x725
[<>] __alloc_pages_nodemask+0x405/0x420
[<>] dma_generic_alloc_coherent+0x8f/0x140
[<>] x86_swiotlb_alloc_coherent+0x21/0x50
[<>] mlx5_dma_zalloc_coherent_node+0xad/0x110 [mlx5_core]
[<>] ? mlx5_db_alloc_node+0x69/0x1b0 [mlx5_core]
[<>] mlx5_buf_alloc_node+0x3e/0xa0 [mlx5_core]
[<>] mlx5_buf_alloc+0x14/0x20 [mlx5_core]
[<>] create_cq_kernel+0x90/0x1f0 [mlx5_ib]
[<>] mlx5_ib_create_cq+0x3b0/0x4e0 [mlx5_ib]

Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-02-15 00:30:03 -08:00
Saeed Mahameed
3ec5693b17 net/mlx5: Remove redundant EQ API exports
EQ structure and API is private to mlx5_core driver only, external
drivers should not have access or the means to manipulate EQ objects.

Remove redundant exports and move API functions out of the linux/mlx5
include directory into the driver's mlx5_core.h private include file.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Gal Pressman <galp@mellanox.com>
2018-02-15 00:30:02 -08:00
Saeed Mahameed
3ac7afdbcf net/mlx5: Move CQ completion and event forwarding logic to eq.c
Since CQ tree is now per EQ, CQ completion and event forwarding became
specific implementation of EQ logic, this patch moves that logic to eq.c
and makes those functions static.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Gal Pressman <galp@mellanox.com>
2018-02-15 00:30:02 -08:00
Saeed Mahameed
f105b45bf7 net/mlx5: CQ hold/put API
Now as the CQ table is per EQ, add an API to hold/put CQ to be used from
eq.c in downstream patch.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Gal Pressman <galp@mellanox.com>
2018-02-15 00:30:01 -08:00
Saeed Mahameed
02d92f7903 net/mlx5: CQ Database per EQ
Before this patch the driver had one CQ database protected via one
spinlock, this spinlock is meant to synchronize between CQ
adding/removing and CQ IRQ interrupt handling.

On a system with large number of CPUs and on a work load that requires
lots of interrupts, this global spinlock becomes a very nasty hotspot
and introduces a contention between the active cores, which will
significantly hurt performance and becomes a bottleneck that prevents
seamless cpu scaling.

To solve this we simply move the CQ database and its spinlock to be per
EQ (IRQ), thus per core.

Tested with:
system: 2 sockets, 14 cores per socket, hyperthreading, 2x14x2=56 cores
netperf command: ./super_netperf 200 -P 0 -t TCP_RR  -H <server> -l 30 -- -r 300,300 -o -s 1M,1M -S 1M,1M

WITHOUT THIS PATCH:
Average:     CPU    %usr   %nice    %sys %iowait    %irq   %soft %steal  %guest  %gnice   %idle
Average:     all    4.32    0.00   36.15    0.09    0.00   34.02   0.00    0.00    0.00   25.41

Samples: 2M of event 'cycles:pp', Event count (approx.): 1554616897271
Overhead  Command          Shared Object                 Symbol
+   14.28%  swapper          [kernel.vmlinux]              [k] intel_idle
+   12.25%  swapper          [kernel.vmlinux]              [k] queued_spin_lock_slowpath
+   10.29%  netserver        [kernel.vmlinux]              [k] queued_spin_lock_slowpath
+    1.32%  netserver        [kernel.vmlinux]              [k] mlx5e_xmit

WITH THIS PATCH:
Average:     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
Average:     all    4.27    0.00   34.31    0.01    0.00   18.71    0.00    0.00    0.00   42.69

Samples: 2M of event 'cycles:pp', Event count (approx.): 1498132937483
Overhead  Command          Shared Object             Symbol
+   23.33%  swapper          [kernel.vmlinux]          [k] intel_idle
+    1.69%  netserver        [kernel.vmlinux]          [k] mlx5e_xmit

Tested-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Gal Pressman <galp@mellanox.com>
2018-02-15 00:29:54 -08:00
Linus Torvalds
e525de3ab0 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
 "Misc fixes all across the map:

   - /proc/kcore vsyscall related fixes
   - LTO fix
   - build warning fix
   - CPU hotplug fix
   - Kconfig NR_CPUS cleanups
   - cpu_has() cleanups/robustification
   - .gitignore fix
   - memory-failure unmapping fix
   - UV platform fix"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mm, mm/hwpoison: Don't unconditionally unmap kernel 1:1 pages
  x86/error_inject: Make just_return_func() globally visible
  x86/platform/UV: Fix GAM Range Table entries less than 1GB
  x86/build: Add arch/x86/tools/insn_decoder_test to .gitignore
  x86/smpboot: Fix uncore_pci_remove() indexing bug when hot-removing a physical CPU
  x86/mm/kcore: Add vsyscall page to /proc/kcore conditionally
  vfs/proc/kcore, x86/mm/kcore: Fix SMAP fault when dumping vsyscall user page
  x86/Kconfig: Further simplify the NR_CPUS config
  x86/Kconfig: Simplify NR_CPUS config
  x86/MCE: Fix build warning introduced by "x86: do not use print_symbol()"
  x86/cpufeature: Update _static_cpu_has() to use all named variables
  x86/cpufeature: Reindent _static_cpu_has()
2018-02-14 17:31:51 -08:00
Linus Torvalds
d4667ca142 Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 PTI and Spectre related fixes and updates from Ingo Molnar:
 "Here's the latest set of Spectre and PTI related fixes and updates:

  Spectre:
   - Add entry code register clearing to reduce the Spectre attack
     surface
   - Update the Spectre microcode blacklist
   - Inline the KVM Spectre helpers to get close to v4.14 performance
     again.
   - Fix indirect_branch_prediction_barrier()
   - Fix/improve Spectre related kernel messages
   - Fix array_index_nospec_mask() asm constraint
   - KVM: fix two MSR handling bugs

  PTI:
   - Fix a paranoid entry PTI CR3 handling bug
   - Fix comments

  objtool:
   - Fix paranoid_entry() frame pointer warning
   - Annotate WARN()-related UD2 as reachable
   - Various fixes
   - Add Add Peter Zijlstra as objtool co-maintainer

  Misc:
   - Various x86 entry code self-test fixes
   - Improve/simplify entry code stack frame generation and handling
     after recent heavy-handed PTI and Spectre changes. (There's two
     more WIP improvements expected here.)
   - Type fix for cache entries

  There's also some low risk non-fix changes I've included in this
  branch to reduce backporting conflicts:

   - rename a confusing x86_cpu field name
   - de-obfuscate the naming of single-TLB flushing primitives"

* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (41 commits)
  x86/entry/64: Fix CR3 restore in paranoid_exit()
  x86/cpu: Change type of x86_cache_size variable to unsigned int
  x86/spectre: Fix an error message
  x86/cpu: Rename cpu_data.x86_mask to cpu_data.x86_stepping
  selftests/x86/mpx: Fix incorrect bounds with old _sigfault
  x86/mm: Rename flush_tlb_single() and flush_tlb_one() to __flush_tlb_one_[user|kernel]()
  x86/speculation: Add <asm/msr-index.h> dependency
  nospec: Move array_index_nospec() parameter checking into separate macro
  x86/speculation: Fix up array_index_nospec_mask() asm constraint
  x86/debug: Use UD2 for WARN()
  x86/debug, objtool: Annotate WARN()-related UD2 as reachable
  objtool: Fix segfault in ignore_unreachable_insn()
  selftests/x86: Disable tests requiring 32-bit support on pure 64-bit systems
  selftests/x86: Do not rely on "int $0x80" in single_step_syscall.c
  selftests/x86: Do not rely on "int $0x80" in test_mremap_vdso.c
  selftests/x86: Fix build bug caused by the 5lvl test which has been moved to the VM directory
  selftests/x86/pkeys: Remove unused functions
  selftests/x86: Clean up and document sscanf() usage
  selftests/x86: Fix vDSO selftest segfault for vsyscall=none
  x86/entry/64: Remove the unused 'icebp' macro
  ...
2018-02-14 17:02:15 -08:00
Will Deacon
8fa80c503b nospec: Move array_index_nospec() parameter checking into separate macro
For architectures providing their own implementation of
array_index_mask_nospec() in asm/barrier.h, attempting to use WARN_ONCE() to
complain about out-of-range parameters using WARN_ON() results in a mess
of mutually-dependent include files.

Rather than unpick the dependencies, simply have the core code in nospec.h
perform the checking for us.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1517840166-15399-1-git-send-email-will.deacon@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-15 01:15:51 +01:00
David S. Miller
bdc8587ad7 Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:

====================
40GbE Intel Wired LAN Driver Updates 2018-02-14

This patch series enables the new mqprio hardware offload mechanism
creating traffic classes on VFs for XL710 devices. The parameters
needed to configure these traffic classes/queue channels are provides
by the user via the tc tool. A maximum of four traffic classes can be
created on each VF. This patch series also enables application of cloud
filters to each of these traffic classes. The cloud filters are applied
using the tc-flower classifier.

Example:
    1. tc qdisc add dev vf0 root mqprio num_tc 4 map 0 0 0 0 1 2 2 3\
        queues 2@0 2@2 1@4 1@5 hw 1 mode channel
    2. tc qdisc add dev vf0 ingress
    3. ethtool -K vf0 hw-tc-offload on
    4. ip link set eth0 vf 0 spoofchk off
    5. tc filter add dev vf0 protocol ip parent ffff: prio 1 flower dst_ip\
        192.168.3.5/32 ip_proto udp dst_port 25 skip_sw hw_tc 2
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-14 15:44:05 -05:00
Andrew Lunn
052fddfb3c net: ptp: Add stub for ptp_classify_raw()
When NET_PTP_CLASSIFY is disabled, a stub function is required in
order that the drivers compile.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-14 14:33:36 -05:00
Suman Anna
0693036ca8 ARM: OMAP2+: Cleanup omap_mcbsp_dev_attr and other legacy data
The omap_mcbsp_dev_attr data was used to supply instance-specific
data for legacy non-DT devices. The legacy McBSP device support
including the usage of the hwmod class revision data has been
dropped in commit 48f6693790 ("ARM: OMAP2+: Remove unused legacy
code for McBSP") and this data is therefore no longer needed.
So, cleanup the structure and all the associated data in various
hwmod data files.

Cc: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: Suman Anna <s-anna@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
2018-02-14 10:28:13 -08:00
Suman Anna
1cddc36458 ARM: OMAP2+: Cleanup omap2_spi_dev_attr and other legacy data
The omap2_spi_dev_attr data was used to supply instance-specific
data for legacy non-DT devices. The SPI legacy device support
including the usage of the hwmod class revision data has been
dropped in commit 6f3ab009a1 ("ARM: OMAP2+: Remove unused legacy
code for device init") and this data is therefore no longer needed.
So, cleanup the structure and all the associated data in various
hwmod data files.

Signed-off-by: Suman Anna <s-anna@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
2018-02-14 10:28:12 -08:00
Suman Anna
a0e37da2a5 ARM: OMAP2+: Cleanup omap_gpio_dev_attr usage
The omap_gpio_dev_attr data was used to supply instance-specific
data for legacy non-DT devices. The GPIO legacy device support has
been cleaned up in commit 14944934f8 ("ARM: OMAP2+: Remove legacy
gpio code") a while ago and this data is therefore no longer needed.
So, cleanup the structure and all the associated data in various
hwmod data files.

Signed-off-by: Suman Anna <s-anna@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
2018-02-14 10:28:12 -08:00
Harshitha Ramamurthy
3872c8d44c virtchnl: Add filter data structures
This patch adds infrastructure to send virtchnl messages to the
PF to configure filters on the VF. The patch adds a struct
called virtchnl_filter which contains information about the fields
in the user-specified tc filter.

Signed-off-by: Harshitha Ramamurthy <harshitha.ramamurthy@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-02-14 09:43:22 -08:00
Harshitha Ramamurthy
0718e560a3 virtchnl: Add a macro to check the size of a union
This patch adds a macro to check if the size of a union is correct.
It throws a divide by zero error if the union is not of the correct
size.

Signed-off-by: Harshitha Ramamurthy <harshitha.ramamurthy@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-02-14 09:43:22 -08:00
Harshitha Ramamurthy
eb09f1feb8 virtchnl: Add virtchl structures to support queue channels
This patch defines new structs in support of the virtchannel message
that the VF sends to the PF to create a queue channel specified by the
user via tc tool.

Signed-off-by: Harshitha Ramamurthy <harshitha.ramamurthy@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-02-14 09:43:22 -08:00
David Ahern
89e58148fb net: Make atalk_ptr depend on ATALK or IRDA
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-14 11:55:33 -05:00
David Ahern
19ff13f2a4 net: Make ax25_ptr depend on CONFIG_AX25
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-14 11:55:33 -05:00
David Ahern
330c7272c4 net: Make dn_ptr depend on CONFIG_DECNET
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-14 11:55:33 -05:00
Linus Walleij
9b00bc7b90
spi: spi-gpio: Rewrite to use GPIO descriptors
This converts the bit-banged GPIO SPI driver to looking up and
using GPIO descriptors to get a handle on GPIO lines for SCK,
MOSI, MISO and all CS lines.

All existing board files are converted in one go to keep it all
consistent. With these conversions I rarely find any interrim
steps that makes any sense.

Device tree probing and GPIO handling should work like before
also after this patch.

For board files, we stop using controller data to pass the GPIO
line for chip select, instead we pass this as a GPIO descriptor
lookup like everything else.

In some s3c24xx machines the names of the SPI devices were set to
"spi-gpio" rather than "spi_gpio" which can never have worked, I
fixed it working (I guess) as part of this patch set. Sometimes
I wonder how this code got upstream in the first place, it
obviously is not tested.

mach-s3c64xx/mach-smartq.c has the same problem and additionally
defines the *same* GPIO line for MOSI and MISO which is not going
to be accepted by gpiolib. As the lines were number 1,2,2 I assumed
it was a typo and use lines 1,2,3. A comment gives awat that line 0
is chip select though no actual SPI device is provided for the LCD
supposed to be on this bit-banged SPI bus. I left it intact instead
of just deleting the bus though.

Kill off board file code that try to initialize the SPI lines
to the same values that they will later be set by the spi_gpio
driver anyways. Given the huge number of weird things in these
board files I do not think this code is very tested or put in
with much afterthought anyways.

In order to assert that we do not get performance regressions on
this crucial bing-banged driver, a ran a script like this dumping the
Ilitek ILI9322 regmap 10000 times (it has no caching obviously) on
an otherwise idle system in two iterations before and after the
patches:

 #!/bin/sh
 for run in `seq 10000`
 do
     cat /debug/regmap/spi0.0/registers > /dev/null
 done

Before the patch:

time test.sh
real    3m 41.03s
user    0m 29.41s
sys     3m 7.22s

time test.sh
real    3m 44.24s
user    0m 32.31s
sys     3m 7.60s

After the patch:

time test.sh
real    3m 41.32s
user    0m 28.92s
sys     3m 8.08s

time test.sh
real    3m 39.92s
user    0m 30.20s
sys     3m 5.56s

So any performance differences seems to be in the error margin.

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Olof Johansson <olof@lixom.net>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
2018-02-14 16:02:41 +00:00
Shameer Kolothum
8b4282e6b8 ACPI/IORT: Add msi address regions reservation helper
On some platforms msi parent address regions have to be excluded from
normal IOVA allocation in that they are detected and decoded in a HW
specific way by system components and so they cannot be considered normal
IOVA address space.

Add a helper function that retrieves ITS address regions - the msi
parent - through IORT device <-> ITS mappings and reserves it so that
these regions will not be translated by IOMMU and will be excluded from
IOVA allocations. The function checks for the smmu model number and
only applies the msi reservation if the platform requires it.

Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
[For the ITS part]
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2018-02-14 15:15:41 +01:00
Kirill A. Shutemov
c65e774fb3 x86/mm: Make PGDIR_SHIFT and PTRS_PER_P4D variable
For boot-time switching between 4- and 5-level paging we need to be able
to fold p4d page table level at runtime. It requires variable
PGDIR_SHIFT and PTRS_PER_P4D.

The change doesn't affect the kernel image size much:

   text	   data	    bss	    dec	    hex	filename
8628091	4734304	1368064	14730459	 e0c4db	vmlinux.before
8628393	4734340	1368064	14730797	 e0c62d	vmlinux.after

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/20180214111656.88514-7-kirill.shutemov@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-14 13:11:14 +01:00
Jesper Dangaard Brouer
297dd12cb1 net: avoid including xdp.h in filter.h
If is sufficient with a forward declaration of struct xdp_rxq_info in
linux/filter.h, which avoids including net/xdp.h.  This was originally
suggested by John Fastabend during the review phase, but wasn't
included in the final patchset revision.  Thus, this followup.

Suggested-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-02-13 17:50:10 -08:00
Suravee Suthikulpanit
c5611a8751 iommu: Do not return error code for APIs with size_t return type
Currently, iommu_unmap, iommu_unmap_fast and iommu_map_sg return
size_t.  However, some of the return values are error codes (< 0),
which can be misinterpreted as large size. Therefore, returning size 0
instead to signify failure to map/unmap.

Cc: Joerg Roedel <joro@8bytes.org>
Cc: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2018-02-13 19:31:20 +01:00
Dmitry Safonov
b1d03c1d12 iommu/vt-d: Clean/document fault status flags
So one could decode them without opening the specification.

Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2018-02-13 17:40:54 +01:00
Kirill Tkhai
1a57feb847 net: Introduce net_sem for protection of pernet_list
Currently, the mutex is mostly used to protect pernet operations
list. It orders setup_net() and cleanup_net() with parallel
{un,}register_pernet_operations() calls, so ->exit{,batch} methods
of the same pernet operations are executed for a dying net, as
were used to call ->init methods, even after the net namespace
is unlinked from net_namespace_list in cleanup_net().

But there are several problems with scalability. The first one
is that more than one net can't be created or destroyed
at the same moment on the node. For big machines with many cpus
running many containers it's very sensitive.

The second one is that it's need to synchronize_rcu() after net
is removed from net_namespace_list():

Destroy net_ns:
cleanup_net()
  mutex_lock(&net_mutex)
  list_del_rcu(&net->list)
  synchronize_rcu()                                  <--- Sleep there for ages
  list_for_each_entry_reverse(ops, &pernet_list, list)
    ops_exit_list(ops, &net_exit_list)
  list_for_each_entry_reverse(ops, &pernet_list, list)
    ops_free_list(ops, &net_exit_list)
  mutex_unlock(&net_mutex)

This primitive is not fast, especially on the systems with many processors
and/or when preemptible RCU is enabled in config. So, all the time, while
cleanup_net() is waiting for RCU grace period, creation of new net namespaces
is not possible, the tasks, who makes it, are sleeping on the same mutex:

Create net_ns:
copy_net_ns()
  mutex_lock_killable(&net_mutex)                    <--- Sleep there for ages

I observed 20-30 seconds hangs of "unshare -n" on ordinary 8-cpu laptop
with preemptible RCU enabled after CRIU tests round is finished.

The solution is to convert net_mutex to the rw_semaphore and add fine grain
locks to really small number of pernet_operations, what really need them.

Then, pernet_operations::init/::exit methods, modifying the net-related data,
will require down_read() locking only, while down_write() will be used
for changing pernet_list (i.e., when modules are being loaded and unloaded).

This gives signify performance increase, after all patch set is applied,
like you may see here:

%for i in {1..10000}; do unshare -n bash -c exit; done

*before*
real 1m40,377s
user 0m9,672s
sys 0m19,928s

*after*
real 0m17,007s
user 0m5,311s
sys 0m11,779

(5.8 times faster)

This patch starts replacing net_mutex to net_sem. It adds rw_semaphore,
describes the variables it protects, and makes to use, where appropriate.
net_mutex is still present, and next patches will kick it out step-by-step.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Acked-by: Andrei Vagin <avagin@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-13 10:36:04 -05:00
Tony Luck
fd0e786d9d x86/mm, mm/hwpoison: Don't unconditionally unmap kernel 1:1 pages
In the following commit:

  ce0fa3e56a ("x86/mm, mm/hwpoison: Clear PRESENT bit for kernel 1:1 mappings of poison pages")

... we added code to memory_failure() to unmap the page from the
kernel 1:1 virtual address space to avoid speculative access to the
page logging additional errors.

But memory_failure() may not always succeed in taking the page offline,
especially if the page belongs to the kernel.  This can happen if
there are too many corrected errors on a page and either mcelog(8)
or drivers/ras/cec.c asks to take a page offline.

Since we remove the 1:1 mapping early in memory_failure(), we can
end up with the page unmapped, but still in use. On the next access
the kernel crashes :-(

There are also various debug paths that call memory_failure() to simulate
occurrence of an error. Since there is no actual error in memory, we
don't need to map out the page for those cases.

Revert most of the previous attempt and keep the solution local to
arch/x86/kernel/cpu/mcheck/mce.c. Unmap the page only when:

	1) there is a real error
	2) memory_failure() succeeds.

All of this only applies to 64-bit systems. 32-bit kernel doesn't map
all of memory into kernel space. It isn't worth adding the code to unmap
the piece that is mapped because nobody would run a 32-bit kernel on a
machine that has recoverable machine checks.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave <dave.hansen@intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert (Persistent Memory) <elliott@hpe.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-mm@kvack.org
Cc: stable@vger.kernel.org #v4.14
Fixes: ce0fa3e56a ("x86/mm, mm/hwpoison: Clear PRESENT bit for kernel 1:1 mappings of poison pages")
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-13 16:25:06 +01:00
Tycho Andersen
2dd6fd2e99 locking/semaphore: Update the file path in documentation
While reading this header I noticed that the locking stuff has moved to
kernel/locking/*, so update the path in semaphore.h to point to that.

Signed-off-by: Tycho Andersen <tycho@tycho.ws>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20180201114119.1090-1-tycho@tycho.ws
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-13 15:00:06 +01:00
Jia Zhang
595dd46ebf vfs/proc/kcore, x86/mm/kcore: Fix SMAP fault when dumping vsyscall user page
Commit:

  df04abfd18 ("fs/proc/kcore.c: Add bounce buffer for ktext data")

... introduced a bounce buffer to work around CONFIG_HARDENED_USERCOPY=y.
However, accessing the vsyscall user page will cause an SMAP fault.

Replace memcpy() with copy_from_user() to fix this bug works, but adding
a common way to handle this sort of user page may be useful for future.

Currently, only vsyscall page requires KCORE_USER.

Signed-off-by: Jia Zhang <zhang.jia@linux.alibaba.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: jolsa@redhat.com
Link: http://lkml.kernel.org/r/1518446694-21124-2-git-send-email-zhang.jia@linux.alibaba.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-13 09:15:58 +01:00
Bjorn Andersson
880f5b3882 remoteproc: Pass type of shutdown to subdev remove
remoteproc instances can be stopped either by invoking shutdown or by an
attempt to recover from a crash. For some subdev types it's expected to
clean up gracefully during a shutdown, but are unable to do so during a
crash - so pass this information to the subdev remove functions.

Acked-By: Chris Lew <clew@codeaurora.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
2018-02-12 16:57:22 -08:00
Denys Vlasenko
9b2c45d479 net: make getname() functions return length rather than use int* parameter
Changes since v1:
Added changes in these files:
    drivers/infiniband/hw/usnic/usnic_transport.c
    drivers/staging/lustre/lnet/lnet/lib-socket.c
    drivers/target/iscsi/iscsi_target_login.c
    drivers/vhost/net.c
    fs/dlm/lowcomms.c
    fs/ocfs2/cluster/tcp.c
    security/tomoyo/network.c

Before:
All these functions either return a negative error indicator,
or store length of sockaddr into "int *socklen" parameter
and return zero on success.

"int *socklen" parameter is awkward. For example, if caller does not
care, it still needs to provide on-stack storage for the value
it does not need.

None of the many FOO_getname() functions of various protocols
ever used old value of *socklen. They always just overwrite it.

This change drops this parameter, and makes all these functions, on success,
return length of sockaddr. It's always >= 0 and can be differentiated
from an error.

Tests in callers are changed from "if (err)" to "if (err < 0)", where needed.

rpc_sockname() lost "int buflen" parameter, since its only use was
to be passed to kernel_getsockname() as &buflen and subsequently
not used in any way.

Userspace API is not changed.

    text    data     bss      dec     hex filename
30108430 2633624  873672 33615726 200ef6e vmlinux.before.o
30108109 2633612  873672 33615393 200ee21 vmlinux.o

Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
CC: David S. Miller <davem@davemloft.net>
CC: linux-kernel@vger.kernel.org
CC: netdev@vger.kernel.org
CC: linux-bluetooth@vger.kernel.org
CC: linux-decnet-user@lists.sourceforge.net
CC: linux-wireless@vger.kernel.org
CC: linux-rdma@vger.kernel.org
CC: linux-sctp@vger.kernel.org
CC: linux-nfs@vger.kernel.org
CC: linux-x25@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-12 14:15:04 -05:00