Commit graph

68773 commits

Author SHA1 Message Date
Axel Lin
35d838ff98
regulator: Fix comment for csel_reg and csel_mask
The csel_reg and csel_mask fields in struct regulator_desc needs to
be generic for drivers. Not just for TPS65218.

Signed-off-by: Axel Lin <axel.lin@ingics.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
2019-03-03 23:44:53 +00:00
YueHaibing
6377f787ae appletalk: Fix use-after-free in atalk_proc_exit
KASAN report this:

BUG: KASAN: use-after-free in pde_subdir_find+0x12d/0x150 fs/proc/generic.c:71
Read of size 8 at addr ffff8881f41fe5b0 by task syz-executor.0/2806

CPU: 0 PID: 2806 Comm: syz-executor.0 Not tainted 5.0.0-rc7+ #45
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0xfa/0x1ce lib/dump_stack.c:113
 print_address_description+0x65/0x270 mm/kasan/report.c:187
 kasan_report+0x149/0x18d mm/kasan/report.c:317
 pde_subdir_find+0x12d/0x150 fs/proc/generic.c:71
 remove_proc_entry+0xe8/0x420 fs/proc/generic.c:667
 atalk_proc_exit+0x18/0x820 [appletalk]
 atalk_exit+0xf/0x5a [appletalk]
 __do_sys_delete_module kernel/module.c:1018 [inline]
 __se_sys_delete_module kernel/module.c:961 [inline]
 __x64_sys_delete_module+0x3dc/0x5e0 kernel/module.c:961
 do_syscall_64+0x147/0x600 arch/x86/entry/common.c:290
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x462e99
Code: f7 d8 64 89 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fb2de6b9c58 EFLAGS: 00000246 ORIG_RAX: 00000000000000b0
RAX: ffffffffffffffda RBX: 000000000073bf00 RCX: 0000000000462e99
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00000000200001c0
RBP: 0000000000000002 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007fb2de6ba6bc
R13: 00000000004bccaa R14: 00000000006f6bc8 R15: 00000000ffffffff

Allocated by task 2806:
 set_track mm/kasan/common.c:85 [inline]
 __kasan_kmalloc.constprop.3+0xa0/0xd0 mm/kasan/common.c:496
 slab_post_alloc_hook mm/slab.h:444 [inline]
 slab_alloc_node mm/slub.c:2739 [inline]
 slab_alloc mm/slub.c:2747 [inline]
 kmem_cache_alloc+0xcf/0x250 mm/slub.c:2752
 kmem_cache_zalloc include/linux/slab.h:730 [inline]
 __proc_create+0x30f/0xa20 fs/proc/generic.c:408
 proc_mkdir_data+0x47/0x190 fs/proc/generic.c:469
 0xffffffffc10c01bb
 0xffffffffc10c0166
 do_one_initcall+0xfa/0x5ca init/main.c:887
 do_init_module+0x204/0x5f6 kernel/module.c:3460
 load_module+0x66b2/0x8570 kernel/module.c:3808
 __do_sys_finit_module+0x238/0x2a0 kernel/module.c:3902
 do_syscall_64+0x147/0x600 arch/x86/entry/common.c:290
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

Freed by task 2806:
 set_track mm/kasan/common.c:85 [inline]
 __kasan_slab_free+0x130/0x180 mm/kasan/common.c:458
 slab_free_hook mm/slub.c:1409 [inline]
 slab_free_freelist_hook mm/slub.c:1436 [inline]
 slab_free mm/slub.c:2986 [inline]
 kmem_cache_free+0xa6/0x2a0 mm/slub.c:3002
 pde_put+0x6e/0x80 fs/proc/generic.c:647
 remove_proc_entry+0x1d3/0x420 fs/proc/generic.c:684
 0xffffffffc10c031c
 0xffffffffc10c0166
 do_one_initcall+0xfa/0x5ca init/main.c:887
 do_init_module+0x204/0x5f6 kernel/module.c:3460
 load_module+0x66b2/0x8570 kernel/module.c:3808
 __do_sys_finit_module+0x238/0x2a0 kernel/module.c:3902
 do_syscall_64+0x147/0x600 arch/x86/entry/common.c:290
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

The buggy address belongs to the object at ffff8881f41fe500
 which belongs to the cache proc_dir_entry of size 256
The buggy address is located 176 bytes inside of
 256-byte region [ffff8881f41fe500, ffff8881f41fe600)
The buggy address belongs to the page:
page:ffffea0007d07f80 count:1 mapcount:0 mapping:ffff8881f6e69a00 index:0x0
flags: 0x2fffc0000000200(slab)
raw: 02fffc0000000200 dead000000000100 dead000000000200 ffff8881f6e69a00
raw: 0000000000000000 00000000800c000c 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff8881f41fe480: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
 ffff8881f41fe500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff8881f41fe580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                     ^
 ffff8881f41fe600: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb
 ffff8881f41fe680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb

It should check the return value of atalk_proc_init fails,
otherwise atalk_exit will trgger use-after-free in pde_subdir_find
while unload the module.This patch fix error cleanup path of atalk_init

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-03 13:01:49 -08:00
David S. Miller
d5fa9c55e5 mlx5-updates-2019-03-01
This series adds multipath offload support and contains some small updates
 to mlx5 driver.
 
 Multipath offload support from Roi Dayan:
 
 We are going to track SW multipath route and related nexthops and reflect
 that as port affinity to the HW.
 
 1) Some patches are preparation.
 2) add the multipath mode and fib events handling.
 3) add support to handle offload failure for net error, i.e.
 port down.
 4) Small updates to match the behavior of multipath
 
 Two small updates from Eran Ben Elisha,
 5) Make a function static
 6) Update PCIe supported devices list.
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJceZBCAAoJEEg/ir3gV/o+sC0H/RWg+QKByvv0L3gqouKRvQq6
 6IL8dacbYnGFiwhXiB/087oPn4g3rOImfvnyrH+d6/clkXGPqrgSCTLhISUc7KyP
 Ig837K0fbn9LdV7a4t6OkTMiH9XUWh/Q88LpMLn0abPHIvE+blm7plbHV1x+D6pA
 +O4QM7qHcDVUYh7lV4WqQJg/caEjNML6JHDouZ0nnqqfWM9C9Jk05sp3Nj80WdWt
 0vruJRlHNTiiKcEcYCoV6z8ZUuN7OhwmyxuPiAGkK5n3Jcpjp9EQ9LfCT+kDuNKO
 SLTk1JMtsv4ZCrB6geC2QIAGUu2lGcLPUUPohgp1nY/hEbK1Vg24sPFhEdk4/r4=
 =JM/g
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-updates-2019-03-01' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2019-03-01

This series adds multipath offload support and contains some small updates
to mlx5 driver.

Multipath offload support from Roi Dayan:

We are going to track SW multipath route and related nexthops and reflect
that as port affinity to the HW.

1) Some patches are preparation.
2) add the multipath mode and fib events handling.
3) add support to handle offload failure for net error, i.e.
port down.
4) Small updates to match the behavior of multipath

Two small updates from Eran Ben Elisha,
5) Make a function static
6) Update PCIe supported devices list.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-02 14:04:20 -08:00
David S. Miller
4e7df119d9 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
Pablo Neira Ayuso says:

====================
Netfilter/IPVS updates for net-next

The following patchset contains Netfilter/IPVS updates for net-next:

1) Add .release_ops to properly unroll .select_ops, use it from nft_compat.
   After this change, we can remove list of extensions too to simplify this
   codebase.

2) Update amanda conntrack helper to support v3.4, from Florian Tham.

3) Get rid of the obsolete BUGPRINT macro in ebtables, from
   Florian Westphal.

4) Merge IPv4 and IPv6 masquerading infrastructure into one single module.
   From Florian Westphal.

5) Patchset to remove nf_nat_l3proto structure to get rid of
   indirections, from Florian Westphal.

6) Skip unnecessary conntrack timeout updates in case the value is
   still the same, also from Florian Westphal.

7) Remove unnecessary 'fall through' comments in empty switch cases,
   from Li RongQing.

8) Fix lookup to fixed size hashtable sets on big endian with 32-bit keys.

9) Incorrect logic to deactivate path of fixed size hashtable sets,
   element was being tested to self.

10) Remove nft_hash_key(), the bitmap set is always selected for 16-bit
    keys.

11) Use boolean whenever possible in IPVS codebase, from Andrea Claudi.

12) Enter close state in conntrack if RST matches exact sequence number,
    from Florian Westphal.

13) Initialize dst_cache in tunnel extension, from wenxu.

14) Pass protocol as u16 to xt_check_match and xt_check_target, from
    Li RongQing.

15) SCTP header is granted to be in a linear area from IPVS NAT handler,
    from Xin Long.

16) Don't steal packets coming from slave VRF device from the
    ip_sabotage_in() path, from David Ahern.

17) Fix unsafe update of basechain stats, from Li RongQing.

18) Make sure CONNTRACK_LOCKS is power of 2 to let compiler optimize
    modulo operation as bitwise AND, from Li RongQing.

19) Use device_attribute instead of internal definition in the IDLETIMER
    target, from Sami Tolvanen.

20) Merge redir, masq and IPv4/IPv6 NAT chain types, from Florian Westphal.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-02 14:01:04 -08:00
David S. Miller
9eb359140c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2019-03-02 12:54:35 -08:00
Linus Torvalds
c93d9218ea Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Fix refcount leak in act_ipt during replace, from Davide Caratti.

 2) Set task state properly in tun during blocking reads, from Timur
    Celik.

 3) Leaked reference in DSA, from Wen Yang.

 4) NULL deref in act_tunnel_key, from Vlad Buslov.

 5) cipso_v4_erro can reference the skb IPCB in inappropriate contexts
    thus referencing garbage, from Nazarov Sergey.

 6) Don't accept RTA_VIA and RTA_GATEWAY in contexts where those
    attributes make no sense.

 7) Fix hung sendto in tipc, from Tung Nguyen.

 8) Out-of-bounds access in netlabel, from Paul Moore.

 9) Grant reference leak in xen-netback, from Igor Druzhinin.

10) Fix tx stalls with lan743x, from Bryan Whitehead.

11) Fix interrupt storm with mv88e6xxx, from Hein Kallweit.

12) Memory leak in sit on device registry failure, from Mao Wenan.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (44 commits)
  net: sit: fix memory leak in sit_init_net()
  net: dsa: mv88e6xxx: Fix statistics on mv88e6161
  geneve: correctly handle ipv6.disable module parameter
  net: dsa: mv88e6xxx: prevent interrupt storm caused by mv88e6390x_port_set_cmode
  bpf: fix sanitation rewrite in case of non-pointers
  ipv4: Add ICMPv6 support when parse route ipproto
  MIPS: eBPF: Fix icache flush end address
  lan743x: Fix TX Stall Issue
  net: phy: phylink: fix uninitialized variable in phylink_get_mac_state
  net: aquantia: regression on cpus with high cores: set mode with 8 queues
  selftests: fixes for UDP GRO
  bpf: drop refcount if bpf_map_new_fd() fails in map_create()
  net: dsa: mv88e6xxx: power serdes on/off for 10G interfaces on 6390X
  net: dsa: mv88e6xxx: Fix u64 statistics
  xen-netback: don't populate the hash cache on XenBus disconnect
  xen-netback: fix occasional leak of grant ref mappings under memory pressure
  sctp: chunk.c: correct format string for size_t in printk
  net: netem: fix skb length BUG_ON in __skb_to_sgvec
  netlabel: fix out-of-bounds memory accesses
  ipv4: Pass original device to ip_rcv_finish_core
  ...
2019-03-02 08:46:34 -08:00
Michael Shych
9f03161a1b platform_data/mlxreg: additions for Mellanox watchdog driver.
There are two new fields added to mlxreg core structure:
features - supported features of device and
identity - device identity name.
Add new defines for watchdog features.

Signed-off-by: Michael Shych <michaelsh@mellanox.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2019-03-02 15:28:19 +01:00
Trond Myklebust
3eb86093ea NFSv4.2: Add client support for the generic 'layouterror' RPC call
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-03-01 16:20:16 -05:00
Trond Myklebust
a79f194aa4 NFSv4/flexfiles: Abort I/O early if the layout segment was invalidated
If a layout segment gets invalidated while a pNFS I/O operation
is queued for transmission, then we ideally want to abort
immediately. This is particularly the case when there is a large
number of I/O related RPCs queued in the RPC layer, and the layout
segment gets invalidated due to an ENOSPC error, or an EACCES (because
the client was fenced). We may end up forced to spam the MDS with a
lot of otherwise unnecessary LAYOUTERRORs after that I/O fails.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-03-01 16:20:16 -05:00
Roi Dayan
6997b1c9ca net/mlx5: Emit port affinity event for multipath offloads
Under multipath offload scheme, as part of handling fib events, emit
mlx5 port affinity event on the enabled ports which will be handled by
the tc offloads code.

Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-03-01 12:04:17 -08:00
Roi Dayan
724b509ca0 net/mlx5: Add multipath mode
In order to offload ecmp-on-host scheme where next-hop routes are used,
we will make use of HW LAG. Add accessor function to let upper layers
in the driver to realize if the lag acts in multi-path mode.

Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-03-01 12:04:16 -08:00
Joe Perches
1c7cf3d5e1 wusb: Remove unnecessary static function ckhdid_printf
This static inline is unnecessary and can be removed
by using the vsprintf %ph extension.

This reduces overall object size by more than 2K.

Reported-by: Louis Taylor <louis@kragniz.eu>
Signed-off-by: Joe Perches <joe@perches.com>
Reviewed-by: Louis Taylor <louis@kragniz.eu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-03-01 20:53:41 +01:00
Greg Kroah-Hartman
99f63620b4 1. Exopsing counters for state changes of channel ring buffers; this is
useful to investigate performance issues. By Kimberly Brown.
 
 2. Switching to the new generic UUID API, by Andy Shevchenko.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE4n5dijQDou9mhzu83qZv95d3LNwFAlx5S9AACgkQ3qZv95d3
 LNwZ3Q//Z4LbaJf8WQjym1N5/akdhw3q9WL6RDn2o8tqNcaQ5OTpUoiYLjpMdtkr
 pr7YV3/7K0IJS2AnExQ937e+OJ0FyFeXBIGKbaMgj4Z99k0ZwL90aUobj9/4En4P
 9HsKR1nqpYr/cetp9MquO8b8J/kocfU4CJm50RnzxgURgsULmAXY+HsfVMAyPVjJ
 VLbXxPgcuRPTurUzhi5QvDNYHI3pmKJMK17TmeWxKeamLDBDvkfuFqo/U7c/k+cw
 H66gY1Nni+PiYGAvS42COtEexpnhGn3xVZKxSpak9ZlP40DotQc1T9TLM45j3T8/
 paGv20vI9c4yt9NDnTyEoTTdNBmnzq8+/rGJRv5aIeCeQfD3mL+c48fJrdhduK2l
 3253nUfDGRR8KI/rLsLnN80Iey33/YUCy5LuxZy7eRY6xuf9zhzwvP4rKRaWbrqc
 VbC4kNiwcOYcwqMnkfayDrmtc6M3HEoHAmXvtTzvOI7dntPTf2oDMszTzHzxtAHl
 A6bEGXxaMTM/3AjvalThZbjEofyJ26HeZVxcDFApdjrJD//3Puo3XRL8iTLP06rN
 iV6z21NcCjrCP5hmN691Jlg3mhvZsn1SKhPfnnv9IVJ9hc+KheGE2n1bcSYOVpig
 eaFwn7GQnvYdS9KLZ2vKBhGWNwcJqmoTAk6toC80gdxe4Zeny3A=
 =hDVS
 -----END PGP SIGNATURE-----

Merge tag 'hyperv-next-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux into char-misc-next

Sasha writes:

1. Exopsing counters for state changes of channel ring buffers; this is
useful to investigate performance issues. By Kimberly Brown.

2. Switching to the new generic UUID API, by Andy Shevchenko.

* tag 'hyperv-next-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
  Drivers: hv: vmbus: Expose counters for interrupts and full conditions
  vmbus: Switch to use new generic UUID API
2019-03-01 16:43:50 +01:00
Mauro Carvalho Chehab
e907bf3c98 media: include: fix several typos
Use codespell to fix lots of typos over frontends.

Manually verified to avoid false-positives.

Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Reviewed-by: Lad, Prabhakar <prabhakar.csengg@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2019-03-01 09:45:52 -05:00
Arnd Bergmann
8ceb820b69 NXP/FSL SoC driver updates for v5.1 take4
DPIO driver
 - Add support for cache stashing and enable it in dpaa2-eth driver
 
 GUTS driver
 - Make fsl_guts_get_svr() API internal in favor of more generic
   soc_device_match()
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEhb3UXAyxp6UQ0v6khtxQDvusFVQFAlx11WsACgkQhtxQDvus
 FVTxahAApjB6xxpMBkBlX4lZ+o3HdX/fyyjna6iZIdG5YPm2sGnlAQ7dZb/sLMCk
 1U+s52hc+aowfL23/x6XDu6vk0Oa05Qc8qmqT5zAyzw8el5r0lFr6S6KI3QmlSBI
 9m3CJTWMRMEJVPribWEIUb/DrrylKEHhZORRBeNqHBPV9J7yh/zvTc3zJiSRG4b5
 Pl1l+iCykTvxazcoz/mXlM7qIe5jinFU5odhAnbbrK+2oPs0NjblLqPCQe96l3/3
 NUVxO0pYqvHDDNc6SbIpylvLp8Yd3N3L+VmdfJiNDKIH1/19R47EBAXvNvWONcao
 DooHNVv3ltwJ3UB5aGcM9Pqr0qJyCtXzRcrfC5q5z7/kMJcTUxB1bbbfPlzsXqKi
 FvL2JpQwqAnF8IJh7Z81bdGIfHfl4YbbY3gp1x4keXo7zRyUtuS2+fkiE+Q/LaWL
 3reJSDoXcPpozZ0ds5m4szbEB1vlm6zzCYpVXaXgZfr8uuV8OXG9CZ6Cl+OYyofc
 GkoyampBIWd8OLNfodOpa5GqcdPDgMZ4Ha8eafN3wNIT+PUYS0Eixy4tk+BNYjiw
 Yu4JHwZzbsGzLzpsT/BgYvznMXg3rpOTpak3WdeUb9VSlOE+vWngUSDMLiNRLdFE
 1rJzCXBe43MOqYEfqc/FbffX+ddPpNWVUrRdMtKD5jYJm/luqKE=
 =tNfD
 -----END PGP SIGNATURE-----

Merge tag 'soc-fsl-next-v5.1-4' of git://git.kernel.org/pub/scm/linux/kernel/git/leo/linux into arm/drivers

NXP/FSL SoC driver updates for v5.1 take4

DPIO driver
- Add support for cache stashing and enable it in dpaa2-eth driver

GUTS driver
- Make fsl_guts_get_svr() API internal in favor of more generic
  soc_device_match()

* tag 'soc-fsl-next-v5.1-4' of git://git.kernel.org/pub/scm/linux/kernel/git/leo/linux:
  dpaa2-eth: configure the cache stashing amount on a queue
  soc: fsl: dpio: configure cache stashing destination
  soc: fsl: dpio: enable frame data cache stashing per software portal
  soc: fsl: guts: make fsl_guts_get_svr() static

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2019-03-01 15:05:54 +01:00
Arnd Bergmann
3473b71e21 OP-TEE driver
- dual license for optee_msg.h and optee_smc.h
 Generic
 - add cancellation support to client interface
 -----BEGIN PGP SIGNATURE-----
 
 iQJOBAABCgA4FiEEcK3MsDvGvFp6zV9ztbC4QZeP7NMFAlx32w8aHGplbnMud2lr
 bGFuZGVyQGxpbmFyby5vcmcACgkQtbC4QZeP7NNr9g//czpqt3B7e2pUF46+rjLF
 MZZAQwJw1KGjAH8YcUXXWlRI6HRIWKRMohZp4ixC4Xe/OQOpl7grwcWg6j69yI+k
 EWdV/SJf6yKmxf9MwgMmh7U2bijIvQAYNd11ODZL+PZfetDLJ6U8kcNlrAGLFTAP
 MtxM+wXcmnIT/CHD0wz8hH2B8ApYTCv4E5vkPXSfZEQ2mUU7Lns0MeUzXs74zPQU
 yJgCMZnLA2JFJ3xhdi1e2gdyI4NGtAomHAQ9oIzD6rO1OU1H51L2+yFHK1G+eu6I
 uMy/bSF0RBJfM0NN9k1XPssHtgg7JrJ6kHZh9Z2knuCi0KUf75bdZE1qC+9N9uu9
 9+Qpt6IyxsRPwCgtVuKNl4KEIsxwGALG0oUwe9sBPIL1dOqZtc/bPDNME7LTAoN5
 0JWxbz0YLgNKPpkIUfs48vzVrRqHhVBbZ/SuXmOLx9w+3wY/V1xH8pHmsqKJ/0b6
 rdWXByt4Vv4YTYXVUz7ChS0ax+ZrDqocxvPFATRYXCNVlE/in/GHmR+QJkpEZnN9
 IXhvvYyv582NVYBRAVf7DZBrUIgKT8SnT66PWhiCiP2z0rK5a3CFohV6Vdl/aqyE
 1hppXuREuCbxdRTRu563yJL+v4MBfzItRp7v5xW/R8dAMPVtUMIlUyn11a4DGeWz
 WYgNf9Po3M2JT7GEvNA2VKA=
 =h0Es
 -----END PGP SIGNATURE-----

Merge tag 'tee-misc-for-v5.1' of https://git.linaro.org/people/jens.wiklander/linux-tee into arm/drivers

OP-TEE driver
- dual license for optee_msg.h and optee_smc.h
Generic
- add cancellation support to client interface

* tag 'tee-misc-for-v5.1' of https://git.linaro.org/people/jens.wiklander/linux-tee:
  tee: optee: update optee_msg.h and optee_smc.h to dual license
  tee: add cancellation support to client interface

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2019-03-01 15:01:16 +01:00
Li RongQing
11d4dd0b20 netfilter: convert the proto argument from u8 to u16
The proto in struct xt_match and struct xt_target is u16, when
calling xt_check_target/match, their proto argument is u8,
and will cause truncation, it is harmless to ip packet, since
ip proto is u8

if a etable's match/target has proto that is u16, will cause
the check failure.

and convert be16 to short in bridge/netfilter/ebtables.c

Signed-off-by: Zhang Yu <zhangyu31@baidu.com>
Signed-off-by: Li RongQing <lirongqing@baidu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-03-01 14:28:43 +01:00
Joerg Roedel
d05e4c8600 Merge branches 'iommu/fixes', 'arm/msm', 'arm/tegra', 'arm/mediatek', 'x86/vt-d', 'x86/amd', 'hyper-v' and 'core' into next 2019-03-01 11:24:51 +01:00
Christoph Hellwig
5b88a17cfd block: optimize bvec iteration in bvec_iter_advance
There is no need to only iterate in chunks of PAGE_SIZE or less in
bvec_iter_advance, given that the callers pass in the chunk length that
they are operating on - either that already is less than PAGE_SIZE
because they do classic page-based iteration, or it is larger because
the caller operates on multi-page bvecs.

This should help shaving off a few cycles of the I/O hot path.

Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-02-28 13:49:22 -07:00
Christoph Hellwig
221e1e0b01 of: mark early_init_dt_alloc_reserved_memory_arch static
This function is only used in of_reserved_mem.c, and never overridden
despite the __weak marker.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Rob Herring <robh@kernel.org>
2019-02-28 11:40:49 -06:00
Jens Axboe
edafccee56 io_uring: add support for pre-mapped user IO buffers
If we have fixed user buffers, we can map them into the kernel when we
setup the io_uring. That avoids the need to do get_user_pages() for
each and every IO.

To utilize this feature, the application must call io_uring_register()
after having setup an io_uring instance, passing in
IORING_REGISTER_BUFFERS as the opcode. The argument must be a pointer to
an iovec array, and the nr_args should contain how many iovecs the
application wishes to map.

If successful, these buffers are now mapped into the kernel, eligible
for IO. To use these fixed buffers, the application must use the
IORING_OP_READ_FIXED and IORING_OP_WRITE_FIXED opcodes, and then
set sqe->index to the desired buffer index. sqe->addr..sqe->addr+seq->len
must point to somewhere inside the indexed buffer.

The application may register buffers throughout the lifetime of the
io_uring instance. It can call io_uring_register() with
IORING_UNREGISTER_BUFFERS as the opcode to unregister the current set of
buffers, and then register a new set. The application need not
unregister buffers explicitly before shutting down the io_uring
instance.

It's perfectly valid to setup a larger buffer, and then sometimes only
use parts of it for an IO. As long as the range is within the originally
mapped region, it will work just fine.

For now, buffers must not be file backed. If file backed buffers are
passed in, the registration will fail with -1/EOPNOTSUPP. This
restriction may be relaxed in the future.

RLIMIT_MEMLOCK is used to check how much memory we can pin. A somewhat
arbitrary 1G per buffer size is also imposed.

Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-02-28 08:24:23 -07:00
Jens Axboe
091141a42e fs: add fget_many() and fput_many()
Some uses cases repeatedly get and put references to the same file, but
the only exposed interface is doing these one at the time. As each of
these entail an atomic inc or dec on a shared structure, that cost can
add up.

Add fget_many(), which works just like fget(), except it takes an
argument for how many references to get on the file. Ditto fput_many(),
which can drop an arbitrary number of references to a file.

Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-02-28 08:24:23 -07:00
Jens Axboe
2b188cc1bb Add io_uring IO interface
The submission queue (SQ) and completion queue (CQ) rings are shared
between the application and the kernel. This eliminates the need to
copy data back and forth to submit and complete IO.

IO submissions use the io_uring_sqe data structure, and completions
are generated in the form of io_uring_cqe data structures. The SQ
ring is an index into the io_uring_sqe array, which makes it possible
to submit a batch of IOs without them being contiguous in the ring.
The CQ ring is always contiguous, as completion events are inherently
unordered, and hence any io_uring_cqe entry can point back to an
arbitrary submission.

Two new system calls are added for this:

io_uring_setup(entries, params)
	Sets up an io_uring instance for doing async IO. On success,
	returns a file descriptor that the application can mmap to
	gain access to the SQ ring, CQ ring, and io_uring_sqes.

io_uring_enter(fd, to_submit, min_complete, flags, sigset, sigsetsize)
	Initiates IO against the rings mapped to this fd, or waits for
	them to complete, or both. The behavior is controlled by the
	parameters passed in. If 'to_submit' is non-zero, then we'll
	try and submit new IO. If IORING_ENTER_GETEVENTS is set, the
	kernel will wait for 'min_complete' events, if they aren't
	already available. It's valid to set IORING_ENTER_GETEVENTS
	and 'min_complete' == 0 at the same time, this allows the
	kernel to return already completed events without waiting
	for them. This is useful only for polling, as for IRQ
	driven IO, the application can just check the CQ ring
	without entering the kernel.

With this setup, it's possible to do async IO with a single system
call. Future developments will enable polled IO with this interface,
and polled submission as well. The latter will enable an application
to do IO without doing ANY system calls at all.

For IRQ driven IO, an application only needs to enter the kernel for
completions if it wants to wait for them to occur.

Each io_uring is backed by a workqueue, to support buffered async IO
as well. We will only punt to an async context if the command would
need to wait for IO on the device side. Any data that can be accessed
directly in the page cache is done inline. This avoids the slowness
issue of usual threadpools, since cached data is accessed as quickly
as a sync interface.

Sample application: http://git.kernel.dk/cgit/fio/plain/t/io_uring.c

Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-02-28 08:24:23 -07:00
Ming Lei
594b9a89af block: introduce mp_bvec_for_each_page() for iterating over page
mp_bvec_for_each_segment() is a bit big for the iteration, so introduce
a light-weight helper for iterating over pages, then 32bytes stack
space can be saved.

Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-02-28 05:56:03 -07:00
Igor Opaniuk
4f062dc1b7 tee: add cancellation support to client interface
Add support of cancellation request to the TEE kernel internal
client interface. Can be used by software TPM drivers, that leverage
TEE under the hood (for instance TPM2.0 mobile profile), for requesting
cancellation of time-consuming operations (RSA key-pair generation etc.).

Signed-off-by: Igor Opaniuk <igor.opaniuk@linaro.org>
Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
2019-02-28 13:49:29 +01:00
Greg Kroah-Hartman
f699f9f9ac mei-hdcp driver
mei driver for the me hdcp client, for use by drm/i915.
 
 Including the following prep work:
 - whitelist hdcp client in mei bus
 - merge to include char-misc-next
 - drm/i915 side of the mei_hdcp/i915 component interface
 - component prep work (including one patch touching i915&snd-hda)
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEb4nG6jLu8Y5XI+PfTA9ye/CYqnEFAlx1qssACgkQTA9ye/CY
 qnGXwg/9EetBXi/7fqK5DXcaTJvBzlHqizXbImnQ3vj9xM9pJRjuvKHL+zAQCIZ5
 R9QA6kKVgtSmEp2MMBLaQHTQC6Vw3/BNZvbeeKXX9SboT4riRap4iXz8OdsXZTuq
 m2EWHBCqI+qe9j47HFLqYpgPZ7J9TR071uqZXmvlHgdaiXivDByDtUNszay40sEw
 GOq+bqjvMjf1hexS1YgImW7Qraw/s+dVfadxCxOk+ImpEtdH66gtn9Dthpml68VR
 Aeb+vG9TW/RFNNh0JE8A+IH9S4s0vlJtlz9bkq8NlJ8mASuiyhr9KN8CvEvS+vX4
 Qw3tJ8Fe1D9YRiGzveoIVP3MnuKPCh+RwIkospC0vIC9Lwrf4Xdkxkipu5lg+bD3
 53iNdbyAi9SZo/TNJn7Agu2XqnO4G8y3nGyAO6jPH74oEfVpZV4OhE5w01M/2Qy0
 gWDivKWP3EC1i9W7HA8LtxqE6NwhajHd6goHrshEPf2A5FL8f7IlYF+mwDv4GYUt
 /MNr3+iiSJLJUT48kk6iFQrxyk+8TfyMZx/k2MzxGKcHCERjdne+fSa3yW6d0Pv4
 Cod8qCNHy8zuKX3ZZx6Qh3xekwEAtFs9T+StYtBM0ua/PZpdCDSSIDbEG6XtrVI2
 nQaUYuWfVH+xYmQSwsnQNhzGZOPzAkM6KkuhWM0eE3b3RDlFZb8=
 =IvjR
 -----END PGP SIGNATURE-----

Merge tag 'topic/mei-hdcp-2019-02-26' of git://anongit.freedesktop.org/drm/drm-intel into char-misc-next

Daniel writes:

mei-hdcp driver

mei driver for the me hdcp client, for use by drm/i915.

Including the following prep work:
- whitelist hdcp client in mei bus
- merge to include char-misc-next
- drm/i915 side of the mei_hdcp/i915 component interface
- component prep work (including one patch touching i915&snd-hda)

* tag 'topic/mei-hdcp-2019-02-26' of git://anongit.freedesktop.org/drm/drm-intel: (23 commits)
  misc/mei/hdcp: Component framework for I915 Interface
  misc/mei/hdcp: Closing wired HDCP2.2 Tx Session
  misc/mei/hdcp: Enabling the HDCP authentication
  misc/mei/hdcp: Verify M_prime
  misc/mei/hdcp: Repeater topology verification and ack
  misc/mei/hdcp: Prepare Session Key
  misc/mei/hdcp: Verify L_prime
  misc/mei/hdcp: Initiate Locality check
  misc/mei/hdcp: Store the HDCP Pairing info
  misc/mei/hdcp: Verify H_prime
  misc/mei/hdcp: Verify Receiver Cert and prepare km
  misc/mei/hdcp: Initiate Wired HDCP2.2 Tx Session
  misc/mei/hdcp: Define ME FW interface for HDCP2.2
  misc/mei/hdcp: Client driver for HDCP application
  mei: bus: whitelist hdcp client
  drm/audio: declaration of struct device
  drm: helper functions for hdcp2 seq_num to from u32
  drm/i915: MEI interface definition
  drm/i915: header for i915 - MEI_HDCP interface
  drm/i915: enum port definition is moved into i915_drm.h
  ...
2019-02-28 12:55:40 +01:00
Sebastian Andrzej Siewior
ad01423aed kthread: Do not use TIMER_IRQSAFE
The TIMER_IRQSAFE usage was introduced in commit 22597dc3d9 ("kthread:
initial support for delayed kthread work") which modelled the delayed
kthread code after workqueue's code. The workqueue code requires the flag
TIMER_IRQSAFE for synchronisation purpose. This is not true for kthread's
delay timer since all operations occur under a lock.

Remove TIMER_IRQSAFE from the timer initialisation and use timer_setup()
for initialisation purpose which is the official function.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Link: https://lkml.kernel.org/r/20190212162554.19779-2-bigeasy@linutronix.de
2019-02-28 11:18:38 +01:00
Julia Cartwright
fe99a4f4d6 kthread: Convert worker lock to raw spinlock
In order to enable the queuing of kthread work items from hardirq context
even when PREEMPT_RT_FULL is enabled, convert the worker spin_lock to a
raw_spin_lock.

This is only acceptable to do because the work performed under the lock is
well-bounded and minimal.

Reported-by: Steffen Trumtrar <s.trumtrar@pengutronix.de>
Reported-by: Tim Sander <tim@krieglstein.org>
Signed-off-by: Julia Cartwright <julia@ni.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Steffen Trumtrar <s.trumtrar@pengutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Link: https://lkml.kernel.org/r/20190212162554.19779-1-bigeasy@linutronix.de
2019-02-28 11:18:38 +01:00
Kuppuswamy Sathyanarayanan
fff42928ad PCI/ATS: Add inline to pci_prg_resp_pasid_required()
Fix unused function warning when compiled with CONFIG_PCI_PASID
disabled.

Fixes: e5567f5f67 ("PCI/ATS: Add pci_prg_resp_pasid_required() interface.")
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-02-28 11:09:29 +01:00
David Howells
e7582e16a1 vfs: Implement logging through fs_context
Implement the ability for filesystems to log error, warning and
informational messages through the fs_context.  In the future, these will
be extractable by userspace by reading from an fd created by the fsopen()
syscall.

Error messages are prefixed with "e ", warnings with "w " and informational
messages with "i ".

In the future, inside the kernel, formatted messages will be malloc'd but
unformatted messages will not copied if they're either in the core .rodata
section or in the .rodata section of the filesystem module pinned by
fs_context::fs_type.  The messages will only be good till the fs_type is
released.

Note that the logging object will be shared between duplicated fs_context
structures.  This is so that such as NFS which do a mount within a mount
can get at least some of the errors from the inner mount.

Five logging functions are provided for this:

 (1) void logfc(struct fs_context *fc, const char *fmt, ...);

     This logs a message into the context.  If the buffer is full, the
     earliest message is discarded.

 (2) void errorf(fc, fmt, ...);

     This wraps logfc() to log an error.

 (3) void invalf(fc, fmt, ...);

     This wraps errorf() and returns -EINVAL for convenience.

 (4) void warnf(fc, fmt, ...);

     This wraps logfc() to log a warning.

 (5) void infof(fc, fmt, ...);

     This wraps logfc() to log an informational message.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-02-28 03:29:37 -05:00
David Howells
d911b4585e vfs: Remove kern_mount_data()
The kern_mount_data() isn't used any more so remove it.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-02-28 03:29:36 -05:00
David Howells
23bf1b6be9 kernfs, sysfs, cgroup, intel_rdt: Support fs_context
Make kernfs support superblock creation/mount/remount with fs_context.

This requires that sysfs, cgroup and intel_rdt, which are built on kernfs,
be made to support fs_context also.

Notes:

 (1) A kernfs_fs_context struct is created to wrap fs_context and the
     kernfs mount parameters are moved in here (or are in fs_context).

 (2) kernfs_mount{,_ns}() are made into kernfs_get_tree().  The extra
     namespace tag parameter is passed in the context if desired

 (3) kernfs_free_fs_context() is provided as a destructor for the
     kernfs_fs_context struct, but for the moment it does nothing except
     get called in the right places.

 (4) sysfs doesn't wrap kernfs_fs_context since it has no parameters to
     pass, but possibly this should be done anyway in case someone wants to
     add a parameter in future.

 (5) A cgroup_fs_context struct is created to wrap kernfs_fs_context and
     the cgroup v1 and v2 mount parameters are all moved there.

 (6) cgroup1 parameter parsing error messages are now handled by invalf(),
     which allows userspace to collect them directly.

 (7) cgroup1 parameter cleanup is now done in the context destructor rather
     than in the mount/get_tree and remount functions.

Weirdies:

 (*) cgroup_do_get_tree() calls cset_cgroup_from_root() with locks held,
     but then uses the resulting pointer after dropping the locks.  I'm
     told this is okay and needs commenting.

 (*) The cgroup refcount web.  This really needs documenting.

 (*) cgroup2 only has one root?

Add a suggestion from Thomas Gleixner in which the RDT enablement code is
placed into its own function.

[folded a leak fix from Andrey Vagin]

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
cc: Tejun Heo <tj@kernel.org>
cc: Li Zefan <lizefan@huawei.com>
cc: Johannes Weiner <hannes@cmpxchg.org>
cc: cgroups@vger.kernel.org
cc: fenghua.yu@intel.com
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-02-28 03:29:34 -05:00
Al Viro
0b52075ee6 introduce cloning of fs_context
new primitive: vfs_dup_fs_context().  Comes with fs_context
method (->dup()) for copying the filesystem-specific parts
of fs_context, along with LSM one (->fs_context_dup()) for
doing the same to LSM parts.

[needs better commit message, and change of Author:, anyway]

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-02-28 03:29:27 -05:00
Al Viro
cb50b348c7 convenience helpers: vfs_get_super() and sget_fc()
the former is an analogue of mount_{single,nodev} for use in
->get_tree() instances, the latter - analogue of sget() for the
same.

These are fairly similar to the originals, but the callback signature
for sget_fc() is different from sget() ones, so getting bits and
pieces shared would be too convoluted; we might get around to that
later, but for now let's just remember to keep them in sync.  They
do live next to each other, and changes in either won't be hard
to spot.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-02-28 03:29:26 -05:00
David Howells
3e1aeb00e6 vfs: Implement a filesystem superblock creation/configuration context
[AV - unfuck kern_mount_data(); we want non-NULL ->mnt_ns on long-living
mounts]
[AV - reordering fs/namespace.c is badly overdue, but let's keep it
separate from that series]
[AV - drop simple_pin_fs() change]
[AV - clean vfs_kern_mount() failure exits up]

Implement a filesystem context concept to be used during superblock
creation for mount and superblock reconfiguration for remount.

The mounting procedure then becomes:

 (1) Allocate new fs_context context.

 (2) Configure the context.

 (3) Create superblock.

 (4) Query the superblock.

 (5) Create a mount for the superblock.

 (6) Destroy the context.

Rather than calling fs_type->mount(), an fs_context struct is created and
fs_type->init_fs_context() is called to set it up.  Pointers exist for the
filesystem and LSM to hang their private data off.

A set of operations has to be set by ->init_fs_context() to provide
freeing, duplication, option parsing, binary data parsing, validation,
mounting and superblock filling.

Legacy filesystems are supported by the provision of a set of legacy
fs_context operations that build up a list of mount options and then invoke
fs_type->mount() from within the fs_context ->get_tree() operation.  This
allows all filesystems to be accessed using fs_context.

It should be noted that, whilst this patch adds a lot of lines of code,
there is quite a bit of duplication with existing code that can be
eliminated should all filesystems be converted over.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-02-28 03:29:26 -05:00
David Howells
846e566218 vfs: Put security flags into the fs_context struct
Put security flags, such as SECURITY_LSM_NATIVE_LABELS, into the filesystem
context so that the filesystem can communicate them to the LSM more easily.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-02-28 03:29:25 -05:00
David Howells
da2441fdff vfs: Add LSM hooks for the new mount API
Add LSM hooks for use by the new mount API and filesystem context code.
This includes:

 (1) Hooks to handle allocation, duplication and freeing of the security
     record attached to a filesystem context.

 (2) A hook to snoop source specifications.  There may be multiple of these
     if the filesystem supports it.  They will to be local files/devices if
     fs_context::source_is_dev is true and will be something else, possibly
     remote server specifications, if false.

 (3) A hook to snoop superblock configuration options in key[=val] form.
     If the LSM decides it wants to handle it, it can suppress the option
     being passed to the filesystem.  Note that 'val' may include commas
     and binary data with the fsopen patch.

 (4) A hook to perform validation and allocation after the configuration
     has been done but before the superblock is allocated and set up.

 (5) A hook to transfer the security from the context to a newly created
     superblock.

 (6) A hook to rule on whether a path point can be used as a mountpoint.

These are intended to replace:

	security_sb_copy_data
	security_sb_kern_mount
	security_sb_mount
	security_sb_set_mnt_opts
	security_sb_clone_mnt_opts
	security_sb_parse_opts_str

[AV -- some of the methods being replaced are already gone, some of the
methods are not added for the lack of need]

Signed-off-by: David Howells <dhowells@redhat.com>
cc: linux-security-module@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-02-28 03:29:23 -05:00
David Howells
31d921c7fb vfs: Add configuration parser helpers
Because the new API passes in key,value parameters, match_token() cannot be
used with it.  Instead, provide three new helpers to aid with parsing:

 (1) fs_parse().  This takes a parameter and a simple static description of
     all the parameters and maps the key name to an ID.  It returns 1 on a
     match, 0 on no match if unknowns should be ignored and some other
     negative error code on a parse error.

     The parameter description includes a list of key names to IDs, desired
     parameter types and a list of enumeration name -> ID mappings.

     [!] Note that for the moment I've required that the key->ID mapping
     array is expected to be sorted and unterminated.  The size of the
     array is noted in the fsconfig_parser struct.  This allows me to use
     bsearch(), but I'm not sure any performance gain is worth the hassle
     of requiring people to keep the array sorted.

     The parameter type array is sized according to the number of parameter
     IDs and is indexed directly.  The optional enum mapping array is an
     unterminated, unsorted list and the size goes into the fsconfig_parser
     struct.

     The function can do some additional things:

	(a) If it's not ambiguous and no value is given, the prefix "no" on
	    a key name is permitted to indicate that the parameter should
	    be considered negatory.

	(b) If the desired type is a single simple integer, it will perform
	    an appropriate conversion and store the result in a union in
	    the parse result.

	(c) If the desired type is an enumeration, {key ID, name} will be
	    looked up in the enumeration list and the matching value will
	    be stored in the parse result union.

	(d) Optionally generate an error if the key is unrecognised.

     This is called something like:

	enum rdt_param {
		Opt_cdp,
		Opt_cdpl2,
		Opt_mba_mpbs,
		nr__rdt_params
	};

	const struct fs_parameter_spec rdt_param_specs[nr__rdt_params] = {
		[Opt_cdp]	= { fs_param_is_bool },
		[Opt_cdpl2]	= { fs_param_is_bool },
		[Opt_mba_mpbs]	= { fs_param_is_bool },
	};

	const const char *const rdt_param_keys[nr__rdt_params] = {
		[Opt_cdp]	= "cdp",
		[Opt_cdpl2]	= "cdpl2",
		[Opt_mba_mpbs]	= "mba_mbps",
	};

	const struct fs_parameter_description rdt_parser = {
		.name		= "rdt",
		.nr_params	= nr__rdt_params,
		.keys		= rdt_param_keys,
		.specs		= rdt_param_specs,
		.no_source	= true,
	};

	int rdt_parse_param(struct fs_context *fc,
			    struct fs_parameter *param)
	{
		struct fs_parse_result parse;
		struct rdt_fs_context *ctx = rdt_fc2context(fc);
		int ret;

		ret = fs_parse(fc, &rdt_parser, param, &parse);
		if (ret < 0)
			return ret;

		switch (parse.key) {
		case Opt_cdp:
			ctx->enable_cdpl3 = true;
			return 0;
		case Opt_cdpl2:
			ctx->enable_cdpl2 = true;
			return 0;
		case Opt_mba_mpbs:
			ctx->enable_mba_mbps = true;
			return 0;
		}

		return -EINVAL;
	}

 (2) fs_lookup_param().  This takes a { dirfd, path, LOOKUP_EMPTY? } or
     string value and performs an appropriate path lookup to convert it
     into a path object, which it will then return.

     If the desired type was a blockdev, the type of the looked up inode
     will be checked to make sure it is one.

     This can be used like:

	enum foo_param {
		Opt_source,
		nr__foo_params
	};

	const struct fs_parameter_spec foo_param_specs[nr__foo_params] = {
		[Opt_source]	= { fs_param_is_blockdev },
	};

	const char *char foo_param_keys[nr__foo_params] = {
		[Opt_source]	= "source",
	};

	const struct constant_table foo_param_alt_keys[] = {
		{ "device",	Opt_source },
	};

	const struct fs_parameter_description foo_parser = {
		.name		= "foo",
		.nr_params	= nr__foo_params,
		.nr_alt_keys	= ARRAY_SIZE(foo_param_alt_keys),
		.keys		= foo_param_keys,
		.alt_keys	= foo_param_alt_keys,
		.specs		= foo_param_specs,
	};

	int foo_parse_param(struct fs_context *fc,
			    struct fs_parameter *param)
	{
		struct fs_parse_result parse;
		struct foo_fs_context *ctx = foo_fc2context(fc);
		int ret;

		ret = fs_parse(fc, &foo_parser, param, &parse);
		if (ret < 0)
			return ret;

		switch (parse.key) {
		case Opt_source:
			return fs_lookup_param(fc, &foo_parser, param,
					       &parse, &ctx->source);
		default:
			return -EINVAL;
		}
	}

 (3) lookup_constant().  This takes a table of named constants and looks up
     the given name within it.  The table is expected to be sorted such
     that bsearch() be used upon it.

     Possibly I should require the table be terminated and just use a
     for-loop to scan it instead of using bsearch() to reduce hassle.

     Tables look something like:

	static const struct constant_table bool_names[] = {
		{ "0",		false },
		{ "1",		true },
		{ "false",	false },
		{ "no",		false },
		{ "true",	true },
		{ "yes",	true },
	};

     and a lookup is done with something like:

	b = lookup_constant(bool_names, param->string, -1);

Additionally, optional validation routines for the parameter description
are provided that can be enabled at compile time.  A later patch will
invoke these when a filesystem is registered.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-02-28 03:28:53 -05:00
Avri Altman
bc47e2f6f9 mmc: core: Add discard support to sd
SD spec v5.1 adds discard support. The flows and commands are similar to
mmc, so just set the discard arg in CMD38.

A host which supports DISCARD shall check if the DISCARD_SUPPORT (b313)
is set in the SD_STATUS register.  If the card does not support discard,
the host shall not issue DISCARD command, but ERASE command instead.

Post the DISCARD operation, the card may de-allocate the discarded
blocks partially or completely. So the host mustn't make any assumptions
concerning the content of the discarded region. This is unlike ERASE
command, in which the region is guaranteed to contain either '0's or
'1's, depends on the content of DATA_STAT_AFTER_ERASE (b55) in the scr
register.

One more important difference compared to ERASE is the busy timeout
which we will address on the next patch.

Signed-off-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2019-02-28 09:16:12 +01:00
Ingo Molnar
c978b9460f perf/core improvements and fixes:
perf annotate:
 
   Wei Li:
 
   - Fix getting source line failure
 
 perf script:
 
   Andi Kleen:
 
   - Handle missing fields with -F +...
 
 perf data:
 
   Jiri Olsa:
 
   - Prep work to support per-cpu files in a directory.
 
 Intel PT:
 
   Adrian Hunter:
 
   - Improve thread_stack__no_call_return()
 
   - Hide x86 retpolines in thread stacks.
 
   - exported SQL viewer refactorings, new 'top calls' report..
 
   Alexander Shishkin:
 
   - Copy parent's address filter offsets on clone
 
   - Fix address filters for vmas with non-zero offset. Applies to
     ARM's CoreSight as well.
 
 python scripts:
 
   Tony Jones:
 
   - Python3 support for several 'perf script' python scripts.
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCXHRYNwAKCRCyPKLppCJ+
 J8XmAQDKY7gb3GhkX+4aE8cGffFYB2YV5mD9Bbu4AM9tuFFBJwD+KAq87FMCy7m7
 h7xyWk3UILpz6y235AVdfOmgcNDkpAQ=
 =SJCG
 -----END PGP SIGNATURE-----

Merge tag 'perf-core-for-mingo-5.1-20190225' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core

Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:

perf annotate:

  Wei Li:

  - Fix getting source line failure

perf script:

  Andi Kleen:

  - Handle missing fields with -F +...

perf data:

  Jiri Olsa:

  - Prep work to support per-cpu files in a directory.

Intel PT:

  Adrian Hunter:

  - Improve thread_stack__no_call_return()

  - Hide x86 retpolines in thread stacks.

  - exported SQL viewer refactorings, new 'top calls' report..

  Alexander Shishkin:

  - Copy parent's address filter offsets on clone

  - Fix address filters for vmas with non-zero offset. Applies to
    ARM's CoreSight as well.

python scripts:

  Tony Jones:

  - Python3 support for several 'perf script' python scripts.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-02-28 08:29:50 +01:00
Ingo Molnar
9ed8f1a6e7 Merge branch 'linus' into perf/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-02-28 08:27:17 +01:00
Peter Zijlstra
28d49e2826 locking/lockdep: Shrink struct lock_class_key
Shrink struct lock_class_key; we never store anything in subkeys[], we
only use the addresses.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-02-28 07:55:53 +01:00
Bart Van Assche
669de8bda8 kernel/workqueue: Use dynamic lockdep keys for workqueues
The following commit:

  87915adc3f ("workqueue: re-add lockdep dependencies for flushing")

improved deadlock checking in the workqueue implementation. Unfortunately
that patch also introduced a few false positive lockdep complaints.

This patch suppresses these false positives by allocating the workqueue mutex
lockdep key dynamically.

An example of a false positive lockdep complaint suppressed by this patch
can be found below. The root cause of the lockdep complaint shown below
is that the direct I/O code can call alloc_workqueue() from inside a work
item created by another alloc_workqueue() call and that both workqueues
share the same lockdep key. This patch avoids that that lockdep complaint
is triggered by allocating the work queue lockdep keys dynamically.

In other words, this patch guarantees that a unique lockdep key is
associated with each work queue mutex.

  ======================================================
  WARNING: possible circular locking dependency detected
  4.19.0-dbg+ #1 Not tainted
  fio/4129 is trying to acquire lock:
  00000000a01cfe1a ((wq_completion)"dio/%s"sb->s_id){+.+.}, at: flush_workqueue+0xd0/0x970

  but task is already holding lock:
  00000000a0acecf9 (&sb->s_type->i_mutex_key#14){+.+.}, at: ext4_file_write_iter+0x154/0x710

  which lock already depends on the new lock.

  the existing dependency chain (in reverse order) is:

  -> #2 (&sb->s_type->i_mutex_key#14){+.+.}:
         down_write+0x3d/0x80
         __generic_file_fsync+0x77/0xf0
         ext4_sync_file+0x3c9/0x780
         vfs_fsync_range+0x66/0x100
         dio_complete+0x2f5/0x360
         dio_aio_complete_work+0x1c/0x20
         process_one_work+0x481/0x9f0
         worker_thread+0x63/0x5a0
         kthread+0x1cf/0x1f0
         ret_from_fork+0x24/0x30

  -> #1 ((work_completion)(&dio->complete_work)){+.+.}:
         process_one_work+0x447/0x9f0
         worker_thread+0x63/0x5a0
         kthread+0x1cf/0x1f0
         ret_from_fork+0x24/0x30

  -> #0 ((wq_completion)"dio/%s"sb->s_id){+.+.}:
         lock_acquire+0xc5/0x200
         flush_workqueue+0xf3/0x970
         drain_workqueue+0xec/0x220
         destroy_workqueue+0x23/0x350
         sb_init_dio_done_wq+0x6a/0x80
         do_blockdev_direct_IO+0x1f33/0x4be0
         __blockdev_direct_IO+0x79/0x86
         ext4_direct_IO+0x5df/0xbb0
         generic_file_direct_write+0x119/0x220
         __generic_file_write_iter+0x131/0x2d0
         ext4_file_write_iter+0x3fa/0x710
         aio_write+0x235/0x330
         io_submit_one+0x510/0xeb0
         __x64_sys_io_submit+0x122/0x340
         do_syscall_64+0x71/0x220
         entry_SYSCALL_64_after_hwframe+0x49/0xbe

  other info that might help us debug this:

  Chain exists of:
    (wq_completion)"dio/%s"sb->s_id --> (work_completion)(&dio->complete_work) --> &sb->s_type->i_mutex_key#14

   Possible unsafe locking scenario:

         CPU0                    CPU1
         ----                    ----
    lock(&sb->s_type->i_mutex_key#14);
                                 lock((work_completion)(&dio->complete_work));
                                 lock(&sb->s_type->i_mutex_key#14);
    lock((wq_completion)"dio/%s"sb->s_id);

   *** DEADLOCK ***

  1 lock held by fio/4129:
   #0: 00000000a0acecf9 (&sb->s_type->i_mutex_key#14){+.+.}, at: ext4_file_write_iter+0x154/0x710

  stack backtrace:
  CPU: 3 PID: 4129 Comm: fio Not tainted 4.19.0-dbg+ #1
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
  Call Trace:
   dump_stack+0x86/0xc5
   print_circular_bug.isra.32+0x20a/0x218
   __lock_acquire+0x1c68/0x1cf0
   lock_acquire+0xc5/0x200
   flush_workqueue+0xf3/0x970
   drain_workqueue+0xec/0x220
   destroy_workqueue+0x23/0x350
   sb_init_dio_done_wq+0x6a/0x80
   do_blockdev_direct_IO+0x1f33/0x4be0
   __blockdev_direct_IO+0x79/0x86
   ext4_direct_IO+0x5df/0xbb0
   generic_file_direct_write+0x119/0x220
   __generic_file_write_iter+0x131/0x2d0
   ext4_file_write_iter+0x3fa/0x710
   aio_write+0x235/0x330
   io_submit_one+0x510/0xeb0
   __x64_sys_io_submit+0x122/0x340
   do_syscall_64+0x71/0x220
   entry_SYSCALL_64_after_hwframe+0x49/0xbe

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Johannes Berg <johannes.berg@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Link: https://lkml.kernel.org/r/20190214230058.196511-20-bvanassche@acm.org
[ Reworked the changelog a bit. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-02-28 07:55:47 +01:00
Bart Van Assche
108c14858b locking/lockdep: Add support for dynamic keys
A shortcoming of the current lockdep implementation is that it requires
lock keys to be allocated statically. That forces all instances of lock
objects that occur in a given data structure to share a lock key. Since
lock dependency analysis groups lock objects per key sharing lock keys
can cause false positive lockdep reports. Make it possible to avoid
such false positive reports by allowing lock keys to be allocated
dynamically. Require that dynamically allocated lock keys are
registered before use by calling lockdep_register_key(). Complain about
attempts to register the same lock key pointer twice without calling
lockdep_unregister_key() between successive registration calls.

The purpose of the new lock_keys_hash[] data structure that keeps
track of all dynamic keys is twofold:

  - Verify whether the lockdep_register_key() and lockdep_unregister_key()
    functions are used correctly.

  - Avoid that lockdep_init_map() complains when encountering a dynamically
    allocated key.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: johannes.berg@intel.com
Cc: tj@kernel.org
Link: https://lkml.kernel.org/r/20190214230058.196511-19-bvanassche@acm.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-02-28 07:55:47 +01:00
Bart Van Assche
a0b0fd53e1 locking/lockdep: Free lock classes that are no longer in use
Instead of leaving lock classes that are no longer in use in the
lock_classes array, reuse entries from that array that are no longer in
use. Maintain a linked list of free lock classes with list head
'free_lock_class'. Only add freed lock classes to the free_lock_classes
list after a grace period to avoid that a lock_classes[] element would
be reused while an RCU reader is accessing it. Since the lockdep
selftests run in a context where sleeping is not allowed and since the
selftests require that lock resetting/zapping works with debug_locks
off, make the behavior of lockdep_free_key_range() and
lockdep_reset_lock() depend on whether or not these are called from
the context of the lockdep selftests.

Thanks to Peter for having shown how to modify get_pending_free()
such that that function does not have to sleep.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: johannes.berg@intel.com
Cc: tj@kernel.org
Link: https://lkml.kernel.org/r/20190214230058.196511-12-bvanassche@acm.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-02-28 07:55:43 +01:00
Bart Van Assche
cdc84d7949 locking/lockdep: Make it easy to detect whether or not inside a selftest
The patch that frees unused lock classes will modify the behavior of
lockdep_free_key_range() and lockdep_reset_lock() depending on whether
or not these functions are called from the context of the lockdep
selftests. Hence make it easy to detect whether or not lockdep code
is called from the context of a lockdep selftest.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: johannes.berg@intel.com
Cc: tj@kernel.org
Link: https://lkml.kernel.org/r/20190214230058.196511-10-bvanassche@acm.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-02-28 07:55:43 +01:00
Bart Van Assche
86cffb80a5 locking/lockdep: Make zap_class() remove all matching lock order entries
Make sure that all lock order entries that refer to a class are removed
from the list_entries[] array when a kernel module is unloaded.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: johannes.berg@intel.com
Cc: tj@kernel.org
Link: https://lkml.kernel.org/r/20190214230058.196511-7-bvanassche@acm.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-02-28 07:55:40 +01:00
Bart Van Assche
09329d1c20 locking/lockdep: Reorder struct lock_class members
This patch does not change any functionality but makes the patch that
frees lock classes that are no longer in use easier to read.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: johannes.berg@intel.com
Cc: tj@kernel.org
Link: https://lkml.kernel.org/r/20190214230058.196511-6-bvanassche@acm.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-02-28 07:55:40 +01:00
Peter Zijlstra
02e525b2af locking/percpu-rwsem: Remove preempt_disable variants
Effective revert commit:

  87709e28dc ("fs/locks: Use percpu_down_read_preempt_disable()")

This is causing major pain for PREEMPT_RT.

Sebastian did a lot of lockperf runs on 2 and 4 node machines with all
preemption modes (PREEMPT=n should be an obvious NOP for this patch
and thus serves as a good control) and no results showed significance
over 2-sigma (the PREEMPT=n results were almost empty at 1-sigma).

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-02-28 07:55:37 +01:00
Ingo Molnar
0614621d89 Merge branch 'linus' into locking/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-02-28 07:50:39 +01:00