Commit graph

56729 commits

Author SHA1 Message Date
Matthew Minter
47a650f279 PCI: Add pci_assign_irq() function and have pci_fixup_irqs() use it
Here we delete the static pdev_fixup_irq() function which is currently what
pci_fixup_irqs() uses to actually assign the IRQs and replace it with the
pci_assign_irq() function which changes the interface and uses the new
function pointers stored in the host bridge structure.

Eventually this will allow pci_fixup_irqs() to be removed entirely and the
new deferred assignment code path will call pci_assign_irq() directly.
However to ensure current users continue to work, a new implementation of
pci_fixup_irqs() is introduced which simply wraps the functionality of
pci_assign_irq().

Signed-off-by: Matthew Minter <matt@masarand.com>
[lorenzo.pieralisi@arm.com: reworked comments/log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2017-07-02 16:14:28 -05:00
Matthew Minter
3aa8a41e0b PCI: Add IRQ mapping function pointers to pci_host_bridge struct
In order to defer IRQ assignment arches must be able to register functions
to map and swizzle interrupts.  These registered functions are stored in
the pci_host_bridge struct.

Signed-off-by: Matthew Minter <matt@masarand.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2017-07-02 16:14:28 -05:00
Lorenzo Pieralisi
9ee8a1c4a0 PCI: Remove pci_scan_root_bus_msi()
The pci_scan_root_bus_bridge() function allows passing a parameterized
struct pci_host_bridge and scanning the resulting PCI bus; since the struct
msi_controller is part of the struct pci_host_bridge and the struct
pci_host_bridge can now be passed to pci_scan_root_bus_bridge() explicitly,
there is no need for a scan interface with a MSI controller parameter.

With all PCI host controller drivers and platform code relying on
pci_scan_root_bus_msi() converted over to pci_scan_root_bus_bridge() the
pci_scan_root_bus_msi() becomes obsolete and can be removed.

Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2017-07-02 16:14:27 -05:00
Kees Cook
5d6dec6fba locking/refcount: Remove the half-implemented refcount_sub() API
CONFIG_REFCOUNT_FULL=y (correctly) does not provide a refcount_sub(),
which should not be part of proper refcount design patterns.

Remove the erroneous extern and the later !CONFIG_REFCOUNT_FULL
accidental implementation.

Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Elena Reshetova <elena.reshetova@intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 29dee3c03a ("locking/refcounts: Out-of-line everything")
Link: http://lkml.kernel.org/r/20170701180129.GA17405@beast
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-02 11:24:36 +02:00
Lawrence Brakmo
40304b2a15 bpf: BPF support for sock_ops
Created a new BPF program type, BPF_PROG_TYPE_SOCK_OPS, and a corresponding
struct that allows BPF programs of this type to access some of the
socket's fields (such as IP addresses, ports, etc.). It uses the
existing bpf cgroups infrastructure so the programs can be attached per
cgroup with full inheritance support. The program will be called at
appropriate times to set relevant connections parameters such as buffer
sizes, SYN and SYN-ACK RTOs, etc., based on connection information such
as IP addresses, port numbers, etc.

Alghough there are already 3 mechanisms to set parameters (sysctls,
route metrics and setsockopts), this new mechanism provides some
distinct advantages. Unlike sysctls, it can set parameters per
connection. In contrast to route metrics, it can also use port numbers
and information provided by a user level program. In addition, it could
set parameters probabilistically for evaluation purposes (i.e. do
something different on 10% of the flows and compare results with the
other 90% of the flows). Also, in cases where IPv6 addresses contain
geographic information, the rules to make changes based on the distance
(or RTT) between the hosts are much easier than route metric rules and
can be global. Finally, unlike setsockopt, it oes not require
application changes and it can be updated easily at any time.

Although the bpf cgroup framework already contains a sock related
program type (BPF_PROG_TYPE_CGROUP_SOCK), I created the new type
(BPF_PROG_TYPE_SOCK_OPS) beccause the existing type expects to be called
only once during the connections's lifetime. In contrast, the new
program type will be called multiple times from different places in the
network stack code.  For example, before sending SYN and SYN-ACKs to set
an appropriate timeout, when the connection is established to set
congestion control, etc. As a result it has "op" field to specify the
type of operation requested.

The purpose of this new program type is to simplify setting connection
parameters, such as buffer sizes, TCP's SYN RTO, etc. For example, it is
easy to use facebook's internal IPv6 addresses to determine if both hosts
of a connection are in the same datacenter. Therefore, it is easy to
write a BPF program to choose a small SYN RTO value when both hosts are
in the same datacenter.

This patch only contains the framework to support the new BPF program
type, following patches add the functionality to set various connection
parameters.

This patch defines a new BPF program type: BPF_PROG_TYPE_SOCKET_OPS
and a new bpf syscall command to load a new program of this type:
BPF_PROG_LOAD_SOCKET_OPS.

Two new corresponding structs (one for the kernel one for the user/BPF
program):

/* kernel version */
struct bpf_sock_ops_kern {
        struct sock *sk;
        __u32  op;
        union {
                __u32 reply;
                __u32 replylong[4];
        };
};

/* user version
 * Some fields are in network byte order reflecting the sock struct
 * Use the bpf_ntohl helper macro in samples/bpf/bpf_endian.h to
 * convert them to host byte order.
 */
struct bpf_sock_ops {
        __u32 op;
        union {
                __u32 reply;
                __u32 replylong[4];
        };
        __u32 family;
        __u32 remote_ip4;     /* In network byte order */
        __u32 local_ip4;      /* In network byte order */
        __u32 remote_ip6[4];  /* In network byte order */
        __u32 local_ip6[4];   /* In network byte order */
        __u32 remote_port;    /* In network byte order */
        __u32 local_port;     /* In host byte horder */
};

Currently there are two types of ops. The first type expects the BPF
program to return a value which is then used by the caller (or a
negative value to indicate the operation is not supported). The second
type expects state changes to be done by the BPF program, for example
through a setsockopt BPF helper function, and they ignore the return
value.

The reply fields of the bpf_sockt_ops struct are there in case a bpf
program needs to return a value larger than an integer.

Signed-off-by: Lawrence Brakmo <brakmo@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-01 16:15:13 -07:00
David S. Miller
283131d20e NFC 4.13 pull request
This is the NFC pull requesy for 4.13. We have:
 
 - A conversion to unified device and GPIO APIs for the
   fdp, pn544, and st{21,-nci} drivers.
 - A fix for NFC device IDs allocation.
 - A fix for the nfcmrvl driver firmware download mechanism.
 - A trf7970a DT and GPIO cleanup and clock setting fix.
 - A few fixes for potential overflows in the digital and LLCP code.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJZVYXGAAoJEIqAPN1PVmxKuA0P/RhaNOWIFLSpVSPOO2/bhLj0
 oxlTW9y+Asy9MPfu5t4vllM/GqtpCYYBpfl35SDGViHlfSIPgrSDqQA2fNRT+hk1
 AwQR8HcCcptfzJ5vkXhJl/9XOUmAalHk1e7KqvNioXP1dAZbCw4/2YK8skzBnoeR
 sRYYSmsJNF1iZR6FSbtlPHWTZqD/SLRSvebDzWiYnltXMqBZeiwXmuEfRPrL/uxO
 C2iAG9Ol9bkxpxpm4jgVTFrW6equymYX0iW4rYKmmd4Ej+29iS8NWAsgK/hCUtNO
 jGrc9+1O/7MM2PoxGt70vojYMQWZpUmPWF2dvRx32+XWTcj96Nk3qOHEM62cQHMs
 OhvgUh9BurXzjtlp+ZipwJhpylczxWZ3TEFxyDTV91cuk+hdQ8xtOWseGgx7XKBy
 3kaYwlYKtiwACYBZ3b/4nzr+vNQ3GzL8VZoeGWzEyimofeWtLUWGuxtxFL4OaSeG
 X3NOZM+g4ILLKH9IKG4DkqWjAn5RD70Jf01z6GfHsfjyz5Awax9Hpp+B8dih/pmb
 4C4g5hX4X7L840Zc+l4WXMRLpH6AgL64orJmLBDxGeYAiPRlWSBHcMTrWSJuN/I4
 pJ++rMpbL2dMXaX8LBgEchtPmzQzVCsKxTXB29NyIbdZE6ebNAnZ9Gur/PwiBQqD
 wnb8HwFOyXgjKBFiA57P
 =Ve8v
 -----END PGP SIGNATURE-----

Merge tag 'nfc-next-4.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/nfc-next

Samuel Ortiz says:

====================
NFC 4.13 pull request

This is the NFC pull requesy for 4.13. We have:

- A conversion to unified device and GPIO APIs for the
  fdp, pn544, and st{21,-nci} drivers.
- A fix for NFC device IDs allocation.
- A fix for the nfcmrvl driver firmware download mechanism.
- A trf7970a DT and GPIO cleanup and clock setting fix.
- A few fixes for potential overflows in the digital and LLCP code.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-01 14:30:39 -07:00
David S. Miller
ea23b42739 mlx5-fixes-2017-06-28
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJZU3mtAAoJEEg/ir3gV/o+4k0IAKj5XCn3cviZlXMJRMHBvamt
 yWrMI90XgjoPhGPx3K9mf+bMhHOGiZR0Q2DFDJZa5U64DDBVPNvag7fy74GYgj1D
 Cet1zohkQ2xdb/R3jfML8tG2IVfvETWo3cgJGFtGUBlOULvpwinSK4A+8oUUGszc
 K1vAY0j3+Ncfjk+CZJ8hWqaIk1dyYtjtyn0ACOUOftqBa6+UZY7LbLTTOI7hOZoX
 3M35W7ntgGoBScONlxpDUXNUewia4ADTiQPWwHdT9+xNlwz1fzmCHlYi5pY+z9TC
 PKbbe1O4l1nsMftwqJVQNHrFnq+x/X69J5vlgobWkk0dQCRQWE9qanG8BfXPykY=
 =DUG8
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-fixes-2017-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
Mellanox, mlx5 fixes 2017-06-28

This series contains some fixes for the mlx5 core and netdev driver.

Please pull and let me know if there's any problem.

For -stable:
("net/mlx5e: Fix TX carrier errors report in get stats ndo") Kernels >= v4.7

("net/mlx5: Cancel delayed recovery work when unloading the driver") Kernels >= v4.10
* When applied to net-next this will introduce a contextual conflict, it
should be easy to resolve, (a spin_lock was changed to spin_lock_irqsave in net-next),
if you need any help with this please let me know.

("net/mlx5: Fix driver load error flow when firmware is stuck") Kernels >= v4.4*
* This patch fixes: 6c780a0267 ("net/mlx5: Wait for FW readiness before initializing command interface")
which was submitted two weeks ago and queued up for v4.4.

Sorry about the mess, but other than the above, this series doesn't introduce
any conflict with the current mlx5 IPSec offload series.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-01 14:11:48 -07:00
Xin Long
01a992bea5 sctp: remove the typedef sctp_init_chunk_t
This patch is to remove the typedef sctp_init_chunk_t, and replace
with struct sctp_init_chunk in the places where it's using this
typedef.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-01 09:08:42 -07:00
Xin Long
4ae70c0845 sctp: remove the typedef sctp_inithdr_t
This patch is to remove the typedef sctp_inithdr_t, and replace
with struct sctp_inithdr in the places where it's using this
typedef.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-01 09:08:42 -07:00
Xin Long
9f8d314715 sctp: remove the typedef sctp_data_chunk_t
This patch is to remove the typedef sctp_data_chunk_t, and replace
with struct sctp_data_chunk in the places where it's using this
typedef.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-01 09:08:42 -07:00
Xin Long
3583df1a3d sctp: remove the typedef sctp_datahdr_t
This patch is to remove the typedef sctp_datahdr_t, and replace with
struct sctp_datahdr in the places where it's using this typedef.

It is also to use izeof(variable) instead of sizeof(type).

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-01 09:08:41 -07:00
Xin Long
0664ed4378 sctp: remove the typedef sctp_param_action_t
Remove this typedef, there is even no places using it.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-01 09:08:41 -07:00
Xin Long
34b4e29b38 sctp: remove the typedef sctp_param_t
This patch is to remove the typedef sctp_param_t, and replace with
struct sctp_paramhdr in the places where it's using this typedef.

It is also to remove the useless declaration sctp_addip_addr_config
and fix the lack of params for some other functions' declaration.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-01 09:08:41 -07:00
Xin Long
3c91870492 sctp: remove the typedef sctp_paramhdr_t
This patch is to remove the typedef sctp_paramhdr_t, and replace
with struct sctp_paramhdr in the places where it's using this
typedef.

It is also to fix some indents and  use sizeof(variable) instead
of sizeof(type).

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-01 09:08:41 -07:00
Xin Long
ec431c2cd5 sctp: remove the typedef sctp_cid_action_t
Remove this typedef, there is even no places using it.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-01 09:08:41 -07:00
Xin Long
6d85e68f4c sctp: remove the typedef sctp_cid_t
This patch is to remove the typedef sctp_cid_t, and replace
with struct sctp_cid in the places where it's using this
typedef.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-01 09:08:41 -07:00
Xin Long
922dbc5be2 sctp: remove the typedef sctp_chunkhdr_t
This patch is to remove the typedef sctp_chunkhdr_t, and replace
with struct sctp_chunkhdr in the places where it's using this
typedef.

It is also to fix some indents and use sizeof(variable) instead
of sizeof(type)., especially in sctp_new.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-01 09:08:41 -07:00
Xin Long
ae146d9b76 sctp: remove the typedef sctp_sctphdr_t
This patch is to remove the typedef sctp_sctphdr_t, and replace
with struct sctphdr in the places where it's using this typedef.

It is also to fix some indents and use sizeof(variable) instead
of sizeof(type).

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-01 09:08:41 -07:00
Jerry Hoemann
7db5bb33ad libnvdimm, acpi, nfit: Add bus level dsm mask for pass thru.
Add a bus level dsm_mask to nvdimm_bus_descriptor to allow the passthru
calling mechanism to specify a different mask from the cmd_mask.

Populate bus_dsm_mask and use it to filter dsm calls that user can
make through the pass thru interface.

Signed-off-by: Jerry Hoemann <jerry.hoemann@hpe.com>
[djbw: use command number constants instead of a magic mask value]
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2017-07-01 08:49:59 -07:00
Reshetova, Elena
433cea4d9b net: convert netpoll_info.refcnt from atomic_t to refcount_t
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.

Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-01 07:39:08 -07:00
Reshetova, Elena
7658b36f1b net: convert in_device.refcnt from atomic_t to refcount_t
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.

Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-01 07:39:08 -07:00
Reshetova, Elena
8851ab5267 net: convert ip_mc_list.refcnt from atomic_t to refcount_t
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.

Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-01 07:39:08 -07:00
Reshetova, Elena
14afee4b60 net: convert sock.sk_wmem_alloc from atomic_t to refcount_t
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.

Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-01 07:39:08 -07:00
Reshetova, Elena
2638595afc net: convert sk_buff_fclones.fclone_ref from atomic_t to refcount_t
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.

Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-01 07:39:08 -07:00
Reshetova, Elena
633547973f net: convert sk_buff.users from atomic_t to refcount_t
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.

Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-01 07:39:07 -07:00
Reshetova, Elena
53869cebce net: convert nf_bridge_info.use from atomic_t to refcount_t
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.

Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-01 07:39:07 -07:00
Jakub Kicinski
dbd1877754 hashtable: remove repeated phrase from a comment
"in a rcu enabled hashtable" is repeated twice in a comment.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-06-30 13:49:53 -07:00
David S. Miller
b079115937 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
A set of overlapping changes in macvlan and the rocker
driver, nothing serious.

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-30 12:43:08 -04:00
David S. Miller
52a623bd61 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

The following patchset contains Netfilter updates for your net-next
tree. This batch contains connection tracking updates for the cleanup
iteration path, patches from Florian Westphal:

X) Skip unconfirmed conntracks in nf_ct_iterate_cleanup_net(), just set
   dying bit to let the CPU release them.

X) Add nf_ct_iterate_destroy() to be used on module removal, to kill
   conntrack from all namespace.

X) Restart iteration on hashtable resizing, since both may occur at
   the same time.

X) Use the new nf_ct_iterate_destroy() to remove conntrack with NAT
   mapping on module removal.

X) Use nf_ct_iterate_destroy() to remove conntrack entries helper
   module removal, from Liping Zhang.

X) Use nf_ct_iterate_cleanup_net() to remove the timeout extension
   if user requests this, also from Liping.

X) Add net_ns_barrier() and use it from FTP helper, so make sure
   no concurrent namespace removal happens at the same time while
   the helper module is being removed.

X) Use NFPROTO_MAX in layer 3 conntrack protocol array, to reduce
   module size. Same thing in nf_tables.

Updates for the nf_tables infrastructure:

X) Prepare usage of the extended ACK reporting infrastructure for
   nf_tables.

X) Remove unnecessary forward declaration in nf_tables hash set.

X) Skip set size estimation if number of element is not specified.

X) Changes to accomodate a (faster) unresizable hash set implementation,
   for anonymous sets and dynamic size fixed sets with no timeouts.

X) Faster lookup function for unresizable hash table for 2 and 4
   bytes key.

And, finally, a bunch of asorted small updates and cleanups:

X) Do not hold reference to netdev from ipt_CLUSTER, instead subscribe
   to device events and look up for index from the packet path, this
   is fixing an issue that is present since the very beginning, patch
   from Xin Long.

X) Use nf_register_net_hook() in ipt_CLUSTER, from Florian Westphal.

X) Use ebt_invalid_target() whenever possible in the ebtables tree,
   from Gao Feng.

X) Calm down compilation warning in nf_dup infrastructure, patch from
   stephen hemminger.

X) Statify functions in nftables rt expression, also from stephen.

X) Update Makefile to use canonical method to specify nf_tables-objs.
   From Jike Song.

X) Use nf_conntrack_helpers_register() in amanda and H323.

X) Space cleanup for ctnetlink, from linzhang.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-30 06:27:09 -07:00
Paolo Bonzini
04a7ea04d5 KVM/ARM updates for 4.13
- vcpu request overhaul
 - allow timer and PMU to have their interrupt number
   selected from userspace
 - workaround for Cavium erratum 30115
 - handling of memory poisonning
 - the usual crop of fixes and cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQJJBAABCAAzFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAllWCM0VHG1hcmMuenlu
 Z2llckBhcm0uY29tAAoJECPQ0LrRPXpDjJ0QAI16x6+trKhH31lTSYekYfqm4hZ2
 Fp7IbALW9KNCaY35tZov2Zuh99qGRduxTh7ewqhKpON8kkU+UKj0F7zH22+vfN4m
 yas/+uNr8R9VLyvea4ysPsgx8Q8v1Ix9setohHYNZIL9/klVqtaHpYvArHVF/mzq
 p2j/NxRS2dlp9r2TtoMRMhA05u6r0wolhUuh+z9v2ipib0gfOBIG24jsqCTEcD9n
 5A/cVd+ztYshkrV95h3y9peahwt3zOA4QBGzrQ2K25jp0s54nqhmC7JTNSa8dtar
 YGW2MuAMoIFTwCFAlpwCzrwpOJFzF3Q6A8bOxei2fjclzjPMgT1xQxuhOoe4ntFa
 lTPxSHalm5W6dFTW90YSo2DBcPe+N7sQkhjR0cCeY3GYsOFhXMLTlOl5Pt1YK1or
 +3FAI74tFRKvVmb9mhZeGTvuzhDgRvtf3Qq5rjwlGzKc2BBOEgtMyj/Wgwo4N6Dz
 IjOnoRaUGELoBCWoTorMxLpsPBdPVSUxNyJTdAhqZ/ZtT1xqjhFNLZcrVWmOTzDM
 1cav+jZkla4sLmJSNDD54aCSvvtPHis0nZn9PRlh12xgOyYiAVx4K++MNuWP0P37
 hbh1gbPT+FcoVxPurUsX/pjNlTucPZcBwFytZDQlpwtPBpEFzJiImLYe/PldRb0f
 9WQOH1Y1+q14MF+N
 =6hNK
 -----END PGP SIGNATURE-----

Merge tag 'kvmarm-for-4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/ARM updates for 4.13

- vcpu request overhaul
- allow timer and PMU to have their interrupt number
  selected from userspace
- workaround for Cavium erratum 30115
- handling of memory poisonning
- the usual crop of fixes and cleanups

Conflicts:
	arch/s390/include/asm/kvm_host.h
2017-06-30 12:38:26 +02:00
Deepa Dinamani
c0edd7c9ac nanosleep: Use get_timespec64() and put_timespec64()
Usage of these apis and their compat versions makes
the syscalls: clock_nanosleep and nanosleep and
their compat implementations simpler.

This is a preparatory patch to isolate data conversions to
struct timespec64 at userspace boundaries. This helps contain
the changes needed to transition to new y2038 safe types.

Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-06-30 04:14:14 -04:00
Peter Oh
3cb57df37b ieee80211: update public action codes
Update Public Action field values as updated in IEEE Std 802.11-2016,
so that modules/drivers can refer it.

Signed-off-by: Peter Oh <peter.oh@bowerswilkins.com>
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2017-06-30 09:47:24 +03:00
Al Viro
aa28de275a iov_iter/hardening: move object size checks to inlined part
There we actually have useful information about object sizes.
Note: this patch has them done for all iov_iter flavours.
Right now we do them twice in iovec case, but that'll change
very shortly.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-06-29 22:21:22 -04:00
Al Viro
b0377fedb6 copy_{to,from}_user(): consolidate object size checks
... and move them into thread_info.h, next to check_object_size()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-06-29 22:21:21 -04:00
Al Viro
9c5f6908de copy_{from,to}_user(): move kasan checks and might_fault() out-of-line
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-06-29 22:21:20 -04:00
Christoph Hellwig
abbb65899a fs: implement vfs_iter_write using do_iter_write
De-dupliate some code and allow for passing the flags argument to
vfs_iter_write.  Additionally it now properly updates timestamps.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-06-29 17:49:23 -04:00
Christoph Hellwig
18e9710ee5 fs: implement vfs_iter_read using do_iter_read
De-dupliate some code and allow for passing the flags argument to
vfs_iter_read.  Additional it properly updates atime now.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-06-29 17:49:23 -04:00
Vishal Verma
14e4945426 libnvdimm, btt: BTT updates for UEFI 2.7 format
The UEFI 2.7 specification defines an updated BTT metadata format,
bumping the revision to 2.0. Add support for the new format, while
retaining compatibility for the old 1.1 format.

Cc: Toshi Kani <toshi.kani@hpe.com>
Cc: Linda Knippers <linda.knippers@hpe.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2017-06-29 13:50:38 -07:00
Martin KaFai Lau
14dc6f04f4 bpf: Add syscall lookup support for fd array and htab
This patch allows userspace to do BPF_MAP_LOOKUP_ELEM on
BPF_MAP_TYPE_PROG_ARRAY,
BPF_MAP_TYPE_ARRAY_OF_MAPS and
BPF_MAP_TYPE_HASH_OF_MAPS.

The lookup returns a prog-id or map-id to the userspace.
The userspace can then use the BPF_PROG_GET_FD_BY_ID
or BPF_MAP_GET_FD_BY_ID to get a fd.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-29 13:13:25 -04:00
Dan Williams
0b277961f4 libnvdimm, pmem: disable dax flushing when pmem is fronting a volatile region
The pmem driver attaches to both persistent and volatile memory ranges
advertised by the ACPI NFIT. When the region is volatile it is redundant
to spend cycles flushing caches at fsync(). Check if the hosting region
is volatile and do not set dax_write_cache() if it is.

Cc: Jan Kara <jack@suse.cz>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2017-06-29 09:29:50 -07:00
Dan Williams
6e0c90d691 libnvdimm, pmem, dax: export a cache control attribute
The dax_flush() operation can be turned into a nop on platforms where
firmware arranges for cpu caches to be flushed on a power-fail event.
The ACPI 6.2 specification defines a mechanism for the platform to
indicate this capability so the kernel can select the proper default.
However, for other platforms, the administrator must toggle this setting
manually.

Given this flush setting is a dax-specific mechanism we advertise it
through a 'dax' attribute group hanging off a host device. For example,
a 'pmem0' block-device gets a 'dax' sysfs-subdirectory with a
'write_cache' attribute to control response to dax cache flush requests.
This is similar to the 'queue/write_cache' attribute that appears under
block devices.

Cc: Jan Kara <jack@suse.cz>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2017-06-29 09:29:50 -07:00
Arnd Bergmann
ffe3744a59 Actions Semi SoC drivers for 4.13
This adds clock source and power domain drivers for S500/S900.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABAgAGBQJZTUdyAAoJEPou0S0+fgE/cxIQALU/QhWWSaHwbGnavU9gKbar
 lOwfoSmdwlrCR7hUCxH14o11fF03fXxWdEGz825DJjZghEIVn7VEW521DzXzA1dy
 bF6s95seZXYKUwtcPERgxbRP89itr/cTZiIGnEeKFfAIU1lmd5KNYHme05+e2bcQ
 G6St+8NbtqKzMaXSaDIHiHpA2ewiMmnsg9exOCcFFYCe2qUzKVOYFtbqeb+cldML
 yUIfgRQnGHiVqhbzx/aSwwkmZic+j7VCwScaZEL6THJ0pem+ayuTCBdvk9LYk/aM
 cksZf1Jc5RwFi6ErybpWkVBE0U2voSyNHoK10dmUy0kyS2hj3ttXjtu1yFHV/K/5
 /Ebe+aUP37sMoDQn/ZcAEY1+LV4RieL/8ePeUgH1dZ/+57T4Zapq2jE2iUazotF5
 VleN1Gn6UDUnPDTXUzwTSPaE2Kw85TMKVeubOy/mbJ71E1ggB02xzHkI8nhw9xMX
 SzRXykR/lIrWyDufmRDiyczMhNcClgiVrxcwU0VrkgJKXrWa5Nj2xyo06GsPjftd
 27P11Hi7aWFwZADw57k7h61yxVzJT9FpSp2ekYipDkgw6VAWiPvkbe90nolGI0zb
 3gzWWxOZnqxk1gdfY3eTdLxAhFXituC/iSxAFSmWeLtL6OcFVQqZaJgi5MDL7vPg
 afrLMF5xasvqA2+zudak
 =FpP/
 -----END PGP SIGNATURE-----

Merge tag 'actions-drivers-for-4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/afaerber/linux-actions into next/drivers

Pull "Actions Semi SoC drivers for 4.13" from Andreas Färber:

This adds clock source and power domain drivers for S500/S900.

* tag 'actions-drivers-for-4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/afaerber/linux-actions:
  soc: actions: owl-sps: Factor out owl_sps_set_pg() for power-gating
  soc: actions: Add Owl SPS
  dt-bindings: power: Add Owl SPS power domains
  clocksource: owl: Add S900 support
  clocksource: Add Owl timer
2017-06-29 17:34:57 +02:00
Arnd Bergmann
c070d6ba25 Actions Semi ARM SoC for v4.13 #2
This adds SMP code to bring up the remaining S500 CPU cores
 by reusing a helper factored out of the SPS power domains driver.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABAgAGBQJZTUjlAAoJEPou0S0+fgE/yxgP/1J/t1eSTX9l+NkP6bxmsKNk
 k4KKCwhch57Mw14iPTysOaezAn/2YNVLEYKSa0JNYOmrUA4Y2gyra0LhAejh7Jd8
 Y3FGdYbbgRHTY0DLAK+vMVnRjHrUu7O+Xgngf6bNRdwL4+qBGTxHMvdoFaWRdZZv
 6bjKAeDAkCKQ62xXTdT9EaKFSPKzLgYyT+7ey3JE5J6MlKVvzjxyG4HtFm9DYpH5
 LvuQ1oZdtVs2Ils8lOF8Z8VRPaKCP4XhnYIqQ4fP9tvaF6lL4R0xYXtMWG4aRpXT
 +g50M0+iE61BUyvCbBsT8eEUnasmmYxtI9eeAj9/HZ3vOem3zuZ4HxbUv+Ss39fS
 39IwQvallLpykKRpykbCXfIyLxIO1XKQk/UQwwSB0nD3QETPmbJB8pbel7LH3VFW
 /hdF7aDq8HlL5hvXp9PLC9/avNCkuZWrjhyj+qUZePpRaF6xi0VPDybpJLfoGfw2
 OPf6JNq317Bw+OII7TNzYMPclCb4UU0+n1wLQyMaDRfc0Riplec8hUQlMTg3aTsZ
 Oj+5g4+m19yahoQEWf6DyXRLk7JQEpbuKhmD2HzC5wJbsfU1COcx9WN2BZeRWGTS
 AD++5WYXwaNdskbGt5UpFvqipWmWq6UVAMolid29f5jN1jUyEJ6ychq307kakhWU
 nYJwn+KDIaICR9KmFy19
 =VNDJ
 -----END PGP SIGNATURE-----

Merge tag 'actions-arm-soc+sps-for-4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/afaerber/linux-actions into next/soc

Pull "Actions Semi ARM SoC for v4.13 #2" from Andreas Färber:

This adds SMP code to bring up the remaining S500 CPU cores
by reusing a helper factored out of the SPS power domains driver.

* tag 'actions-arm-soc+sps-for-4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/afaerber/linux-actions:
  ARM: owl: smp: Implement SPS power-gating for CPU2 and CPU3
  soc: actions: owl-sps: Factor out owl_sps_set_pg() for power-gating
  soc: actions: Add Owl SPS
  dt-bindings: power: Add Owl SPS power domains
2017-06-29 17:33:41 +02:00
Jacopo Mondi
425562429d pinctrl: generic: Add output-enable property
Add output-enable generic pin configuration property.
This properties allows enabling/disabling pin's output capabilities
without actually driving any value on the line.

Acked-by: Rob Herring <robh@kernel.org>
[Added inline elaborations on buffer enabling/disabling]
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2017-06-29 14:30:49 +02:00
Linus Walleij
6183061967 Linux 4.12-rc7
-----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJZUGOmAAoJEHm+PkMAQRiGhX8H/3fIhingPD01MBf98U0xGrJo
 yIXmhu6nFs7TM0lDVDcHsKgqLQIT69ll7PrSZrMkc1RGUIPINoCuJVuJqDre0kfB
 of5TX2KegqSx8h1vOWjGBCBjdYfPGyMdf9icf6KsGc/SlIdhN6WA99kglAjJA0Ve
 qPTNagF0ntUNg1lsXffxyfcHqFpyqw/Z/C4ie/byFsn9iJ1VG9mNlTWSud09vhuM
 3tvHzTUVAIWWuRrrgrvgqQpnwL+q5BfSDsXScMjBau0EK3RGGqG8EN6Kbkfa7VQ6
 aBoeboQjUijSJnVwvySdQ11MChTIOwZdfrNPra/1HD3WJNsSu4BIRt5JcAKcOhc=
 =qmSg
 -----END PGP SIGNATURE-----

Merge tag 'v4.12-rc7' into devel

Linux 4.12-rc7
2017-06-29 14:27:39 +02:00
Arnd Bergmann
0607512d0a ras: mark stub functions as 'inline'
With CONFIG_RAS disabled, we get two harmless warnings about
unused functions:

include/linux/ras.h:37:13: error: 'log_arm_hw_error' defined but not used [-Werror=unused-function]
 static void log_arm_hw_error(struct cper_sec_proc_arm *err) { return; }
include/linux/ras.h:33:13: error: 'log_non_standard_event' defined but not used [-Werror=unused-function]
 static void log_non_standard_event(const guid_t *sec_type,

Clearly these are meant to be 'inline', like the other stubs
in the same header.

Fixes: 297b64c743 ("ras: acpi / apei: generate trace event for unrecognized CPER section")
Fixes: e9279e83ad ("trace, ras: add ARM processor error trace event")
Acked-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-06-29 10:48:57 +01:00
Jens Axboe
9ae3b3f52c block: provide bio_uninit() free freeing integrity/task associations
Wen reports significant memory leaks with DIF and O_DIRECT:

"With nvme devive + T10 enabled, On a system it has 256GB and started
logging /proc/meminfo & /proc/slabinfo for every minute and in an hour
it increased by 15968128 kB or ~15+GB.. Approximately 256 MB / minute
leaking.

/proc/meminfo | grep SUnreclaim...

SUnreclaim:      6752128 kB
SUnreclaim:      6874880 kB
SUnreclaim:      7238080 kB
....
SUnreclaim:     22307264 kB
SUnreclaim:     22485888 kB
SUnreclaim:     22720256 kB

When testcases with T10 enabled call into __blkdev_direct_IO_simple,
code doesn't free memory allocated by bio_integrity_alloc. The patch
fixes the issue. HTX has been run with +60 hours without failure."

Since __blkdev_direct_IO_simple() allocates the bio on the stack, it
doesn't go through the regular bio free. This means that any ancillary
data allocated with the bio through the stack is not freed. Hence, we
can leak the integrity data associated with the bio, if the device is
using DIF/DIX.

Fix this by providing a bio_uninit() and export it, so that we can use
it to free this data. Note that this is a minimal fix for this issue.
Any current user of bio's that are allocated outside of
bio_alloc_bioset() suffers from this issue, most notably some drivers.
We will fix those in a more comprehensive patch for 4.13. This also
means that the commit marked as being fixed by this isn't the real
culprit, it's just the most obvious one out there.

Fixes: 542ff7bf18 ("block: new direct I/O implementation")
Reported-by: Wen Xiong <wenxiong@linux.vnet.ibm.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-06-28 15:30:13 -06:00
Christoph Hellwig
4b855ad371 blk-mq: Create hctx for each present CPU
Currently we only create hctx for online CPUs, which can lead to a lot
of churn due to frequent soft offline / online operations.  Instead
allocate one for each present CPU to avoid this and dramatically simplify
the code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Cc: Keith Busch <keith.busch@intel.com>
Cc: linux-block@vger.kernel.org
Cc: linux-nvme@lists.infradead.org
Link: http://lkml.kernel.org/r/20170626102058.10200-3-hch@lst.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-06-28 23:00:07 +02:00
Lorenzo Pieralisi
cea9bc0be6 PCI: Make pci_register_host_bridge() PCI core internal
With the introduction of pci_scan_root_bus_bridge() there is no need to
export pci_register_host_bridge() to other kernel subsystems other than the
PCI compilation unit that needs it.

Make pci_register_host_bridge() static to its compilation unit and convert
the existing drivers usage over to pci_scan_root_bus_bridge().

Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Arnd Bergmann <arnd@arndb.de>
2017-06-28 15:13:55 -05:00
Lorenzo Pieralisi
1228c4b6c1 PCI: Add pci_scan_root_bus_bridge() interface
The current pci_scan_root_bus() interface is made up of two main code
paths:

  - pci_create_root_bus()
  - pci_scan_child_bus()

pci_create_root_bus() is a wrapper function that allows to create a struct
pci_host_bridge structure, initialize it with the passed parameters and
register it with the kernel.

As the struct pci_host_bridge require additional struct members,
pci_create_root_bus() parameters list has grown in time, making it unwieldy
to add further parameters to it in case the struct pci_host_bridge gains
more members fields to augment its functionality.

Since PCI core code provides functions to allocate struct pci_host_bridge,
instead of forcing the pci_create_root_bus() interface to add new
parameters to cater for new struct pci_host_bridge functionality, it is
more suitable to add an interface in PCI core code to scan a PCI bus
straight from a struct pci_host_bridge created and customized by each
specific PCI host controller driver.

Add a pci_scan_root_bus_bridge() function to allow PCI host controller
drivers to create and initialize struct pci_host_bridge and scan the
resulting bus.

Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Arnd Bergmann <arnd@arndb.de>
2017-06-28 15:13:55 -05:00