Commit graph

915365 commits

Author SHA1 Message Date
Jann Horn
604dca5e3a bpf: Fix tnum constraints for 32-bit comparisons
The BPF verifier tried to track values based on 32-bit comparisons by
(ab)using the tnum state via 581738a681 ("bpf: Provide better register
bounds after jmp32 instructions"). The idea is that after a check like
this:

    if ((u32)r0 > 3)
      exit

We can't meaningfully constrain the arithmetic-range-based tracking, but
we can update the tnum state to (value=0,mask=0xffff'ffff'0000'0003).
However, the implementation from 581738a681 didn't compute the tnum
constraint based on the fixed operand, but instead derives it from the
arithmetic-range-based tracking. This means that after the following
sequence of operations:

    if (r0 >= 0x1'0000'0001)
      exit
    if ((u32)r0 > 7)
      exit

The verifier assumed that the lower half of r0 is in the range (0, 0)
and apply the tnum constraint (value=0,mask=0xffff'ffff'0000'0000) thus
causing the overall tnum to be (value=0,mask=0x1'0000'0000), which was
incorrect. Provide a fixed implementation.

Fixes: 581738a681 ("bpf: Provide better register bounds after jmp32 instructions")
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200330160324.15259-3-daniel@iogearbox.net
2020-03-30 11:53:52 -07:00
Daniel Borkmann
f2d67fec0b bpf: Undo incorrect __reg_bound_offset32 handling
Anatoly has been fuzzing with kBdysch harness and reported a hang in
one of the outcomes:

  0: (b7) r0 = 808464432
  1: (7f) r0 >>= r0
  2: (14) w0 -= 808464432
  3: (07) r0 += 808464432
  4: (b7) r1 = 808464432
  5: (de) if w1 s<= w0 goto pc+0
   R0_w=invP(id=0,umin_value=808464432,umax_value=5103431727,var_off=(0x30303020;0x10000001f)) R1_w=invP808464432 R10=fp0
  6: (07) r0 += -2144337872
  7: (14) w0 -= -1607454672
  8: (25) if r0 > 0x30303030 goto pc+0
   R0_w=invP(id=0,umin_value=271581184,umax_value=271581311,var_off=(0x10300000;0x7f)) R1_w=invP808464432 R10=fp0
  9: (76) if w0 s>= 0x303030 goto pc+2
  12: (95) exit

  from 8 to 9: safe

  from 5 to 6: R0_w=invP(id=0,umin_value=808464432,umax_value=5103431727,var_off=(0x30303020;0x10000001f)) R1_w=invP808464432 R10=fp0
  6: (07) r0 += -2144337872
  7: (14) w0 -= -1607454672
  8: (25) if r0 > 0x30303030 goto pc+0
   R0_w=invP(id=0,umin_value=271581184,umax_value=271581311,var_off=(0x10300000;0x7f)) R1_w=invP808464432 R10=fp0
  9: safe

  from 8 to 9: safe
  verification time 589 usec
  stack depth 0
  processed 17 insns (limit 1000000) [...]

The underlying program was xlated as follows:

  # bpftool p d x i 9
   0: (b7) r0 = 808464432
   1: (7f) r0 >>= r0
   2: (14) w0 -= 808464432
   3: (07) r0 += 808464432
   4: (b7) r1 = 808464432
   5: (de) if w1 s<= w0 goto pc+0
   6: (07) r0 += -2144337872
   7: (14) w0 -= -1607454672
   8: (25) if r0 > 0x30303030 goto pc+0
   9: (76) if w0 s>= 0x303030 goto pc+2
  10: (05) goto pc-1
  11: (05) goto pc-1
  12: (95) exit

The verifier rewrote original instructions it recognized as dead code with
'goto pc-1', but reality differs from verifier simulation in that we're
actually able to trigger a hang due to hitting the 'goto pc-1' instructions.

Taking different examples to make the issue more obvious: in this example
we're probing bounds on a completely unknown scalar variable in r1:

  [...]
  5: R0_w=inv1 R1_w=inv(id=0) R10=fp0
  5: (18) r2 = 0x4000000000
  7: R0_w=inv1 R1_w=inv(id=0) R2_w=inv274877906944 R10=fp0
  7: (18) r3 = 0x2000000000
  9: R0_w=inv1 R1_w=inv(id=0) R2_w=inv274877906944 R3_w=inv137438953472 R10=fp0
  9: (18) r4 = 0x400
  11: R0_w=inv1 R1_w=inv(id=0) R2_w=inv274877906944 R3_w=inv137438953472 R4_w=inv1024 R10=fp0
  11: (18) r5 = 0x200
  13: R0_w=inv1 R1_w=inv(id=0) R2_w=inv274877906944 R3_w=inv137438953472 R4_w=inv1024 R5_w=inv512 R10=fp0
  13: (2d) if r1 > r2 goto pc+4
   R0_w=inv1 R1_w=inv(id=0,umax_value=274877906944,var_off=(0x0; 0x7fffffffff)) R2_w=inv274877906944 R3_w=inv137438953472 R4_w=inv1024 R5_w=inv512 R10=fp0
  14: R0_w=inv1 R1_w=inv(id=0,umax_value=274877906944,var_off=(0x0; 0x7fffffffff)) R2_w=inv274877906944 R3_w=inv137438953472 R4_w=inv1024 R5_w=inv512 R10=fp0
  14: (ad) if r1 < r3 goto pc+3
   R0_w=inv1 R1_w=inv(id=0,umin_value=137438953472,umax_value=274877906944,var_off=(0x0; 0x7fffffffff)) R2_w=inv274877906944 R3_w=inv137438953472 R4_w=inv1024 R5_w=inv512 R10=fp0
  15: R0=inv1 R1=inv(id=0,umin_value=137438953472,umax_value=274877906944,var_off=(0x0; 0x7fffffffff)) R2=inv274877906944 R3=inv137438953472 R4=inv1024 R5=inv512 R10=fp0
  15: (2e) if w1 > w4 goto pc+2
   R0=inv1 R1=inv(id=0,umin_value=137438953472,umax_value=274877906944,var_off=(0x0; 0x7f00000000)) R2=inv274877906944 R3=inv137438953472 R4=inv1024 R5=inv512 R10=fp0
  16: R0=inv1 R1=inv(id=0,umin_value=137438953472,umax_value=274877906944,var_off=(0x0; 0x7f00000000)) R2=inv274877906944 R3=inv137438953472 R4=inv1024 R5=inv512 R10=fp0
  16: (ae) if w1 < w5 goto pc+1
   R0=inv1 R1=inv(id=0,umin_value=137438953472,umax_value=274877906944,var_off=(0x0; 0x7f00000000)) R2=inv274877906944 R3=inv137438953472 R4=inv1024 R5=inv512 R10=fp0
  [...]

We're first probing lower/upper bounds via jmp64, later we do a similar
check via jmp32 and examine the resulting var_off there. After fall-through
in insn 14, we get the following bounded r1 with 0x7fffffffff unknown marked
bits in the variable section.

Thus, after knowing r1 <= 0x4000000000 and r1 >= 0x2000000000:

  max: 0b100000000000000000000000000000000000000 / 0x4000000000
  var: 0b111111111111111111111111111111111111111 / 0x7fffffffff
  min: 0b010000000000000000000000000000000000000 / 0x2000000000

Now, in insn 15 and 16, we perform a similar probe with lower/upper bounds
in jmp32.

Thus, after knowing r1 <= 0x4000000000 and r1 >= 0x2000000000 and
                    w1 <= 0x400        and w1 >= 0x200:

  max: 0b100000000000000000000000000000000000000 / 0x4000000000
  var: 0b111111100000000000000000000000000000000 / 0x7f00000000
  min: 0b010000000000000000000000000000000000000 / 0x2000000000

The lower/upper bounds haven't changed since they have high bits set in
u64 space and the jmp32 tests can only refine bounds in the low bits.

However, for the var part the expectation would have been 0x7f000007ff
or something less precise up to 0x7fffffffff. A outcome of 0x7f00000000
is not correct since it would contradict the earlier probed bounds
where we know that the result should have been in [0x200,0x400] in u32
space. Therefore, tests with such info will lead to wrong verifier
assumptions later on like falsely predicting conditional jumps to be
always taken, etc.

The issue here is that __reg_bound_offset32()'s implementation from
commit 581738a681 ("bpf: Provide better register bounds after jmp32
instructions") makes an incorrect range assumption:

  static void __reg_bound_offset32(struct bpf_reg_state *reg)
  {
        u64 mask = 0xffffFFFF;
        struct tnum range = tnum_range(reg->umin_value & mask,
                                       reg->umax_value & mask);
        struct tnum lo32 = tnum_cast(reg->var_off, 4);
        struct tnum hi32 = tnum_lshift(tnum_rshift(reg->var_off, 32), 32);

        reg->var_off = tnum_or(hi32, tnum_intersect(lo32, range));
  }

In the above walk-through example, __reg_bound_offset32() as-is chose
a range after masking with 0xffffffff of [0x0,0x0] since umin:0x2000000000
and umax:0x4000000000 and therefore the lo32 part was clamped to 0x0 as
well. However, in the umin:0x2000000000 and umax:0x4000000000 range above
we'd end up with an actual possible interval of [0x0,0xffffffff] for u32
space instead.

In case of the original reproducer, the situation looked as follows at
insn 5 for r0:

  [...]
  5: R0_w=invP(id=0,umin_value=808464432,umax_value=5103431727,var_off=(0x0; 0x1ffffffff)) R1_w=invP808464432 R10=fp0
                               0x30303030           0x13030302f
  5: (de) if w1 s<= w0 goto pc+0
   R0_w=invP(id=0,umin_value=808464432,umax_value=5103431727,var_off=(0x30303020; 0x10000001f)) R1_w=invP808464432 R10=fp0
                             0x30303030           0x13030302f
  [...]

After the fall-through, we similarly forced the var_off result into
the wrong range [0x30303030,0x3030302f] suggesting later on that fixed
bits must only be of 0x30303020 with 0x10000001f unknowns whereas such
assumption can only be made when both bounds in hi32 range match.

Originally, I was thinking to fix this by moving reg into a temp reg and
use proper coerce_reg_to_size() helper on the temp reg where we can then
based on that define the range tnum for later intersection:

  static void __reg_bound_offset32(struct bpf_reg_state *reg)
  {
        struct bpf_reg_state tmp = *reg;
        struct tnum lo32, hi32, range;

        coerce_reg_to_size(&tmp, 4);
        range = tnum_range(tmp.umin_value, tmp.umax_value);
        lo32 = tnum_cast(reg->var_off, 4);
        hi32 = tnum_lshift(tnum_rshift(reg->var_off, 32), 32);
        reg->var_off = tnum_or(hi32, tnum_intersect(lo32, range));
  }

In the case of the concrete example, this gives us a more conservative unknown
section. Thus, after knowing r1 <= 0x4000000000 and r1 >= 0x2000000000 and
                             w1 <= 0x400        and w1 >= 0x200:

  max: 0b100000000000000000000000000000000000000 / 0x4000000000
  var: 0b111111111111111111111111111111111111111 / 0x7fffffffff
  min: 0b010000000000000000000000000000000000000 / 0x2000000000

However, above new __reg_bound_offset32() has no effect on refining the
knowledge of the register contents. Meaning, if the bounds in hi32 range
mismatch we'll get the identity function given the range reg spans
[0x0,0xffffffff] and we cast var_off into lo32 only to later on binary
or it again with the hi32.

Likewise, if the bounds in hi32 range match, then we mask both bounds
with 0xffffffff, use the resulting umin/umax for the range to later
intersect the lo32 with it. However, _prior_ called __reg_bound_offset()
did already such intersection on the full reg and we therefore would only
repeat the same operation on the lo32 part twice.

Given this has no effect and the original commit had false assumptions,
this patch reverts the code entirely which is also more straight forward
for stable trees: apparently 581738a681 got auto-selected by Sasha's
ML system and misclassified as a fix, so it got sucked into v5.4 where
it should never have landed. A revert is low-risk also from a user PoV
since it requires a recent kernel and llc to opt-into -mcpu=v3 BPF CPU
to generate jmp32 instructions. A proper bounds refinement would need a
significantly more complex approach which is currently being worked, but
no stable material [0]. Hence revert is best option for stable. After the
revert, the original reported program gets rejected as follows:

  1: (7f) r0 >>= r0
  2: (14) w0 -= 808464432
  3: (07) r0 += 808464432
  4: (b7) r1 = 808464432
  5: (de) if w1 s<= w0 goto pc+0
   R0_w=invP(id=0,umin_value=808464432,umax_value=5103431727,var_off=(0x0; 0x1ffffffff)) R1_w=invP808464432 R10=fp0
  6: (07) r0 += -2144337872
  7: (14) w0 -= -1607454672
  8: (25) if r0 > 0x30303030 goto pc+0
   R0_w=invP(id=0,umax_value=808464432,var_off=(0x0; 0x3fffffff)) R1_w=invP808464432 R10=fp0
  9: (76) if w0 s>= 0x303030 goto pc+2
   R0=invP(id=0,umax_value=3158063,var_off=(0x0; 0x3fffff)) R1=invP808464432 R10=fp0
  10: (30) r0 = *(u8 *)skb[808464432]
  BPF_LD_[ABS|IND] uses reserved fields
  processed 11 insns (limit 1000000) [...]

  [0] https://lore.kernel.org/bpf/158507130343.15666.8018068546764556975.stgit@john-Precision-5820-Tower/T/

Fixes: 581738a681 ("bpf: Provide better register bounds after jmp32 instructions")
Reported-by: Anatoly Trosinenko <anatoly.trosinenko@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200330160324.15259-2-daniel@iogearbox.net
2020-03-30 11:53:52 -07:00
David S. Miller
2d39eab45b Merge branch 'split-phylink-PCS-operations'
Russell King says:

====================
split phylink PCS operations

This series splits the phylink_mac_ops structure so that PCS can be
supported separately with their own PCS operations, separating them
from the MAC layer.  This may need adaption later as more users come
along.

v2: change pcs_config() and associated called function prototypes to
only pass the information that is required, and add some documention.

v3: change phylink_create() prototype
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30 11:52:28 -07:00
Russell King
4c0d6d3a7a net: phylink: add separate pcs operations structure
Add a separate set of PCS operations, which MAC drivers can use to
couple phylink with their associated MAC PCS layer.  The PCS
operations include:

- pcs_get_state() - reads the link up/down, resolved speed, duplex
   and pause from the PCS.
- pcs_config() - configures the PCS for the specified mode, PHY
   interface type, and setting the advertisement.
- pcs_an_restart() - restarts 802.3 in-band negotiation with the
   link partner
- pcs_link_up() - informs the PCS that link has come up, and the
   parameters of the link. Link parameters are used to program the
   PCS for fixed speed and non-inband modes.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30 11:52:27 -07:00
Russell King
e7765d634a net: phylink: rename 'ops' to 'mac_ops'
Rename the bland 'ops' member of struct phylink to be a more
descriptive 'mac_ops' - this is necessary as we're about to introduce
another set of operations.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30 11:52:27 -07:00
Russell King
0bd274060a net: phylink: change phylink_mii_c22_pcs_set_advertisement() prototype
Change phylink_mii_c22_pcs_set_advertisement() to take only the PHY
interface and advertisement mask, rather than the full phylink state.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30 11:52:27 -07:00
Heiner Kallweit
b8447abc4c r8169: factor out rtl8169_tx_map
Factor out mapping the tx skb to a new function rtl8169_tx_map(). This
allows to remove redundancies, and rtl8169_get_txd_opts1() has only
one user left, so it can be inlined.
As a result rtl8169_xmit_frags() is significantly simplified, and in
rtl8169_start_xmit() the code is simplified and better readable.
No functional change intended.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30 11:50:20 -07:00
David S. Miller
033c6f3b78 Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next
Johan Hedberg says:

====================
pull request: bluetooth-next 2020-03-29

Here are a few more Bluetooth patches for the 5.7 kernel:

 - Fix assumption of encryption key size when reading fails
 - Add support for DEFER_SETUP with L2CAP Enhanced Credit Based Mode
 - Fix issue with auto-connected devices
 - Fix suspend handling when entering the state fails
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30 11:49:14 -07:00
Yuval Basson
8063f761cd qed: Fix use after free in qed_chain_free
The qed_chain data structure was modified in
commit 1a4a69751f ("qed: Chain support for external PBL") to support
receiving an external pbl (due to iWARP FW requirements).
The pages pointed to by the pbl are allocated in qed_chain_alloc
and their virtual address are stored in an virtual addresses array to
enable accessing and freeing the data. The physical addresses however
weren't stored and were accessed directly from the external-pbl
during free.

Destroy-qp flow, leads to freeing the external pbl before the chain is
freed, when the chain is freed it tries accessing the already freed
external pbl, leading to a use-after-free. Therefore we need to store
the physical addresses in additional to the virtual addresses in a
new data structure.

Fixes: 1a4a69751f ("qed: Chain support for external PBL")
Signed-off-by: Michal Kalderon <mkalderon@marvell.com>
Signed-off-by: Yuval Bason <ybason@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30 11:45:18 -07:00
Heiner Kallweit
4abc3c0481 r8169: improve handling of TD_MSS_MAX
If the mtu is greater than TD_MSS_MAX, then TSO is disabled, see
rtl8169_fix_features(). Because mss is less than mtu, we can't have
the case mss > TD_MSS_MAX in the TSO path.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30 11:44:53 -07:00
David S. Miller
3288dffc5d Merge branch 'Port-and-flow-policers-for-DSA'
Vladimir Oltean says:

====================
Port and flow policers for DSA (SJA1105, Felix/Ocelot)

This series adds support for 2 types of policers:
 - port policers, via tc matchall filter
 - flow policers, via tc flower filter
for 2 DSA drivers:
 - sja1105
 - felix/ocelot

First we start with ocelot/felix. Prior to this patch, the ocelot core
library currently only supported:
- Port policers
- Flow-based dropping and trapping
But the felix wrapper could not actually use the port policers due to
missing linkage and support in the DSA core. So one of the patches
addresses exactly that limitation by adding the missing support to the
DSA core. The other patch for felix flow policers (via the VCAP IS2
engine) is actually in the ocelot library itself, since the linkage with
the ocelot flower classifier has already been done in an earlier patch
set.

Then with the newly added .port_policer_add and .port_policer_del, we
can also start supporting the L2 policers on sja1105.

Then, for full functionality of these L2 policers on sja1105, we also
implement a more limited set of flow-based policing keys for this
switch, namely for broadcast and VLAN PCP.

Series version 1 was submitted here:
https://patchwork.ozlabs.org/cover/1263353/

Nothing functional changed in v2, only a rebase.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30 11:44:01 -07:00
Vladimir Oltean
a6af77637a net: dsa: sja1105: add broadcast and per-traffic class policers
This patch adds complete support for manipulating the L2 Policing Tables
from this switch. There are 45 table entries, one entry per each port
and traffic class, and one dedicated entry for broadcast traffic for
each ingress port.

Policing entries are shareable, and we use this functionality to support
shared block filters.

We are modeling broadcast policers as simple tc-flower matches on
dst_mac. As for the traffic class policers, the switch only deduces the
traffic class from the VLAN PCP field, so it makes sense to model this
as a tc-flower match on vlan_prio.

How to limit broadcast traffic coming from all front-panel ports to a
cumulated total of 10 Mbit/s:

tc qdisc add dev sw0p0 ingress_block 1 clsact
tc qdisc add dev sw0p1 ingress_block 1 clsact
tc qdisc add dev sw0p2 ingress_block 1 clsact
tc qdisc add dev sw0p3 ingress_block 1 clsact
tc filter add block 1 flower skip_sw dst_mac ff:ff:ff:ff:ff:ff \
	action police rate 10mbit burst 64k

How to limit traffic with VLAN PCP 0 (also includes untagged traffic) to
100 Mbit/s on port 0 only:

tc filter add dev sw0p0 ingress protocol 802.1Q flower skip_sw \
	vlan_prio 0 action police rate 100mbit burst 64k

The broadcast, VLAN PCP and port policers are compatible with one
another (can be installed at the same time on a port).

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30 11:44:01 -07:00
Vladimir Oltean
a7cc081cab net: dsa: sja1105: add configuration of port policers
This adds partial configuration support for the L2 Policing Table. Out
of the 45 policing entries, only 5 are used (one for each port), in a
shared manner. All 8 traffic classes, and the broadcast policer, are
redirected to a common instance which belongs to the ingress port.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30 11:44:00 -07:00
Vladimir Oltean
fc411eaac8 net: dsa: felix: add port policers
This patch is a trivial passthrough towards the ocelot library, which
support port policers since commit 2c1d029a01 ("net: mscc: ocelot:
Implement port policers via tc command").

Some data structure conversion between the DSA core and the Ocelot
library is necessary, for policer parameters.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30 11:44:00 -07:00
Vladimir Oltean
342971766c net: dsa: add port policers
The approach taken to pass the port policer methods on to drivers is
pragmatic. It is similar to the port mirroring implementation (in that
the DSA core does all of the filter block interaction and only passes
simple operations for the driver to implement) and dissimilar to how
flow-based policers are going to be implemented (where the driver has
full control over the flow_cls_offload data structure).

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30 11:44:00 -07:00
Vladimir Oltean
e13c207528 net: dsa: refactor matchall mirred action to separate function
Make room for other actions for the matchall filter by keeping the
mirred argument parsing self-contained in its own function.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30 11:44:00 -07:00
Xiaoliang Yang
c9a7fe1238 net: mscc: ocelot: add action of police on vcap_is2
Ocelot has 384 policers that can be allocated to ingress ports,
QoS classes per port, and VCAP IS2 entries. ocelot_police.c
supports to set policers which can be allocated to police action
of VCAP IS2. We allocate policers from maximum pol_id, and
decrease the pol_id when add a new vcap_is2 entry which is
police action.

Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30 11:44:00 -07:00
Linus Torvalds
1592614838 for-5.7/drivers-2020-03-29
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl6BJDYQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgplhMD/95jd4nlVetHAo54z+Zk2ExE13+yDamRKyh
 vc7t2tz1reqFOimtVr5aVuTXCTgOx4CpiIox5qcn6qAExN4JtCChOBRGize/0u8S
 ckxnhHbN2C0rfnGldvrYYeNRonFI+7QKimnurWUSYYGN0xqbo21BxJ7dFaohMseo
 q4K8sIW0ctE6AOlw28Jerkg614s2NDGZ7q1laheXnYHn5c9f1m0NaKN/jyTGgr0X
 TLBiLbX2yRrAuvpctBj6Fna6YN7Vdd9jsf2Bt6ipUI1XgHQoVUGMxQNhWPyjsbSv
 GzRQUNAfVcasLzCP/Mj/47144OkUtDDpn2mjeXDaFljLDGFULD+jp/SsOmLCxkPC
 gI7G2yfBvF96/SOyT0JXrLyMcBd1R2vRoASbc5tPu82mZhx7YJZH5WYtOB9h2gra
 RTYo3xcm0EoN6yeMaH+xOuXxTWWInIrgKPONW4H8s7hxEiMt5oFNVBI7vqPr4LVp
 tpfxiKZDavKOofKXogNV4W7mSMP/Ir5Q9Ha4g5SXHBGp0z/PHmnQ0xDGNq0KDnU4
 eNO0UYCFNCNa+0AOhpNxaVuVm9LjrgvyXRjePgOZQ4akhohwHO6DLrHK1f8Hb1vD
 8Ih6uR+F5zZlKsouWro8HLGYm5w40Wq9tbCI8QbPYH6nkGoDmzpPv9jbAeWgJU5c
 KqP/5TBSLA==
 =Bs4E
 -----END PGP SIGNATURE-----

Merge tag 'for-5.7/drivers-2020-03-29' of git://git.kernel.dk/linux-block

Pull block driver updates from Jens Axboe:

 - floppy driver cleanup series from Willy

 - NVMe updates and fixes (Various)

 - null_blk trace improvements (Chaitanya)

 - bcache fixes (Coly)

 - md fixes (via Song)

 - loop block size change optimizations (Martijn)

 - scnprintf() use (Takashi)

* tag 'for-5.7/drivers-2020-03-29' of git://git.kernel.dk/linux-block: (81 commits)
  null_blk: add trace in null_blk_zoned.c
  null_blk: add tracepoint helpers for zoned mode
  block: add a zone condition debug helper
  nvme: cleanup namespace identifier reporting in nvme_init_ns_head
  nvme: rename __nvme_find_ns_head to nvme_find_ns_head
  nvme: refactor nvme_identify_ns_descs error handling
  nvme-tcp: Add warning on state change failure at nvme_tcp_setup_ctrl
  nvme-rdma: Add warning on state change failure at nvme_rdma_setup_ctrl
  nvme: Fix controller creation races with teardown flow
  nvme: Make nvme_uninit_ctrl symmetric to nvme_init_ctrl
  nvme: Fix ctrl use-after-free during sysfs deletion
  nvme-pci: Re-order nvme_pci_free_ctrl
  nvme: Remove unused return code from nvme_delete_ctrl_sync
  nvme: Use nvme_state_terminal helper
  nvme: release ida resources
  nvme: Add compat_ioctl handler for NVME_IOCTL_SUBMIT_IO
  nvmet-tcp: optimize tcp stack TX when data digest is used
  nvme-fabrics: Use scnprintf() for avoiding potential buffer overflow
  nvme-multipath: do not reset on unknown status
  nvmet-rdma: allocate RW ctxs according to mdts
  ...
2020-03-30 11:43:51 -07:00
David S. Miller
0d5d6045a7 Merge branch 'ionic-support-for-firmware-upgrade'
Shannon Nelson says:

====================
ionic support for firmware upgrade

The Pensando Distributed Services Card can get firmware upgrades from
the off-host centralized management suite, and can be upgraded without a
host reboot or driver reload.  This patchset sets up the support for fw
upgrade in the Linux driver.

When the upgrade begins, the DSC first brings the link down, then stops
the firmware.  The driver will notice this and quiesce itself by stopping
the queues and releasing DMA resources, then monitoring for firmware to
start back up.  When the upgrade is finished the firmware is restarted
and link is brought up, and the driver rebuilds the queues and restarts
traffic flow.

First we separate the Link state from the netdev state, then reorganize a
few things to prepare for partial tear-down of the queues.  Next we fix
up the state machine so that we take the Tx and Rx queues down and back
up when we get LINK_DOWN and LINK_UP events.  Lastly, we add handling of
the FW reset itself by tearing down the lif internals and rebuilding them
with the new FW setup.

v2: This changes the design from (ab)using the full .ndo_stop and
    .ndo_open routines to getting a better separation between the
    alloc and the init functions so that we can keep our resource
    allocations as long as possible.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30 11:40:50 -07:00
Shannon Nelson
c672412f61 ionic: remove lifs on fw reset
When the FW RESET event comes to the driver from the firmware,
or the fw_status goes to 0 (stopped) or to 0xff (no PCI
connection), then shut down the driver activity.  This event
signals a FW upgrade where we need to quiesce all operations and
wait for the FW to restart.  The FW will continue the update
process once it sees all the LIFs are reset.  When the update
process is done it will set the fw_status back to RUNNING.
Meanwhile, the heartbeat check continues and when the fw_status
is seen as set to running we can restart the driver operations.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30 11:40:50 -07:00
Shannon Nelson
49d3b49367 ionic: disable the queues on link down
When the link goes down, we need to disable the queues on the
NIC in addition to stopping the netdev stack.  This lets the
FW know that the driver has stopped queue activity, and then
the FW can do internal reconfiguration work, whether actually
Link related, or for other internal FW needs.  To do this,
we pull out the queue enable and disable from ionic_open()
and ionic_stop() so they can be used by other routines.

To help keep things sane, we swap the queue enables so that
the rx queue and its napi are enabled before the tx queue
which rides on the rx queues napi.

We also drop the ionic_lif_quiesce() as it doesn't do anything
more than what the queue disable has already taken care of.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30 11:40:50 -07:00
Shannon Nelson
d5eddde5ec ionic: check for queues before deleting
Make sure the queue structures exist before trying
to delete them.  This addresses a couple of error
recovery issues.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30 11:40:50 -07:00
Shannon Nelson
f9c00e2cf2 ionic: clean tx queue of unfinished requests
Clean out tx requests that didn't get finished before
shutting down the queue.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30 11:40:50 -07:00
Shannon Nelson
0b0641009b ionic: move irq request to qcq alloc
Move the irq request and free out of the qcq_init and deinit
and into the alloc and free routines where they belong for
better resource management.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30 11:40:50 -07:00
Shannon Nelson
2a8c2c1a02 ionic: move debugfs add/delete to match alloc/free
Move the qcq debugfs add to the queue alloc, and likewise move
the debugfs delete to the queue free.  The LIF debugfs add
also needs to be moved, but the del is already in the LIF free.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30 11:40:50 -07:00
Shannon Nelson
987c0871e8 ionic: check for linkup in watchdog
Add a link_status_check to the heartbeat watchdog.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30 11:40:50 -07:00
Shannon Nelson
aa47b540b7 ionic: decouple link message from netdev state
Rearrange the link_up/link_down messages so that we announce
link up when we first notice that the link is up when the
driver loads, and decouple the link_up/link_down messages from
the UP and DOWN netdev state.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30 11:40:49 -07:00
Ido Schimmel
ea315c5507 mlxsw: spectrum_ptp: Fix build warnings
Cited commit extended the enums 'hwtstamp_tx_types' and
'hwtstamp_rx_filters' with values that were not accounted for in the
switch statements, resulting in the build warnings below.

Fix by adding a default case.

drivers/net/ethernet/mellanox/mlxsw/spectrum_ptp.c: In function ‘mlxsw_sp_ptp_get_message_types’:
drivers/net/ethernet/mellanox/mlxsw/spectrum_ptp.c:915:2: warning: enumeration value ‘__HWTSTAMP_TX_CNT’ not handled in switch [-Wswitch]
  915 |  switch (tx_type) {
      |  ^~~~~~
drivers/net/ethernet/mellanox/mlxsw/spectrum_ptp.c:927:2: warning: enumeration value ‘__HWTSTAMP_FILTER_CNT’ not handled in switch [-Wswitch]
  927 |  switch (rx_filter) {
      |  ^~~~~~

Fixes: f76510b458 ("ethtool: add timestamping related string sets")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reported-by: David S. Miller <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30 11:20:33 -07:00
Linus Torvalds
10f36b1e80 for-5.7/block-2020-03-29
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl6BJCoQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpvziEACqQC+QRKiqR6X5yaPWJ9LqjKE7lfI1PUb7
 0a1z1mKuf8d6z0qNleUwdSOEaS5zJiswou2K8GLvEtTQH41QYsQkxc9GLjAyTveK
 szAyzZaa3BNUy9hkczm9i2arv3fI8XoTE3JvRM0e9wL8fBJDYCtKtHFJvF4hisOQ
 ydaJlU6tcwzd9bdV7K5dLwBxu3AeAJjzS3Tyfw25u9N9O/btUxJ91RTqBb2+Xeoz
 AVasfRlAqf/CzdjxCCmDgWE2QM4852pAeQ7UJJBGISNWNoiwkezMg+6HD0jEOLee
 bQ8uDyQdihIWTY+/zQasotX8/71uLV8QgtjWLXR9zrjrubIBWHGzoWSQ4kPg5DfQ
 bJmKO0VvWN2sshZEpWvzzAFGYxZViNphbK2Pb4hKOcv7jtMcC8mmEogh/7EqbD/n
 KB3IM9qVoXM8INm5o0dTy5uDRJxiHiHYkqsZaKz55BB/R4Geym5TINT3nXgxhQrn
 JoSwp4zdm3/NJOySruDi2eETqWJC2bsz3FsQSyCQTPOuP0nLtFKBb1UKHpmYTCXG
 H4LCyCKFJ6s006qBcdaNPZBw1mrSNwoxEulHnpYA4BFfPeXi72yrnMZQkdwWONpW
 LIVuD0hBm8X/pulbvEEdjzXBqZVkqK3xFX+uX5+bnwwaUKddXAC/h9SQKpBP2Mbb
 AeZToMklKw==
 =6Glq
 -----END PGP SIGNATURE-----

Merge tag 'for-5.7/block-2020-03-29' of git://git.kernel.dk/linux-block

Pull block updates from Jens Axboe:

 - Online capacity resizing (Balbir)

 - Number of hardware queue change fixes (Bart)

 - null_blk fault injection addition (Bart)

 - Cleanup of queue allocation, unifying the node/no-node API
   (Christoph)

 - Cleanup of genhd, moving code to where it makes sense (Christoph)

 - Cleanup of the partition handling code (Christoph)

 - disk stat fixes/improvements (Konstantin)

 - BFQ improvements (Paolo)

 - Various fixes and improvements

* tag 'for-5.7/block-2020-03-29' of git://git.kernel.dk/linux-block: (72 commits)
  block: return NULL in blk_alloc_queue() on error
  block: move bio_map_* to blk-map.c
  Revert "blkdev: check for valid request queue before issuing flush"
  block: simplify queue allocation
  bcache: pass the make_request methods to blk_queue_make_request
  null_blk: use blk_mq_init_queue_data
  block: add a blk_mq_init_queue_data helper
  block: move the ->devnode callback to struct block_device_operations
  block: move the part_stat* helpers from genhd.h to a new header
  block: move block layer internals out of include/linux/genhd.h
  block: move guard_bio_eod to bio.c
  block: unexport get_gendisk
  block: unexport disk_map_sector_rcu
  block: unexport disk_get_part
  block: mark part_in_flight and part_in_flight_rw static
  block: mark block_depr static
  block: factor out requeue handling from dispatch code
  block/diskstats: replace time_in_queue with sum of request times
  block/diskstats: accumulate all per-cpu counters in one pass
  block/diskstats: more accurate approximation of io_ticks for slow disks
  ...
2020-03-30 11:20:13 -07:00
David S. Miller
307b4e0b37 Merge branch 'Devlink-health-auto-attributes-refactor'
Eran Ben Elisha says:

====================
Devlink health auto attributes refactor

This patchset refactors the auto-recover health reporter flag to be
explicitly set by the devlink core.
In addition, add another flag to control auto-dump attribute, also
to be explicitly set by the devlink core.

For that, patch 0001 changes the auto-recover default value of
netdevsim dummy reporter.

After reporter registration, both flags can be altered be administrator
only.

Changes since v1:
- Change default behaviour of netdevsim dummy reporter
- Move initialization of DEVLINK_ATTR_HEALTH_REPORTER_AUTO_DUMP
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30 11:17:40 -07:00
Eran Ben Elisha
48bb52c80b devlink: Add auto dump flag to health reporter
On low memory system, run time dumps can consume too much memory. Add
administrator ability to disable auto dumps per reporter as part of the
error flow handle routine.

This attribute is not relevant while executing
DEVLINK_CMD_HEALTH_REPORTER_DUMP_GET.

By default, auto dump is activated for any reporter that has a dump method,
as part of the reporter registration to devlink.

Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30 11:17:34 -07:00
Eran Ben Elisha
ba7d16c779 devlink: Implicitly set auto recover flag when registering health reporter
When health reporter is registered to devlink, devlink will implicitly set
auto recover if and only if the reporter has a recover method. No reason
to explicitly get the auto recover flag from the driver.

Remove this flag from all drivers that called
devlink_health_reporter_create.

All existing health reporters set auto recovery to true if they have a
recover method.

Yet, administrator can unset auto recover via netlink command as prior to
this patch.

Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30 11:17:34 -07:00
Eran Ben Elisha
c7f0d4c898 netdevsim: Change dummy reporter auto recover default
Health reporters should be registered with auto recover set to true.
Align dummy reporter behaviour with that, as in later patch the option to
set auto recover behaviour will be removed.

In addition, align netdevsim selftest to the new default value.

Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30 11:17:34 -07:00
Richard Cochran
62582a7ee7 ptp: Avoid deadlocks in the programmable pin code.
The PTP Hardware Clock (PHC) subsystem offers an API for configuring
programmable pins.  User space sets or gets the settings using ioctls,
and drivers verify dialed settings via a callback.  Drivers may also
query pin settings by calling the ptp_find_pin() method.

Although the core subsystem protects concurrent access to the pin
settings, the implementation places illogical restrictions on how
drivers may call ptp_find_pin().  When enabling an auxiliary function
via the .enable(on=1) callback, drivers may invoke the pin finding
method, but when disabling with .enable(on=0) drivers are not
permitted to do so.  With the exception of the mv88e6xxx, all of the
PHC drivers do respect this restriction, but still the locking pattern
is both confusing and unnecessary.

This patch changes the locking implementation to allow PHC drivers to
freely call ptp_find_pin() from their .enable() and .verify()
callbacks.

V2 ChangeLog:
- fixed spelling in the kernel doc
- add Vladimir's tested by tag

Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Reported-by: Yangbo Lu <yangbo.lu@nxp.com>
Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30 11:16:38 -07:00
Linus Torvalds
3a0eb192c0 for-5.7/libata-2020-03-29
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl6BJIgQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgph2iEADcreBhOOXF8EMkC7Ck5QDNXWOhkru2LXO6
 fdc6eCx11fCN8/PqMZXv7cvKUCAhdZqEhhfqDhQ7tpKuh1WDTC9NGLGM+6z+S0GV
 P1R/PXOMOb4pnR96xlc+UYe7IgKD5GTcazIMFZ3HbbbI65LL0dCT5JvI4Cc4VVF1
 r/kXs7MRplz+fItEY3aJ/zOXCAICGcL8Wn/MAInD3Jr4NvWNGtWDpfsm7M1IoX4K
 JLGY5OY7AMmUFVBcxDhtyf6LXdERRfjAJ+95AgzF0WLcz8SczhsPXhqeOaSRtNoG
 aF4a9vvuz8jdEAyIvXgqoYlDzt989ZEv4AdWhXs4Y+UX5+PzTQ5eoGthYFqenQju
 uxQUPnrYTKtX1HMN68ih5u75mktXn97NeQ764bcFV1+5rELSxHMQxBaZ2y5Zy7sY
 EK4owLsWUhyZNsVbAqRz8sy+UJF5AwfCa0rXEVsoFIpi6r90kZWh58zlNkq82p4K
 WHnRDikD6dZ/vbWVaRWLZIF9PiJ4Jy9mdefcEHDZuqAKHeehkBFU+iGG9AtHA8bm
 3CCHWPqd6cwMKPzKL9nLvNl0PwDTfTpZBO9waW3Xh9RAyNA/dNJ+vexSgh8WQbC5
 o0UFT82orpMTMDwmqsT+dc3aIUWIXXZVSDeeXZKZ1SPAkLX55L969l2AK+Z0U9ar
 Yj6VGE4U2g==
 =vn2E
 -----END PGP SIGNATURE-----

Merge tag 'for-5.7/libata-2020-03-29' of git://git.kernel.dk/linux-block

Pull libata updates from Jens Axboe:

 - Series from Bart, making the libata code smaller on PATA only setups.
   This is useful for smaller/embedded use cases, and will help us move
   some of those off drivers/ide.

 - Kill unused BPRINTK() (Hannes)

 - Add various Comet Lake ahci PCI ids (Kai-Heng, Mika)

 - Fix for a double scsi_host_put() in error handling (John)

 - Use scnprintf (Takashi)

 - Assign OF node to the SCSI device (Linus Walleij)

* tag 'for-5.7/libata-2020-03-29' of git://git.kernel.dk/linux-block: (36 commits)
  ata: make "libata.force" kernel parameter optional
  ata: move ata_eh_analyze_ncq_error() & co. to libata-sata.c
  ata: start separating SATA specific code from libata-eh.c
  ata: move ata_sas_*() to libata-sata.c
  ata: start separating SATA specific code from libata-scsi.c
  ata: move sata_deb_timing_*() to libata-sata.c
  ata: move ata_qc_complete_multiple() to libata-sata.c
  ata: move sata_link_hardreset() to libata-sata.c
  ata: move sata_link_{debounce,resume}() to libata-sata.c
  ata: move *sata_set_spd*() to libata-sata.c
  ata: move sata_scr_*() to libata-sata.c
  ata: start separating SATA specific code from libata-core.c
  ata: let compiler optimize out ata_eh_set_lpm() on non-SATA hosts
  ata: let compiler optimize out ata_dev_config_ncq() on non-SATA hosts
  ata: add CONFIG_SATA_HOST=n version of ata_ncq_enabled()
  ata: separate PATA timings code from libata-core.c
  ata: fix CodingStyle issues in PATA timings code
  ata: remove EXPORT_SYMBOL_GPL()s not used by modules
  ata: move EXPORT_SYMBOL_GPL()s close to exported code
  ata: optimize ata_scsi_rbuf[] size
  ...
2020-03-30 11:10:08 -07:00
Jiri Pirko
054eae8253 net: devlink: use NL_SET_ERR_MSG_MOD instead of NL_SET_ERR_MSG
The rest of the devlink code sets the extack message using
NL_SET_ERR_MSG_MOD. Change the existing appearances of NL_SET_ERR_MSG
to NL_SET_ERR_MSG_MOD.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30 11:07:39 -07:00
David S. Miller
6e2345c197 Merge branch 'net-sched-expose-HW-stats-types-per-action-used-by-drivers'
Jiri Pirko says:

====================
net: sched: expose HW stats types per action used by drivers

The first patch is just adding a helper used by the second patch too.
The second patch is exposing HW stats types that are used by drivers.

Example:

$ tc filter add dev enp3s0np1 ingress proto ip handle 1 pref 1 flower dst_ip 192.168.1.1 action drop
$ tc -s filter show dev enp3s0np1 ingress
filter protocol ip pref 1 flower chain 0
filter protocol ip pref 1 flower chain 0 handle 0x1
  eth_type ipv4
  dst_ip 192.168.1.1
  in_hw in_hw_count 2
        action order 1: gact action drop
         random type none pass val 0
         index 1 ref 1 bind 1 installed 10 sec used 10 sec
        Action statistics:
        Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
        backlog 0b 0p requeues 0
        used_hw_stats immediate     <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30 11:06:49 -07:00
Jiri Pirko
93a129eb8c net: sched: expose HW stats types per action used by drivers
It may be up to the driver (in case ANY HW stats is passed) to select
which type of HW stats he is going to use. Add an infrastructure to
expose this information to user.

$ tc filter add dev enp3s0np1 ingress proto ip handle 1 pref 1 flower dst_ip 192.168.1.1 action drop
$ tc -s filter show dev enp3s0np1 ingress
filter protocol ip pref 1 flower chain 0
filter protocol ip pref 1 flower chain 0 handle 0x1
  eth_type ipv4
  dst_ip 192.168.1.1
  in_hw in_hw_count 2
        action order 1: gact action drop
         random type none pass val 0
         index 1 ref 1 bind 1 installed 10 sec used 10 sec
        Action statistics:
        Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
        backlog 0b 0p requeues 0
        used_hw_stats immediate     <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30 11:06:49 -07:00
Jiri Pirko
8953b0770f net: introduce nla_put_bitfield32() helper and use it
Introduce a helper to pass value and selector to. The helper packs them
into struct and puts them into netlink message.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30 11:06:49 -07:00
YueHaibing
b4d8ddf835 RDMA/bnxt_re: make bnxt_re_ib_init static
Fix sparse warning:

drivers/infiniband/hw/bnxt_re/main.c:1313:5:
 warning: symbol 'bnxt_re_ib_init' was not declared. Should it be static?

Link: https://lore.kernel.org/r/20200330110219.24448-1-yuehaibing@huawei.com
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-30 15:03:19 -03:00
Linus Torvalds
c03cb66464 * Fix driver auto-probing related issues
* Stop using the deprecated i2c_new_device() function
 * Replace zero-length array with flexible-array member
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEKmCqpbOU668PNA69Ze02AX4ItwAFAl6AX4QACgkQZe02AX4I
 twCodw/9F+m/IPY7dFTIjQeJvFW/fbERMmtPBDeSW9x9Pu6ikOxz4kEieswXi8jR
 gWPscKoZLDd9cQe0iMC4EZV8TEOk6pQt+J2PLxniIvw4WUngjRwrMHjgSz0QCKp0
 jge03ltjlxqsZ0TY+IrgYFXC+LaGhjvZ8LT9vLLk1z4BvVasuT5JOPk2QC4kPKrn
 0oymnf/ROR9QWAYDP7GMET2buCJtGkDgZFxMBy1wDNOt4r5i8t4jq4ZtNfun7uA5
 5yHEEnh5smFV5/8/L+yAjwK64CYtQwgukpSjJnNfVSgoOWMtcK13BDQ5YNKhxW9O
 mPBAzwS/kfGU96k/v7xuaxbT50YrG7RHpi9tX2peUowV2wbaRIu1dZIVHYQPsmp0
 RMBoQCvK62jPJFJdCCiLS164C6JHM/3/wk2Z4OSDy27XEGiOSyiDpTI/KlmwdVNC
 R6am3TnCwwVB1bXsrn9kMVxThQeuKJJEQp1xCbR9A+HmGO2W24L5GpoxWIdfkuc/
 SrvQt2fOzqMzY02/PqQXrqdJlyqfHA/ARe/QEz8a6YU2kf+uGl/ebmH9vJm04yV3
 JdpAGa0+EfVmS4/WyblUHPRq/jnTAB9opBRnNi6pHJBuWr6SeZaGdVhk5TTM3yck
 eJipGlzYAWB6N7/tHYYMV6qg1ta526i0xrYpvxJ+ECJXU0uez/U=
 =79Ge
 -----END PGP SIGNATURE-----

Merge tag 'i3c/for-5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux

Pull i3c updates from Boris Brezillon:

 - Fix driver auto-probing related issues

 - Stop using the deprecated i2c_new_device() function

 - Replace zero-length array with flexible-array member

* tag 'i3c/for-5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux:
  i3c: convert to use i2c_new_client_device()
  i3c: master: Replace zero-length array with flexible-array member
  i3c: Simplify i3c_device_match_id()
  i3c: Generate aliases for i3c modules
  i3c: Add a modalias sysfs attribute
  i3c: Fix MODALIAS uevents
  i3c: master: no need to iterate master device twice
2020-03-30 11:03:19 -07:00
David S. Miller
acc086bfb9 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next
Steffen Klassert says:

====================
pull request (net-next): ipsec-next 2020-03-28

1) Use kmem_cache_zalloc() instead of kmem_cache_alloc()
   in xfrm_state_alloc(). From Huang Zijiang.

2) esp_output_fill_trailer() is the same in IPv4 and IPv6,
   so share this function to avoide code duplcation.
   From Raed Salem.

3) Add offload support for esp beet mode.
   From Xin Long.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30 10:59:20 -07:00
David S. Miller
0141317611 Merge branch 'hns3-fixes'
Huazhong Tan says:

====================
net: hns3: fixes for -net

This patchset includes some bugfixes for the HNS3 ethernet driver.

[patch 1] removes flag WQ_MEM_RECLAIM flag when allocating WE,
since it will cause a warning when the reset task flushes a IB's WQ.

[patch 2] adds a new DESC_TYPE_FRAGLIST_SKB type to handle the
linear data of the fraglist SKB, since it is different with the frag
data.

[patch 3] adds different handings for RSS configuration when load
or reset.

[patch 4] fixes a link ksetting issue.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30 10:57:53 -07:00
Guangbin Huang
a9775bb64a net: hns3: fix set and get link ksettings issue
When device is not open, the service task which update the port
information per second is not running. In this case, the port
capabilities, including speed ability, autoneg ability, media type,
may be incorrect. Then get/set link ksetting may fail.

This patch fixes it by updating the port information before getting/
setting link ksettings when device is not open, and start timer
task immediately by setting delay time to 0 when device opens.

Fixes: 46a3df9f97 ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support")
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30 10:57:53 -07:00
Guojia Liao
944de4847a net: hns3: fix RSS config lost after VF reset.
Currently, VF's RSS configuration would be set to default
after VF reset, the the user's one will loss.

To fix it, this patch separates hclgevf_rss_init_hw() into
two parts, one sets up the default RSS configuration and
just be called when driver loading, one configures the hardware
and be called by driver loading or reset.

Fixes: d97b307213 ("net: hns3: Add RSS tuples support for VF")
Signed-off-by: Guojia Liao <liaoguojia@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30 10:57:53 -07:00
Huazhong Tan
74ef402e13 net: hns3: fix for fraglist SKB headlen not handling correctly
When the fraglist SKB headlen is larger than zero, current code
still handle the fraglist SKB linear data as frag data, which may
cause TX error.

This patch adds a new DESC_TYPE_FRAGLIST_SKB type to handle the
mapping and unmapping of the fraglist SKB linear data buffer.

Fixes: 8ae10cfb50 ("net: hns3: support tx-scatter-gather-fraglist feature")
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30 10:57:53 -07:00
Yunsheng Lin
16deaef205 net: hns3: drop the WQ_MEM_RECLAIM flag when allocating WQ
The WQ in hns3 driver is allocated with WQ_MEM_RECLAIM flag
in order to guarantee forward progress, which may cause hns3'
WQ_MEM_RECLAIM WQ flushing infiniband' !WQ_MEM_RECLAIM WQ
warning:

[11246.200168] hns3 0000:bd:00.1: Reset done, hclge driver initialization finished.
[11246.209979] hns3 0000:bd:00.1 eth7: net open
[11246.227608] ------------[ cut here ]------------
[11246.237370] workqueue: WQ_MEM_RECLAIM hclge:hclge_service_task [hclge] is flushing !WQ_MEM_RECLAIM infiniband:0x0
[11246.237391] WARNING: CPU: 50 PID: 2279 at ./kernel/workqueue.c:2605 check_flush_dependency+0xcc/0x140
[11246.260412] Modules linked in: hclgevf hns_roce_hw_v2 rdma_test(O) hns3 xt_CHECKSUM iptable_mangle xt_conntrack ipt_REJECT nf_reject_ipv4 ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter bpfilter vfio_iommu_type1 vfio_pci vfio_virqfd vfio ib_isert iscsi_target_mod ib_ipoib ib_umad rpcrdma ib_iser libiscsi scsi_transport_iscsi aes_ce_blk crypto_simd cryptd aes_ce_cipher sunrpc nls_iso8859_1 crct10dif_ce ghash_ce sha2_ce sha256_arm64 sha1_ce joydev input_leds hid_generic usbkbd usbmouse sbsa_gwdt usbhid usb_storage hid ses hclge hisi_zip hisi_hpre hisi_sec2 hnae3 hisi_qm ahci hisi_trng_v2 evbug uacce rng_core gpio_dwapb autofs4 hisi_sas_v3_hw megaraid_sas hisi_sas_main libsas scsi_transport_sas [last unloaded: hns_roce_hw_v2]
[11246.325742] CPU: 50 PID: 2279 Comm: kworker/50:0 Kdump: loaded Tainted: G           O      5.4.0-rc4+ #1
[11246.335181] Hardware name: Huawei TaiShan 200 (Model 2280)/BC82AMDD, BIOS 2280-V2 CS V3.B140.01 12/18/2019
[11246.344802] Workqueue: hclge hclge_service_task [hclge]
[11246.350007] pstate: 60c00009 (nZCv daif +PAN +UAO)
[11246.354779] pc : check_flush_dependency+0xcc/0x140
[11246.359549] lr : check_flush_dependency+0xcc/0x140
[11246.364317] sp : ffff800268a73990
[11246.367618] x29: ffff800268a73990 x28: 0000000000000001
[11246.372907] x27: ffffcbe4f5868000 x26: ffffcbe4f5541000
[11246.378196] x25: 00000000000000b8 x24: ffff002fdd0ff868
[11246.383483] x23: ffff002fdd0ff800 x22: ffff2027401ba600
[11246.388770] x21: 0000000000000000 x20: ffff002fdd0ff800
[11246.394059] x19: ffff202719293b00 x18: ffffcbe4f5541948
[11246.399347] x17: 000000006f8ad8dd x16: 0000000000000002
[11246.404634] x15: ffff8002e8a734f7 x14: 6c66207369205d65
[11246.409922] x13: 676c63685b206b73 x12: 61745f6563697672
[11246.415208] x11: 65735f65676c6368 x10: 3a65676c6368204d
[11246.420494] x9 : 49414c4345525f4d x8 : 6e6162696e69666e
[11246.425782] x7 : 69204d49414c4345 x6 : ffffcbe4f5765145
[11246.431068] x5 : 0000000000000000 x4 : 0000000000000000
[11246.436355] x3 : 0000000000000030 x2 : 00000000ffffffff
[11246.441642] x1 : 3349eb1ac5310100 x0 : 0000000000000000
[11246.446928] Call trace:
[11246.449363]  check_flush_dependency+0xcc/0x140
[11246.453785]  flush_workqueue+0x110/0x410
[11246.457691]  ib_cache_cleanup_one+0x54/0x468
[11246.461943]  __ib_unregister_device+0x70/0xa8
[11246.466279]  ib_unregister_device+0x2c/0x40
[11246.470455]  hns_roce_exit+0x34/0x198 [hns_roce_hw_v2]
[11246.475571]  __hns_roce_hw_v2_uninit_instance.isra.56+0x3c/0x58 [hns_roce_hw_v2]
[11246.482934]  hns_roce_hw_v2_reset_notify+0xd8/0x210 [hns_roce_hw_v2]
[11246.489261]  hclge_notify_roce_client+0x84/0xe0 [hclge]
[11246.494464]  hclge_reset_rebuild+0x60/0x730 [hclge]
[11246.499320]  hclge_reset_service_task+0x400/0x5a0 [hclge]
[11246.504695]  hclge_service_task+0x54/0x698 [hclge]
[11246.509464]  process_one_work+0x15c/0x458
[11246.513454]  worker_thread+0x144/0x520
[11246.517186]  kthread+0xfc/0x128
[11246.520314]  ret_from_fork+0x10/0x18
[11246.523873] ---[ end trace eb980723699c2585 ]---
[11246.528710] hns3 0000:bd:00.2: Func clear success after reset.
[11246.528747] hns3 0000:bd:00.0: Func clear success after reset.
[11246.907710] hns3 0000:bd:00.1 eth7: link up

According to [1] and [2]:

There seems to be no specific guidance about how to handling the
forward progress guarantee of network device's WQ yet, and other
network device's WQ seem to be marked with WQ_MEM_RECLAIM without
a clear reason.

So this patch removes the WQ_MEM_RECLAIM flag when allocating WQ
to aviod the above warning.

1. https://www.spinics.net/lists/netdev/msg631646.html
2. https://www.spinics.net/lists/netdev/msg632097.html

Fixes: 0ea6890225 ("net: hns3: allocate WQ with WQ_MEM_RECLAIM flag")
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30 10:57:53 -07:00
Linus Torvalds
0f75139634 tpmdd updates for Linux v5.7
-----BEGIN PGP SIGNATURE-----
 
 iJYEABYIAD4WIQRE6pSOnaBC00OEHEIaerohdGur0gUCXm6wmiAcamFya2tvLnNh
 a2tpbmVuQGxpbnV4LmludGVsLmNvbQAKCRAaerohdGur0txuAP9WLoapeGoI+jnq
 5SowWPRdHaKEX3Ez7CTZJll03pAoOAEAjX99WhbVZZe4CK3QLYRryZQobm2myk+/
 z192BJwtnA8=
 =vd/3
 -----END PGP SIGNATURE-----

Merge tag 'tpmdd-next-20200316' of git://git.infradead.org/users/jjs/linux-tpmdd

Pull tpm updates from Jarkko Sakkinen:
 "tpmdd updates for Linux v5.7"

* tag 'tpmdd-next-20200316' of git://git.infradead.org/users/jjs/linux-tpmdd:
  KEYS: reaching the keys quotas correctly
  tpm: ibmvtpm: Add support for TPM2
  tpm: ibmvtpm: Wait for buffer to be set before proceeding
  tpm: of: Handle IBM,vtpm20 case when getting log parameters
  MAINTAINERS: adjust to trusted keys subsystem creation
  tpm: tpm_tis_spi_cr50: use new structure for SPI transfer delays
  tpm_tis_spi: use new 'delay' structure for SPI transfer delays
  tpm: tpm2_bios_measurements_next should increase position index
  tpm: tpm1_bios_measurements_next should increase position index
  tpm: Don't make log failures fatal
2020-03-30 10:57:32 -07:00
Christophe JAILLET
ee91a83e08 net: dsa: Simplify 'dsa_tag_protocol_to_str()'
There is no point in preparing the module name in a buffer. The format
string can be passed diectly to 'request_module()'.

This axes a few lines of code and cleans a few things:
   - max len for a driver name is MODULE_NAME_LEN wich is ~ 60 chars,
     not 128. It would be down-sized in 'request_module()'
   - we should pass the total size of the buffer to 'snprintf()', not the
     size minus 1

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30 10:54:57 -07:00
YueHaibing
32109c7065 net: ena: Make some functions static
Fix sparse warnings:

drivers/net/ethernet/amazon/ena/ena_netdev.c:460:6: warning: symbol 'ena_xdp_exchange_program_rx_in_range' was not declared. Should it be static?
drivers/net/ethernet/amazon/ena/ena_netdev.c:481:6: warning: symbol 'ena_xdp_exchange_program' was not declared. Should it be static?
drivers/net/ethernet/amazon/ena/ena_netdev.c:1555:5: warning: symbol 'ena_xdp_handle_buff' was not declared. Should it be static?

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30 10:53:40 -07:00