The eq.[c|h] files are under major rewrite. so use this opportunity and
update their copyright and license texts.
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
The users of EQ are running their code on different CPUs and with
various affinity patterns. Move the cpumask setting close to their
actual usage.
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Introduce new API that will allow IRQs users to hold a pointer to
mlx5_irq.
In the end of this series, IRQs will be allocated on demand. Hence,
this will allow us to properly manage and use IRQs.
Signed-off-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Shared IRQ are consumed by multiple EQ users and in order to properly
initialize and later release such IRQs, we add kref counting of IRQ
structure.
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Lag is used to combine two PCI functions of the same HCA into a single
logical unit. This is a core functionality and as such should be managed by
the core driver. Currently this isn't the case. While we store the lag
software structure inside the lower device, its lifetime (creation /
destruction) is dictated by the mlx5e part. Change the ownership model so
lag is tied to the lifetime of the lower level driver instead to the
mlx5e part.
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
If MLX5_PRIV_FLAGS_DISABLE_ALL_ADEV is set it means the device is going
down and mlx5_rescan_drivers_locked() shouldn't be called.
With this patch and the previous one in the series, unbinding a PCI
function when its netdev is part of a bond works and leaves the system in a
working state.
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
When a net device is removed (can happen if the PCI function is unbound
from the system) it's not enough to destroy the hardware lag. The system
should recreate the original devices that were present before the lag.
As the same flow is done when a net device is removed from the bond
refactor and reuse the code.
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Documentation/admin-guide/cputopology.rst is the wrong place to describe
sysfs ABI. So move the cputopology ABI things to
Documentation/ABI/stable/sysfs-devices-system-cpu and add a reference to
ABI doc in Documentation/admin-guide/cputopology.rst.
Link: https://lkml.kernel.org/r/20210319041618.14316-1-song.bao.hua@hisilicon.com
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Tian Tao <tiantao6@hisilicon.com>
Signed-off-by: Barry Song <song.bao.hua@hisilicon.com>
Link: https://lore.kernel.org/r/20210611052249.25776-1-song.bao.hua@hisilicon.com
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Fault injection uses debugfs in a way that the provided values via sysfs
are interpreted as u64. Providing negative numbers results in an error:
/sys/kernel/debug/fail_function# echo -1 > times
sh: write error: Invalid argument
Update the docs and examples to use "printf %#x <val>" in these cases.
For "retval", reword the paragraph a little and fix a typo.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Link: https://lore.kernel.org/r/20210603125841.27436-1-wsa+renesas@sang-engineering.com
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Since LLVM commit fc018eb, the '-warn-stack-size' flag has been dropped
[1], leading to the following error message when building with Clang-13
and LLD-13:
ld.lld: error: -plugin-opt=-: ld.lld: Unknown command line argument
'-warn-stack-size=2048'. Try: 'ld.lld --help'
ld.lld: Did you mean '--asan-stack=2048'?
In the same way as with commit 2398ce8015 ("x86, lto: Pass
-stack-alignment only on LLD < 13.0.0") , make '-warn-stack-size'
conditional on LLD < 13.0.0.
[1] https://reviews.llvm.org/D103928
Fixes: 24845dcb17 ("Makefile: LTO: have linker check -Wframe-larger-than")
Cc: stable@vger.kernel.org
Link: https://github.com/ClangBuiltLinux/linux/issues/1377
Signed-off-by: Tor Vic <torvic9@mailbox.org>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/7631bab7-a8ab-f884-ab54-f4198976125c@mailbox.org
Add and document more checkpatch message types. About 50% of all
message types are documented now.
In addition to this:
- Create a new subsection 'Indentation and Line Breaks'.
- Rename subsection 'Comment style' to simply 'Comments'.
- Refactor some of the existing types to appropriate subsections.
Reviewed-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Tested-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Dwaipayan Ray <dwaipayanray1@gmail.com>
Link: https://lore.kernel.org/r/20210614141132.6881-1-dwaipayanray1@gmail.com
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Update the function prototype of mhi_ndo_xmit to match
ndo_start_xmit. This otherwise leads to run time failures when
CFI is enabled in kernel.
Fixes: 3ffec6a14f ("net: Add mhi-net driver")
Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
In almost all cases from test_verifier that have been changed in here, we've
had an unreachable path with a load from a register which has an invalid
address on purpose. This was basically to make sure that we never walk this
path and to have the verifier complain if it would otherwise. Change it to
match on the right error for unprivileged given we now test these paths
under speculative execution.
There's one case where we match on exact # of insns_processed. Due to the
extra path, this will of course mismatch on unprivileged. Thus, restrict the
test->insn_processed check to privileged-only.
In one other case, we result in a 'pointer comparison prohibited' error. This
is similarly due to verifying an 'invalid' branch where we end up with a value
pointer on one side of the comparison.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
The verifier only enumerates valid control-flow paths and skips paths that
are unreachable in the non-speculative domain. And so it can miss issues
under speculative execution on mispredicted branches.
For example, a type confusion has been demonstrated with the following
crafted program:
// r0 = pointer to a map array entry
// r6 = pointer to readable stack slot
// r9 = scalar controlled by attacker
1: r0 = *(u64 *)(r0) // cache miss
2: if r0 != 0x0 goto line 4
3: r6 = r9
4: if r0 != 0x1 goto line 6
5: r9 = *(u8 *)(r6)
6: // leak r9
Since line 3 runs iff r0 == 0 and line 5 runs iff r0 == 1, the verifier
concludes that the pointer dereference on line 5 is safe. But: if the
attacker trains both the branches to fall-through, such that the following
is speculatively executed ...
r6 = r9
r9 = *(u8 *)(r6)
// leak r9
... then the program will dereference an attacker-controlled value and could
leak its content under speculative execution via side-channel. This requires
to mistrain the branch predictor, which can be rather tricky, because the
branches are mutually exclusive. However such training can be done at
congruent addresses in user space using different branches that are not
mutually exclusive. That is, by training branches in user space ...
A: if r0 != 0x0 goto line C
B: ...
C: if r0 != 0x0 goto line D
D: ...
... such that addresses A and C collide to the same CPU branch prediction
entries in the PHT (pattern history table) as those of the BPF program's
lines 2 and 4, respectively. A non-privileged attacker could simply brute
force such collisions in the PHT until observing the attack succeeding.
Alternative methods to mistrain the branch predictor are also possible that
avoid brute forcing the collisions in the PHT. A reliable attack has been
demonstrated, for example, using the following crafted program:
// r0 = pointer to a [control] map array entry
// r7 = *(u64 *)(r0 + 0), training/attack phase
// r8 = *(u64 *)(r0 + 8), oob address
// [...]
// r0 = pointer to a [data] map array entry
1: if r7 == 0x3 goto line 3
2: r8 = r0
// crafted sequence of conditional jumps to separate the conditional
// branch in line 193 from the current execution flow
3: if r0 != 0x0 goto line 5
4: if r0 == 0x0 goto exit
5: if r0 != 0x0 goto line 7
6: if r0 == 0x0 goto exit
[...]
187: if r0 != 0x0 goto line 189
188: if r0 == 0x0 goto exit
// load any slowly-loaded value (due to cache miss in phase 3) ...
189: r3 = *(u64 *)(r0 + 0x1200)
// ... and turn it into known zero for verifier, while preserving slowly-
// loaded dependency when executing:
190: r3 &= 1
191: r3 &= 2
// speculatively bypassed phase dependency
192: r7 += r3
193: if r7 == 0x3 goto exit
194: r4 = *(u8 *)(r8 + 0)
// leak r4
As can be seen, in training phase (phase != 0x3), the condition in line 1
turns into false and therefore r8 with the oob address is overridden with
the valid map value address, which in line 194 we can read out without
issues. However, in attack phase, line 2 is skipped, and due to the cache
miss in line 189 where the map value is (zeroed and later) added to the
phase register, the condition in line 193 takes the fall-through path due
to prior branch predictor training, where under speculation, it'll load the
byte at oob address r8 (unknown scalar type at that point) which could then
be leaked via side-channel.
One way to mitigate these is to 'branch off' an unreachable path, meaning,
the current verification path keeps following the is_branch_taken() path
and we push the other branch to the verification stack. Given this is
unreachable from the non-speculative domain, this branch's vstate is
explicitly marked as speculative. This is needed for two reasons: i) if
this path is solely seen from speculative execution, then we later on still
want the dead code elimination to kick in in order to sanitize these
instructions with jmp-1s, and ii) to ensure that paths walked in the
non-speculative domain are not pruned from earlier walks of paths walked in
the speculative domain. Additionally, for robustness, we mark the registers
which have been part of the conditional as unknown in the speculative path
given there should be no assumptions made on their content.
The fix in here mitigates type confusion attacks described earlier due to
i) all code paths in the BPF program being explored and ii) existing
verifier logic already ensuring that given memory access instruction
references one specific data structure.
An alternative to this fix that has also been looked at in this scope was to
mark aux->alu_state at the jump instruction with a BPF_JMP_TAKEN state as
well as direction encoding (always-goto, always-fallthrough, unknown), such
that mixing of different always-* directions themselves as well as mixing of
always-* with unknown directions would cause a program rejection by the
verifier, e.g. programs with constructs like 'if ([...]) { x = 0; } else
{ x = 1; }' with subsequent 'if (x == 1) { [...] }'. For unprivileged, this
would result in only single direction always-* taken paths, and unknown taken
paths being allowed, such that the former could be patched from a conditional
jump to an unconditional jump (ja). Compared to this approach here, it would
have two downsides: i) valid programs that otherwise are not performing any
pointer arithmetic, etc, would potentially be rejected/broken, and ii) we are
required to turn off path pruning for unprivileged, where both can be avoided
in this work through pushing the invalid branch to the verification stack.
The issue was originally discovered by Adam and Ofek, and later independently
discovered and reported as a result of Benedict and Piotr's research work.
Fixes: b2157399cc ("bpf: prevent out-of-bounds speculation")
Reported-by: Adam Morrison <mad@cs.tau.ac.il>
Reported-by: Ofek Kirzner <ofekkir@gmail.com>
Reported-by: Benedict Schlueter <benedict.schlueter@rub.de>
Reported-by: Piotr Krysiuk <piotras@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Reviewed-by: Benedict Schlueter <benedict.schlueter@rub.de>
Reviewed-by: Piotr Krysiuk <piotras@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
... in such circumstances, we do not want to mark the instruction as seen given
the goal is still to jmp-1 rewrite/sanitize dead code, if it is not reachable
from the non-speculative path verification. We do however want to verify it for
safety regardless.
With the patch as-is all the insns that have been marked as seen before the
patch will also be marked as seen after the patch (just with a potentially
different non-zero count). An upcoming patch will also verify paths that are
unreachable in the non-speculative domain, hence this extension is needed.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Reviewed-by: Benedict Schlueter <benedict.schlueter@rub.de>
Reviewed-by: Piotr Krysiuk <piotras@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Instead of relying on current env->pass_cnt, use the seen count from the
old aux data in adjust_insn_aux_data(), and expand it to the new range of
patched instructions. This change is valid given we always expand 1:n
with n>=1, so what applies to the old/original instruction needs to apply
for the replacement as well.
Not relying on env->pass_cnt is a prerequisite for a later change where we
want to avoid marking an instruction seen when verified under speculative
execution path.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Reviewed-by: Benedict Schlueter <benedict.schlueter@rub.de>
Reviewed-by: Piotr Krysiuk <piotras@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
- Fix crash on SMP when debug is enabled
-----BEGIN PGP SIGNATURE-----
iQJNBAABCAA3FiEE7E6oRXp8w05ovYr/9JCA4xAyCykFAmDHvYgZHGx1aXoudm9u
LmRlbnR6QGludGVsLmNvbQAKCRD0kIDjEDILKWg3D/4vIihzTyD9Hjr5/RVyFz0Z
1NRsCzEjXbnAu2bBh0YyXmLW4b6Wkxlcs4/P393bEpFK4fur3al5MXEE2dBqWDjE
FqDlSbLgkfjVkQlTi+FKqAnt4NaHtOS6356h8a8ZVQOFPMXszt1etbp/Le+93OQZ
6vsFCv90ugArIjjEqdZHl6KjZMZywltDmJWI8xtjB7/hpeHF5ukAmA4UAkb0kUuz
ZSJz73cjfb5PMzlOd6rJIkoaMxT5aKgE9YeVAh2HhMoMU7TdDGGoMADVFnpcABa6
q/EJL0URziqYT6xrrYTPI5gEkXvIaJmB4uFlIR/eYvBuFUsmoRf/bPcsAaID/Zcg
LK0zNR/RM6NQTgkR4u9lziavs4z/T4VfOIrIIXgnKi9sYSC2sur3UklvylwODzHT
El/uF/5swifBRCVzgEcdrntNLPHYWdskvPPo+Z0dOsHk/FjcwjaBlbNtaKpUB7bY
xnP7nyTpWvLqJy6GdKyqHUf9Va35Yn3Hv7BzMI9fhEhdNNrCeyn3zJnJaweZkROk
Ivl1pf3RiDLLblRDHh3Zj5w/XhypLiSj9aMfF65MpX6/AtAc0UlOb65b3D9wx6mZ
8uDfhwXc/rpKOGDiglKQ7ytKB+UrZsg2bm836KayfSarexfoK/Z3sNAEdFKt0RMB
auLejRsqudsxpftZ4MFxYg==
=5x19
-----END PGP SIGNATURE-----
Merge tag 'for-net-2021-06-14' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth
Luiz Augusto von Dentz says:
====================
bluetooth pull request for net:
- Fix crash on SMP when debug is enabled
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix an issue where fairness is decreased since cfs_rq's can end up not
being decayed properly. For two sibling control groups with the same
priority, this can often lead to a load ratio of 99/1 (!!).
This happens because when a cfs_rq is throttled, all the descendant
cfs_rq's will be removed from the leaf list. When they initial cfs_rq
is unthrottled, it will currently only re add descendant cfs_rq's if
they have one or more entities enqueued. This is not a perfect
heuristic.
Instead, we insert all cfs_rq's that contain one or more enqueued
entities, or it its load is not completely decayed.
Can often lead to situations like this for equally weighted control
groups:
$ ps u -C stress
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 10009 88.8 0.0 3676 100 pts/1 R+ 11:04 0:13 stress --cpu 1
root 10023 3.0 0.0 3676 104 pts/1 R+ 11:04 0:00 stress --cpu 1
Fixes: 31bc6aeaab ("sched/fair: Optimize update_blocked_averages()")
[vingo: !SMP build fix]
Signed-off-by: Odin Ugedal <odin@uged.al>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://lore.kernel.org/r/20210612112815.61678-1-odin@uged.al
The list indentation should always be 2 spaces more than the preceding
keyword. A few of the items are only using 1 space, and resulting in
warnings with dt_binding_check. Fix these.
Fixes: cf107e98d0 ("dt-bindings: remoteproc: qcom: pas: Convert binding to YAML")
Signed-off-by: Suman Anna <s-anna@ti.com>
Link: https://lore.kernel.org/r/20210603142639.8335-1-s-anna@ti.com
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
The allwinner,sun6i-a31-hwspinlock.yaml binding has a mismatched
$id and fails to compile the example due to undefined args specifier
values for clocks and resets. Fix both of these issues.
Acked-by: Rob Herring <robh@kernel.org>
Fixes: f9e784dcb6 ("dt-bindings: hwlock: add sun6i_hwspinlock")
Signed-off-by: Suman Anna <s-anna@ti.com>
Link: https://lore.kernel.org/r/20210603144216.10327-1-s-anna@ti.com
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Early exits from for_each_available_child_of_node() should decrement the
node reference counter. Reported by Coccinelle:
drivers/thermal/sprd_thermal.c:387:1-23: WARNING:
Function "for_each_child_of_node" should have of_node_put() before goto around lines 391.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Acked-by: Chunyan Zhang <zhang.lyra@gmail.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210614192230.19248-2-krzysztof.kozlowski@canonical.com
Early exits from for_each_available_child_of_node() should decrement the
node reference counter. Reported by Coccinelle:
drivers/thermal/imx_sc_thermal.c:93:1-33: WARNING:
Function "for_each_available_child_of_node" should have of_node_put() before return around line 97.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Reviewed-by: Jacky Bai <ping.bai@nxp.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210614192230.19248-1-krzysztof.kozlowski@canonical.com
There is not strong reason to have both WWAN and WWAN_CORE symbols,
Let's build the WWAN core framework when WWAN is selected, in the
same way as for other subsystems.
This fixes issue with mhi_net selecting WWAN_CORE without WWAN and
reported by kernel test robot:
Kconfig warnings: (for reference only)
WARNING: unmet direct dependencies detected for WWAN_CORE
Depends on NETDEVICES && WWAN
Selected by
- MHI_NET && NETDEVICES && NET_CORE && MHI_BUS
Fixes: 9a44c1cc63 ("net: Add a WWAN subsystem")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Loic Poulain <loic.poulain@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
When receiving a new connection pchan->conn won't be initialized so the
code cannot use bt_dev_dbg as the pointer to hci_dev won't be
accessible.
Fixes: 2e1614f7d6 ("Bluetooth: SMP: Convert BT_ERR/BT_DBG to bt_dev_err/bt_dev_dbg")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
After the blamed patch, __skb_flow_dissect() on the DSA master stopped
adjusting for the length of the DSA headers. This is because it was told
to adjust only if the needed_headroom is zero, aka if there is no DSA
header. Of course, the adjustment should be done only if there _is_ a
DSA header.
Modify the comment too so it is clearer.
Fixes: 4e50025129 ("net: dsa: generalize overhead for taggers that use both headers and trailers")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The struct sja1105_regs tables are not modified during the runtime of
the driver, so they can be made constant. In fact, struct sja1105_info
already holds a const pointer to these.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Current AXI GPIO driver checking for interrupt data in suspend and
resume path and reporting as error in case of no interrupt connection.
As per AXI GPIO IP specification interrupt connection is optional,
driver logic is updated in suspend and resume calls by reporting
debug message and enable/disable clock in case of no connection.
Signed-off-by: Srinivas Neeli <srinivas.neeli@xilinx.com>
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Vladimir Oltean says:
====================
Fixes and improvements to TJA1103 PHY driver
This series contains:
- an erratum workaround for the TJA1103 PHY integrated in SJA1110
- an adaptation of the driver so it prints less unnecessary information
when probing on SJA1110
- a PTP RX timestamping bug fix and a clarification patch
Targeting net-next since the PHY support is currently in net-next only.
Changes in v3:
Added one more patch which improves the readability of
nxp_c45_reconstruct_ts.
Changes in v2:
Added a comment to the hardware workaround procedure.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The SJA1110 switch integrates TJA1103 PHYs, but in SJA1110 switch rev B
silicon, there is a bug in that the registers for selecting the 100base-T1
autoneg master/slave roles are not writable.
To enable write access to the master/slave registers, these additional
PHY writes are necessary during initialization.
The issue has been corrected in later SJA1110 silicon versions and is
not present in the standalone PHY variants, but applying the workaround
unconditionally in the driver should not do any harm.
Suggested-by: Radu Pirea (NXP OSS) <radu-nicolae.pirea@oss.nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
The reconstruction procedure for partial timestamps reads the current
PTP time and fills in the low 2 bits of the second portion, as well as
the nanoseconds portion, from the actual hardware packet timestamp.
Critically, the reconstruction procedure works because it assumes that
the current PTP time is strictly larger than the hardware timestamp was:
it detects a 2-bit wraparound of the 'seconds' portion by checking whether
the 'seconds' portion of the partial hardware timestamp is larger than
the 'seconds' portion of the current time. That can only happen if the
hardware timestamp was captured by the PHY during the last phase of a
'modulo 4 seconds' interval, and the current PTP time was read by the
driver during the initial phase of the next 'modulo 4 seconds' interval.
The partial RX timestamps are added to priv->rx_queue in
nxp_c45_rxtstamp() and they are processed potentially in parallel by the
aux worker thread in nxp_c45_do_aux_work(). This means that it is
possible for nxp_c45_do_aux_work() to process more than one RX timestamp
during the same schedule.
There is one premature optimization that will cause issues: for RX
timestamping, the driver reads the current time only once, and it uses
that to reconstruct all PTP RX timestamps in the queue. For the second
and later timestamps, this will be an issue if we are processing two RX
timestamps which are to the left and to the right, respectively, of a
4-bit wraparound of the 'seconds' portion of the PTP time, and the
current PTP time is also pre-wraparound.
0.000000000 4.000000000 8.000000000 12.000000000
|..................|..................|..................|............>
^ ^ ^ ^ time
| | | |
| | | process hwts 1 and hwts 2
| | |
| | hwts 2
| |
| read current PTP time
|
hwts 1
What will happen in that case is that hwts 2 (post-wraparound) will use
a stale current PTP time that is pre-wraparound.
But nxp_c45_reconstruct_ts will not detect this condition, because it is
not coded up for it, so it will reconstruct hwts 2 with a current time
from the previous 4 second interval (i.e. 0.something instead of
4.something).
This is solvable by making sure that the full 64-bit current time is
always read after the PHY has taken the partial RX timestamp. We do this
by reading the current PTP time for every timestamp in the RX queue.
Fixes: 514def5dd3 ("phy: nxp-c45-tja11xx: add timestamping support")
Cc: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
nxp_c45_reconstruct_ts() takes a partial hardware timestamp in @hwts,
with 2 bits of the 'seconds' portion, and a full PTP time in @ts.
It patches in the lower bits of @hwts into @ts, and to ensure that the
reconstructed timestamp is correct, it checks whether the lower 2 bits
of @hwts are not in fact higher than the lower 2 bits of @ts. This is
not logically possible because, according to the calling convention, @ts
was collected later in time than @hwts, but due to two's complement
arithmetic it can actually happen, because the current PTP time might
have wrapped around between when @hwts was collected and when @ts was,
yielding the lower 2 bits of @ts smaller than those of @hwts.
To correct for that situation which is expected to happen under normal
conditions, the driver subtracts exactly one wraparound interval from
the reconstructed timestamp, since the upper bits of that need to
correspond to what the upper bits of @hwts were, not to what the upper
bits of @ts were.
Readers might be confused because the driver denotes the amount of bits
that the partial hardware timestamp has to offer as TS_SEC_MASK
(timestamp mask for seconds). But it subtracts a seemingly unrelated
BIT(2), which is in fact more subtle: if the hardware timestamp provides
2 bits of partial 'seconds' timestamp, then the wraparound interval is
2^2 == BIT(2).
But nonetheless, it is better to express the wraparound interval in
terms of a definition we already have, so replace BIT(2) with
1 + GENMASK(1, 0) which produces the same result but is clearer.
Suggested-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Cc: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
The SJA1110 switch integrates these PHYs, and they do not have support
for timestamping. This message becomes quite overwhelming:
[ 10.056596] NXP C45 TJA1103 spi1.0-base-t1:01: the phy does not support PTP
[ 10.112625] NXP C45 TJA1103 spi1.0-base-t1:02: the phy does not support PTP
[ 10.167461] NXP C45 TJA1103 spi1.0-base-t1:03: the phy does not support PTP
[ 10.223510] NXP C45 TJA1103 spi1.0-base-t1:04: the phy does not support PTP
[ 10.278239] NXP C45 TJA1103 spi1.0-base-t1:05: the phy does not support PTP
[ 10.332663] NXP C45 TJA1103 spi1.0-base-t1:06: the phy does not support PTP
[ 15.390828] NXP C45 TJA1103 spi1.2-base-t1:01: the phy does not support PTP
[ 15.445224] NXP C45 TJA1103 spi1.2-base-t1:02: the phy does not support PTP
[ 15.499673] NXP C45 TJA1103 spi1.2-base-t1:03: the phy does not support PTP
[ 15.554074] NXP C45 TJA1103 spi1.2-base-t1:04: the phy does not support PTP
[ 15.608516] NXP C45 TJA1103 spi1.2-base-t1:05: the phy does not support PTP
[ 15.662996] NXP C45 TJA1103 spi1.2-base-t1:06: the phy does not support PTP
So reduce its log level to debug.
Cc: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Zhou Yanjie says:
====================
Add Ingenic SoCs MAC support.
v2->v3:
1.Add "ingenic,mac.yaml" for Ingenic SoCs.
2.Change tx clk delay and rx clk delay from hardware value to ps.
3.return -EINVAL when a unsupported value is encountered when
parsing the binding.
4.Simplify the code of the RGMII part of X2000 SoC according to
Andrew Lunn’s suggestion.
5.Follow the example of "dwmac-mediatek.c" to improve the code
that handles delays according to Andrew Lunn’s suggestion.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support for Ingenic SoC MAC glue layer support for the stmmac
device driver. This driver is used on for the MAC ethernet controller
found in the JZ4775 SoC, the X1000 SoC, the X1600 SoC, the X1830 SoC,
and the X2000 SoC.
Signed-off-by: 周琰杰 (Zhou Yanjie) <zhouyanjie@wanyeetech.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add the dwmac bindings for the JZ4775 SoC, the X1000 SoC,
the X1600 SoC, the X1830 SoC and the X2000 SoC from Ingenic.
Signed-off-by: 周琰杰 (Zhou Yanjie) <zhouyanjie@wanyeetech.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Oleksandr Mazur says:
====================
Marvell Prestera driver implementation of devlink functionality.
This patch series implement Prestera Switchdev driver devlink traps,
that are registered within the driver, as well as extend current devlink
functionality by adding new hard drop statistics counter, that could be
retrieved on-demand: the counter shows number of packets that have been
dropped by the underlying device and haven't been passed to the devlink
subsystem.
The core prestera-devlink functionality is implemented in the prestera_devlink.c.
The patch series also extends the existing devlink kernel API:
- devlink: add trap_drop_counter_get callback for driver to register - make it possible
to keep track of how many packets have been dropped (hard) by the switch device, before
the packets even made it to the devlink subsystem (e.g. dropped due to RXDMA buffer
overflow).
The core features that extend current functionality of prestera Switchdev driver:
- add logic for driver traps and drops registration (also traps with DROP action).
- add documentation for prestera driver traps and drops group.
PATCH v2:
1) Rebase whole series on top of latest mater;
2) Remove storm control-related patches, as they're out of devlink
scope;
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Add documentation for the devlink feature prestera switchdev driver supports:
add description for the support of the driver-specific devlink traps
(include both traps with action TRAP and action DROP);
Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add traps that have init_action being set to DROP.
Add 'trap_drop_counter_get' (devlink API) callback implementation,
that is used to get number of packets that have been dropped by the HW
(traps with action 'DROP').
Add new FW command CPU_CODE_COUNTERS_GET.
Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add devlink traps registration (with corresponding groups) for
all the traffic types that driver traps to the CPU;
prestera_rxtx: report each packet trapped to the CPU (RX) to the
prestera_devlink;
Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add hard drop counter check testcase, to make sure netdevsim driver
properly handles the devlink hard drop counters get/set callbacks.
Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Whenever query statistics is issued for trap with DROP action,
devlink subsystem would also fill-in statistics 'dropped' field.
In case if device driver did't register callback for hard drop
statistics querying, 'dropped' field will be omitted and not filled.
Add trap_drop_counter_get callback implementation to the netdevsim.
Add new test cases for netdevsim, to test both the callback
functionality, as well as drop statistics alteration check.
Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add devlink_trap_drop_packets_get function, as well as test that are
used to verify devlink (hard) dropped stats functionality works.
Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Whenever query statistics is issued for trap, devlink subsystem
would also fill-in statistics 'dropped' field. This field indicates
the number of packets HW dropped and failed to report to the device driver,
and thus - to the devlink subsystem itself.
In case if device driver didn't register callback for hard drop
statistics querying, 'dropped' field will be omitted and not filled.
Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Syzbot reported slab-out-of-bounds Read in
qrtr_endpoint_post. The problem was in wrong
_size_ type:
if (len != ALIGN(size, 4) + hdrlen)
goto err;
If size from qrtr_hdr is 4294967293 (0xfffffffd), the result of
ALIGN(size, 4) will be 0. In case of len == hdrlen and size == 4294967293
in header this check won't fail and
skb_put_data(skb, data + hdrlen, size);
will read out of bound from data, which is hdrlen allocated block.
Fixes: 194ccc8829 ("net: qrtr: Support decoding incoming v2 packets")
Reported-and-tested-by: syzbot+1917d778024161609247@syzkaller.appspotmail.com
Signed-off-by: Pavel Skripkin <paskripkin@gmail.com>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Author forgot to remove that flag.
Fixes: f7af616c63 ("net: iosm: infrastructure")
Reported-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Loic Poulain <loic.poulain@linaro.org>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The 3rd argument is u32 by function definition while it is __be32
by function declaration.
Signed-off-by: Lijun Pan <lijunp213@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>