There is a small race between copy_process() and sched_fork()
where child->sched_task_group point to an already freed pointer.
parent doing fork() | someone moving the parent
| to another cgroup
-------------------------------+-------------------------------
copy_process()
+ dup_task_struct()<1>
parent move to another cgroup,
and free the old cgroup. <2>
+ sched_fork()
+ __set_task_cpu()<3>
+ task_fork_fair()
+ sched_slice()<4>
In the worst case, this bug can lead to "use-after-free" and
cause panic as shown above:
(1) parent copy its sched_task_group to child at <1>;
(2) someone move the parent to another cgroup and free the old
cgroup at <2>;
(3) the sched_task_group and cfs_rq that belong to the old cgroup
will be accessed at <3> and <4>, which cause a panic:
[] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
[] PGD 8000001fa0a86067 P4D 8000001fa0a86067 PUD 2029955067 PMD 0
[] Oops: 0000 [#1] SMP PTI
[] CPU: 7 PID: 648398 Comm: ebizzy Kdump: loaded Tainted: G OE --------- - - 4.18.0.x86_64+ #1
[] RIP: 0010:sched_slice+0x84/0xc0
[] Call Trace:
[] task_fork_fair+0x81/0x120
[] sched_fork+0x132/0x240
[] copy_process.part.5+0x675/0x20e0
[] ? __handle_mm_fault+0x63f/0x690
[] _do_fork+0xcd/0x3b0
[] do_syscall_64+0x5d/0x1d0
[] entry_SYSCALL_64_after_hwframe+0x65/0xca
[] RIP: 0033:0x7f04418cd7e1
Between cgroup_can_fork() and cgroup_post_fork(), the cgroup
membership and thus sched_task_group can't change. So update child's
sched_task_group at sched_post_fork() and move task_fork() and
__set_task_cpu() (where accees the sched_task_group) from sched_fork()
to sched_post_fork().
Fixes: 8323f26ce3 ("sched: Fix race in task_group")
Signed-off-by: Zhang Qiao <zhangqiao22@huawei.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lkml.kernel.org/r/20210915064030.2231-1-zhangqiao22@huawei.com
numa_distance in cpu_attach_domain() is introduced in
commit b5b217346d ("sched/topology: Warn when NUMA diameter > 2")
to warn user when NUMA diameter > 2 as we'll misrepresent
the scheduler topology structures at that time. This is
fixed by Barry in commit 585b6d2723 ("sched/topology: fix the issue
groups don't span domain->span for NUMA diameter > 2") and
numa_distance is unused now. So remove it.
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Barry Song <baohua@kernel.org>
Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>
Link: https://lore.kernel.org/r/20210915063158.80639-1-yangyicong@hisilicon.com
Fix a few comments to help understand them better.
Signed-off-by: Bharata B Rao <bharata@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Mel Gorman <mgorman@suse.de>
Link: https://lkml.kernel.org/r/20211004105706.3669-4-bharata@amd.com
numa_group::fault_cpus is actually a pointer to the region
in numa_group::faults[] where NUMA_CPU stats are located.
Remove this redundant member and use numa_group::faults[NUMA_CPU]
directly like it is done for similar per-process numa fault stats.
There is no functionality change due to this commit.
Signed-off-by: Bharata B Rao <bharata@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Mel Gorman <mgorman@suse.de>
Link: https://lkml.kernel.org/r/20211004105706.3669-3-bharata@amd.com
While allocating group fault stats, task_numa_group()
is using a hard coded number 4. Replace this by
NR_NUMA_HINT_FAULT_STATS.
No functionality change in this commit.
Signed-off-by: Bharata B Rao <bharata@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Mel Gorman <mgorman@suse.de>
Link: https://lkml.kernel.org/r/20211004105706.3669-2-bharata@amd.com
Make sure to prod idle CPUs so they call klp_update_patch_state().
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Acked-by: Miroslav Benes <mbenes@suse.cz>
Acked-by: Vasily Gorbik <gor@linux.ibm.com>
Tested-by: Petr Mladek <pmladek@suse.com>
Tested-by: Vasily Gorbik <gor@linux.ibm.com> # on s390
Link: https://lkml.kernel.org/r/20210929151723.162004989@infradead.org
.opd section contains function descriptors used to locate
functions in the kernel. If someone is able to modify a
function descriptor he will be able to run arbitrary
kernel function instead of another.
To avoid that, move .opd section inside read-only memory.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/3cd40b682fb6f75bb40947b55ca0bac20cb3f995.1634136222.git.christophe.leroy@csgroup.eu
On power9 and earlier platforms, the default event used for cyles and
instructions is PM_CYC (0x0001e) and PM_INST_CMPL (0x00002)
respectively. These events use two programmable PMCs and by default will
count irrespective of the run latch state (idle state). But since they
use programmable PMCs, these events can lead to multiplexing with other
events, because there are only 4 programmable PMCs. Hence in power10,
performance monitoring unit (PMU) driver uses performance monitor
counter 5 (PMC5) and performance monitor counter6 (PMC6) for counting
instructions and cycles.
Currently on power10, the event used for cycles is PM_RUN_CYC (0x600F4)
and instructions uses PM_RUN_INST_CMPL (0x500fa). But counting of these
events in idle state is controlled by the CC56RUN bit setting in Monitor
Mode Control Register0 (MMCR0). If the CC56RUN bit is zero, PMC5/6 will
not count when CTRL[RUN] (run latch) is zero. This could lead to missing
some counts if a thread is in idle state during system wide profiling.
To fix it, set the CC56RUN bit in MMCR0 for power10, which makes PMC5
and PMC6 count instructions and cycles regardless of the run latch
state. Since this change make PMC5/6 count as PM_INST_CMPL/PM_CYC,
rename the event code 0x600f4 as PM_CYC instead of PM_RUN_CYC and event
code 0x500fa as PM_INST_CMPL instead of PM_RUN_INST_CMPL. The changes
are only for PMC5/6 event codes and will not affect the behaviour of
PM_RUN_CYC/PM_RUN_INST_CMPL if progammed in other PMC's.
Fixes: a64e697cef ("powerpc/perf: power10 Performance Monitoring support")
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.cm>
Reviewed-by: Madhavan Srinivasan <maddy@linux.ibm.com>
[mpe: Tweak change log wording for style and consistency]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211007075121.28497-1-atrajeev@linux.vnet.ibm.com
The number of correctable errors is displayed as uncorrectable
errors because the "SBE" error count is passed to both calls of
edac_mc_handle_error().
Pass the correct uncorrectable error count to the second
edac_mc_handle_error() call when logging uncorrectable errors.
[ bp: Massage commit message. ]
Fixes: 7f6998a412 ("ARM: 8888/1: EDAC: Add driver for the Marvell Armada XP SDRAM and L2 cache ECC")
Signed-off-by: Hans Potsch <hans.potsch@nokia.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/20211006121332.58788-1-hans.potsch@nokia.com
The VLV/CHV sideband code is pretty distinct from the rest of the
sideband code. Split it out to new vlv_sideband.[ch].
Pure code movement with relevant #include changes, and a tiny checkpatch
fix on top.
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/755ebbbaf01fc6d306b763b6ef60f45e671ba290.1634119597.git.jani.nikula@intel.com
when `echo $cmd > control` contains multiple queries, extra query
separators (;\n) can parse as empty statements. This is normal, and a
vpr-info on an empty command is just noise.
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Link: https://lore.kernel.org/r/20211013220726.1280565-4-jim.cromie@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
On qemu --smp 3 runs, remove-module can get called 3 times.
So don't print on entry; instead print "removed" after entry is
found and removed, so just once.
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Link: https://lore.kernel.org/r/20211013220726.1280565-3-jim.cromie@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When adding some stuff to the header file we must not rely on
implicit dependencies that are happen by luck or bugs in other
headers. Hence fwnode.h needs to use bits.h directly.
Fixes: c2c724c868 ("driver core: Add fw_devlink_parse_fwtree()")
Cc: Saravana Kannan <saravanak@google.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20211013143707.80222-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jim pointed out that using $module.dyndbg= is always a more flexible
choice for using dynamic debug on the command line. The $module.dyndbg
style is checked at boot and handles if $module is a builtin. If it is
actually a loadable module, it is handled again later when the module is
loaded.
If you just use dyndbg="module $module +p" dynamic debug is only enabled
when $module is a builtin.
It was recommended to illustrate wildcard usage as well.
Suggested-by: Jim Cromie <jim.cromie@gmail.com>
Signed-off-by: Andrew Halaney <ahalaney@redhat.com>
Signed-off-by: Jason Baron <jbaron@akamai.com>
Link: https://lore.kernel.org/r/1634139622-20667-4-git-send-email-jbaron@akamai.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This param has been deprecated for a very long time now, let's rip it
out.
Signed-off-by: Andrew Halaney <ahalaney@redhat.com>
Signed-off-by: Jason Baron <jbaron@akamai.com>
Link: https://lore.kernel.org/r/1634139622-20667-3-git-send-email-jbaron@akamai.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Right now dyndbg shows up as an unknown parameter if used on boot:
Unknown command line parameters: dyndbg=+p
That's because it is unknown, it doesn't sit in the __param
section, so the processing done to warn users supplying an unknown
parameter doesn't think it is legitimate.
Install a dummy handler to register it. dynamic debug needs to search
the whole command line for modules listed that are currently builtin,
so there's no real work to be done in this callback.
Fixes: 86d1919a4f ("init: print out unknown kernel parameters")
Tested-by: Jim Cromie <jim.cromie@gmail.com>
Signed-off-by: Andrew Halaney <ahalaney@redhat.com>
Signed-off-by: Jason Baron <jbaron@akamai.com>
Link: https://lore.kernel.org/r/1634139622-20667-2-git-send-email-jbaron@akamai.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The different port@ entries of the adv7482 nodes shall be encapsulated
in a ports node, add one. This change does not change how the driver
parses the DT and no driver change is needed.
The change however makes it possible to validate the source files with a
correct json-schema.
Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Link: https://lore.kernel.org/r/20211012183431.718691-1-niklas.soderlund+renesas@ragnatech.se
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
The V3U have 32 VIN, 4 CSI-2 and 4 ISP nodes that interact with each
other for video capture. Add all nodes and record how they are
interconnected.
Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Link: https://lore.kernel.org/r/20211012100038.375289-1-niklas.soderlund+renesas@ragnatech.se
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
The 'microchip,24c02' compatible does not match the at24 driver, so
add this generic fallback to the device node compatible string to
make the device to match the driver using the OF device ID table.
Also set this eeprom to read-only mode because it stores the mac
address of the onboard usb network card.
Signed-off-by: Chukun Pan <amadeus@jmu.edu.cn>
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Link: https://lore.kernel.org/r/20211010135017.6855-2-amadeus@jmu.edu.cn
Decrease reference count of chardevice during char device deletion in
order to fix a memory leak. Add a release callabck for the device
associated chardev and move ida_simple_remove into the release function.
Fixes: 2637baed78 ("nvme: introduce generic per-namespace chardev")
Reported-by: Yi Zhang <yi.zhang@redhat.com>
Suggested-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Adam Manzanares <a.manzanares@samsung.com>
Reviewed-by: Javier González <javier@javigon.com>
Tested-by: Yi Zhang <yi.zhang@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Add the check to validate compound response buffer.
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
DataOffset and Length validation can be potencial 32bit overflow.
This patch fix it.
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
* Requests except READ, WRITE, IOCTL, INFO, QUERY
DIRECOTRY, CANCEL must consume one credit.
* If client's granted credits are insufficient,
refuse to handle requests.
* Windows server 2016 or later grant up to 8192
credits to clients at once.
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Hyunchul Lee <hyc.lee@gmail.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Add validation for request/response buffer size check in smb2_ioctl and
fsctl_copychunk() take copychunk_ioctl_req pointer and the other arguments
instead of smb2_ioctl_req structure and remove an unused smb2_ioctl_req
argument of fsctl_validate_negotiate_info.
Cc: Tom Talpey <tom@talpey.com>
Cc: Ronnie Sahlberg <ronniesahlberg@gmail.com>
Cc: Ralph Böhme <slow@samba.org>
Cc: Steve French <smfrench@gmail.com>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Acked-by: Hyunchul Lee <hyc.lee@gmail.com>
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
I got a null-ptr-deref report:
KASAN: null-ptr-deref in range [0x0000000000000090-0x0000000000000097]
...
RIP: 0010:regulator_enable+0x84/0x260
...
Call Trace:
ahci_platform_enable_regulators+0xae/0x320
ahci_platform_enable_resources+0x1a/0x120
ahci_probe+0x4f/0x1b9
platform_probe+0x10b/0x280
...
entry_SYSCALL_64_after_hwframe+0x44/0xae
If devm_regulator_get() in ahci_platform_get_resources() fails,
hpriv->phy_regulator will point to NULL, when enabling or disabling it,
null-ptr-deref will occur.
ahci_probe()
ahci_platform_get_resources()
devm_regulator_get(, "phy") // failed, let phy_regulator = NULL
ahci_platform_enable_resources()
ahci_platform_enable_regulators()
regulator_enable(hpriv->phy_regulator) // null-ptr-deref
commit 962399bb7f ("ata: libahci_platform: Fix regulator_get_optional()
misuse") replaces devm_regulator_get_optional() with devm_regulator_get(),
but PHY regulator omits to delete "hpriv->phy_regulator = NULL;" like AHCI.
Delete it like AHCI regulator to fix this bug.
Fixes: commit 962399bb7f ("ata: libahci_platform: Fix regulator_get_optional() misuse")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Missed a few asics.
v2: update comment
Fixes: 82d05736c4 ("drm/amdgpu/amdgpu_psp: convert to IP version checking")
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
VEGA20 is 11.0.2, but it's handled by powerplay, not
swsmu.
Fixes: a8967967f6 ("drm/amdgpu/amdgpu_smu: convert to IP version checking")
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
When creating unregistered new svm range to recover retry fault, avoid
new svm range to overlap with ranges or userptr ranges managed by TTM,
otherwise svm migration will trigger TTM or userptr eviction, to evict
user queues unexpectedly.
Change helper amdgpu_ttm_tt_affect_userptr to return userptr which is
inside the range. Add helper svm_range_check_vm_userptr to scan all
userptr of the vm, and return overlap userptr bo start, last.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Size can be any value and is user controlled resulting in overwriting the
40 byte array wr_buf with an arbitrary length of data from buf.
Signed-off-by: Thelford Williams <tdwilliamsiv@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
- Remove unused helpers from i.MX specific clock header
- Rework all clk based helpers to use clk_hw based ones
- Rework gate/mux/divider wrappers
- Rework imx_clk_hw_composite and imx_clk_hw_pll14xx wrappers
- Add i.MX8ULP clock driver and related bindings
- Update i.MX pllv4 and composite clocks to support i.MX8ULP
- Disable i.MX7ULP composite clock during initialization
- Add CLK_SET_RATE_NO_REPARENT flag to the i.MX7ULP composite
- Disable the pfd when set pfdv2 clock rate
- Add support for i.MX8ULP in pfdv2
- Add the pcc reset controller support on i.MX8ULP
- Fix the build break when clk-imx8ulp is built as module
- Move csi_sel mux to correct base register in i.MX6UL clock drivr
- Fix csi clk gate register in i.MX6UL clock driver
- Fix the build bu making CLK_IMX8ULP select MXC_CLK
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEdRlgxHYCb3ovKt456LNSLBEEo7YFAmFkWCoACgkQ6LNSLBEE
o7ZDKQ/8DSZg8XHl9X2HKTzFpUJopwMB5X3qBw5dT3p7YVI+P8FBfRqntvgXyPJb
YDUwNDUUtiRYMFcyFK91dXl/N4zLcXfstEffyJWpJ5gy26oD5niPkFJKUjPSIhps
YPtqe7RwlzqLFLYcjT5lfRo9HJukKfRDdam+RN6Ap7o8DPAru5f8vsKfApnkOQb/
HdhvIVdaGSSu4LBBUUb0MALt6MD7HR4qY5iS94AH4H1yVcWk4nCn0lm+Y2U1wxQ/
OwXF7veYnU7zal5tSHzyHaihUbIPReOQTghILI8aVzL8hSf1Lo2dJvIRSeTXUff2
V7TXhEO9jjrZkVNBvW7DFuidKojS0X1PZQRaCTPm9FTeAqP051yxF1UkAneViQMk
mF1cn/bLSaf9BZpZmCLkF7fz0Tg0BOKhXZCpNJ+4FT1LiYEKF9AU6viVl7CSMf7t
vZ8eqFhHZvo9uvCWSoDu+ZRK/Cwb2moFntol9gbPICN1cgJi/g3giBx1gdpOv2Zc
k6nGYbBOtjrx8/JPw9u1MMjxAVBUdA8BdoeiMMXJb/qqr8Ur+oNd3lVIGMHKT7o5
syMxnMOiu4UuHus/uzEKhBtAWVGpzbVSf3lE+5QjE6hFEA1PmsWEULi6OxfieLS8
V4SHoNwCrkcQN8A/tjGK6Zev5kqJNaF0B/1HSs+oEI+XkmXIyjM=
=9IBD
-----END PGP SIGNATURE-----
Merge tag 'clk-imx-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/abelvesa/linux into clk-imx
Pull i.MX clk driver changes from Abel Vesa:
- Remove unused helpers from i.MX specific clock header
- Rework all clk based helpers to use clk_hw based ones
- Rework gate/mux/divider wrappers
- Rework imx_clk_hw_composite and imx_clk_hw_pll14xx wrappers
- Add i.MX8ULP clock driver and related bindings
- Update i.MX pllv4 and composite clocks to support i.MX8ULP
- Disable i.MX7ULP composite clock during initialization
- Add CLK_SET_RATE_NO_REPARENT flag to the i.MX7ULP composite
- Disable the pfd when set pfdv2 clock rate
- Add support for i.MX8ULP in pfdv2
- Add the pcc reset controller support on i.MX8ULP
- Fix the build break when clk-imx8ulp is built as module
- Move csi_sel mux to correct base register in i.MX6UL clock drivr
- Fix csi clk gate register in i.MX6UL clock driver
- Fix build bug making CLK_IMX8ULP select MXC_CLK
* tag 'clk-imx-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/abelvesa/linux: (21 commits)
clk: imx: Make CLK_IMX8ULP select MXC_CLK
clk: imx: imx6ul: Fix csi clk gate register
clk: imx: imx6ul: Move csi_sel mux to correct base register
clk: imx: Fix the build break when clk-imx8ulp build as module
clk: imx: Add the pcc reset controller support on imx8ulp
clk: imx: Add clock driver for imx8ulp
clk: imx: Update the pfdv2 for 8ulp specific support
clk: imx: disable the pfd when set pfdv2 clock rate
clk: imx: Add 'CLK_SET_RATE_NO_REPARENT' for composite-7ulp
clk: imx: disable i.mx7ulp composite clock during initialization
clk: imx: Update the compsite driver to support imx8ulp
clk: imx: Update the pllv4 to support imx8ulp
dt-bindings: clock: Add imx8ulp clock support
clk: imx: Rework imx_clk_hw_pll14xx wrapper
clk: imx: Rework all imx_clk_hw_composite wrappers
clk: imx: Rework all clk_hw_register_divider wrappers
clk: imx: Rework all clk_hw_register_mux wrappers
clk: imx: Rework all clk_hw_register_gate2 wrappers
clk: imx: Rework all clk_hw_register_gate wrappers
clk: imx: Make mux/mux2 clk based helpers use clk_hw based ones
...
The implement of function netdev_all_upper_get_next_dev_rcu has been
removed in:
commit f1170fd462 ("net: Remove all_adj_list and its references")
so delete redundant declaration in header file.
Fixes: f1170fd462 ("net: Remove all_adj_list and its references")
Signed-off-by: Chen Wandun <chenwandun@huawei.com>
Link: https://lore.kernel.org/r/20211013094702.3931071-1-chenwandun@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ido Schimmel says:
====================
mlxsw: Show per-band ECN-marked counter on qdisc
The RED qdisc can expose number of packets that it has marked through
the prob_marked counter (shown in iproute2 as "marked"). This counter
currently just shows number of packets marked in the SW datapath, which
in a switch deployment likely means zero.
Spectrum-3 does support per-TC counters, and in this patchset, mlxsw
supports this RED statistic properly.
====================
Link: https://lore.kernel.org/r/20211013103748.492531-1-idosch@idosch.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add a variant of ECN test that uses qdisc marked counter (supported on
Spectrum-3 and above) instead of the aggregate ethtool ecn_marked counter.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The Qdisc code in mlxsw used to report a number of packets ECN-marked on a
port. Because reporting a per-port value as a per-TC value was misleading,
this was removed in commit 8a29581eb0 ("mlxsw: spectrum: Move the
ECN-marked packet counter to ethtool").
On Spectrum-3, a per-TC number of ECN-marked packets is available in per-TC
congestion counter group. Add a new array for the ECN counter, fetch the
values from the per-TC congestion group, and pick the value indicated by
tclass_num as appropriate.
On Spectrum-1 and Spectrum-2, this per-TC value is not available, and
zeroes will be reported, as they currently are.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The PPCNT register retrieves per port performance counters. The
ecn_marked_tc field in per-TC Congestion counter group contains a count of
packets marked as ECN or potentially marked as ECN.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The name does not make sense as it is. Clearly there is a typo and the
suffix should have been _CNT, like the other enumerators. Fix accordingly.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
There is no such thing as "traffic group". The group that this is a heading
of is "per traffic class counters". Fix the heading.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
'skb' is allocated in digital_in_send_sdd_req(), but not free when
digital_in_send_cmd() failed, which will cause memory leak. Fix it
by freeing 'skb' if digital_in_send_cmd() return failed.
Fixes: 2c66daecc4 ("NFC Digital: Add NFC-A technology support")
Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
'params' is allocated in digital_tg_listen_mdaa(), but not free when
digital_send_cmd() failed, which will cause memory leak. Fix it by
freeing 'params' if digital_send_cmd() return failed.
Fixes: 1c7a4c24fb ("NFC Digital: Add target NFC-DEP support")
Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When nfc proto id is using, nfc_proto_register() return -EBUSY error
code, but forgot to unregister proto. Fix it by adding proto_unregister()
in the error handling case.
Fixes: c7fe3b52c1 ("NFC: add NFC socket family")
Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Link: https://lore.kernel.org/r/20211013034932.2833737-1-william.xuanziyang@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The removal of eprobes was broken and missed in testing. Add various ways
to remove eprobes that are considered acceptable to the testing process to
catch when/if they break again.
Link: https://lkml.kernel.org/r/20211013205533.836644549@goodmis.org
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
When an event probe is to be removed via the API that created it via the
dynamic events, an -ENOENT error is returned.
This is because the removal of the event probe does not expect to see the
event system and name that the event probe is attached to, even though
that's part of the API to create it. As the removal of probes is to use
the same API as they are created.
In fact, the removal is not consistent with the kprobes and uprobes
removal. Fix that by allowing various ways to remove the eprobe.
The eprobe is created with:
e:[GROUP/]NAME SYSTEM/EVENT [OPTIONS]
Have it get removed by echoing in the following into dynamic_events:
# Remove all eprobes with NAME
echo '-:NAME' >> dynamic_events
# Remove a specific eprobe
echo '-:GROUP/NAME' >> dynamic_events
echo '-:GROUP/NAME SYSTEM/EVENT' >> dynamic_events
echo '-:NAME SYSTEM/EVENT' >> dynamic_events
echo '-:GROUP/NAME SYSTEM/EVENT OPTIONS' >> dynamic_events
echo '-:NAME SYSTEM/EVENT OPTIONS' >> dynamic_events
Link: https://lkml.kernel.org/r/20211012081925.0e19cc4f@gandalf.local.home
Link: https://lkml.kernel.org/r/20211013205533.630722129@goodmis.org
Suggested-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Fixes: 7491e2c442 ("tracing: Add a probe that attaches to trace events")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>