linux-xiaomi-chiron

Author	SHA1	Message	Date
Daniel Bristot de Oliveira	039a602db3	trace/hwlat: Protect kdata->kthread with get/put_online_cpus In preparation to the hotplug support, protect kdata->kthread with get/put_online_cpus() to avoid concurrency with hotplug operations. Link: https://lore.kernel.org/linux-doc/20210621134636.5b332226@oasis.local.home/ Link: https://lkml.kernel.org/r/8bdb2a56f46abfd301d6fffbf43448380c09a6f5.1624372313.git.bristot@redhat.com Cc: Phil Auld <pauld@redhat.com> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Kate Carcia <kcarcia@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Alexandre Chartre <alexandre.chartre@oracle.com> Cc: Clark Willaims <williams@redhat.com> Cc: John Kacur <jkacur@redhat.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: x86@kernel.org Cc: linux-doc@vger.kernel.org Cc: linux-kernel@vger.kernel.org Suggested-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Daniel Bristot de Oliveira <bristot@redhat.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>	2021-06-25 19:57:24 -04:00
Daniel Bristot de Oliveira	a955d7eac1	trace: Add timerlat tracer The timerlat tracer aims to help the preemptive kernel developers to found souces of wakeup latencies of real-time threads. Like cyclictest, the tracer sets a periodic timer that wakes up a thread. The thread then computes a wakeup latency value as the difference between the current time and the absolute time that the timer was set to expire. The main goal of timerlat is tracing in such a way to help kernel developers. Usage Write the ASCII text "timerlat" into the current_tracer file of the tracing system (generally mounted at /sys/kernel/tracing). For example: [root@f32 ~]# cd /sys/kernel/tracing/ [root@f32 tracing]# echo timerlat > current_tracer It is possible to follow the trace by reading the trace trace file: [root@f32 tracing]# cat trace # tracer: timerlat # # _-----=> irqs-off # / _----=> need-resched # \| / _---=> hardirq/softirq # \|\| / _--=> preempt-depth # \|\| / # \|\|\|\| ACTIVATION # TASK-PID CPU# \|\|\|\| TIMESTAMP ID CONTEXT LATENCY # \| \| \| \|\|\|\| \| \| \| \| <idle>-0 [000] d.h1 54.029328: #1 context irq timer_latency 932 ns <...>-867 [000] .... 54.029339: #1 context thread timer_latency 11700 ns <idle>-0 [001] dNh1 54.029346: #1 context irq timer_latency 2833 ns <...>-868 [001] .... 54.029353: #1 context thread timer_latency 9820 ns <idle>-0 [000] d.h1 54.030328: #2 context irq timer_latency 769 ns <...>-867 [000] .... 54.030330: #2 context thread timer_latency 3070 ns <idle>-0 [001] d.h1 54.030344: #2 context irq timer_latency 935 ns <...>-868 [001] .... 54.030347: #2 context thread timer_latency 4351 ns The tracer creates a per-cpu kernel thread with real-time priority that prints two lines at every activation. The first is the timer latency observed at the hardirq context before the activation of the thread. The second is the timer latency observed by the thread, which is the same level that cyclictest reports. The ACTIVATION ID field serves to relate the irq execution to its respective thread execution. The irq/thread splitting is important to clarify at which context the unexpected high value is coming from. The irq context can be delayed by hardware related actions, such as SMIs, NMIs, IRQs or by a thread masking interrupts. Once the timer happens, the delay can also be influenced by blocking caused by threads. For example, by postponing the scheduler execution via preempt_disable(), by the scheduler execution, or by masking interrupts. Threads can also be delayed by the interference from other threads and IRQs. The timerlat can also take advantage of the osnoise: traceevents. For example: [root@f32 ~]# cd /sys/kernel/tracing/ [root@f32 tracing]# echo timerlat > current_tracer [root@f32 tracing]# echo osnoise > set_event [root@f32 tracing]# echo 25 > osnoise/stop_tracing_total_us [root@f32 tracing]# tail -10 trace cc1-87882 [005] d..h... 548.771078: #402268 context irq timer_latency 1585 ns cc1-87882 [005] dNLh1.. 548.771082: irq_noise: local_timer:236 start 548.771077442 duration 4597 ns cc1-87882 [005] dNLh2.. 548.771083: irq_noise: reschedule:253 start 548.771083017 duration 56 ns cc1-87882 [005] dNLh2.. 548.771086: irq_noise: call_function_single:251 start 548.771083811 duration 2048 ns cc1-87882 [005] dNLh2.. 548.771088: irq_noise: call_function_single:251 start 548.771086814 duration 1495 ns cc1-87882 [005] dNLh2.. 548.771091: irq_noise: call_function_single:251 start 548.771089194 duration 1558 ns cc1-87882 [005] dNLh2.. 548.771094: irq_noise: call_function_single:251 start 548.771091719 duration 1932 ns cc1-87882 [005] dNLh2.. 548.771096: irq_noise: call_function_single:251 start 548.771094696 duration 1050 ns cc1-87882 [005] d...3.. 548.771101: thread_noise: cc1:87882 start 548.771078243 duration 10909 ns timerlat/5-1035 [005] ....... 548.771103: #402268 context thread timer_latency 25960 ns For further information see: Documentation/trace/timerlat-tracer.rst Link: https://lkml.kernel.org/r/71f18efc013e1194bcaea1e54db957de2b19ba62.1624372313.git.bristot@redhat.com Cc: Phil Auld <pauld@redhat.com> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Kate Carcia <kcarcia@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Alexandre Chartre <alexandre.chartre@oracle.com> Cc: Clark Willaims <williams@redhat.com> Cc: John Kacur <jkacur@redhat.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: x86@kernel.org Cc: linux-doc@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Daniel Bristot de Oliveira <bristot@redhat.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>	2021-06-25 19:57:24 -04:00
Daniel Bristot de Oliveira	bce29ac9ce	trace: Add osnoise tracer In the context of high-performance computing (HPC), the Operating System Noise (osnoise) refers to the interference experienced by an application due to activities inside the operating system. In the context of Linux, NMIs, IRQs, SoftIRQs, and any other system thread can cause noise to the system. Moreover, hardware-related jobs can also cause noise, for example, via SMIs. The osnoise tracer leverages the hwlat_detector by running a similar loop with preemption, SoftIRQs and IRQs enabled, thus allowing all the sources of osnoise during its execution. Using the same approach of hwlat, osnoise takes note of the entry and exit point of any source of interferences, increasing a per-cpu interference counter. The osnoise tracer also saves an interference counter for each source of interference. The interference counter for NMI, IRQs, SoftIRQs, and threads is increased anytime the tool observes these interferences' entry events. When a noise happens without any interference from the operating system level, the hardware noise counter increases, pointing to a hardware-related noise. In this way, osnoise can account for any source of interference. At the end of the period, the osnoise tracer prints the sum of all noise, the max single noise, the percentage of CPU available for the thread, and the counters for the noise sources. Usage Write the ASCII text "osnoise" into the current_tracer file of the tracing system (generally mounted at /sys/kernel/tracing). For example:: [root@f32 ~]# cd /sys/kernel/tracing/ [root@f32 tracing]# echo osnoise > current_tracer It is possible to follow the trace by reading the trace trace file:: [root@f32 tracing]# cat trace # tracer: osnoise # # _-----=> irqs-off # / _----=> need-resched # \| / _---=> hardirq/softirq # \|\| / _--=> preempt-depth MAX # \|\| / SINGLE Interference counters: # \|\|\|\| RUNTIME NOISE % OF CPU NOISE +-----------------------------+ # TASK-PID CPU# \|\|\|\| TIMESTAMP IN US IN US AVAILABLE IN US HW NMI IRQ SIRQ THREAD # \| \| \| \|\|\|\| \| \| \| \| \| \| \| \| \| \| <...>-859 [000] .... 81.637220: 1000000 190 99.98100 9 18 0 1007 18 1 <...>-860 [001] .... 81.638154: 1000000 656 99.93440 74 23 0 1006 16 3 <...>-861 [002] .... 81.638193: 1000000 5675 99.43250 202 6 0 1013 25 21 <...>-862 [003] .... 81.638242: 1000000 125 99.98750 45 1 0 1011 23 0 <...>-863 [004] .... 81.638260: 1000000 1721 99.82790 168 7 0 1002 49 41 <...>-864 [005] .... 81.638286: 1000000 263 99.97370 57 6 0 1006 26 2 <...>-865 [006] .... 81.638302: 1000000 109 99.98910 21 3 0 1006 18 1 <...>-866 [007] .... 81.638326: 1000000 7816 99.21840 107 8 0 1016 39 19 In addition to the regular trace fields (from TASK-PID to TIMESTAMP), the tracer prints a message at the end of each period for each CPU that is running an osnoise/CPU thread. The osnoise specific fields report: - The RUNTIME IN USE reports the amount of time in microseconds that the osnoise thread kept looping reading the time. - The NOISE IN US reports the sum of noise in microseconds observed by the osnoise tracer during the associated runtime. - The % OF CPU AVAILABLE reports the percentage of CPU available for the osnoise thread during the runtime window. - The MAX SINGLE NOISE IN US reports the maximum single noise observed during the runtime window. - The Interference counters display how many each of the respective interference happened during the runtime window. Note that the example above shows a high number of HW noise samples. The reason being is that this sample was taken on a virtual machine, and the host interference is detected as a hardware interference. Tracer options The tracer has a set of options inside the osnoise directory, they are: - osnoise/cpus: CPUs at which a osnoise thread will execute. - osnoise/period_us: the period of the osnoise thread. - osnoise/runtime_us: how long an osnoise thread will look for noise. - osnoise/stop_tracing_us: stop the system tracing if a single noise higher than the configured value happens. Writing 0 disables this option. - osnoise/stop_tracing_total_us: stop the system tracing if total noise higher than the configured value happens. Writing 0 disables this option. - tracing_threshold: the minimum delta between two time() reads to be considered as noise, in us. When set to 0, the default value will be used, which is currently 5 us. Additional Tracing In addition to the tracer, a set of tracepoints were added to facilitate the identification of the osnoise source. - osnoise:sample_threshold: printed anytime a noise is higher than the configurable tolerance_ns. - osnoise:nmi_noise: noise from NMI, including the duration. - osnoise:irq_noise: noise from an IRQ, including the duration. - osnoise:softirq_noise: noise from a SoftIRQ, including the duration. - osnoise:thread_noise: noise from a thread, including the duration. Note that all the values are net values. For example, if while osnoise is running, another thread preempts the osnoise thread, it will start a thread_noise duration at the start. Then, an IRQ takes place, preempting the thread_noise, starting a irq_noise. When the IRQ ends its execution, it will compute its duration, and this duration will be subtracted from the thread_noise, in such a way as to avoid the double accounting of the IRQ execution. This logic is valid for all sources of noise. Here is one example of the usage of these tracepoints:: osnoise/8-961 [008] d.h. 5789.857532: irq_noise: local_timer:236 start 5789.857529929 duration 1845 ns osnoise/8-961 [008] dNh. 5789.858408: irq_noise: local_timer:236 start 5789.858404871 duration 2848 ns migration/8-54 [008] d... 5789.858413: thread_noise: migration/8:54 start 5789.858409300 duration 3068 ns osnoise/8-961 [008] .... 5789.858413: sample_threshold: start 5789.858404555 duration 8723 ns interferences 2 In this example, a noise sample of 8 microseconds was reported in the last line, pointing to two interferences. Looking backward in the trace, the two previous entries were about the migration thread running after a timer IRQ execution. The first event is not part of the noise because it took place one millisecond before. It is worth noticing that the sum of the duration reported in the tracepoints is smaller than eight us reported in the sample_threshold. The reason roots in the overhead of the entry and exit code that happens before and after any interference execution. This justifies the dual approach: measuring thread and tracing. Link: https://lkml.kernel.org/r/e649467042d60e7b62714c9c6751a56299d15119.1624372313.git.bristot@redhat.com Cc: Phil Auld <pauld@redhat.com> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Kate Carcia <kcarcia@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Alexandre Chartre <alexandre.chartre@oracle.com> Cc: Clark Willaims <williams@redhat.com> Cc: John Kacur <jkacur@redhat.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: x86@kernel.org Cc: linux-doc@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Daniel Bristot de Oliveira <bristot@redhat.com> [ Made the following functions static: trace_irqentry_callback() trace_irqexit_callback() trace_intel_irqentry_callback() trace_intel_irqexit_callback() Added to include/trace.h: osnoise_arch_register() osnoise_arch_unregister() Fixed define logic for LATENCY_FS_NOTIFY Reported-by: kernel test robot <lkp@intel.com> ] Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>	2021-06-25 19:57:01 -04:00
Fabien Dessenne	db0f032512	pinctrl: stm32: check for IRQ MUX validity during alloc() Considering the following irq_domain_ops call chain: - .alloc() is called when a clients calls platform_get_irq() or gpiod_to_irq() - .activate() is called next, when the clients calls request_threaded_irq() Check for the IRQ MUX conflict during the first stage (alloc instead of activate). This avoids to provide the client with an IRQ that can't be used. Signed-off-by: Fabien Dessenne <fabien.dessenne@foss.st.com> Link: https://lore.kernel.org/r/20210617144602.2557619-1-fabien.dessenne@foss.st.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org>	2021-06-26 01:52:35 +02:00
Steven Rostedt (VMware)	6880c987e4	tracing: Add LATENCY_FS_NOTIFY to define if latency_fsnotify() is defined With the coming addition of the osnoise tracer, the configs needed to include the latency_fsnotify() has become more complex, and to keep the declaration in the header file the same as in the C file, just have the logic needed to define it in one place, and that defines LATENCY_FS_NOTIFY which will be used in the C code. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>	2021-06-25 19:47:33 -04:00
Sai Krishna Potthuri	fa99e70138	pinctrl: zynqmp: some code cleanups Some minor code cleanups and updates which includes - Mention module name under help in Kconfig. - Remove extra lines and duplicate Pin range checks. - Replace 'return ret' with 'return 0' in success path. - Copyright year update. - use devm_pinctrl_register() instead pinctrl_register() in probe. Signed-off-by: Sai Krishna Potthuri <lakshmi.sai.krishna.potthuri@xilinx.com> Link: https://lore.kernel.org/r/1624273214-66849-1-git-send-email-lakshmi.sai.krishna.potthuri@xilinx.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org>	2021-06-26 01:44:19 +02:00
Christophe Leroy	767e6e7130	powerpc/interrupt: Also use exit_must_hard_disable() on PPC32 Reduce #ifdefs a bit by making exit_must_hard_disable() return true on PPC32. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/52531029563c1fc823b790058e799d0ca71b028c.1624631463.git.christophe.leroy@csgroup.eu	2021-06-26 09:43:34 +10:00
Alexandru Ardelean	2f0d67bf4c	clk: tegra: clk-tegra124-dfll-fcpu: don't use devm functions for regulator The purpose of the device-managed functions is to bind the life-time of an object to that of a parent device object. This is not the case for the 'vdd-cpu' regulator in this driver. A reference is obtained via devm_regulator_get() and immediately released with devm_regulator_put(). In this case, the usage of devm_ functions is slightly excessive, as the un-managed versions of these functions is a little cleaner (and slightly more economical in terms of allocation). This change converts the devm_regulator_{get,put}() to regulator_{get,put}() in the get_alignment_from_regulator() function of this driver. Signed-off-by: Alexandru Ardelean <aardelean@deviqon.com> Link: https://lore.kernel.org/r/20210624084737.42336-1-aardelean@deviqon.com Reviewed-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Stephen Boyd <sboyd@kernel.org>	2021-06-25 16:23:07 -07:00
Christophe JAILLET	b9ec1c1f9c	clk: zynqmp: pll: Remove some dead code 'clk_hw_set_rate_range()' does not return any error code and 'ret' is known to be 0 at this point, so this message can never be displayed. Remove it. Fixes: `3fde0e16d0` ("drivers: clk: Add ZynqMP clock driver") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://lore.kernel.org/r/71a9fed5f762a71248b8ac73c0a15af82f3ce1e2.1619867987.git.christophe.jaillet@wanadoo.fr Reviewed-by: Michael Tretter <m.tretter@pengutronix.de> Signed-off-by: Stephen Boyd <sboyd@kernel.org>	2021-06-25 16:08:18 -07:00
Michal Simek	6c9feabc2c	clk: zynqmp: fix compile testing without ZYNQMP_FIRMWARE When the firmware code is disabled, the incomplete error handling in the clk driver causes compile-time warnings: drivers/clk/zynqmp/pll.c: In function 'zynqmp_pll_recalc_rate': drivers/clk/zynqmp/pll.c:147:29: error: 'fbdiv' is used uninitialized [-Werror=uninitialized] 147 \| rate = parent_rate * fbdiv; \| ~~~~~~~~~~~~^~~~~~~ In function 'zynqmp_pll_get_mode', inlined from 'zynqmp_pll_recalc_rate' at drivers/clk/zynqmp/pll.c:148:6: drivers/clk/zynqmp/pll.c:61:27: error: 'ret_payload' is used uninitialized [-Werror=uninitialized] 61 \| return ret_payload[1]; \| ~~~~~~~~~~~^~~ drivers/clk/zynqmp/pll.c: In function 'zynqmp_pll_recalc_rate': drivers/clk/zynqmp/pll.c:53:13: note: 'ret_payload' declared here 53 \| u32 ret_payload[PAYLOAD_ARG_CNT]; \| ^~~~~~~~~~~ drivers/clk/zynqmp/clk-mux-zynqmp.c: In function 'zynqmp_clk_mux_get_parent': drivers/clk/zynqmp/clk-mux-zynqmp.c:57:16: error: 'val' is used uninitialized [-Werror=uninitialized] 57 \| return val; \| ^~~ As it was apparently intentional to support this for compile testing purposes, change the code to have just enough error handling for the compiler to not notice the remaining bugs. Fixes: `21f2375346` ("clk: zynqmp: Drop dependency on ARCH_ZYNQMP") Co-developed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Michal Simek <michal.simek@xilinx.com> Link: https://lore.kernel.org/r/f1c4e8c903fe2d5df5413421920a56890a46387a.1624356908.git.michal.simek@xilinx.com Signed-off-by: Stephen Boyd <sboyd@kernel.org>	2021-06-25 16:04:54 -07:00
Linus Torvalds	e2f527b58e	SCSI fixes on 20210625 Two small fixes, both in upper layer drivers (scsi disk and cdrom). The sd one is fixing a commit changing revalidation that came from the block tree a while ago (5.10) and the sr one adds handling of a condition we didn't previously handle for manually removed media. Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com> -----BEGIN PGP SIGNATURE----- iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCYNZTQSYcamFtZXMuYm90 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishSp6AP9Q8jdJ 29ditYcnr23nsLgtzK0koo7ip3TM6WjEf3SSIAEA45wPiEdC8G35wfQFzefy3UfO LGbq6eNGwPDakldysAs= =NKGv -----END PGP SIGNATURE----- Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Two small fixes, both in upper layer drivers (scsi disk and cdrom). The sd one is fixing a commit changing revalidation that came from the block tree a while ago (5.10) and the sr one adds handling of a condition we didn't previously handle for manually removed media" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: sd: Call sd_revalidate_disk() for ioctl(BLKRRPART) scsi: sr: Return appropriate error code when disk is ejected	2021-06-25 15:59:14 -07:00
Bjorn Andersson	aef6a521e5	remoteproc: qcom: pas: Add SC8180X adsp, cdsp and mpss The Qualcomm SC8180X has the typical ADSP, CDSP and MPSS remote processors operated using the PAS interface, add support for these. Attempts to configuring mss.lvl is failing, so a new adsp_data is provided that skips this resource, for now. Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/20210608174944.2045215-2-bjorn.andersson@linaro.org Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>	2021-06-25 17:43:35 -05:00
Bjorn Andersson	4865ed1360	dt-bindings: remoteproc: qcom: pas: Add SC8180X adsp, cdsp and mpss Add compatibles for the Audio DSP, Compute DSP and Modem subsystem found in the Qualcomm SC8180x to the Peripheral Authentication Service remoteproc binding. Acked-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20210608174944.2045215-1-bjorn.andersson@linaro.org Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>	2021-06-25 17:43:32 -05:00
Geert Uytterhoeven	feb29cc744	dt-bindings: clock: gpio-mux-clock: Convert to json-schema Convert the simple GPIO clock multiplexer Device Tree binding documentation to json-schema. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://lore.kernel.org/r/14cb3b4da446f26a4780e0bd1b58788eb6085d05.1623414619.git.geert+renesas@glider.be Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Stephen Boyd <sboyd@kernel.org>	2021-06-25 15:41:58 -07:00
Steven Rostedt	62de4f29e9	trace: Add __print_ns_to_secs() and __print_ns_without_secs() helpers To have nanosecond output displayed in a more human readable format, its nicer to convert it to a seconds format (XXX.YYYYYYYYY). The problem is that to do so, the numbers must be divided by NSEC_PER_SEC, and moded too. But as these numbers are 64 bit, this can not be done simply with '/' and '%' operators, but must use do_div() instead. Instead of performing the expensive do_div() in the hot path of the tracepoint, it is more efficient to perform it during the output phase. But passing in do_div() can confuse the parser, and do_div() doesn't work exactly like a normal C function. It modifies the number in place, and we don't want to modify the actual values in the ring buffer. Two helper functions are now created: __print_ns_to_secs() and __print_ns_without_secs() They both take a value of nanoseconds, and the former will return that number divided by NSEC_PER_SEC, and the latter will mod it with NSEC_PER_SEC giving a way to print a nice human readable format: __print_fmt("time=%llu.%09u", __print_ns_to_secs(REC->nsec_val), __print_ns_without_secs(REC->nsec_val)) Link: https://lkml.kernel.org/r/e503b903045496c4ccde52843e1e318b422f7a56.1624372313.git.bristot@redhat.com Cc: Phil Auld <pauld@redhat.com> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Kate Carcia <kcarcia@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Alexandre Chartre <alexandre.chartre@oracle.com> Cc: Clark Willaims <williams@redhat.com> Cc: John Kacur <jkacur@redhat.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: x86@kernel.org Cc: linux-doc@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Daniel Bristot de Oliveira <bristot@redhat.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>	2021-06-25 18:26:12 -04:00
Daniel Bristot de Oliveira	aa892f8c88	trace/hwlat: Remove printk from sampling loop hwlat has some time operation checks on the sample loop, and it is currently using pr_err (printk) to report them. The problem is that this can lead the system to an unresponsible state due to an overflow of printk messages. This problem can be mitigated by writing the error message to the trace buffer. Remove the printk messages from the sampling loop, switching the to messages in the trace buffer. No functional change. Link: https://lkml.kernel.org/r/9d77c34869748aa105e965c769d24642914eea3a.1624372313.git.bristot@redhat.com Cc: Phil Auld <pauld@redhat.com> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Kate Carcia <kcarcia@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Alexandre Chartre <alexandre.chartre@oracle.com> Cc: Clark Willaims <williams@redhat.com> Cc: John Kacur <jkacur@redhat.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: x86@kernel.org Cc: linux-doc@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Daniel Bristot de Oliveira <bristot@redhat.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>	2021-06-25 18:26:12 -04:00
Daniel Bristot de Oliveira	f27a1c9e1b	trace/hwlat: Use trace_min_max_param for width and window params Use the trace_min_max_param to reduce code duplication. No functional change. Link: https://lkml.kernel.org/r/b91accd5a7c6c14ea02d3379aae974ba22b47dd6.1624372313.git.bristot@redhat.com Cc: Phil Auld <pauld@redhat.com> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Kate Carcia <kcarcia@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Alexandre Chartre <alexandre.chartre@oracle.com> Cc: Clark Willaims <williams@redhat.com> Cc: John Kacur <jkacur@redhat.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: x86@kernel.org Cc: linux-doc@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Daniel Bristot de Oliveira <bristot@redhat.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>	2021-06-25 18:26:12 -04:00
Daniel Bristot de Oliveira	bc87cf0a08	trace: Add a generic function to read/write u64 values from tracefs The hwlat detector and (in preparation for) the osnoise/timerlat tracers have a set of u64 parameters that the user can read/write via tracefs. For instance, we have hwlat_detector's window and width. To reduce the code duplication, hwlat's window and width share the same read function. However, they do not share the write functions because they do different parameter checks. For instance, the width needs to be smaller than the window, while the window needs to be larger than the window. The same pattern repeats on osnoise/timerlat, and a large portion of the code was devoted to the write function. Despite having different checks, the write functions have the same structure: read a user-space buffer take the lock that protects the value check for minimum and maximum acceptable values save the value release the lock return success or error To reduce the code duplication also in the write functions, this patch provides a generic read and write implementation for u64 values that need to be within some minimum and/or maximum parameters, while (potentially) being protected by a lock. To use this interface, the structure trace_min_max_param needs to be filled: struct trace_min_max_param { struct mutex lock; u64 val; u64 min; u64 max; }; The desired value is stored on the variable pointed by val. If min points to a minimum acceptable value, it will be checked during the write operation. Likewise, if max points to a maximum allowable value, it will be checked during the write operation. Finally, if lock points to a mutex, it will be taken at the beginning of the operation and released at the end. The definition of a trace_min_max_param needs to passed as the (private) data for tracefs_create_file(), and the trace_min_max_fops (added by this patch) as the fops file_operations. Link: https://lkml.kernel.org/r/3e35760a7c8b5c55f16ae5ad5fc54a0e71cbe647.1624372313.git.bristot@redhat.com Cc: Phil Auld <pauld@redhat.com> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Kate Carcia <kcarcia@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Alexandre Chartre <alexandre.chartre@oracle.com> Cc: Clark Willaims <williams@redhat.com> Cc: John Kacur <jkacur@redhat.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: x86@kernel.org Cc: linux-doc@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Daniel Bristot de Oliveira <bristot@redhat.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>	2021-06-25 18:26:12 -04:00
Daniel Bristot de Oliveira	f46b16520a	trace/hwlat: Implement the per-cpu mode Implements the per-cpu mode in which a sampling thread is created for each cpu in the "cpus" (and tracing_mask). The per-cpu mode has the potention to speed up the hwlat detection by running on multiple CPUs at the same time, at the cost of higher cpu usage with irqs disabled. Use with care. [ Changed get_cpu_data() to static. Reported-by: kernel test robot <lkp@intel.com> ] Link: https://lkml.kernel.org/r/ec06d0ab340e8460d293772faba19ad8a5c371aa.1624372313.git.bristot@redhat.com Cc: Phil Auld <pauld@redhat.com> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Kate Carcia <kcarcia@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Alexandre Chartre <alexandre.chartre@oracle.com> Cc: Clark Willaims <williams@redhat.com> Cc: John Kacur <jkacur@redhat.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: x86@kernel.org Cc: linux-doc@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Daniel Bristot de Oliveira <bristot@redhat.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>	2021-06-25 18:23:22 -04:00
Dave Chinner	1effb72a81	xfs: don't wait on future iclogs when pushing the CIL The iclogbuf ring attached to the struct xlog is circular, hence the first and last iclogs in the ring can only be determined by comparing them against the log->l_iclog pointer. In xfs_cil_push_work(), we want to wait on previous iclogs that were issued so that we can flush them to stable storage with the commit record write, and it simply waits on the previous iclog in the ring. This, however, leads to CIL push hangs in generic/019 like so: task:kworker/u33:0 state:D stack:12680 pid: 7 ppid: 2 flags:0x00004000 Workqueue: xfs-cil/pmem1 xlog_cil_push_work Call Trace: __schedule+0x30b/0x9f0 schedule+0x68/0xe0 xlog_wait_on_iclog+0x121/0x190 ? wake_up_q+0xa0/0xa0 xlog_cil_push_work+0x994/0xa10 ? _raw_spin_lock+0x15/0x20 ? xfs_swap_extents+0x920/0x920 process_one_work+0x1ab/0x390 worker_thread+0x56/0x3d0 ? rescuer_thread+0x3c0/0x3c0 kthread+0x14d/0x170 ? __kthread_bind_mask+0x70/0x70 ret_from_fork+0x1f/0x30 With other threads blocking in either xlog_state_get_iclog_space() waiting for iclog space or xlog_grant_head_wait() waiting for log reservation space. The problem here is that the previous iclog on the ring might actually be a future iclog. That is, if log->l_iclog points at commit_iclog, commit_iclog is the first (oldest) iclog in the ring and there are no previous iclogs pending as they have all completed their IO and been activated again. IOWs, commit_iclog->ic_prev points to an iclog that will be written in the future, not one that has been written in the past. Hence, in this case, waiting on the ->ic_prev iclog is incorrect behaviour, and depending on the state of the future iclog, we can end up with a circular ABA wait cycle and we hang. The fix is made more complex by the fact that many iclogs states cannot be used to determine if the iclog is a past or future iclog. Hence we have to determine past iclogs by checking the LSN of the iclog rather than their state. A past ACTIVE iclog will have a LSN of zero, while a future ACTIVE iclog will have a LSN greater than the current iclog. We don't wait on either of these cases. Similarly, a future iclog that hasn't completed IO will have an LSN greater than the current iclog and so we don't wait on them. A past iclog that is still undergoing IO completion will have a LSN less than the current iclog and those are the only iclogs that we need to wait on. Hence we can use the iclog LSN to determine what iclogs we need to wait on here. Fixes: 5fd9256ce156 ("xfs: separate CIL commit record IO") Reported-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>	2021-06-25 14:02:02 -07:00
Mikulas Patocka	95b88f4d71	dm writecache: pause writeback if cache full and origin being written directly Implementation reuses dm_io_tracker, that until now was only used by dm-cache, to track if any writes were issued directly to the origin (due to cache being full) within the last second. If so writeback is paused for a second. This change improves performance for when the cache is full and IO is issued directly to the origin device (rather than through the cache). Depends-on: `d53f1fafec` ("dm writecache: do direct write if the cache is full") Suggested-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2021-06-25 16:04:01 -04:00
Mike Snitzer	dc4fa29fe4	dm io tracker: factor out IO tracker Allow other code to use dm_io_tracker. Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2021-06-25 15:28:59 -04:00
Hou Tao	b6e58b5466	dm btree remove: assign new_root only when removal succeeds remove_raw() in dm_btree_remove() may fail due to IO read error (e.g. read the content of origin block fails during shadowing), and the value of shadow_spine::root is uninitialized, but the uninitialized value is still assign to new_root in the end of dm_btree_remove(). For dm-thin, the value of pmd->details_root or pmd->root will become an uninitialized value, so if trying to read details_info tree again out-of-bound memory may occur as showed below: general protection fault, probably for non-canonical address 0x3fdcb14c8d7520 CPU: 4 PID: 515 Comm: dmsetup Not tainted 5.13.0-rc6 Hardware name: QEMU Standard PC RIP: 0010:metadata_ll_load_ie+0x14/0x30 Call Trace: sm_metadata_count_is_more_than_one+0xb9/0xe0 dm_tm_shadow_block+0x52/0x1c0 shadow_step+0x59/0xf0 remove_raw+0xb2/0x170 dm_btree_remove+0xf4/0x1c0 dm_pool_delete_thin_device+0xc3/0x140 pool_message+0x218/0x2b0 target_message+0x251/0x290 ctl_ioctl+0x1c4/0x4d0 dm_ctl_ioctl+0xe/0x20 __x64_sys_ioctl+0x7b/0xb0 do_syscall_64+0x40/0xb0 entry_SYSCALL_64_after_hwframe+0x44/0xae Fixing it by only assign new_root when removal succeeds Signed-off-by: Hou Tao <houtao1@huawei.com> Cc: stable@vger.kernel.org Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2021-06-25 15:25:24 -04:00
Damien Le Moal	28436ba34b	dm zone: fix dm_revalidate_zones() memory allocation Make sure that the zone write pointer offset array is allocated with a vmalloc in dm_zone_revalidate_cb() by passing GFP_KERNEL gfp flag to kvcalloc(). However, since we do not want to trigger IOs while revalidating zones, change dm_revalidate_zones() to have the zone scan done in GFP_NOIO context using memalloc_noio_save/restore calls. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Fixes: `bb37d77239` ("dm: introduce zone append emulation") Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2021-06-25 15:25:23 -04:00
Colin Ian King	326dbde2e0	dm ps io affinity: remove redundant continue statement The continue statement at the end of a for-loop has no effect, remove it. Addresses-Coverity: ("Continue has no effect") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2021-06-25 15:25:22 -04:00
Mikulas Patocka	611c3e168b	dm writecache: add optional "metadata_only" parameter Add a "metadata_only" parameter that when present: only metadata is promoted to the cache. This option improves performance for heavier REQ_META workloads (e.g. device-mapper-test-suite's "git clone and checkout" benchmark improves from 341s to 312s). Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2021-06-25 15:25:21 -04:00
Mike Snitzer	cd039afa0a	dm writecache: add "cleaner" and "max_age" to Documentation Backfill missing Documentation. Fixes: `93de44eb3f` ("dm writecache: implement the "cleaner" policy") Fixes: `3923d4854e` ("dm writecache: implement gradual cleanup") Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2021-06-25 15:25:19 -04:00
Steve French	0fa757b5d3	smb3: prevent races updating CurrentMid There was one place where we weren't locking CurrentMid, and although likely to be safe since even without the lock since it is during negotiate protocol, it is more consistent to lock it in this last remaining place, and avoids confusing Coverity warning. Addresses-Coverity: 1486665 ("Data race condition") Signed-off-by: Steve French <stfrench@microsoft.com>	2021-06-25 14:02:26 -05:00
David S. Miller	ff8744b5eb	Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== 100GbE Intel Wired LAN Driver Updates 2021-06-25 This series contains updates to ice driver only. Jesse adds support for tracepoints to aide in debugging. Maciej adds support for PTP auxiliary pin support. Victor removes the VSI info from the old aggregator when moving the VSI to another aggregator. Tony removes an unnecessary VSI assignment. Christophe Jaillet fixes a memory leak for failed allocation in ice_pf_dcb_cfg(). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-06-25 11:59:11 -07:00
Guvenc Gulce	17081633e2	net/smc: Ensure correct state of the socket in send path When smc_sendmsg() is called before the SMC socket initialization has completed, smc_tx_sendmsg() will access un-initialized fields of the SMC socket which results in a null-pointer dereference. Fix this by checking the socket state first in smc_tx_sendmsg(). Fixes: `e0e4b8fa53` ("net/smc: Add SMC statistics support") Reported-by: syzbot+5dda108b672b54141857@syzkaller.appspotmail.com Reviewed-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: Guvenc Gulce <guvenc@linux.ibm.com> Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-06-25 11:53:51 -07:00
David S. Miller	4e3db44a24	wireless-drivers-next patches for v5.14 Second, and most likely the last, set of patches for v5.14. mt76 and iwlwifi have most patches in this round, but rtw88 also has some new features. Nothing special really standing out. mt76 * mt7915 MSI support * disable ASPM on mt7915 * mt7915 tx status reporting * mt7921 decap offload rtw88 * beacon filter support * path diversity support * firmware crash information via devcoredump * quirks for disabling pci capabilities mt7601u * add USB ID for a XiaoDu WiFi Dongle ath11k * enable support for QCN9074 PCI devices brcmfmac * support parse country code map from DeviceTree iwlwifi * support for new hardware * support for BIOS control of 11ax enablement in Russia * support UNII4 band enablement from BIOS -----BEGIN PGP SIGNATURE----- iQFJBAABCgAzFiEEiBjanGPFTz4PRfLobhckVSbrbZsFAmDVyOMVHGt2YWxvQGNv ZGVhdXJvcmEub3JnAAoJEG4XJFUm622bXtsH/RB2SKxPY4Rgb7+afjpJIeGDBq8T 7p5/Xe/PaKMIuusFuJ1gYbTShm5FjdA16F7ZMYSgzpf2b9XBv6hDw7xsPW3qPPtJ MJynfnyvpNH9/NWcSrqOMLeFGv1ot5861BLESCXIaWNud+8OMUSw+qf8P6Z80RCD tasOayIJVQ/6xLpzbqYoLXfXGeIwlWTiQnQROA8Joq7vQziHSnHTqC7RyTF6SrFn AUnGJ10t62hsGt64tt1QeaCV65SCTPjxI2KKpUpvhD/yQtdcEq2f+1vCfnCk+NFz VWxCAI6U7s6gf0Lu4/+cJhHt/ZCtLZ2d06g93sfplSWHBImYxG55cO3tHzU= =pRFB -----END PGP SIGNATURE----- Merge tag 'wireless-drivers-next-2021-06-25' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next Kalle Valo says: ==================== wireless-drivers-next patches for v5.14 Second, and most likely the last, set of patches for v5.14. mt76 and iwlwifi have most patches in this round, but rtw88 also has some new features. Nothing special really standing out. mt76 * mt7915 MSI support * disable ASPM on mt7915 * mt7915 tx status reporting * mt7921 decap offload rtw88 * beacon filter support * path diversity support * firmware crash information via devcoredump * quirks for disabling pci capabilities mt7601u * add USB ID for a XiaoDu WiFi Dongle ath11k * enable support for QCN9074 PCI devices brcmfmac * support parse country code map from DeviceTree iwlwifi * support for new hardware * support for BIOS control of 11ax enablement in Russia * support UNII4 band enablement from BIOS ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-06-25 11:50:02 -07:00
Marcin Wojtas	ac53c26433	net: mdiobus: withdraw fwnode_mdbiobus_register The newly implemented fwnode_mdbiobus_register turned out to be problematic - in case the fwnode_/of_/acpi_mdio are built as modules, a dependency cycle can be observed during the depmod phase of modules_install, eg.: depmod: ERROR: Cycle detected: fwnode_mdio -> of_mdio -> fwnode_mdio depmod: ERROR: Found 2 modules in dependency cycles! OR: depmod: ERROR: Cycle detected: acpi_mdio -> fwnode_mdio -> acpi_mdio depmod: ERROR: Found 2 modules in dependency cycles! A possible solution could be to rework fwnode_mdiobus_register, so that to merge the contents of acpi_mdiobus_register and of_mdiobus_register. However feasible, such change would be very intrusive and affect huge amount of the of_mdiobus_register users. Since there are currently 2 users of ACPI and MDIO (xgmac_mdio and mvmdio), withdraw the fwnode_mdbiobus_register and roll back to a simple 'if' condition in affected drivers. Fixes: `62a6ef6a99` ("net: mdiobus: Introduce fwnode_mdbiobus_register()") Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-06-25 11:46:29 -07:00
Petr Oros	d6765985a4	Revert "be2net: disable bh with spin_lock in be_process_mcc" Patch was based on wrong presumption that be_poll can be called only from bh context. It reintroducing old regression (also reverted) and causing deadlock when we use netconsole with benet in bonding. Old revert: commit `072a9c4860` ("netpoll: revert `6bdb7fe310` and fix be_poll() instead") [ 331.269715] bond0: (slave enp0s7f0): Releasing backup interface [ 331.270121] CPU: 4 PID: 1479 Comm: ifenslave Not tainted 5.13.0-rc7+ #2 [ 331.270122] Call Trace: [ 331.270122] [c00000001789f200] [c0000000008c505c] dump_stack+0x100/0x174 (unreliable) [ 331.270124] [c00000001789f240] [c008000001238b9c] be_poll+0x64/0xe90 [be2net] [ 331.270125] [c00000001789f330] [c000000000d1e6e4] netpoll_poll_dev+0x174/0x3d0 [ 331.270127] [c00000001789f400] [c008000001bc167c] bond_poll_controller+0xb4/0x130 [bonding] [ 331.270128] [c00000001789f450] [c000000000d1e624] netpoll_poll_dev+0xb4/0x3d0 [ 331.270129] [c00000001789f520] [c000000000d1ed88] netpoll_send_skb+0x448/0x470 [ 331.270130] [c00000001789f5d0] [c0080000011f14f8] write_msg+0x180/0x1b0 [netconsole] [ 331.270131] [c00000001789f640] [c000000000230c0c] console_unlock+0x54c/0x790 [ 331.270132] [c00000001789f7b0] [c000000000233098] vprintk_emit+0x2d8/0x450 [ 331.270133] [c00000001789f810] [c000000000234758] vprintk+0xc8/0x270 [ 331.270134] [c00000001789f850] [c000000000233c28] printk+0x40/0x54 [ 331.270135] [c00000001789f870] [c000000000ccf908] __netdev_printk+0x150/0x198 [ 331.270136] [c00000001789f910] [c000000000ccfdb4] netdev_info+0x68/0x94 [ 331.270137] [c00000001789f950] [c008000001bcbd70] __bond_release_one+0x188/0x6b0 [bonding] [ 331.270138] [c00000001789faa0] [c008000001bcc6f4] bond_do_ioctl+0x42c/0x490 [bonding] [ 331.270139] [c00000001789fb60] [c000000000d0d17c] dev_ifsioc+0x17c/0x400 [ 331.270140] [c00000001789fbc0] [c000000000d0db70] dev_ioctl+0x390/0x890 [ 331.270141] [c00000001789fc10] [c000000000c7c76c] sock_do_ioctl+0xac/0x1b0 [ 331.270142] [c00000001789fc90] [c000000000c7ffac] sock_ioctl+0x31c/0x6e0 [ 331.270143] [c00000001789fd60] [c0000000005b9728] sys_ioctl+0xf8/0x150 [ 331.270145] [c00000001789fdb0] [c0000000000336c0] system_call_exception+0x160/0x2f0 [ 331.270146] [c00000001789fe10] [c00000000000d35c] system_call_common+0xec/0x278 [ 331.270147] --- interrupt: c00 at 0x7fffa6c6ec00 [ 331.270147] NIP: 00007fffa6c6ec00 LR: 0000000105c4185c CTR: 0000000000000000 [ 331.270148] REGS: c00000001789fe80 TRAP: 0c00 Not tainted (5.13.0-rc7+) [ 331.270148] MSR: 800000000280f033 <SF,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE> CR: 28000428 XER: 00000000 [ 331.270155] IRQMASK: 0 [ 331.270156] GPR00: 0000000000000036 00007fffd494d5b0 00007fffa6d57100 0000000000000003 [ 331.270158] GPR04: 0000000000008991 00007fffd494d6d0 0000000000000008 00007fffd494f28c [ 331.270161] GPR08: 0000000000000003 0000000000000000 0000000000000000 0000000000000000 [ 331.270164] GPR12: 0000000000000000 00007fffa6dfa220 0000000000000000 0000000000000000 [ 331.270167] GPR16: 0000000105c44880 0000000000000000 0000000105c60088 0000000105c60318 [ 331.270170] GPR20: 0000000105c602c0 0000000105c44560 0000000000000000 0000000000000000 [ 331.270172] GPR24: 00007fffd494dc50 00007fffd494d6a8 0000000105c60008 00007fffd494d6d0 [ 331.270175] GPR28: 00007fffd494f27e 0000000105c6026c 00007fffd494f284 0000000000000000 [ 331.270178] NIP [00007fffa6c6ec00] 0x7fffa6c6ec00 [ 331.270178] LR [0000000105c4185c] 0x105c4185c [ 331.270179] --- interrupt: c00 This reverts commit `d0d006a43e`. Fixes: `d0d006a43e` ("be2net: disable bh with spin_lock in be_process_mcc") Signed-off-by: Petr Oros <poros@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-06-25 11:44:16 -07:00
Christophe JAILLET	b81c191c46	ice: Fix a memory leak in an error handling path in 'ice_pf_dcb_cfg()' If this 'kzalloc()' fails we must free some resources as in all the other error handling paths of this function. Fixes: `348048e724` ("ice: Implement iidc operations") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-06-25 11:30:50 -07:00
Tony Nguyen	70fa0a0780	ice: remove unnecessary VSI assignment ice_get_vf_vsi() is being called twice for the same VSI. Remove the unnecessary call/assignment. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>	2021-06-25 11:30:50 -07:00
Victor Raj	37c592062b	ice: remove the VSI info from previous agg Remove the VSI info from previous aggregator after moving the VSI to a new aggregator. Signed-off-by: Victor Raj <victor.raj@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-06-25 11:30:49 -07:00
Maciej Machnikowski	172db5f91d	ice: add support for auxiliary input/output pins The E810 device supports programmable pins for enabling both input and output events related to the PTP hardware clock. This includes both output signals with programmable period, as well as timestamping of events on input pins. Add support for enabling these using the CONFIG_PTP_1588_CLOCK interface. This allows programming the software defined pins to take advantage of the hardware clock features. Signed-off-by: Maciej Machnikowski <maciej.machnikowski@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-06-25 11:30:49 -07:00
Bailey Forrest	1db1a862a0	gve: Fix swapped vars when fetching max queues Fixes: `893ce44df5` ("gve: Add basic driver framework for Compute Engine Virtual NIC") Signed-off-by: Bailey Forrest <bcf@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-06-25 11:22:04 -07:00
Dave Chinner	a1bb8505e9	xfs: Fix a CIL UAF by getting get rid of the iclog callback lock The iclog callback chain has it's own lock. That was added way back in 2008 by myself to alleviate severe lock contention on the icloglock in commit `114d23aae5` ("[XFS] Per iclog callback chain lock"). This was long before delayed logging took the icloglock out of the hot transaction commit path and removed all contention on it. Hence the separate ic_callback_lock doesn't serve any scalability purpose anymore, and hasn't for close on a decade. Further, we only attach callbacks to iclogs in one place where we are already taking the icloglock soon after attaching the callbacks. We also have to drop the icloglock to run callbacks and grab it immediately afterwards again. So given that the icloglock is no longer hot, making it cover callbacks again doesn't really change the locking patterns very much at all. We also need to extend the icloglock to cover callback addition to fix a zero-day UAF in the CIL push code. This occurs when shutdown races with xlog_cil_push_work() and the shutdown runs the callbacks before the push releases the iclog. This results in the CIL context structure attached to the iclog being freed by the callback before the CIL push has finished referencing it, leading to UAF bugs. Hence, to avoid this UAF, we need the callback attachment to be atomic with post processing of the commit iclog and references to the structures being attached to the iclog. This requires holding the icloglock as that's the only way to serialise iclog state against a shutdown in progress. The result is we need to be using the icloglock to protect the callback list addition and removal and serialise them with shutdown. That makes the ic_callback_lock redundant and so it can be removed. Fixes: `71e330b593` ("xfs: Introduce delayed logging core code") Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>	2021-06-25 11:21:39 -07:00
Dave Chinner	b6903358c2	xfs: remove callback dequeue loop from xlog_state_do_iclog_callbacks If we are processing callbacks on an iclog, nothing can be concurrently adding callbacks to the loop. We only add callbacks to the iclog when they are in ACTIVE or WANT_SYNC state, and we explicitly do not add callbacks if the iclog is already in IOERROR state. The only way to have a dequeue racing with an enqueue is to be processing a shutdown without a direct reference to an iclog in ACTIVE or WANT_SYNC state. As the enqueue avoids this race condition, we only ever need a single dequeue operation in xlog_state_do_iclog_callbacks(). Hence we can remove the loop. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>	2021-06-25 11:21:34 -07:00
Dave Chinner	6be001021f	xfs: don't nest icloglock inside ic_callback_lock It's completely unnecessary because callbacks are added to iclogs without holding the icloglock, hence no amount of ordering between the icloglock and ic_callback_lock will order the removal of callbacks from the iclog. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>	2021-06-25 11:21:00 -07:00
David Thompson	f92e1869d7	Add Mellanox BlueField Gigabit Ethernet driver This patch adds build and driver logic for the "mlxbf_gige" Ethernet driver from Mellanox Technologies. The second generation BlueField SoC from Mellanox supports an out-of-band GigaBit Ethernet management port to the Arm subsystem. This driver supports TCP/IP network connectivity for that port, and provides back-end routines to handle basic ethtool requests. The driver interfaces to the Gigabit Ethernet block of BlueField SoC via MMIO accesses to registers, which contain control information or pointers describing transmit and receive resources. There is a single transmit queue, and the port supports transmit ring sizes of 4 to 256 entries. There is a single receive queue, and the port supports receive ring sizes of 32 to 32K entries. The transmit and receive rings are allocated from DMA coherent memory. There is a 16-bit producer and consumer index per ring to denote software ownership and hardware ownership, respectively. The main driver logic such as probe(), remove(), and netdev ops are in "mlxbf_gige_main.c". Logic in "mlxbf_gige_rx.c" and "mlxbf_gige_tx.c" handles the packet processing for receive and transmit respectively. The logic in "mlxbf_gige_ethtool.c" supports the handling of some basic ethtool requests: get driver info, get ring parameters, get registers, and get statistics. The logic in "mlxbf_gige_mdio.c" is the driver controlling the Mellanox BlueField hardware that interacts with a PHY device via MDIO/MDC pins. This driver does the following: - At driver probe time, it configures several BlueField MDIO parameters such as sample rate, full drive, voltage and MDC - It defines functions to read and write MDIO registers and registers the MDIO bus. - It defines the phy interrupt handler reporting a link up/down status change - This driver's probe is invoked from the main driver logic while the phy interrupt handler is registered in ndo_open. Driver limitations - Only supports 1Gbps speed - Only supports GMII protocol - Supports maximum packet size of 2KB - Does not support scatter-gather buffering Testing - Successful build of kernel for ARM64, ARM32, X86_64 - Tested ARM64 build on FastModels & Palladium - Tested ARM64 build on several Mellanox boards that are built with the BlueField-2 SoC. The testing includes coverage in the areas of networking (e.g. ping, iperf, ifconfig, route), file transfers (e.g. SCP), and various ethtool options relevant to this driver. Signed-off-by: David Thompson <davthompson@nvidia.com> Signed-off-by: Asmaa Mnebhi <asmaa@nvidia.com> Reviewed-by: Liming Sun <limings@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-06-25 11:20:23 -07:00
Allison Henderson	d3a3340b6a	xfs: Initialize error in xfs_attr_remove_iter A recent bug report generated a warning that a code path in xfs_attr_remove_iter could potentially return error uninitialized in the case of XFS_DAS_RM_SHRINK state. Fix this by initializing error. Signed-off-by: Allison Henderson <allison.henderson@oracle.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Bill O'Donnell <bodonnel@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>	2021-06-25 11:19:58 -07:00
Linus Torvalds	7ce32ac6fb	Merge branch 'akpm' (patches from Andrew) Merge misc fixes from Andrew Morton: "24 patches, based on `4a09d388f2`. Subsystems affected by this patch series: mm (thp, vmalloc, hugetlb, memory-failure, and pagealloc), nilfs2, kthread, MAINTAINERS, and mailmap" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (24 commits) mailmap: add Marek's other e-mail address and identity without diacritics MAINTAINERS: fix Marek's identity again mm/page_alloc: do bulk array bounds check after checking populated elements mm/page_alloc: __alloc_pages_bulk(): do bounds check before accessing array mm/hwpoison: do not lock page again when me_huge_page() successfully recovers mm,hwpoison: return -EHWPOISON to denote that the page has already been poisoned mm/memory-failure: use a mutex to avoid memory_failure() races mm, futex: fix shared futex pgoff on shmem huge page kthread: prevent deadlock when kthread_mod_delayed_work() races with kthread_cancel_delayed_work_sync() kthread_worker: split code for canceling the delayed work timer mm/vmalloc: unbreak kasan vmalloc support KVM: s390: prepare for hugepage vmalloc mm/vmalloc: add vmalloc_no_huge nilfs2: fix memory leak in nilfs_sysfs_delete_device_group mm/thp: another PVMW_SYNC fix in page_vma_mapped_walk() mm/thp: fix page_vma_mapped_walk() if THP mapped by ptes mm: page_vma_mapped_walk(): get vma_address_end() earlier mm: page_vma_mapped_walk(): use goto instead of while (1) mm: page_vma_mapped_walk(): add a level of indentation mm: page_vma_mapped_walk(): crossing page table boundary ...	2021-06-25 11:05:03 -07:00
Nicolas Dichtel	ff70202b2d	dev_forward_skb: do not scrub skb mark within the same name space The goal is to keep the mark during a bpf_redirect(), like it is done for legacy encapsulation / decapsulation, when there is no x-netns. This was initially done in commit `213dd74aee` ("skbuff: Do not scrub skb mark within the same name space"). When the call to skb_scrub_packet() was added in dev_forward_skb() (commit `8b27f27797` ("skb: allow skb_scrub_packet() to be used by tunnels")), the second argument (xnet) was set to true to force a call to skb_orphan(). At this time, the mark was always cleanned up by skb_scrub_packet(), whatever xnet value was. This call to skb_orphan() was removed later in commit `9c4c325252` ("skbuff: preserve sock reference when scrubbing the skb."). But this 'true' stayed here without any real reason. Let's correctly set xnet in ____dev_forward_skb(), this function has access to the previous interface and to the new interface. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-06-25 11:04:39 -07:00
Gleb Fotengauer-Malinovskiy	808e9df477	userfaultfd: uapi: fix UFFDIO_CONTINUE ioctl request definition This ioctl request reads from uffdio_continue structure written by userspace which justifies _IOC_WRITE flag. It also writes back to that structure which justifies _IOC_READ flag. See NOTEs in include/uapi/asm-generic/ioctl.h for more information. Fixes: `f619147104` ("userfaultfd: add UFFDIO_CONTINUE ioctl") Signed-off-by: Gleb Fotengauer-Malinovskiy <glebfm@altlinux.org> Acked-by: Peter Xu <peterx@redhat.com> Reviewed-by: Axel Rasmussen <axelrasmussen@google.com> Reviewed-by: Dmitry V. Levin <ldv@altlinux.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2021-06-25 10:53:26 -07:00
Daniel Latypov	1d71307a6f	kunit: add unit test for filtering suites by names This adds unit tests for kunit_filter_subsuite() and kunit_filter_suites(). Note: what the executor means by "subsuite" is the array of suites corresponding to each test file. This patch lightly refactors executor.c to avoid the use of global variables to make it testable. It also includes a clever `kfree_at_end()` helper that makes this test easier to write than it otherwise would have been. Tested by running just the new tests using itself $ ./tools/testing/kunit/kunit.py run 'exec' Signed-off-by: Daniel Latypov <dlatypov@google.com> Reviewed-by: David Gow <davidgow@google.com> Acked-by: Brendan Higgins <brendanhiggins@google.com> Tested-by: Brendan Higgins <brendanhiggins@google.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2021-06-25 11:44:37 -06:00
Linus Torvalds	55fcd4493d	Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: "Three more driver bugfixes and an annotation fix for the core" * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: robotfuzz-osif: fix control-request directions i2c: dev: Add __user annotation i2c: cp2615: check for allocation failure in cp2615_i2c_recv() i2c: i801: Ensure that SMBHSTSTS_INUSE_STS is cleared when leaving i801_access	2021-06-25 10:44:03 -07:00
Marco Elver	40eb5cf4cc	kasan: test: make use of kunit_skip() Make use of the recently added kunit_skip() to skip tests, as it permits TAP parsers to recognize if a test was deliberately skipped. Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: David Gow <davidgow@google.com> Reviewed-by: Daniel Latypov <dlatypov@google.com> Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com> Reviewed-by: Brendan Higgins <brendanhiggins@google.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2021-06-25 11:31:03 -06:00
David Gow	d99ea67514	kunit: test: Add example tests which are always skipped Add two new tests to the example test suite, both of which are always skipped. This is used as an example for how to write tests which are skipped, and to demonstrate the difference between kunit_skip() and kunit_mark_skipped(). Note that these tests are enabled by default, so a default run of KUnit will have two skipped tests. Signed-off-by: David Gow <davidgow@google.com> Reviewed-by: Daniel Latypov <dlatypov@google.com> Reviewed-by: Brendan Higgins <brendanhiggins@google.com> Reviewed-by: Marco Elver <elver@google.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2021-06-25 11:31:03 -06:00

... 71 72 73 74 75 ...

1031296 commits