linux-xiaomi-chiron

Author	SHA1	Message	Date
David S. Miller	d00f26b623	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Alexei Starovoitov says: ==================== pull-request: bpf-next 2020-05-14 The following pull-request contains BPF updates for your net-next tree. The main changes are: 1) Merged tag 'perf-for-bpf-2020-05-06' from tip tree that includes CAP_PERFMON. 2) support for narrow loads in bpf_sock_addr progs and additional helpers in cg-skb progs, from Andrey. 3) bpf benchmark runner, from Andrii. 4) arm and riscv JIT optimizations, from Luke. 5) bpf iterator infrastructure, from Yonghong. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-14 20:31:21 -07:00
Andre Przywara	59ffe4ed07	dt-bindings: ehci/ohci: Allow iommus property A OHCI/EHCI controller could be behind an IOMMU, in which case an iommus property assigns the stream ID for this device. Allow that property in the DT bindings to fix a complaint about the Arm Juno board's DTS file. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Rob Herring <robh@kernel.org>	2020-05-14 22:17:06 -05:00
Andre Przywara	17b53ce330	dt-bindings: mali-midgard: Allow dma-coherent Add the boolean dma-coherent property to the list of allowed properties, since some boards (Arm Juno) integrate the GPU this way. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Rob Herring <robh@kernel.org>	2020-05-14 22:16:29 -05:00
Andre Przywara	61efb56e30	dt-bindings: arm: gic: Allow combining arm,gic-400 compatible strings The arm,gic-400 compatible is probably the best matching string for the GIC in most modern SoCs, but was only introduced later into the kernel. For historic reasons and to keep compatibility, some SoC DTs were thus using a combination of this name and one of the older strings, which currently the binding denies. Add a stanza to the DT binding to allow "arm,gic-400", followed by either "arm,cortex-a15-gic" or "arm,cortex-a7-gic". This fixes binding compliance for quite some SoC .dtsi files in the kernel tree. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Rob Herring <robh@kernel.org>	2020-05-14 22:16:08 -05:00
Yoshihiro Kaneko	0be4ae7488	dt-bindings: irqchip: renesas-intc-irqpin: Convert to json-schema Convert the Renesas Interrupt Controller (INTC) for external pins Device Tree binding documentation to json-schema. Signed-off-by: Yoshihiro Kaneko <ykaneko0929@gmail.com> Co-developed-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> [robh: drop allOf] Signed-off-by: Rob Herring <robh@kernel.org>	2020-05-14 21:48:36 -05:00
Dave Airlie	27db6f7b0a	- Handle idling during i915_gem_evict_something busy loops (Chris) - Mark current submissions with a weak-dependency (Chris) - Propagate errror from completed fences (Chris) - Fixes on execlist to avoid GPU hang situation (Chris) - Fixes couple deadlocks (Chris) - Timeslice preemption fixes (Chris) - Fix Display Port interrupt handling on Tiger Lake (Imre) - Reduce debug noise around Frame Buffer Compression +(Peter) - Fix logic around IPC W/a for Coffee Lake and Kaby Lake +(Sultan) - Avoid dereferencing a dead context (Chris) -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEbSBwaO7dZQkcLOKj+mJfZA7rE8oFAl68wiMACgkQ+mJfZA7r E8oIuwf+O5cOPgntlouifdbTtEmbZOFFOQhEggis8XnWK6wk5cspGsDuuzcVlp31 rc90KIjt/GjN/wxF3G30aU/SX876Fu4Y6bRpt6X6n1LheYkRwG3AfXOr3P3Le++e W924tGCnjY7Nxip9MVj5pKy6nd1QKq/jtYT71aPapmSPBEzTIquDDOk73cwoWsJd BbmF9KJ1BQbMjXjO5f6TOR62/Crea8qxUttB7Su0quldJHGkB9Lj2a6zmBsjDoIq rdwBQg4seW3RVWPwSJD1/2oGofEQF46MssFk4moxoZjzRsHixI3qsBvRJvixtxpP i38rmFkWYDymMv6JWtcO1KaKMRQCHA== =BNra -----END PGP SIGNATURE----- Merge tag 'drm-intel-fixes-2020-05-13-1' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes - Handle idling during i915_gem_evict_something busy loops (Chris) - Mark current submissions with a weak-dependency (Chris) - Propagate errror from completed fences (Chris) - Fixes on execlist to avoid GPU hang situation (Chris) - Fixes couple deadlocks (Chris) - Timeslice preemption fixes (Chris) - Fix Display Port interrupt handling on Tiger Lake (Imre) - Reduce debug noise around Frame Buffer Compression +(Peter) - Fix logic around IPC W/a for Coffee Lake and Kaby Lake +(Sultan) - Avoid dereferencing a dead context (Chris) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200514040235.GA2164266@intel.com	2020-05-15 12:29:01 +10:00
Dave Airlie	1493bddcca	drm-misc-next for 5.8: UAPI Changes: Cross-subsystem Changes: * dma-buf: use atomic64_fetch_add() for context id * Documentation: document bindings for ASUS ZOOT TM5P5, BOE NV133FHM-N62, hpd-gpios Core Changes: Driver Changes: * drm/ast: fix supend; cleanups * drm/i2c: cleanups * drm/panel: add MODULE_LICENSE to panel-visinox-rm69299; add support for ASUS TM5P5i, BOE NV133FHM-N62i; fix size and bpp of BOE NV133FHM-N61 add hpd-gpio to panel-simple * drm/mcde: fix return value check in mcde_dsi_bind() * drm/mgag200: use managed drmm_mode_config_init(); cleanups * fbdev/pxa168fb: cleanups -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEchf7rIzpz2NEoWjlaA3BHVMLeiMFAl687Z0ACgkQaA3BHVML eiMp+AgAvjbwyD2v1HSFxx5B0bYSzO29Gaq6zPb5xieINAeeKLRwUamIvJH55YST p/O9Lpio8yomB8AhM1w3GkrkD3YLJ4N9ABiKEU21JV09r1qLPksAZtm1IYnMfMiY fjz5Y1B+OBTDzNSrBzDRNQrOsT5wcWXbwuC2S69nW5CIXBzkzUuE9KLv+vM3jvSd QWLEHjodhg2W+4gClcSz1W1aYaSpeytNgL/cIn+dF0dOOJYj4AQ8SvC0YHJakmBL jNmCsXKm7Kfuk7MLhIruIww+2SVWgjOoNWz1E86IAbjHMSfrIKcl/pvAjS5uXSVp gKgFF7H7RnY+g/25DiIsoDfws5cOYg== =IitK -----END PGP SIGNATURE----- Merge tag 'drm-misc-next-2020-05-14' of git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for 5.8: UAPI Changes: Cross-subsystem Changes: * dma-buf: use atomic64_fetch_add() for context id * Documentation: document bindings for ASUS ZOOT TM5P5, BOE NV133FHM-N62, hpd-gpios Core Changes: Driver Changes: * drm/ast: fix supend; cleanups * drm/i2c: cleanups * drm/panel: add MODULE_LICENSE to panel-visinox-rm69299; add support for ASUS TM5P5i, BOE NV133FHM-N62i; fix size and bpp of BOE NV133FHM-N61 add hpd-gpio to panel-simple * drm/mcde: fix return value check in mcde_dsi_bind() * drm/mgag200: use managed drmm_mode_config_init(); cleanups * fbdev/pxa168fb: cleanups Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20200514070819.GA6930@linux-uq9g	2020-05-15 12:23:25 +10:00
Michael Ellerman	93900337b9	drivers/macintosh: Fix memleak in windfarm_pm112 driver create_cpu_loop() calls smu_sat_get_sdb_partition() which does kmalloc() and returns the allocated buffer. In fact it's called twice, and neither buffer is freed. This results in a memory leak as reported by Erhard: unreferenced object 0xc00000047081f840 (size 32): comm "kwindfarm", pid 203, jiffies 4294880630 (age 5552.877s) hex dump (first 32 bytes): c8 06 02 7f ff 02 ff 01 fb bf 00 41 00 20 00 00 ...........A. .. 00 07 89 37 00 a0 00 00 00 00 00 00 00 00 00 00 ...7............ backtrace: [<0000000083f0a65c>] .smu_sat_get_sdb_partition+0xc4/0x2d0 [windfarm_smu_sat] [<000000003010fcb7>] .pm112_wf_notify+0x104c/0x13bc [windfarm_pm112] [<00000000b958b2dd>] .notifier_call_chain+0xa8/0x180 [<0000000070490868>] .blocking_notifier_call_chain+0x64/0x90 [<00000000131d8149>] .wf_thread_func+0x114/0x1a0 [<000000000d54838d>] .kthread+0x13c/0x190 [<00000000669b72bc>] .ret_from_kernel_thread+0x58/0x64 unreferenced object 0xc0000004737089f0 (size 16): comm "kwindfarm", pid 203, jiffies 4294880879 (age 5552.050s) hex dump (first 16 bytes): c4 04 01 7f 22 11 e0 e6 ff 55 7b 12 ec 11 00 00 ...."....U{..... backtrace: [<0000000083f0a65c>] .smu_sat_get_sdb_partition+0xc4/0x2d0 [windfarm_smu_sat] [<00000000b94ef7e1>] .pm112_wf_notify+0x1294/0x13bc [windfarm_pm112] [<00000000b958b2dd>] .notifier_call_chain+0xa8/0x180 [<0000000070490868>] .blocking_notifier_call_chain+0x64/0x90 [<00000000131d8149>] .wf_thread_func+0x114/0x1a0 [<000000000d54838d>] .kthread+0x13c/0x190 [<00000000669b72bc>] .ret_from_kernel_thread+0x58/0x64 Fix it by rearranging the logic so we deal with each buffer separately, which then makes it easy to free the buffer once we're done with it. Fixes: `ac171c4666` ("[PATCH] powerpc: Thermal control for dual core G5s") Cc: stable@vger.kernel.org # v2.6.16+ Reported-by: Erhard F. <erhard_f@mailbox.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Tested-by: Erhard F. <erhard_f@mailbox.org> Link: https://lore.kernel.org/r/20200423060038.3308530-1-mpe@ellerman.id.au	2020-05-15 11:58:56 +10:00
Michael Ellerman	7481cad474	selftests/powerpc: Add a test of counting larx/stcx This is based on the count_instructions test. However this one also counts the number of failed stcx's, and in conjunction with knowing the size of the stcx loop, can calculate the total number of instructions executed even in the face of non-deterministic stcx failures. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200426114410.3917383-1-mpe@ellerman.id.au	2020-05-15 11:58:55 +10:00
Michael Ellerman	24ac99e97f	powerpc: Drop unneeded cast in task_pt_regs() There's no need to cast in task_pt_regs() as tsk->thread.regs should already be a struct pt_regs. If someone's using task_pt_regs() on something that's not a task but happens to have a thread.regs then we'll deal with them later. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200428123152.73566-1-mpe@ellerman.id.au	2020-05-15 11:58:55 +10:00
Michael Ellerman	7ffa8b7dc1	powerpc/64: Don't initialise init_task->thread.regs Aneesh increased the size of struct pt_regs by 16 bytes and started seeing this WARN_ON: smp: Bringing up secondary CPUs ... ------------[ cut here ]------------ WARNING: CPU: 0 PID: 0 at arch/powerpc/kernel/process.c:455 giveup_all+0xb4/0x110 Modules linked in: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.7.0-rc2-gcc-8.2.0-1.g8f6a41f-default+ #318 NIP: c00000000001a2b4 LR: c00000000001a29c CTR: c0000000031d0000 REGS: c0000000026d3980 TRAP: 0700 Not tainted (5.7.0-rc2-gcc-8.2.0-1.g8f6a41f-default+) MSR: 800000000282b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 48048224 XER: 00000000 CFAR: c000000000019cc8 IRQMASK: 1 GPR00: c00000000001a264 c0000000026d3c20 c0000000026d7200 800000000280b033 GPR04: 0000000000000001 0000000000000000 0000000000000077 30206d7372203164 GPR08: 0000000000002000 0000000002002000 800000000280b033 3230303030303030 GPR12: 0000000000008800 c0000000031d0000 0000000000800050 0000000002000066 GPR16: 000000000309a1a0 000000000309a4b0 000000000309a2d8 000000000309a890 GPR20: 00000000030d0098 c00000000264da40 00000000fd620000 c0000000ff798080 GPR24: c00000000264edf0 c0000001007469f0 00000000fd620000 c0000000020e5e90 GPR28: c00000000264edf0 c00000000264d200 000000001db60000 c00000000264d200 NIP [c00000000001a2b4] giveup_all+0xb4/0x110 LR [c00000000001a29c] giveup_all+0x9c/0x110 Call Trace: [c0000000026d3c20] [c00000000001a264] giveup_all+0x64/0x110 (unreliable) [c0000000026d3c90] [c00000000001ae34] __switch_to+0x104/0x480 [c0000000026d3cf0] [c000000000e0b8a0] __schedule+0x320/0x970 [c0000000026d3dd0] [c000000000e0c518] schedule_idle+0x38/0x70 [c0000000026d3df0] [c00000000019c7c8] do_idle+0x248/0x3f0 [c0000000026d3e70] [c00000000019cbb8] cpu_startup_entry+0x38/0x40 [c0000000026d3ea0] [c000000000011bb0] rest_init+0xe0/0xf8 [c0000000026d3ed0] [c000000002004820] start_kernel+0x990/0x9e0 [c0000000026d3f90] [c00000000000c49c] start_here_common+0x1c/0x400 Which was unexpected. The warning is checking the thread.regs->msr value of the task we are switching from: usermsr = tsk->thread.regs->msr; ... WARN_ON((usermsr & MSR_VSX) && !((usermsr & MSR_FP) && (usermsr & MSR_VEC))); ie. if MSR_VSX is set then both of MSR_FP and MSR_VEC are also set. Dumping tsk->thread.regs->msr we see that it's: 0x1db60000 Which is not a normal looking MSR, in fact the only valid bit is MSR_VSX, all the other bits are reserved in the current definition of the MSR. We can see from the oops that it was swapper/0 that we were switching from when we hit the warning, ie. init_task. So its thread.regs points to the base (high addresses) in init_stack. Dumping the content of init_task->thread.regs, with the members of pt_regs annotated (the 16 bytes larger version), we see: 0000000000000000 c000000002780080 gpr[0] gpr[1] 0000000000000000 c000000002666008 gpr[2] gpr[3] c0000000026d3ed0 0000000000000078 gpr[4] gpr[5] c000000000011b68 c000000002780080 gpr[6] gpr[7] 0000000000000000 0000000000000000 gpr[8] gpr[9] c0000000026d3f90 0000800000002200 gpr[10] gpr[11] c000000002004820 c0000000026d7200 gpr[12] gpr[13] 000000001db60000 c0000000010aabe8 gpr[14] gpr[15] c0000000010aabe8 c0000000010aabe8 gpr[16] gpr[17] c00000000294d598 0000000000000000 gpr[18] gpr[19] 0000000000000000 0000000000001ff8 gpr[20] gpr[21] 0000000000000000 c00000000206d608 gpr[22] gpr[23] c00000000278e0cc 0000000000000000 gpr[24] gpr[25] 000000002fff0000 c000000000000000 gpr[26] gpr[27] 0000000002000000 0000000000000028 gpr[28] gpr[29] 000000001db60000 0000000004750000 gpr[30] gpr[31] 0000000002000000 000000001db60000 nip msr 0000000000000000 0000000000000000 orig_r3 ctr c00000000000c49c 0000000000000000 link xer 0000000000000000 0000000000000000 ccr softe 0000000000000000 0000000000000000 trap dar 0000000000000000 0000000000000000 dsisr result 0000000000000000 0000000000000000 ppr kuap 0000000000000000 0000000000000000 pad[2] pad[3] This looks suspiciously like stack frames, not a pt_regs. If we look closely we can see return addresses from the stack trace above, c000000002004820 (start_kernel) and c00000000000c49c (start_here_common). init_task->thread.regs is setup at build time in processor.h: #define INIT_THREAD { \ .ksp = INIT_SP, \ .regs = (struct pt_regs )INIT_SP - 1, / XXX bogus, I think / \ The early boot code where we setup the initial stack is: LOAD_REG_ADDR(r3,init_thread_union) / set up a stack pointer */ LOAD_REG_IMMEDIATE(r1,THREAD_SIZE) add r1,r3,r1 li r0,0 stdu r0,-STACK_FRAME_OVERHEAD(r1) Which creates a stack frame of size 112 bytes (STACK_FRAME_OVERHEAD). Which is far too small to contain a pt_regs. So the result is init_task->thread.regs is pointing at some stack frames on the init stack, not at a pt_regs. We have gotten away with this for so long because with pt_regs at its current size the MSR happens to point into the first frame, at a location that is not written to by the early asm. With the 16 byte expansion the MSR falls into the second frame, which is used by the compiler, and collides with a saved register that tends to be non-zero. As far as I can see this has been wrong since the original merge of 64-bit ppc support, back in 2002. Conceptually swapper should have no regs, it never entered from userspace, and in fact that's what we do on 32-bit. It's also presumably what the "bogus" comment is referring to. So I think the right fix is to just not-initialise regs at all. I'm slightly worried this will break some code that isn't prepared for a NULL regs, but we'll have to see. Remove the comment in head_64.S which refers to us setting up the regs (even though we never did), and is otherwise not really accurate any more. Reported-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200428123130.73078-1-mpe@ellerman.id.au	2020-05-15 11:58:54 +10:00
Gustavo A. R. Silva	02bddf21c3	powerpc/mm: Replace zero-length array with flexible-array The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99: struct foo { int stuff; struct boo array[]; }; By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on. Also, notice that, dynamic memory allocations won't be affected by this change: "Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1] sizeof(flexible-array-member) triggers a warning because flexible array members have incomplete type[1]. There are some instances of code in which the sizeof operator is being incorrectly/erroneously applied to zero-length arrays and the result is zero. Such instances may be hiding some bugs. So, this work (flexible-array member conversions) will also help to get completely rid of those sorts of issues. This issue was found with the help of Coccinelle. [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://github.com/KSPP/linux/issues/21 [3] commit `7649773293` ("cxgb3/l2t: Fix undefined behaviour") Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200507185755.GA15014@embeddedor	2020-05-15 11:58:54 +10:00
Gustavo A. R. Silva	0f6be41c60	powerpc: Replace zero-length array with flexible-array The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99: struct foo { int stuff; struct boo array[]; }; By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on. Also, notice that, dynamic memory allocations won't be affected by this change: "Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1] sizeof(flexible-array-member) triggers a warning because flexible array members have incomplete type[1]. There are some instances of code in which the sizeof operator is being incorrectly/erroneously applied to zero-length arrays and the result is zero. Such instances may be hiding some bugs. So, this work (flexible-array member conversions) will also help to get completely rid of those sorts of issues. This issue was found with the help of Coccinelle. [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://github.com/KSPP/linux/issues/21 [3] commit `7649773293` ("cxgb3/l2t: Fix undefined behaviour") Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200507185749.GA14994@embeddedor	2020-05-15 11:58:54 +10:00
Nicholas Piggin	4e0e45b07d	powerpc: Use trap metadata to prevent double restart rather than zeroing trap It's not very nice to zero trap for this, because then system calls no longer have trap_is_syscall(regs) invariant, and we can't distinguish between sc and scv system calls (in a later patch). Take one last unused bit from the low bits of the pt_regs.trap word for this instead. There is not a really good reason why it should be in trap as opposed to another field, but trap has some concept of flags and it exists. Ideally I think we would move trap to 2-byte field and have 2 more bytes available independently. Add a selftests case for this, which can be seen to fail if trap_norestart() is changed to return false. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Make them static inlines] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200507121332.2233629-4-mpe@ellerman.id.au	2020-05-15 11:58:54 +10:00
Nicholas Piggin	912237ea16	powerpc: trap_is_syscall() helper to hide syscall trap number A new system call interrupt will be added with a new trap number. Hide the explicit 0xc00 test behind an accessor to reduce churn in callers. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Make it a static inline] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200507121332.2233629-3-mpe@ellerman.id.au	2020-05-15 11:58:54 +10:00
Nicholas Piggin	db30144b5c	powerpc: Use set_trap() and avoid open-coding trap masking The pt_regs.trap field keeps 4 low bits for some metadata about the trap or how it was handled, which is masked off in order to test the architectural trap number. Add a set_trap() accessor to set this, equivalent to TRAP() for returning it. This is actually not quite the equivalent of TRAP() because it always clears the low bits, which may be harmless if it can only be updated via ptrace syscall, but it seems dangerous. In fact settting TRAP from ptrace doesn't seem like a great idea so maybe it's better deleted. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Make it a static inline rather than a shouty macro] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200507121332.2233629-2-mpe@ellerman.id.au	2020-05-15 11:58:54 +10:00
Nicholas Piggin	feb9df3462	powerpc/64s: Always has full regs, so remove remnant checks Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200507121332.2233629-1-mpe@ellerman.id.au	2020-05-15 11:58:53 +10:00
Alexei Starovoitov	b92d44b5c2	Merge branch 'expand-cg_skb-helpers' Andrey Ignatov says: ==================== v2->v3: - better documentation for bpf_sk_cgroup_id in uapi (Yonghong Song) - save/restore errno in network helpers (Yonghong Song) - cleanup leftover after switching selftest to skeleton (Yonghong Song) - switch from map to skel->bss in selftest (Yonghong Song) v1->v2: - switch selftests to skeleton. This patch set allows a bunch of existing sk lookup and skb cgroup id helpers, and adds two new bpf_sk_{,ancestor_}cgroup_id helpers to be used in cgroup skb programs. It fills the gap to cover a use-case to apply intra-host cgroup-bpf network policy based on a source cgroup a packet comes from. For example, there can be multiple containers A, B, C running on a host. Every such container runs in its own cgroup that can have multiple sub-cgroups. But all these containers can share some IP addresses. At the same time container A wants to have a policy for a server S running in it so that only clients from this same container can connect to S, but not from other containers (such as B, C). Source IP address can't be used to decide whether to allow or deny a packet, but it looks reasonable to filter by cgroup id. The patch set allows to implement the following policy: * when an ingress packet comes to container's cgroup, lookup peer (client) socket this packet comes from; * having peer socket, get its cgroup id; * compare peer cgroup id with self cgroup id and allow packet only if they match, i.e. it comes from same cgroup; * the "sub-cgroup" part of the story can be addressed by getting not direct cgroup id of the peer socket, but ancestor cgroup id on specified level, similar to existing "ancestor" flavors of cgroup id helpers. A newly introduced selftest implements such a policy in its basic form to provide a better idea on the use-case. Patch 1 allows existing sk lookup helpers in cgroup skb. Patch 2 allows skb_ancestor_cgroup_id in cgrou skb. Patch 3 introduces two new helpers to get cgroup id of socket. Patch 4 extends network helpers to use them in the next patch. Patch 5 adds selftest / example of use-case. ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2020-05-14 18:42:02 -07:00
Andrey Ignatov	68e916bc8d	selftests/bpf: Test for sk helpers in cgroup skb Test bpf_sk_lookup_tcp, bpf_sk_release, bpf_sk_cgroup_id and bpf_sk_ancestor_cgroup_id helpers from cgroup skb program. The test creates a testing cgroup, starts a TCPv6 server inside the cgroup and creates two client sockets: one inside testing cgroup and one outside. Then it attaches cgroup skb program to the cgroup that checks all TCP segments coming to the server and allows only those coming from the cgroup of the server. If a segment comes from a peer outside of the cgroup, it'll be dropped. Finally the test checks that client from inside testing cgroup can successfully connect to the server, but client outside the cgroup fails to connect by timeout. The main goal of the test is to check newly introduced bpf_sk_{,ancestor_}cgroup_id helpers. It also checks a couple of socket lookup helpers (tcp & release), but lookup helpers were introduced much earlier and covered by other tests. Here it's mostly checked that they can be called from cgroup skb. Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/171f4c5d75e8ff4fe1c4e8c1c12288b5240a4549.1589486450.git.rdna@fb.com	2020-05-14 18:41:08 -07:00
Andrey Ignatov	383724e17a	selftests/bpf: Add connect_fd_to_fd, connect_wait net helpers Add two new network helpers. connect_fd_to_fd connects an already created client socket fd to address of server fd. Sometimes it's useful to separate client socket creation and connecting this socket to a server, e.g. if client socket has to be created in a cgroup different from that of server cgroup. Additionally connect_to_fd is now implemented using connect_fd_to_fd, both helpers don't treat EINPROGRESS as an error and let caller decide how to proceed with it. connect_wait is a helper to work with non-blocking client sockets so that if connect_to_fd or connect_fd_to_fd returned -1 with errno == EINPROGRESS, caller can wait for connect to finish or for connection timeout. The helper returns -1 on error, 0 on timeout (1sec, hard-coded), and positive number on success. Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/1403fab72300f379ca97ead4820ae43eac4414ef.1589486450.git.rdna@fb.com	2020-05-14 18:41:08 -07:00
Andrey Ignatov	f307fa2cb4	bpf: Introduce bpf_sk_{, ancestor_}cgroup_id helpers With having ability to lookup sockets in cgroup skb programs it becomes useful to access cgroup id of retrieved sockets so that policies can be implemented based on origin cgroup of such socket. For example, a container running in a cgroup can have cgroup skb ingress program that can lookup peer socket that is sending packets to a process inside the container and decide whether those packets should be allowed or denied based on cgroup id of the peer. More specifically such ingress program can implement intra-host policy "allow incoming packets only from this same container and not from any other container on same host" w/o relying on source IP addresses since quite often it can be the case that containers share same IP address on the host. Introduce two new helpers for this use-case: bpf_sk_cgroup_id() and bpf_sk_ancestor_cgroup_id(). These helpers are similar to existing bpf_skb_{,ancestor_}cgroup_id helpers with the only difference that sk is used to get cgroup id instead of skb, and share code with them. See documentation in UAPI for more details. Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/f5884981249ce911f63e9b57ecd5d7d19154ff39.1589486450.git.rdna@fb.com	2020-05-14 18:41:07 -07:00
Andrey Ignatov	06d3e4c9f1	bpf: Allow skb_ancestor_cgroup_id helper in cgroup skb cgroup skb programs already can use bpf_skb_cgroup_id. Allow bpf_skb_ancestor_cgroup_id as well so that container policies can be implemented for a container that can have sub-cgroups dynamically created, but policies should still be implemented based on cgroup id of container itself not on an id of a sub-cgroup. Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/8874194d6041eba190356453ea9f6071edf5f658.1589486450.git.rdna@fb.com	2020-05-14 18:41:07 -07:00
Andrey Ignatov	d56c2f95ad	bpf: Allow sk lookup helpers in cgroup skb Currently sk lookup helpers are allowed in tc, xdp, sk skb, and cgroup sock_addr programs. But they would be useful in cgroup skb as well so that for example cgroup skb ingress program can lookup a peer socket a packet comes from on same host and make a decision whether to allow or deny this packet based on the properties of that socket, e.g. cgroup that peer socket belongs to. Allow the following sk lookup helpers in cgroup skb: * bpf_sk_lookup_tcp; * bpf_sk_lookup_udp; * bpf_sk_release; * bpf_skc_lookup_tcp. Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/f8c7ee280f1582b586629436d777b6db00597d63.1589486450.git.rdna@fb.com	2020-05-14 18:41:07 -07:00
Colin Ian King	5b0004d92b	selftest/bpf: Fix spelling mistake "SIGALARM" -> "SIGALRM" There is a spelling mistake in an error message, fix it. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20200514121529.259668-1-colin.king@canonical.com	2020-05-14 18:39:06 -07:00
Andrii Nakryiko	c70f34a8ac	bpf: Fix bpf_iter's task iterator logic task_seq_get_next might stop prematurely if get_pid_task() fails to get task_struct. Failure to do so doesn't mean that there are no more tasks with higher pids. Procfs's iteration algorithm (see next_tgid in fs/proc/base.c) does a retry in such case. After this fix, instead of stopping prematurely after about 300 tasks on my server, bpf_iter program now returns >4000, which sounds much closer to reality. Fixes: `eaaacd2391` ("bpf: Add task and task/file iterator targets") Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20200514055137.1564581-1-andriin@fb.com	2020-05-14 18:37:32 -07:00
Andrey Ignatov	0645f7eb6f	selftests/bpf: Test narrow loads for bpf_sock_addr.user_port Test 1,2,4-byte loads from bpf_sock_addr.user_port in sock_addr programs. Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/e5c734a58cca4041ab30cb5471e644246f8cdb5a.1589420814.git.rdna@fb.com	2020-05-14 18:30:57 -07:00
Andrey Ignatov	7aebfa1b38	bpf: Support narrow loads from bpf_sock_addr.user_port bpf_sock_addr.user_port supports only 4-byte load and it leads to ugly code in BPF programs, like: volatile __u32 user_port = ctx->user_port; __u16 port = bpf_ntohs(user_port); Since otherwise clang may optimize the load to be 2-byte and it's rejected by verifier. Add support for 1- and 2-byte loads same way as it's supported for other fields in bpf_sock_addr like user_ip4, msg_src_ip4, etc. Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/c1e983f4c17573032601d0b2b1f9d1274f24bc16.1589420814.git.rdna@fb.com	2020-05-14 18:30:57 -07:00
Zhou Wang	528443e32a	arm64: defconfig: Enable UACCE/PCI PASID/SEC2/HPRE configs Enable configs for UACCE, PCI PASID, HiSilicon SEC2 and HPRE drivers. Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com> Signed-off-by: Wei Xu <xuwei5@hisilicon.com>	2020-05-15 09:29:47 +08:00
Lorenzo Bianconi	6a09815428	samples/bpf: xdp_redirect_cpu: Set MAX_CPUS according to NR_CPUS xdp_redirect_cpu is currently failing in bpf_prog_load_xattr() allocating cpu_map map if CONFIG_NR_CPUS is less than 64 since cpu_map_alloc() requires max_entries to be less than NR_CPUS. Set cpu_map max_entries according to NR_CPUS in xdp_redirect_cpu_kern.c and get currently running cpus in xdp_redirect_cpu_user.c Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/374472755001c260158c4e4b22f193bdd3c56fb7.1589300442.git.lorenzo@kernel.org	2020-05-14 18:27:00 -07:00
David S. Miller	207b584d0a	MAINTAINERS: Mark networking drivers as Maintained. Suggested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-14 18:04:41 -07:00
Heiner Kallweit	9b65d2ffe8	r8169: don't include linux/moduleparam.h `93882c6f21` ("r8169: switch from netif_xxx message functions to netdev_xxx") removed the last module parameter from the driver, therefore there's no need any longer to include linux/moduleparam.h. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-14 18:03:01 -07:00
Heiner Kallweit	aa443b3f8f	r8169: remove not needed checks in rtl8169_set_eee After `9de5d235b6` ("net: phy: fix aneg restart in phy_ethtool_set_eee") we don't need the check for aneg being enabled any longer, and as discussed with Russell configuring the EEE advertisement should be supported even if we're in a half-duplex mode currently. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-14 18:03:01 -07:00
Stanley Chu	f48b285ae6	scsi: ufs-mediatek: Customize WriteBooster flush policy Change the WriteBooster policy to keep VCC on during runtime suspend if available WriteBooster buffer is less than 80%. Link: https://lore.kernel.org/r/20200509093716.21010-5-stanley.chu@mediatek.com Reviewed-by: Asutosh Das <asutoshd@codeaurora.org> Signed-off-by: Stanley Chu <stanley.chu@mediatek.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-05-14 21:02:22 -04:00
Stanley Chu	d14734ae3a	scsi: ufs: Customize flush threshold for WriteBooster Allow flush threshold for WriteBooster to be customizable by vendors. To achieve this, make the value a variable in struct ufs_hba_variant_params. Also introduce UFS_WB_BUF_REMAIN_PERCENT() macro to provide a more flexible way to specify WriteBooster available buffer values. Link: https://lore.kernel.org/r/20200509093716.21010-4-stanley.chu@mediatek.com Reviewed-by: Asutosh Das <asutoshd@codeaurora.org> Signed-off-by: Stanley Chu <stanley.chu@mediatek.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-05-14 21:02:22 -04:00
Stanley Chu	90b8491c00	scsi: ufs: Introduce ufs_hba_variant_params to group customizable parameters The UFS driver is growing more and more customizable parameters. Collect them in one place. Link: https://lore.kernel.org/r/20200509093716.21010-2-stanley.chu@mediatek.com Reviewed-by: Asutosh Das <asutoshd@codeaurora.org> Signed-off-by: Stanley Chu <stanley.chu@mediatek.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-05-14 21:02:22 -04:00
Colin Ian King	b014d0430b	net: dsa: felix: fix incorrect clamp calculation for burst Currently burst is clamping on rate and not burst, the assignment of burst from the clamping discards the previous assignment of burst. This looks like a cut-n-paste error from the previous clamping calculation on ramp. Fix this by replacing ramp with burst. Addresses-Coverity: ("Unused value") Fixes: `0fbabf875d` ("net: dsa: felix: add support Credit Based Shaper(CBS) for hardware offload") Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-14 18:02:02 -07:00
Amol Grover	7013908c2d	ipmr: Add lockdep expression to ipmr_for_each_table macro During the initialization process, ipmr_new_table() is called to create new tables which in turn calls ipmr_get_table() which traverses net->ipv4.mr_tables without holding the writer lock. However, this is safe to do so as no tables exist at this time. Hence add a suitable lockdep expression to silence the following false-positive warning: ============================= WARNING: suspicious RCU usage 5.7.0-rc3-next-20200428-syzkaller #0 Not tainted ----------------------------- net/ipv4/ipmr.c:136 RCU-list traversed in non-reader section!! ipmr_get_table+0x130/0x160 net/ipv4/ipmr.c:136 ipmr_new_table net/ipv4/ipmr.c:403 [inline] ipmr_rules_init net/ipv4/ipmr.c:248 [inline] ipmr_net_init+0x133/0x430 net/ipv4/ipmr.c:3089 Fixes: `f0ad0860d0` ("ipv4: ipmr: support multiple tables") Reported-by: syzbot+1519f497f2f9f08183c6@syzkaller.appspotmail.com Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Amol Grover <frextrite@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-14 18:01:07 -07:00
Amol Grover	a14fbcd4f1	ipmr: Fix RCU list debugging warning ipmr_for_each_table() macro uses list_for_each_entry_rcu() for traversing outside of an RCU read side critical section but under the protection of rtnl_mutex. Hence, add the corresponding lockdep expression to silence the following false-positive warning at boot: [ 4.319347] ============================= [ 4.319349] WARNING: suspicious RCU usage [ 4.319351] 5.5.4-stable #17 Tainted: G E [ 4.319352] ----------------------------- [ 4.319354] net/ipv4/ipmr.c:1757 RCU-list traversed in non-reader section!! Fixes: `f0ad0860d0` ("ipv4: ipmr: support multiple tables") Signed-off-by: Amol Grover <frextrite@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-14 18:01:07 -07:00
Bartosz Golaszewski	140ad6c8c6	net: phy: mdio-moxart: remove unneeded include mdio-moxart doesn't use regulators in the driver code. We can remove the regulator include. Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-14 17:58:46 -07:00
Dan Murphy	74ac28f164	dt-bindings: dp83867: Convert DP83867 to yaml Convert the dp83867 binding to yaml. Signed-off-by: Dan Murphy <dmurphy@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-14 17:58:02 -07:00
Dan Murphy	e90b651e7b	dt-bindings: net: dp83869: Update licensing info Add BSD 2 Clause to the licensing. CC: Rob Herring <robh@kernel.org> Signed-off-by: Dan Murphy <dmurphy@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-14 17:54:41 -07:00
Damien Le Moal	0bd735df76	scsi: sd: Signal drive managed SMR disks Print a message indicating that a disk is a drive-managed SMR model when such drive is found using the ZONED field of the Block Device Characteristics VPD page (IDENTIFY data on ATA side). [mkp: typo] Link: https://lore.kernel.org/r/20200514081953.1252087-1-damien.lemoal@wdc.com Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-05-14 20:48:53 -04:00
ChenTao	21d2b76831	scsi: ufs-mediatek: Make ufs_mtk_fixup_dev_quirks static Fix the following warning: drivers/scsi/ufs/ufs-mediatek.c:585:6: warning: symbol 'ufs_mtk_fixup_dev_quirks' was not declared. Should it be static? Link: https://lore.kernel.org/r/20200514012655.127202-1-chentao107@huawei.com Reported-by: Hulk Robot <hulkci@huawei.com> Reviewed-by: Stanley Chu <stanley.chu@mediatek.com> Signed-off-by: ChenTao <chentao107@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-05-14 20:47:23 -04:00
Madhuparna Bhowmik	95f59bf88b	drivers: net: hamradio: Fix suspicious RCU usage warning in bpqether.c This patch fixes the following warning: ============================= WARNING: suspicious RCU usage 5.7.0-rc5-next-20200514-syzkaller #0 Not tainted ----------------------------- drivers/net/hamradio/bpqether.c:149 RCU-list traversed in non-reader section!! Since rtnl lock is held, pass this cond in list_for_each_entry_rcu(). Reported-by: syzbot+bb82cafc737c002d11ca@syzkaller.appspotmail.com Signed-off-by: Madhuparna Bhowmik <madhuparnabhowmik10@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-14 17:43:45 -07:00
Kevin Lo	cc8a677a76	net: phy: broadcom: fix BCM54XX_SHD_SCR3_TRDDAPD value for BCM54810 Set the correct bit when checking for PHY_BRCM_DIS_TXCRXC_NOENRGY on the BCM54810 PHY. Fixes: `0ececcfc92` ("net: phy: broadcom: Allow BCM54810 to use bcm54xx_adjust_rxrefclk()") Signed-off-by: Kevin Lo <kevlo@kevlo.org> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-14 17:40:06 -07:00
Luo bin	3f044d26f8	hinic: update huawei ethernet driver maintainer update huawei ethernet driver maintainer from aviad to Bin luo Signed-off-by: Luo bin <luobin9@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-14 17:38:30 -07:00
Luo bin	bcab67822d	hinic: add set_ringparam ethtool_ops support support to change TX/RX queue depth with ethtool -G Signed-off-by: Luo bin <luobin9@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-14 17:38:19 -07:00
Jakub Kicinski	5a46b062e2	devlink: refactor end checks in devlink_nl_cmd_region_read_dumpit Clean up after recent fixes, move address calculations around and change the variable init, so that we can have just one start_offset == end_offset check. Make the check a little stricter to preserve the -EINVAL error if requested start offset is larger than the region itself. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-14 17:36:25 -07:00
David S. Miller	c7ad365761	Merge branch 'am65-cpsw-add-taprio-EST-offload-support' Murali Karicheri says: ==================== am65-cpsw: add taprio/EST offload support AM65 CPSW h/w supports Enhanced Scheduled Traffic (EST – defined in P802.1Qbv/D2.2 that later got included in IEEE 802.1Q-2018) configuration. EST allows express queue traffic to be scheduled (placed) on the wire at specific repeatable time intervals. In Linux kernel, EST configuration is done through tc command and the taprio scheduler in the net core implements a software only scheduler (SCH_TAPRIO). If the NIC is capable of EST configuration, user indicate "flag 2" in the command which is then parsed by taprio scheduler in net core and indicate that the command is to be offloaded to h/w. taprio then offloads the command to the driver by calling ndo_setup_tc() ndo ops. This patch implements ndo_setup_tc() as well as other changes required to offload EST configuration to CPSW h/w For more details please refer patch 2/2. This series is based on original work done by Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> to add taprio offload support to AM65 CPSW 2G. 1. Example configuration 3 Gates ifconfig eth0 down ethtool -L eth0 tx 3 ethtool --set-priv-flags eth0 p0-rx-ptype-rrobin off ifconfig eth0 192.168.2.20 tc qdisc replace dev eth0 parent root handle 100 taprio \ num_tc 3 \ map 0 0 1 2 0 0 0 0 0 0 0 0 0 0 0 0 \ queues 1@0 1@1 1@2 \ base-time 0000 \ sched-entry S 4 125000 \ sched-entry S 2 125000 \ sched-entry S 1 250000 \ flags 2 2. Example configuration 8 Gates ifconfig eth0 down ethtool -L eth0 tx 8 ethtool --set-priv-flags eth0 p0-rx-ptype-rrobin off ifconfig eth0 192.168.2.20 tc qdisc replace dev eth0 parent root handle 100 taprio \ num_tc 8 \ map 0 1 2 3 4 5 6 7 0 0 0 0 0 0 0 0 \ queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \ base-time 0000 \ sched-entry S 80 125000 \ sched-entry S 40 125000 \ sched-entry S 20 125000 \ sched-entry S 10 125000 \ sched-entry S 08 125000 \ sched-entry S 04 125000 \ sched-entry S 02 125000 \ sched-entry S 01 125000 \ flags 2 Classify frames to particular priority using skbedit so that they land at a specific queue in cpsw h/w which is Gated by the EST gate which opens based on the sched-entry. tc qdisc add dev eth0 clsact In the below for example an iperf3 session with destination port 5007 will go through Q7. tc filter add dev eth0 egress protocol ip prio 1 u32 match ip dport 5007 0xffff action skbedit priority 7 tc filter add dev eth0 egress protocol ip prio 1 u32 match ip dport 5006 0xffff action skbedit priority 6 tc filter add dev eth0 egress protocol ip prio 1 u32 match ip dport 5005 0xffff action skbedit priority 5 tc filter add dev eth0 egress protocol ip prio 1 u32 match ip dport 5004 0xffff action skbedit priority 4 tc filter add dev eth0 egress protocol ip prio 1 u32 match ip dport 5003 0xffff action skbedit priority 3 tc filter add dev eth0 egress protocol ip prio 1 u32 match ip dport 5002 0xffff action skbedit priority 2 tc filter add dev eth0 egress protocol ip prio 1 u32 match ip dport 5001 0xffff action skbedit priority 1 iperf3 -c 192.168.2.10 -u -l1470 -b32M -t1 -p 5007 Testing was done by capturing frames at the PC using wireshark and checking for the bust interval or cycle time of UDP frames with a specific port number. Verified that the distance between first frame of a burst (cycle-time) is 1 milli second and burst duration is within 125 usec based on the received packet timestamp shown in wireshark packet display. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-14 17:33:30 -07:00
Ivan Khoronzhuk	8127224c27	ethernet: ti: am65-cpsw-qos: add TAPRIO offload support AM65 CPSW h/w supports Enhanced Scheduled Traffic (EST – defined in P802.1Qbv/D2.2 that later got included in IEEE 802.1Q-2018) configuration. EST allows express queue traffic to be scheduled (placed) on the wire at specific repeatable time intervals. In Linux kernel, EST configuration is done through tc command and the taprio scheduler in the net core implements a software only scheduler (SCH_TAPRIO). If the NIC is capable of EST configuration, user indicate "flag 2" in the command which is then parsed by taprio scheduler in net core and indicate that the command is to be offloaded to h/w. taprio then offloads the command to the driver by calling ndo_setup_tc() ndo ops. This patch implements ndo_setup_tc() to offload EST configuration to CPSW h/w. Currently driver supports only SetGateStates operation. EST operates on a repeating time interval generated by the CPTS EST function generator. Each Ethernet port has a global EST fetch RAM that can be configured as 2 buffers, each of 64 locations or one large buffer of 128 locations. In 2 buffer configuration, a ping pong mechanism is used to hold the active schedule (oper) in one buffer and new (admin) command in the other. Each 22-bit fetch command consists of a 14-bit fetch count (14 MSB’s) and an 8-bit priority fetch allow (8 LSB’s) that will be applied for the fetch count time in wireside clocks. Driver process each of the sched-entry in the offload command and update the fetch RAM. Driver configures duration in sched-entry into the fetch count and Gate mask into the priority fetch bits of the RAM. Then configures the CPTS EST function generator to activate the schedule. Currently driver supports only 2 buffer configuration which means driver supports a max cycle time of ~8 msec. CPSW supports a configurable number of priority queues (up to 8) and needs to be switched to this mode from the default round robin mode before EST can be offloaded. User configures these through ethtool commands (-L for changing number of queues and --set-priv-flags to disable round robin mode). Driver doesn't enable EST if pf_p0_rx_ptype_rrobin privat flag is set. The flag is common for all ports, and so can't be just overridden by taprio configuration w/o user involvement. Command fails if pf_p0_rx_ptype_rrobin is already set in the driver. Scheds (commands) configuration depends on interface speed so driver translates the duration to the fetch count based on link speed. Each schedule can be constructed with several command entries in fetch RAM depending on interval. For example if each sched has timer interval < ~130us on 1000 Mb link then each sched consumes one command and have 1:1 mapping. When Ethernet link goes down, driver purge the configuration if link is down for more than 1 second. The patch allows to update the timer and scheds memory only if it's really needed, and skip cases required the user to stop timer by configuring only shceds memory. Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-14 17:33:30 -07:00

... 153 154 155 156 157 ...

932869 commits