Commit graph

1059881 commits

Author SHA1 Message Date
J. Bruce Fields
80479eb862 nfsd4: remove obselete comment
Mandatory locking has been removed.  And the rest of this comment is
redundant with the code.

Reported-by: Jeff layton <jlayton@kernel.org>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2021-11-01 17:17:14 -04:00
Dave Marchevsky
6ac22d036f perf bpf: Pull in bpf_program__get_prog_info_linear()
To prepare for impending deprecation of libbpf's bpf_program__get_prog_info_linear(),
pull in the function and associated helpers into the perf codebase and migrate
existing uses to the perf copy.

Since libbpf's deprecated definitions will still be visible to perf, it is necessary
to rename perf's definitions.

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20211011082031.4148337-4-davemarchevsky@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-01 18:16:40 -03:00
Joanne Koong
7a67087250 selftests/bpf: Add bloom map success test for userspace calls
This patch has two changes:
1) Adds a new function "test_success_cases" to test
successfully creating + adding + looking up a value
in a bloom filter map from the userspace side.

2) Use bpf_create_map instead of bpf_create_map_xattr in
the "test_fail_cases" and test_inner_map to make the
code look cleaner.

Signed-off-by: Joanne Koong <joannekoong@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20211029224909.1721024-4-joannekoong@fb.com
2021-11-01 14:16:03 -07:00
Joanne Koong
8845b4681b bpf: Add alignment padding for "map_extra" + consolidate holes
This patch makes 2 changes regarding alignment padding
for the "map_extra" field.

1) In the kernel header, "map_extra" and "btf_value_type_id"
are rearranged to consolidate the hole.

Before:
struct bpf_map {
	...
        u32		max_entries;	/*    36     4	*/
        u32		map_flags;	/*    40     4	*/

        /* XXX 4 bytes hole, try to pack */

        u64		map_extra;	/*    48     8	*/
        int		spin_lock_off;	/*    56     4	*/
        int		timer_off;	/*    60     4	*/
        /* --- cacheline 1 boundary (64 bytes) --- */
        u32		id;		/*    64     4	*/
        int		numa_node;	/*    68     4	*/
	...
        bool		frozen;		/*   117     1	*/

        /* XXX 10 bytes hole, try to pack */

        /* --- cacheline 2 boundary (128 bytes) --- */
	...
        struct work_struct	work;	/*   144    72	*/

        /* --- cacheline 3 boundary (192 bytes) was 24 bytes ago --- */
	struct mutex	freeze_mutex;	/*   216   144 	*/

        /* --- cacheline 5 boundary (320 bytes) was 40 bytes ago --- */
        u64		writecnt; 	/*   360     8	*/

    /* size: 384, cachelines: 6, members: 26 */
    /* sum members: 354, holes: 2, sum holes: 14 */
    /* padding: 16 */
    /* forced alignments: 2, forced holes: 1, sum forced holes: 10 */

} __attribute__((__aligned__(64)));

After:
struct bpf_map {
	...
        u32		max_entries;	/*    36     4	*/
        u64		map_extra;	/*    40     8 	*/
        u32		map_flags;	/*    48     4	*/
        int		spin_lock_off;	/*    52     4	*/
        int		timer_off;	/*    56     4	*/
        u32		id;		/*    60     4	*/

        /* --- cacheline 1 boundary (64 bytes) --- */
        int		numa_node;	/*    64     4	*/
	...
	bool		frozen		/*   113     1  */

        /* XXX 14 bytes hole, try to pack */

        /* --- cacheline 2 boundary (128 bytes) --- */
	...
        struct work_struct	work;	/*   144    72	*/

        /* --- cacheline 3 boundary (192 bytes) was 24 bytes ago --- */
        struct mutex	freeze_mutex;	/*   216   144	*/

        /* --- cacheline 5 boundary (320 bytes) was 40 bytes ago --- */
        u64		writecnt;       /*   360     8	*/

    /* size: 384, cachelines: 6, members: 26 */
    /* sum members: 354, holes: 1, sum holes: 14 */
    /* padding: 16 */
    /* forced alignments: 2, forced holes: 1, sum forced holes: 14 */

} __attribute__((__aligned__(64)));

2) Add alignment padding to the bpf_map_info struct
More details can be found in commit 36f9814a49 ("bpf: fix uapi hole
for 32 bit compat applications")

Signed-off-by: Joanne Koong <joannekoong@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20211029224909.1721024-3-joannekoong@fb.com
2021-11-01 14:16:03 -07:00
Joanne Koong
6fdc348006 bpf: Bloom filter map naming fixups
This patch has two changes in the kernel bloom filter map
implementation:

1) Change the names of map-ops functions to include the
"bloom_map" prefix.

As Martin pointed out on a previous patchset, having generic
map-ops names may be confusing in tracing and in perf-report.

2) Drop the "& 0xF" when getting nr_hash_funcs, since we
already ascertain that no other bits in map_extra beyond the
first 4 bits can be set.

Signed-off-by: Joanne Koong <joannekoong@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20211029224909.1721024-2-joannekoong@fb.com
2021-11-01 14:16:03 -07:00
Alexei Starovoitov
f27a6fad14 Merge branch 'introduce dummy BPF STRUCT_OPS'
Hou Tao says:

====================

Hi,

Currently the test of BPF STRUCT_OPS depends on the specific bpf
implementation (e.g, tcp_congestion_ops), but it can not cover all
basic functionalities (e.g, return value handling), so introduce
a dummy BPF STRUCT_OPS for test purpose.

Instead of loading a userspace-implemeted bpf_dummy_ops map into
kernel and calling the specific function by writing to sysfs provided
by bpf_testmode.ko, only loading bpf_dummy_ops related prog into
kernel and calling these prog by bpf_prog_test_run(). The latter
is more flexible and has no dependency on extra kernel module.

Now the return value handling is supported by test_1(...) ops,
and passing multiple arguments is supported by test_2(...) ops.
If more is needed, test_x(...) ops can be added afterwards.

Comments are always welcome.
Regards,
Hou

Change Log:
v4:
 * add Acked-by tags in patch 1~4
 * patch 2: remove unncessary comments and update commit message
            accordingly
 * patch 4: remove unnecessary nr checking in dummy_ops_init_args()

v3: https://www.spinics.net/lists/bpf/msg48303.html
 * rebase on bpf-next
 * address comments for Martin, mainly include: merge patch 3 &
   patch 4 in v2, fix names of btf ctx access check helpers,
   handle CONFIG_NET, fix leak in dummy_ops_init_args(), and
   simplify bpf_dummy_init()
 * patch 4: use a loop to check args in test_dummy_multiple_args()

v2: https://www.spinics.net/lists/bpf/msg47948.html
 * rebase on bpf-next
 * add test_2(...) ops to test the passing of multiple arguments
 * a new patch (patch #2) is added to factor out ctx access helpers
 * address comments from Martin & Andrii

v1: https://www.spinics.net/lists/bpf/msg46787.html

RFC: https://www.spinics.net/lists/bpf/msg46117.html
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2021-11-01 14:10:00 -07:00
Hou Tao
31122b2f76 selftests/bpf: Add test cases for struct_ops prog
Running a BPF_PROG_TYPE_STRUCT_OPS prog for dummy_st_ops::test_N()
through bpf_prog_test_run(). Four test cases are added:
(1) attach dummy_st_ops should fail
(2) function return value of bpf_dummy_ops::test_1() is expected
(3) pointer argument of bpf_dummy_ops::test_1() works as expected
(4) multiple arguments passed to bpf_dummy_ops::test_2() are correct

Signed-off-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20211025064025.2567443-5-houtao1@huawei.com
2021-11-01 14:10:00 -07:00
Hou Tao
c196906d50 bpf: Add dummy BPF STRUCT_OPS for test purpose
Currently the test of BPF STRUCT_OPS depends on the specific bpf
implementation of tcp_congestion_ops, but it can not cover all
basic functionalities (e.g, return value handling), so introduce
a dummy BPF STRUCT_OPS for test purpose.

Loading a bpf_dummy_ops implementation from userspace is prohibited,
and its only purpose is to run BPF_PROG_TYPE_STRUCT_OPS program
through bpf(BPF_PROG_TEST_RUN). Now programs for test_1() & test_2()
are supported. The following three cases are exercised in
bpf_dummy_struct_ops_test_run():

(1) test and check the value returned from state arg in test_1(state)
The content of state is copied from userspace pointer and copied back
after calling test_1(state). The user pointer is saved in an u64 array
and the array address is passed through ctx_in.

(2) test and check the return value of test_1(NULL)
Just simulate the case in which an invalid input argument is passed in.

(3) test multiple arguments passing in test_2(state, ...)
5 arguments are passed through ctx_in in form of u64 array. The first
element of array is userspace pointer of state and others 4 arguments
follow.

Signed-off-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20211025064025.2567443-4-houtao1@huawei.com
2021-11-01 14:10:00 -07:00
Hou Tao
35346ab641 bpf: Factor out helpers for ctx access checking
Factor out two helpers to check the read access of ctx for raw tp
and BTF function. bpf_tracing_ctx_access() is used to check
the read access to argument is valid, and bpf_tracing_btf_ctx_access()
checks whether the btf type of argument is valid besides the checking
of argument read. bpf_tracing_btf_ctx_access() will be used by the
following patch.

Signed-off-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20211025064025.2567443-3-houtao1@huawei.com
2021-11-01 14:10:00 -07:00
Hou Tao
31a645aea4 bpf: Factor out a helper to prepare trampoline for struct_ops prog
Factor out a helper bpf_struct_ops_prepare_trampoline() to prepare
trampoline for BPF_PROG_TYPE_STRUCT_OPS prog. It will be used by
.test_run callback in following patch.

Signed-off-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20211025064025.2567443-2-houtao1@huawei.com
2021-11-01 14:10:00 -07:00
Linus Torvalds
8cb1ae19bf x86/fpu updates:
- Cleanup of extable fixup handling to be more robust, which in turn
    allows to make the FPU exception fixups more robust as well.
 
  - Change the return code for signal frame related failures from explicit
    error codes to a boolean fail/success as that's all what the calling
    code evaluates.
 
  - A large refactoring of the FPU code to prepare for adding AMX support:
 
    - Distangle the public header maze and remove especially the misnomed
      kitchen sink internal.h which is despite it's name included all over
      the place.
 
    - Add a proper abstraction for the register buffer storage (struct
      fpstate) which allows to dynamically size the buffer at runtime by
      flipping the pointer to the buffer container from the default
      container which is embedded in task_struct::tread::fpu to a
      dynamically allocated container with a larger register buffer.
 
    - Convert the code over to the new fpstate mechanism.
 
    - Consolidate the KVM FPU handling by moving the FPU related code into
      the FPU core which removes the number of exports and avoids adding
      even more export when AMX has to be supported in KVM. This also
      removes duplicated code which was of course unnecessary different and
      incomplete in the KVM copy.
 
    - Simplify the KVM FPU buffer handling by utilizing the new fpstate
      container and just switching the buffer pointer from the user space
      buffer to the KVM guest buffer when entering vcpu_run() and flipping
      it back when leaving the function. This cuts the memory requirements
      of a vCPU for FPU buffers in half and avoids pointless memory copy
      operations.
 
      This also solves the so far unresolved problem of adding AMX support
      because the current FPU buffer handling of KVM inflicted a circular
      dependency between adding AMX support to the core and to KVM.  With
      the new scheme of switching fpstate AMX support can be added to the
      core code without affecting KVM.
 
    - Replace various variables with proper data structures so the extra
      information required for adding dynamically enabled FPU features (AMX)
      can be added in one place
 
  - Add AMX (Advanved Matrix eXtensions) support (finally):
 
     AMX is a large XSTATE component which is going to be available with
     Saphire Rapids XEON CPUs. The feature comes with an extra MSR (MSR_XFD)
     which allows to trap the (first) use of an AMX related instruction,
     which has two benefits:
 
     1) It allows the kernel to control access to the feature
 
     2) It allows the kernel to dynamically allocate the large register
        state buffer instead of burdening every task with the the extra 8K
        or larger state storage.
 
     It would have been great to gain this kind of control already with
     AVX512.
 
     The support comes with the following infrastructure components:
 
     1) arch_prctl() to
        - read the supported features (equivalent to XGETBV(0))
        - read the permitted features for a task
        - request permission for a dynamically enabled feature
 
        Permission is granted per process, inherited on fork() and cleared
        on exec(). The permission policy of the kernel is restricted to
        sigaltstack size validation, but the syscall obviously allows
        further restrictions via seccomp etc.
 
     2) A stronger sigaltstack size validation for sys_sigaltstack(2) which
        takes granted permissions and the potentially resulting larger
        signal frame into account. This mechanism can also be used to
        enforce factual sigaltstack validation independent of dynamic
        features to help with finding potential victims of the 2K
        sigaltstack size constant which is broken since AVX512 support was
        added.
 
     3) Exception handling for #NM traps to catch first use of a extended
        feature via a new cause MSR. If the exception was caused by the use
        of such a feature, the handler checks permission for that
        feature. If permission has not been granted, the handler sends a
        SIGILL like the #UD handler would do if the feature would have been
        disabled in XCR0. If permission has been granted, then a new fpstate
        which fits the larger buffer requirement is allocated.
 
        In the unlikely case that this allocation fails, the handler sends
        SIGSEGV to the task. That's not elegant, but unavoidable as the
        other discussed options of preallocation or full per task
        permissions come with their own set of horrors for kernel and/or
        userspace. So this is the lesser of the evils and SIGSEGV caused by
        unexpected memory allocation failures is not a fundamentally new
        concept either.
 
        When allocation succeeds, the fpstate properties are filled in to
        reflect the extended feature set and the resulting sizes, the
        fpu::fpstate pointer is updated accordingly and the trap is disarmed
        for this task permanently.
 
     4) Enumeration and size calculations
 
     5) Trap switching via MSR_XFD
 
        The XFD (eXtended Feature Disable) MSR is context switched with the
        same life time rules as the FPU register state itself. The mechanism
        is keyed off with a static key which is default disabled so !AMX
        equipped CPUs have zero overhead. On AMX enabled CPUs the overhead
        is limited by comparing the tasks XFD value with a per CPU shadow
        variable to avoid redundant MSR writes. In case of switching from a
        AMX using task to a non AMX using task or vice versa, the extra MSR
        write is obviously inevitable.
 
        All other places which need to be aware of the variable feature sets
        and resulting variable sizes are not affected at all because they
        retrieve the information (feature set, sizes) unconditonally from
        the fpstate properties.
 
     6) Enable the new AMX states
 
   Note, this is relatively new code despite the fact that AMX support is in
   the works for more than a year now.
 
   The big refactoring of the FPU code, which allowed to do a proper
   integration has been started exactly 3 weeks ago. Refactoring of the
   existing FPU code and of the original AMX patches took a week and has
   been subject to extensive review and testing. The only fallout which has
   not been caught in review and testing right away was restricted to AMX
   enabled systems, which is completely irrelevant for anyone outside Intel
   and their early access program. There might be dragons lurking as usual,
   but so far the fine grained refactoring has held up and eventual yet
   undetected fallout is bisectable and should be easily addressable before
   the 5.16 release. Famous last words...
 
   Many thanks to Chang Bae and Dave Hansen for working hard on this and
   also to the various test teams at Intel who reserved extra capacity to
   follow the rapid development of this closely which provides the
   confidence level required to offer this rather large update for inclusion
   into 5.16-rc1.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmF/NkITHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYodDkEADH4+/nN/QoSUHIuuha5Zptj3g2b16a
 /3TxT9fhwPen/kzMGsUk70s3iWJMA+I5dCfkSZexJ2hfhcRe9cBzZIa1HCawKwf3
 YCISTsO/M+LpeORuZ+TpfFLJKnxNr1SEOl+EYffGhq0AkCjifb9Cnr0JZuoMUzGU
 jpfJZ2bj28ri5lG812DtzSMBM9E3SAwgJv+GNjmZbxZKb9mAfhbAMdBUXHirX7Ej
 jmx6koQjYOKwYIW8w1BrdC270lUKQUyJTbQgdRkN9Mh/HnKyFixQ18JqGlgaV2cT
 EtYePUfTEdaHdAhUINLIlEug1MfOslHU+HyGsdywnoChNB4GHPQuePC5Tz60VeFN
 RbQ9aKcBUu8r95rjlnKtAtBijNMA4bjGwllVxNwJ/ZoA9RPv1SbDZ07RX3qTaLVY
 YhVQl8+shD33/W24jUTJv1kMMexpHXIlv0gyfMryzpwI7uzzmGHRPAokJdbYKctC
 dyMPfdE90rxTiMUdL/1IQGhnh3awjbyfArzUhHyQ++HyUyzCFh0slsO0CD18vUy8
 FofhCugGBhjuKw3XwLNQ+KsWURz5qHctSzBc3qMOSyqFHbAJCVRANkhsFvWJo2qL
 75+Z7OTRebtsyOUZIdq26r4roSxHrps3dupWTtN70HWx2NhQG1nLEw986QYiQu1T
 hcKvDmehQLrUvg==
 =x3WL
 -----END PGP SIGNATURE-----

Merge tag 'x86-fpu-2021-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fpu updates from Thomas Gleixner:

 - Cleanup of extable fixup handling to be more robust, which in turn
   allows to make the FPU exception fixups more robust as well.

 - Change the return code for signal frame related failures from
   explicit error codes to a boolean fail/success as that's all what the
   calling code evaluates.

 - A large refactoring of the FPU code to prepare for adding AMX
   support:

      - Distangle the public header maze and remove especially the
        misnomed kitchen sink internal.h which is despite it's name
        included all over the place.

      - Add a proper abstraction for the register buffer storage (struct
        fpstate) which allows to dynamically size the buffer at runtime
        by flipping the pointer to the buffer container from the default
        container which is embedded in task_struct::tread::fpu to a
        dynamically allocated container with a larger register buffer.

      - Convert the code over to the new fpstate mechanism.

      - Consolidate the KVM FPU handling by moving the FPU related code
        into the FPU core which removes the number of exports and avoids
        adding even more export when AMX has to be supported in KVM.
        This also removes duplicated code which was of course
        unnecessary different and incomplete in the KVM copy.

      - Simplify the KVM FPU buffer handling by utilizing the new
        fpstate container and just switching the buffer pointer from the
        user space buffer to the KVM guest buffer when entering
        vcpu_run() and flipping it back when leaving the function. This
        cuts the memory requirements of a vCPU for FPU buffers in half
        and avoids pointless memory copy operations.

        This also solves the so far unresolved problem of adding AMX
        support because the current FPU buffer handling of KVM inflicted
        a circular dependency between adding AMX support to the core and
        to KVM. With the new scheme of switching fpstate AMX support can
        be added to the core code without affecting KVM.

      - Replace various variables with proper data structures so the
        extra information required for adding dynamically enabled FPU
        features (AMX) can be added in one place

 - Add AMX (Advanced Matrix eXtensions) support (finally):

   AMX is a large XSTATE component which is going to be available with
   Saphire Rapids XEON CPUs. The feature comes with an extra MSR
   (MSR_XFD) which allows to trap the (first) use of an AMX related
   instruction, which has two benefits:

    1) It allows the kernel to control access to the feature

    2) It allows the kernel to dynamically allocate the large register
       state buffer instead of burdening every task with the the extra
       8K or larger state storage.

   It would have been great to gain this kind of control already with
   AVX512.

   The support comes with the following infrastructure components:

    1) arch_prctl() to
        - read the supported features (equivalent to XGETBV(0))
        - read the permitted features for a task
        - request permission for a dynamically enabled feature

       Permission is granted per process, inherited on fork() and
       cleared on exec(). The permission policy of the kernel is
       restricted to sigaltstack size validation, but the syscall
       obviously allows further restrictions via seccomp etc.

    2) A stronger sigaltstack size validation for sys_sigaltstack(2)
       which takes granted permissions and the potentially resulting
       larger signal frame into account. This mechanism can also be used
       to enforce factual sigaltstack validation independent of dynamic
       features to help with finding potential victims of the 2K
       sigaltstack size constant which is broken since AVX512 support
       was added.

    3) Exception handling for #NM traps to catch first use of a extended
       feature via a new cause MSR. If the exception was caused by the
       use of such a feature, the handler checks permission for that
       feature. If permission has not been granted, the handler sends a
       SIGILL like the #UD handler would do if the feature would have
       been disabled in XCR0. If permission has been granted, then a new
       fpstate which fits the larger buffer requirement is allocated.

       In the unlikely case that this allocation fails, the handler
       sends SIGSEGV to the task. That's not elegant, but unavoidable as
       the other discussed options of preallocation or full per task
       permissions come with their own set of horrors for kernel and/or
       userspace. So this is the lesser of the evils and SIGSEGV caused
       by unexpected memory allocation failures is not a fundamentally
       new concept either.

       When allocation succeeds, the fpstate properties are filled in to
       reflect the extended feature set and the resulting sizes, the
       fpu::fpstate pointer is updated accordingly and the trap is
       disarmed for this task permanently.

    4) Enumeration and size calculations

    5) Trap switching via MSR_XFD

       The XFD (eXtended Feature Disable) MSR is context switched with
       the same life time rules as the FPU register state itself. The
       mechanism is keyed off with a static key which is default
       disabled so !AMX equipped CPUs have zero overhead. On AMX enabled
       CPUs the overhead is limited by comparing the tasks XFD value
       with a per CPU shadow variable to avoid redundant MSR writes. In
       case of switching from a AMX using task to a non AMX using task
       or vice versa, the extra MSR write is obviously inevitable.

       All other places which need to be aware of the variable feature
       sets and resulting variable sizes are not affected at all because
       they retrieve the information (feature set, sizes) unconditonally
       from the fpstate properties.

    6) Enable the new AMX states

   Note, this is relatively new code despite the fact that AMX support
   is in the works for more than a year now.

   The big refactoring of the FPU code, which allowed to do a proper
   integration has been started exactly 3 weeks ago. Refactoring of the
   existing FPU code and of the original AMX patches took a week and has
   been subject to extensive review and testing. The only fallout which
   has not been caught in review and testing right away was restricted
   to AMX enabled systems, which is completely irrelevant for anyone
   outside Intel and their early access program. There might be dragons
   lurking as usual, but so far the fine grained refactoring has held up
   and eventual yet undetected fallout is bisectable and should be
   easily addressable before the 5.16 release. Famous last words...

   Many thanks to Chang Bae and Dave Hansen for working hard on this and
   also to the various test teams at Intel who reserved extra capacity
   to follow the rapid development of this closely which provides the
   confidence level required to offer this rather large update for
   inclusion into 5.16-rc1

* tag 'x86-fpu-2021-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (110 commits)
  Documentation/x86: Add documentation for using dynamic XSTATE features
  x86/fpu: Include vmalloc.h for vzalloc()
  selftests/x86/amx: Add context switch test
  selftests/x86/amx: Add test cases for AMX state management
  x86/fpu/amx: Enable the AMX feature in 64-bit mode
  x86/fpu: Add XFD handling for dynamic states
  x86/fpu: Calculate the default sizes independently
  x86/fpu/amx: Define AMX state components and have it used for boot-time checks
  x86/fpu/xstate: Prepare XSAVE feature table for gaps in state component numbers
  x86/fpu/xstate: Add fpstate_realloc()/free()
  x86/fpu/xstate: Add XFD #NM handler
  x86/fpu: Update XFD state where required
  x86/fpu: Add sanity checks for XFD
  x86/fpu: Add XFD state to fpstate
  x86/msr-index: Add MSRs for XFD
  x86/cpufeatures: Add eXtended Feature Disabling (XFD) feature bit
  x86/fpu: Reset permission and fpstate on exec()
  x86/fpu: Prepare fpu_clone() for dynamically enabled features
  x86/fpu/signal: Prepare for variable sigframe length
  x86/signal: Use fpu::__state_user_size for sigalt stack validation
  ...
2021-11-01 14:03:56 -07:00
Linus Torvalds
7d20dd3294 x86/apic related update:
- A single commit which reduces cacheline misses in
     __x2apic_send_IPI_mask() significantly by converting
     x86_cpu_to_logical_apicid() to an array instead of using per CPU
     storage. This reduces the cost for a full broadcast on a dual socket
     system with 256 CPUs from 33 down to 11 microseconds.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmF/Ia8THHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoeL1D/0fna8gVHWaVGUexB7HDF7G+WTuTnl0
 7A7n7Yt3fqNirxfxgPtWpWHSbJfhB1jSxbjEirTOCDHGC47au6Hh1HICtpRqOMoY
 oH5ZXpDdiQnpl3MnGys6J5StVZCyWPz/pIIqmqO99fd5Ykolbf9UNjXiBW+1jBfO
 dS7+av9dAhTbVnbJfTDR5fhl81nM04eyVIaMn7rq3Z6VDfFFaugvv/HmpzfBRwxC
 o/hiN/e6T3DMtz2zXOXxrw+xdBp62sAhlrx1FDB3OYVcBb1cVK1gWTMoY+GOUd5y
 5SiDoklzZXp4s/MT1qBxRvo7PRcL6SwwWtePBgYCL1oLfHE5Ctx/xu1jhfrH15IT
 jgykscSsyX4cRZfNXZcFLu8/EZmD/hT5xRQn4VkG2gnjm/SJ/U5m7cBOF8YWdiKs
 YRChHnpHN1fvxsKtLw3ZtgNT9HgqvIl4AvhweNe64fHFuQNefxVgvNAxgnNe03dr
 OGMbBPoX8AfSr2cxkhnprByMbcLa7aYAyAu0WkbrVar8ZB4uOPApnnIfMSvYgRtn
 0pP9G+kyRQMc57Wq0NzQWLnhgbQuZlAo5zzvVjju0FGUjFDWOG3G6I58AqsgYA0i
 YDbBIOSVCnXVCFFV7i99M8U2Z4n4Cj4Paf8UjRFi6H9164AyVYVZ0kMBQqX2Luj1
 8UYlESTByyO7Xw==
 =7l4b
 -----END PGP SIGNATURE-----

Merge tag 'x86-apic-2021-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86/apic update from Thomas Gleixner:
 "A single commit which reduces cache misses in __x2apic_send_IPI_mask()
  significantly by converting x86_cpu_to_logical_apicid() to an array
  instead of using per CPU storage.

  This reduces the cost for a full broadcast on a dual socket system
  with 256 CPUs from 33 down to 11 microseconds"

* tag 'x86-apic-2021-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/apic: Reduce cache line misses in __x2apic_send_IPI_mask()
2021-11-01 14:01:35 -07:00
Linus Torvalds
9a7e0a90a4 Scheduler updates:
- Revert the printk format based wchan() symbol resolution as it can leak
    the raw value in case that the symbol is not resolvable.
 
  - Make wchan() more robust and work with all kind of unwinders by
    enforcing that the task stays blocked while unwinding is in progress.
 
  - Prevent sched_fork() from accessing an invalid sched_task_group
 
  - Improve asymmetric packing logic
 
  - Extend scheduler statistics to RT and DL scheduling classes and add
    statistics for bandwith burst to the SCHED_FAIR class.
 
  - Properly account SCHED_IDLE entities
 
  - Prevent a potential deadlock when initial priority is assigned to a
    newly created kthread. A recent change to plug a race between cpuset and
    __sched_setscheduler() introduced a new lock dependency which is now
    triggered. Break the lock dependency chain by moving the priority
    assignment to the thread function.
 
  - Fix the idle time reporting in /proc/uptime for NOHZ enabled systems.
 
  - Improve idle balancing in general and especially for NOHZ enabled
    systems.
 
  - Provide proper interfaces for live patching so it does not have to
    fiddle with scheduler internals.
 
  - Add cluster aware scheduling support.
 
  - A small set of tweaks for RT (irqwork, wait_task_inactive(), various
    scheduler options and delaying mmdrop)
 
  - The usual small tweaks and improvements all over the place
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmF/OUkTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoR/5D/9ikdGNpKg9osNqJ3GjAmxsK6kVkB29
 iFe2k8pIpWDToWQf/wQRGih4Yj3Cl49QSnZcPIibh2/12EB1qrrW6iSPJkInz8Ec
 /1LS5/Vewn2OyoxyXZjdvGC5gTXEodSbIazASvX7nvdMeI4gsAsL5etzrMJirT/t
 aymqvr7zovvywrwMTQJrGjUMo9l4ewE8tafMNNhRu1BHU1U4ojM9yvThyRAAcmp7
 3Xy49A+Yq3IgrvYI4u8FMK5Zh08KaxSFjiLhePGm/bF+wSfYmWop2TP1jY05W2Uo
 ti8hfbJMUoFRYuMxAiEldkItnc0wV4M9PtWZZ/x+B71bs65Y4Zjt9cW+rxJv2+m1
 vzV31EsQwGnOti072dzWN4c/cZqngVXAjaNtErvDwJUr+Tw1ayv9KUvuodMQqZY6
 mu68bFUO2kV9EMe1CBOv51Uy1RGHyLj3rlNqrkw+Xp5ISE9Ad2vhUEiRp5bQx5Ci
 V/XFhGZkGUluh0vccrdFlNYZwhj8cZEzkOPCnPSeZ+bq8SyZE6xuHH/lTP1CJCOy
 s800rW1huM+kgV+zRN8adDkGXibAk9N3RtVGnQXmuEy8gB9LZmQg+JeM2wsc9B+6
 i0gdqZnsjNAfoK+BBAG4holxptSL8/eOJsFH8ZNIoxQ+iqooyPx9tFX7yXnRTBQj
 d2qWG7UvoseT+g==
 =fgtS
 -----END PGP SIGNATURE-----

Merge tag 'sched-core-2021-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull scheduler updates from Thomas Gleixner:

 - Revert the printk format based wchan() symbol resolution as it can
   leak the raw value in case that the symbol is not resolvable.

 - Make wchan() more robust and work with all kind of unwinders by
   enforcing that the task stays blocked while unwinding is in progress.

 - Prevent sched_fork() from accessing an invalid sched_task_group

 - Improve asymmetric packing logic

 - Extend scheduler statistics to RT and DL scheduling classes and add
   statistics for bandwith burst to the SCHED_FAIR class.

 - Properly account SCHED_IDLE entities

 - Prevent a potential deadlock when initial priority is assigned to a
   newly created kthread. A recent change to plug a race between cpuset
   and __sched_setscheduler() introduced a new lock dependency which is
   now triggered. Break the lock dependency chain by moving the priority
   assignment to the thread function.

 - Fix the idle time reporting in /proc/uptime for NOHZ enabled systems.

 - Improve idle balancing in general and especially for NOHZ enabled
   systems.

 - Provide proper interfaces for live patching so it does not have to
   fiddle with scheduler internals.

 - Add cluster aware scheduling support.

 - A small set of tweaks for RT (irqwork, wait_task_inactive(), various
   scheduler options and delaying mmdrop)

 - The usual small tweaks and improvements all over the place

* tag 'sched-core-2021-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (69 commits)
  sched/fair: Cleanup newidle_balance
  sched/fair: Remove sysctl_sched_migration_cost condition
  sched/fair: Wait before decaying max_newidle_lb_cost
  sched/fair: Skip update_blocked_averages if we are defering load balance
  sched/fair: Account update_blocked_averages in newidle_balance cost
  x86: Fix __get_wchan() for !STACKTRACE
  sched,x86: Fix L2 cache mask
  sched/core: Remove rq_relock()
  sched: Improve wake_up_all_idle_cpus() take #2
  irq_work: Also rcuwait for !IRQ_WORK_HARD_IRQ on PREEMPT_RT
  irq_work: Handle some irq_work in a per-CPU thread on PREEMPT_RT
  irq_work: Allow irq_work_sync() to sleep if irq_work() no IRQ support.
  sched/rt: Annotate the RT balancing logic irqwork as IRQ_WORK_HARD_IRQ
  sched: Add cluster scheduler level for x86
  sched: Add cluster scheduler level in core and related Kconfig for ARM64
  topology: Represent clusters of CPUs within a die
  sched: Disable -Wunused-but-set-variable
  sched: Add wrapper for get_wchan() to keep task blocked
  x86: Fix get_wchan() to support the ORC unwinder
  proc: Use task_is_running() for wchan in /proc/$pid/stat
  ...
2021-11-01 13:48:52 -07:00
Linus Torvalds
57a315cd71 Time, timers and timekeeping updates:
- No core updates
 
   - No new clocksource/event driver
 
   - A large rework of the ARM architected timer driver to prepare for the
     support of the upcoming ARMv8.6 support
 
   - Fix Kconfig options for Exynos MCT, Samsung PWM and TI DM timers
 
   - Address a namespace collison in the ARC sp804 timer driver
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmF/IPsTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoXztEACE15PRW/s3dsKjsX4HgZocPS3WSeBx
 Nim90LHLWA8hiN3LdY5PmyB5lWHFPojCNtnnnA3hAAasalxtxu00XUwMx11Ewf7C
 HDpYvW4Pp3QHFH9i45nxVJ/2bhiL7g8cpvYNrXuk2rVkZnBf13LGqD4IUJbts/gY
 X+T0NTDHFNjJjbabuD4LVsfKkerPPJawYtL6EnFn8hkZwMKVY2Mq0xw6fCzA7qGy
 JbBkt0Or5k7imruq3uob9V+IscmkcPRxrP1JOgJLbzCbax+bpz2SL1nZxm5NAeol
 Hdxx+bP2PnpoCsq6GnN3q17nvr8PEOeEQ02AcZ/5IpExIfBnIDahJ61KA8Ao6WGm
 xvVY4fIG6nqul6LMqoAyVc0BeG4Mc7u3D12f+1/UE9pmYbMEM5TcHRbxg0QgTauF
 R78xc5P5v+B2LcUTE9/czJlaCZQuAF+4N60Lnt+aW0Ahsa0h+O2vrX1rBOC3d4US
 12xexkAimMS8VvmuD5wSPE9JrBMDZraJuDM3G+5cwOjXIG8VFXOSloOn/9BhWFEM
 c0scefsGFxENDwkd3d6PmGfrtxSOnqAzhvxgRyJ9zql9wX6LDOa6s2HXKxdSXEL8
 N18WTWsC0Ps60XLlH9CQrBj/T0YnKpOazMx0LIFLPFqEkoSkEjq1j3KgDnSQQ4uE
 J4c2UU4M8FnMlg==
 =SVx3
 -----END PGP SIGNATURE-----

Merge tag 'timers-core-2021-10-31' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull timer updates from Thomas Gleixner:
 "Time, timers and timekeeping updates:

   - No core updates

   - No new clocksource/event driver

   - A large rework of the ARM architected timer driver to prepare for
     the support of the upcoming ARMv8.6 support

   - Fix Kconfig options for Exynos MCT, Samsung PWM and TI DM timers

   - Address a namespace collison in the ARC sp804 timer driver"

* tag 'timers-core-2021-10-31' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  clocksource/drivers/timer-ti-dm: Select TIMER_OF
  clocksource/drivers/exynosy: Depend on sub-architecture for Exynos MCT and Samsung PWM
  clocksource/drivers/arch_arm_timer: Move workaround synchronisation around
  clocksource/drivers/arm_arch_timer: Fix masking for high freq counters
  clocksource/drivers/arm_arch_timer: Drop unnecessary ISB on CVAL programming
  clocksource/drivers/arm_arch_timer: Remove any trace of the TVAL programming interface
  clocksource/drivers/arm_arch_timer: Work around broken CVAL implementations
  clocksource/drivers/arm_arch_timer: Advertise 56bit timer to the core code
  clocksource/drivers/arm_arch_timer: Move MMIO timer programming over to CVAL
  clocksource/drivers/arm_arch_timer: Fix MMIO base address vs callback ordering issue
  clocksource/drivers/arm_arch_timer: Move drop _tval from erratum function names
  clocksource/drivers/arm_arch_timer: Move system register timer programming over to CVAL
  clocksource/drivers/arm_arch_timer: Extend write side of timer register accessors to u64
  clocksource/drivers/arm_arch_timer: Drop CNT*_TVAL read accessors
  clocksource/arm_arch_timer: Add build-time guards for unhandled register accesses
  clocksource/drivers/arc_timer: Eliminate redefined macro error
2021-11-01 13:44:55 -07:00
Ville Syrjälä
1977e8eb40 drm/i915: Fix type1 DVI DP dual mode adapter heuristic for modern platforms
Looks like we never updated intel_bios_is_port_dp_dual_mode() when
the VBT port mapping became erratic on modern platforms. This
is causing us to look up the wrong child device and thus throwing
the heuristic off (ie. we might end looking at a child device for
a genuine DP++ port when we were supposed to look at one for a
native HDMI port).

Fix it up by not using the outdated port_mapping[] in
intel_bios_is_port_dp_dual_mode() and rely on
intel_bios_encoder_data_lookup() instead.

Cc: stable@vger.kernel.org
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/4138
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211025142147.23897-1-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
(cherry picked from commit 32c2bc89c7)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2021-11-01 16:39:18 -04:00
Ville Syrjälä
99bac3063e drm/i915: Extend the async flip VT-d w/a to skl/bxt
Looks like skl/bxt/derivatives also need the plane stride
stretch w/a when using async flips and VT-d is enabled, or
else we get corruption on screen. To my surprise this was
even documented in bspec, but only as a note on the
CHICHKEN_PIPESL register description rather than on the
w/a list.

So very much the same thing as on HSW/BDW, except the bits
moved yet again.

Cc: stable@vger.kernel.org
Cc: Karthik B S <karthik.b.s@intel.com>
Fixes: 55ea1cb178 ("drm/i915: Enable async flips in i915")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210930190943.17547-1-ville.syrjala@linux.intel.com
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
(cherry picked from commit d08df3b0bd)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
(cherry picked from commit b2d73debfd)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2021-11-01 16:38:58 -04:00
Zhi A Wang
af6c83ae25 drm/i915/gvt: fix the usage of ww lock in gvt scheduler.
As the APIs related to ww lock in i915 was changed recently, the usage of
ww lock in GVT-g scheduler needs to be changed accrodingly. We noticed a
deadlock when GVT-g scheduler submits the workload to i915. After some
investigation, it seems the way of how to use ww lock APIs has been
changed. Releasing a ww now requires a explicit i915_gem_ww_ctx_fini().

Fixes: 67f1120381 ("drm/i915/gvt: Introduce per object locking in GVT scheduler.")
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Zhi A Wang <zhi.a.wang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20210826143834.25410-1-zhi.a.wang@intel.com
Acked-by: Zhenyu Wang <zhenyuw@linux.intel.com>
(cherry picked from commit d168cd7979)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2021-11-01 16:38:12 -04:00
Linus Torvalds
43aa0a195f objtool updates:
- Improve retpoline code patching by separating it from alternatives which
    reduces memory footprint and allows to do better optimizations in the
    actual runtime patching.
 
  - Add proper retpoline support for x86/BPF
 
  - Address noinstr warnings in x86/kvm, lockdep and paravirtualization code
 
  - Add support to handle pv_opsindirect calls in the noinstr analysis
 
  - Classify symbols upfront and cache the result to avoid redundant
    str*cmp() invocations.
 
  - Add a CFI hash to reduce memory consumption which also reduces runtime
    on a allyesconfig by ~50%
 
  - Adjust XEN code to make objtool handling more robust and as a side
    effect to prevent text fragmentation due to placement of the hypercall
    page.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmF/GFgTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoc1JD/0Sz6seP2OUMxbMT3gCcFo9sMvYTdsM
 7WuGFbBbnCIo7g8JH7k0zRRBigptMp2eUtQXKkgaaIbWN4JbuVKf8KxN5/qXxLi4
 fJ12QnNTGH9N2jtzl5wKmpjaKJnnJMD9D10XwoR+T6gn6NHd+AgLEs7GxxuQUlgo
 eC9oEXhNHC8uNhiZc38EwfwmItI1bRgaLrnZWIL4rYGSMxfCK1/cEOpWrFfX9wmj
 /diB6oqMyPXZXMCtgpX7TniUr5XOTCcUkeO9mQv5bmyq/YM/8hrTbcVSJlsVYLvP
 EsBnUSHAcfLFiHXwa1RNiIGdbiPjbN+UYeXGAvqF58f3e5dTIHtN/UmWo7OH93If
 9rLMVNcMpsfPx7QRk2IxEPumLCkyfwjzfKrVDM6P6TKEIUzD1og4IK9gTlfykVsh
 56G5XiCOC/X2x8IMxKTLGuBiAVLFHXK/rSwoqhvNEWBFKDbP13QWs0LurBcW09Sa
 /kQI9pIBT1xFA/R+OY5Xy1cqNVVK1Gxmk8/bllCijA9pCFSCFM4hLZE5CevdrBCV
 h5SdqEK5hIlzFyypXfsCik/4p/+rfvlGfUKtFsPctxx29SPe+T0orx+l61jiWQok
 rZOflwMawK5lDuASHrvNHGJcWaTwoo3VcXMQDnQY0Wulc43J5IFBaPxkZzgyd+S1
 4lktHxatrCMUgw==
 =pfZi
 -----END PGP SIGNATURE-----

Merge tag 'objtool-core-2021-10-31' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull objtool updates from Thomas Gleixner:

 - Improve retpoline code patching by separating it from alternatives
   which reduces memory footprint and allows to do better optimizations
   in the actual runtime patching.

 - Add proper retpoline support for x86/BPF

 - Address noinstr warnings in x86/kvm, lockdep and paravirtualization
   code

 - Add support to handle pv_opsindirect calls in the noinstr analysis

 - Classify symbols upfront and cache the result to avoid redundant
   str*cmp() invocations.

 - Add a CFI hash to reduce memory consumption which also reduces
   runtime on a allyesconfig by ~50%

 - Adjust XEN code to make objtool handling more robust and as a side
   effect to prevent text fragmentation due to placement of the
   hypercall page.

* tag 'objtool-core-2021-10-31' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (41 commits)
  bpf,x86: Respect X86_FEATURE_RETPOLINE*
  bpf,x86: Simplify computing label offsets
  x86,bugs: Unconditionally allow spectre_v2=retpoline,amd
  x86/alternative: Add debug prints to apply_retpolines()
  x86/alternative: Try inline spectre_v2=retpoline,amd
  x86/alternative: Handle Jcc __x86_indirect_thunk_\reg
  x86/alternative: Implement .retpoline_sites support
  x86/retpoline: Create a retpoline thunk array
  x86/retpoline: Move the retpoline thunk declarations to nospec-branch.h
  x86/asm: Fixup odd GEN-for-each-reg.h usage
  x86/asm: Fix register order
  x86/retpoline: Remove unused replacement symbols
  objtool,x86: Replace alternatives with .retpoline_sites
  objtool: Shrink struct instruction
  objtool: Explicitly avoid self modifying code in .altinstr_replacement
  objtool: Classify symbols
  objtool: Support pv_opsindirect calls for noinstr
  x86/xen: Rework the xen_{cpu,irq,mmu}_opsarrays
  x86/xen: Mark xen_force_evtchn_callback() noinstr
  x86/xen: Make irq_disable() noinstr
  ...
2021-11-01 13:24:43 -07:00
Linus Torvalds
595b28fb0c Locking updates:
- Move futex code into kernel/futex/ and split up the kitchen sink into
    seperate files to make integration of sys_futex_waitv() simpler.
 
  - Add a new sys_futex_waitv() syscall which allows to wait on multiple
    futexes. The main use case is emulating Windows' WaitForMultipleObjects
    which allows Wine to improve the performance of Windows Games. Also
    native Linux games can benefit from this interface as this is a common
    wait pattern for this kind of applications.
 
  - Add context to ww_mutex_trylock() to provide a path for i915 to rework
    their eviction code step by step without making lockdep upset until the
    final steps of rework are completed. It's also useful for regulator and
    TTM to avoid dropping locks in the non contended path.
 
  - Lockdep and might_sleep() cleanups and improvements
 
  - A few improvements for the RT substitutions.
 
  - The usual small improvements and cleanups.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmF/FTITHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoVNZD/9vIm3Bu1Coz8tbNXz58AiCYq9Y/vp5
 mzFgSzz+VJTkW5Vh8jo5Uel4rCKZyt+rL276EoaRPzYl8KFtWDbpK3qd3PrXKqTX
 At49JO4ttAMJUHIBQ6vblEkykmfEd9YPU1uSWk5roJ+s7Jmr5VWnu0FEWHP00As5
 tWOca/TM0ei9kof26V2fl5aecTGII4i4Zsvy+LPsXtI+TnmP0gSBcGAS/5UnZTtJ
 vQRWTR3ojoYvh5iTmNqbaURYoQLe2j8yscn1DSW1CABWVmP12eDWs+N7jRP4b5S9
 73xOv5P7vpva41wxrK2ir5iNkpsLE97VL2JOHTW8nm7orblfiuxHLTCkTjEdd2pO
 h8blI2IBizEB3JYn2BMkOAaZQOSjN8hd6Ye/b2B4AMEGWeXEoEv6eVy/orYKCluQ
 XDqGn47Vce/SYmo5vfTB8VMt6nANx8PKvOP3IvjHInYEQBgiT6QrlUw3RRkXBp5s
 clQkjYYwjAMVIXowcCrdhoKjMROzi6STShVwHwGL8MaZXqr8Vl6BUO9ckU0pY+4C
 F000Hzwxi8lGEQ9k+P+BnYOEzH5osCty8lloKiQ/7ciX6T+CZHGJPGK/iY4YL8P5
 C3CJWMsHCqST7DodNFJmdfZt99UfIMmEhshMDduU9AAH0tHCn8vOu0U6WvCtpyBp
 BvHj68zteAtlYg==
 =RZ4x
 -----END PGP SIGNATURE-----

Merge tag 'locking-core-2021-10-31' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull locking updates from Thomas Gleixner:

 - Move futex code into kernel/futex/ and split up the kitchen sink into
   seperate files to make integration of sys_futex_waitv() simpler.

 - Add a new sys_futex_waitv() syscall which allows to wait on multiple
   futexes.

   The main use case is emulating Windows' WaitForMultipleObjects which
   allows Wine to improve the performance of Windows Games. Also native
   Linux games can benefit from this interface as this is a common wait
   pattern for this kind of applications.

 - Add context to ww_mutex_trylock() to provide a path for i915 to
   rework their eviction code step by step without making lockdep upset
   until the final steps of rework are completed. It's also useful for
   regulator and TTM to avoid dropping locks in the non contended path.

 - Lockdep and might_sleep() cleanups and improvements

 - A few improvements for the RT substitutions.

 - The usual small improvements and cleanups.

* tag 'locking-core-2021-10-31' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (44 commits)
  locking: Remove spin_lock_flags() etc
  locking/rwsem: Fix comments about reader optimistic lock stealing conditions
  locking: Remove rcu_read_{,un}lock() for preempt_{dis,en}able()
  locking/rwsem: Disable preemption for spinning region
  docs: futex: Fix kernel-doc references
  futex: Fix PREEMPT_RT build
  futex2: Documentation: Document sys_futex_waitv() uAPI
  selftests: futex: Test sys_futex_waitv() wouldblock
  selftests: futex: Test sys_futex_waitv() timeout
  selftests: futex: Add sys_futex_waitv() test
  futex,arm: Wire up sys_futex_waitv()
  futex,x86: Wire up sys_futex_waitv()
  futex: Implement sys_futex_waitv()
  futex: Simplify double_lock_hb()
  futex: Split out wait/wake
  futex: Split out requeue
  futex: Rename mark_wake_futex()
  futex: Rename: match_futex()
  futex: Rename: hb_waiter_{inc,dec,pending}()
  futex: Split out PI futex
  ...
2021-11-01 13:15:36 -07:00
Linus Torvalds
91e1c99e17 perf updates:
core:
 
   - Allow ftrace to instrument parts of the perf core code
 
   - Add a new mem_hops field to perf_mem_data_src which allows to represent
     intra-node/package or inter-node/off-package details to prepare for
     next generation systems which have more hieararchy within the
     node/pacakge level.
 
  tools:
 
   - Update for the new mem_hops field in perf_mem_data_src
 
  arch:
 
   - A set of constraints fixes for the Intel uncore PMU
 
   - The usual set of small fixes and improvements for x86 and PPC
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmF/GkQTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoaD8D/wLhXR8RxtF4W9HJmHA+5XFsPtg+isp
 ZNU2kOs4gZskFx75NQaRv5ikA8y68TKdIx+NuQvRLYItaMveTToLSsJ55bfGMxIQ
 JHqDvANUNxBmAACnbYQlqf9WgB0i/3fCUHY5lpmN0waKjaswz7WNpycv4ccShVZr
 PKbgEjkeFBhplCqqOF0X5H3V+4q85+nZONm1iSNd4S7/3B6OCxOf1u78usL1bbtW
 yJAMSuTeOVUZCJm7oVywKW/ZlCscT135aKr6xe5QTrjlPuRWzuLaXNezdMnMyoVN
 HVv8a0ClACb8U5KiGfhvaipaIlIAliWJp2qoiNjrspDruhH6Yc+eNh1gUhLbtNpR
 4YZR5jxv4/mS13kzMMQg00cCWQl7N4whPT+ZE9pkpshGt+EwT+Iy3U+v13wDfnnp
 MnDggpWYGEkAck13t/T6DwC3qBIsVujtpiG+tt/ERbTxiuxi1ccQTGY3PDjtHV3k
 tIMH5n7l4jEpfl8VmoSUgz/2h1MLZnQUWp41GXkjkaOt7uunQZen+nAwqpTm28KV
 7U6U0h1q6r7HxOZRxkPPe4HSV+aBNH3H1LeNBfEd3hDCFGf6MY6vLow+2BE9ybk7
 Y6LPbRqq0SN3sd5MND0ZvQEt5Zgol8CMlX+UKoLEEv7RognGbIxkgpK7exv5pC9w
 nWj7TaMfpRzPgw==
 =Oj0G
 -----END PGP SIGNATURE-----

Merge tag 'perf-core-2021-10-31' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf updates from Thomas Gleixner:
 "Core:

   - Allow ftrace to instrument parts of the perf core code

   - Add a new mem_hops field to perf_mem_data_src which allows to
     represent intra-node/package or inter-node/off-package details to
     prepare for next generation systems which have more hieararchy
     within the node/pacakge level.

  Tools:

   - Update for the new mem_hops field in perf_mem_data_src

  Arch:

   - A set of constraints fixes for the Intel uncore PMU

   - The usual set of small fixes and improvements for x86 and PPC"

* tag 'perf-core-2021-10-31' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/intel: Fix ICL/SPR INST_RETIRED.PREC_DIST encodings
  powerpc/perf: Fix data source encodings for L2.1 and L3.1 accesses
  tools/perf: Add mem_hops field in perf_mem_data_src structure
  perf: Add mem_hops field in perf_mem_data_src structure
  perf: Add comment about current state of PERF_MEM_LVL_* namespace and remove an extra line
  perf/core: Allow ftrace for functions in kernel/event/core.c
  perf/x86: Add new event for AUX output counter index
  perf/x86: Add compiler barrier after updating BTS
  perf/x86/intel/uncore: Fix Intel SPR M3UPI event constraints
  perf/x86/intel/uncore: Fix Intel SPR M2PCIE event constraints
  perf/x86/intel/uncore: Fix Intel SPR IIO event constraints
  perf/x86/intel/uncore: Fix Intel SPR CHA event constraints
  perf/x86/intel/uncore: Fix Intel ICX IIO event constraints
  perf/x86/intel/uncore: Fix invalid unit check
  perf/x86/intel/uncore: Support extra IMC channel on Ice Lake server
2021-11-01 13:12:15 -07:00
Linus Torvalds
5a47ebe98e Updates for the interrupt subsystem:
Core changes:
 
   - Prevent a potential deadlock when initial priority is assigned to a
     newly created interrupt thread. A recent change to plug a race between
     cpuset and __sched_setscheduler() introduced a new lock dependency
     which is now triggered. Break the lock dependency chain by moving the
     priority assignment to the thread function.
 
   - A couple of small updates to make the irq core RT safe.
 
   - Confine the irq_cpu_online/offline() API to the only left unfixable
     user Cavium Octeon so that it does not grow new usage.
 
   - A small documentation update
 
  Driver changes:
 
   - A large cross architecture rework to move irq_enter/exit() into the
     architecture code to make addressing the NOHZ_FULL/RCU issues simpler.
 
   - The obligatory new irq chip driver for Microchip EIC
 
   - Modularize a few irq chip drivers
 
   - Expand usage of devm_*() helpers throughout the driver code
 
   - The usual small fixes and improvements all over the place
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmF+8BUTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoWs2EACeNbL93aIFokd2/RllRSr4VvMjKNyW
 PpA0RYDOz1Jh4ldK+7b/EYapKgAkR3yyOtz+jyjRE7jsQK0pQeLtYNLd3cTzsD7K
 LCvl8rq6cbRqyFoSC15UKKNbQ/f+o/3LeGPoipr5NQZRMepxk2J/yBCNRXHvIbe6
 oLMQJUgw7KKtvCrCUX9OSei4F09T1qsNrIYb7QafP5+v0zndAT7uKNivWrKGFrsh
 Uk9epoH3hIkvQERkpmzwJEJaq6oyqhoYQy7ZRGayEPwIdCyivJGZrVX0mZk1LX58
 uc8u5grIslX9MqZEQWBweR5y7nISB494NGKmoCInu66U/+3DSOg3AGH2Rfw8PNFZ
 lMKdXzYoDgv2y6LeiLtTUKV4K1NBRXo0BhwSGbPw0o6C03/x003kG824Y+/naU75
 6q05BZSia1PagPV3e0UAm0A2Rnjj/5uso2fEk0eGBSGM27jf9SQcSE8DVrEiLRd1
 2N5uAXbMdfu4xACsEI1Uxu1KNOSQnUhBCy0X6Ppj1a083kLG7jg/126ebb05R8G4
 MF79PFt+xUPSzmuKc/xwCdANtW+zzoyjYl5w6mwELBJ9veNbPShokGBTN/qzjXKZ
 vdr3/pXx95lRAzFnGOnETesm3IyObruU4K8NbMKd2b+eYa0w1WuZCKnutGLfsqxg
 byhCEw459e3P2g==
 =r6ln
 -----END PGP SIGNATURE-----

Merge tag 'irq-core-2021-10-31' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull irq updates from Thomas Gleixner:
 "Updates for the interrupt subsystem:

  Core changes:

   - Prevent a potential deadlock when initial priority is assigned to a
     newly created interrupt thread. A recent change to plug a race
     between cpuset and __sched_setscheduler() introduced a new lock
     dependency which is now triggered. Break the lock dependency chain
     by moving the priority assignment to the thread function.

   - A couple of small updates to make the irq core RT safe.

   - Confine the irq_cpu_online/offline() API to the only left unfixable
     user Cavium Octeon so that it does not grow new usage.

   - A small documentation update

  Driver changes:

   - A large cross architecture rework to move irq_enter/exit() into the
     architecture code to make addressing the NOHZ_FULL/RCU issues
     simpler.

   - The obligatory new irq chip driver for Microchip EIC

   - Modularize a few irq chip drivers

   - Expand usage of devm_*() helpers throughout the driver code

   - The usual small fixes and improvements all over the place"

* tag 'irq-core-2021-10-31' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (53 commits)
  h8300: Fix linux/irqchip.h include mess
  dt-bindings: irqchip: renesas-irqc: Document r8a774e1 bindings
  MIPS: irq: Avoid an unused-variable error
  genirq: Hide irq_cpu_{on,off}line() behind a deprecated option
  irqchip/mips-gic: Get rid of the reliance on irq_cpu_online()
  MIPS: loongson64: Drop call to irq_cpu_offline()
  irq: remove handle_domain_{irq,nmi}()
  irq: remove CONFIG_HANDLE_DOMAIN_IRQ_IRQENTRY
  irq: riscv: perform irqentry in entry code
  irq: openrisc: perform irqentry in entry code
  irq: csky: perform irqentry in entry code
  irq: arm64: perform irqentry in entry code
  irq: arm: perform irqentry in entry code
  irq: add a (temporary) CONFIG_HANDLE_DOMAIN_IRQ_IRQENTRY
  irq: nds32: avoid CONFIG_HANDLE_DOMAIN_IRQ
  irq: arc: avoid CONFIG_HANDLE_DOMAIN_IRQ
  irq: add generic_handle_arch_irq()
  irq: unexport handle_irq_desc()
  irq: simplify handle_domain_{irq,nmi}()
  irq: mips: simplify do_domain_IRQ()
  ...
2021-11-01 13:09:10 -07:00
John Johansen
dc155617fa apparmor: Fix internal policy capable check for policy management
The check was incorrectly treating a returned error as a boolean.

Fixes: 31ec99e133 ("apparmor: switch to apparmor to internal capable check for policy management")
Signed-off-by: John Johansen <john.johansen@canonical.com>
2021-11-01 13:05:40 -07:00
Linus Torvalds
037c50bfbe for-5.16-tag
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAmF/7PAACgkQxWXV+ddt
 WDtp6A//SbVYeuHWpsXkhBiOpJt2PpS1K8VY5LIJc3brua5EZm8IarlR57X9IqYu
 89ZlWnuANrw4d5RRiIO+NYhc+DR6+ydxHesJG+I2B+o5OnR0Ynb06gLhsP1tSK6y
 lYZORQFJZP051ODU/uEc8A0KZN7DySIUmqezAibfyxepF6oPEap0nFp17/B80tWp
 sKdMp2TBN5ymZwsdSK1nZ7ws1ZL57HgkFDPqp8m8CuPTkneG4CtNol6yUpuPExpL
 QzvQsqTygmiFoy0uNTG7Rg7IlKqEuhbR7lwfkmcBZCV66JmhFco5QhxN13QIn42s
 +YSug52SMWc8YVHIEj16xtBgHEqZXWYey8d2ewhc0tDSGDm0HmXCNjcn1vYr0NJr
 5bW/7/3bpkHYejasy1wDEK5P8Uo2xsgpRyAvuEReGoRi8ze66EohahvP3o7YJi/Q
 o0pROXdCT89JbM/T4MTvN/5MUlCSM7rnexXZ39ldGNacPgn9FAUCPw6KtzKKyVRe
 DF19nPOUXSg6SLECbVkRQUwcOjxOTFP+T0Jx61Um8bomFskYJJnmr4SD3pqlzgp7
 NxV5ad0+r7zU0x9MADkyqboObo0ROAfD4hthcZiRN+0UIK+Gq5nATTD5ur6/nwsT
 0PJGOXDPz7cmfqUdmvpA0ctRxbFEqpaz6sDh7nq/iUSmaGITcUM=
 =HvYu
 -----END PGP SIGNATURE-----

Merge tag 'for-5.16-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs updates from David Sterba:
 "The updates this time are more under the hood and enhancing existing
  features (subpage with compression and zoned namespaces).

  Performance related:

   - misc small inode logging improvements (+3% throughput, -11% latency
     on sample dbench workload)

   - more efficient directory logging: bulk item insertion, less tree
     searches and locking

   - speed up bulk insertion of items into a b-tree, which is used when
     logging directories, when running delayed items for directories
     (fsync and transaction commits) and when running the slow path
     (full sync) of an fsync (bulk creation run time -4%, deletion -12%)

  Core:

   - continued subpage support
      - make defragmentation work
      - make compression write work

   - zoned mode
      - support ZNS (zoned namespaces), zone capacity is number of
        usable blocks in each zone
      - add dedicated block group (zoned) for relocation, to prevent
        out of order writes in some cases
      - greedy block group reclaim, pick the ones with least usable
        space first

   - preparatory work for send protocol updates

   - error handling improvements

   - cleanups and refactoring

  Fixes:

   - lockdep warnings
      - in show_devname callback, on seeding device
      - device delete on loop device due to conversions to workqueues

   - fix deadlock between chunk allocation and chunk btree modifications

   - fix tracking of missing device count and status"

* tag 'for-5.16-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (140 commits)
  btrfs: remove root argument from check_item_in_log()
  btrfs: remove root argument from add_link()
  btrfs: remove root argument from btrfs_unlink_inode()
  btrfs: remove root argument from drop_one_dir_item()
  btrfs: clear MISSING device status bit in btrfs_close_one_device
  btrfs: call btrfs_check_rw_degradable only if there is a missing device
  btrfs: send: prepare for v2 protocol
  btrfs: fix comment about sector sizes supported in 64K systems
  btrfs: update device path inode time instead of bd_inode
  fs: export an inode_update_time helper
  btrfs: fix deadlock when defragging transparent huge pages
  btrfs: sysfs: convert scnprintf and snprintf to sysfs_emit
  btrfs: make btrfs_super_block size match BTRFS_SUPER_INFO_SIZE
  btrfs: update comments for chunk allocation -ENOSPC cases
  btrfs: fix deadlock between chunk allocation and chunk btree modifications
  btrfs: zoned: use greedy gc for auto reclaim
  btrfs: check-integrity: stop storing the block device name in btrfsic_dev_state
  btrfs: use btrfs_get_dev_args_from_path in dev removal ioctls
  btrfs: add a btrfs_get_dev_args_from_path helper
  btrfs: handle device lookup with btrfs_dev_lookup_args
  ...
2021-11-01 12:48:25 -07:00
Linus Torvalds
2cf3f8133b btrfs: fix lzo_decompress_bio() kmap leakage
Commit ccaa66c8dd reinstated the kmap/kunmap that had been dropped in
commit 8c945d32e6 ("btrfs: compression: drop kmap/kunmap from lzo").

However, it seems to have done so incorrectly due to the change not
reverting cleanly, and lzo_decompress_bio() ended up not having a
matching "kunmap()" to the "kmap()" that was put back.

Also, any assert that the page pointer is not NULL should be before the
kmap() of said pointer, since otherwise you'd just oops in the kmap()
before the assert would even trigger.

I noticed this when trying to verify my btrfs merge, and things not
adding up.  I'm doing this fixup before re-doing my merge, because this
commit needs to also be backported to 5.15 (after verification from the
btrfs people).

Fixes: ccaa66c8dd ("Revert 'btrfs: compression: drop kmap/kunmap from lzo'")
Cc: David Sterba <dsterba@suse.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-01 12:46:47 -07:00
J. Bruce Fields
6d91929a6f nfsd: document server-to-server-copy parameters
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2021-11-01 15:18:39 -04:00
Linus Walleij
c738888032 watchdog: db8500_wdt: Rename symbols
For conistency and clarity, rename all symbols and strings from
ux500 to db8500 in the driver.

Cc: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Lee Jones <lee.jones@linaro.org>
Link: https://lore.kernel.org/r/20210922230947.1864357-3-linus.walleij@linaro.org
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2021-11-01 20:18:09 +01:00
Linus Walleij
d0305aac8e watchdog: db8500_wdt: Rename driver
This driver is named after the ambition to support more SoCs than
the DB8500. Those were never produced, so cut down the scope and
rename the driver accordingly. Since the Kconfig for the watchdog
defaults to y this will still be built by default.

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20210922230947.1864357-2-linus.walleij@linaro.org
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2021-11-01 20:18:08 +01:00
Linus Walleij
74128d801b watchdog: ux500_wdt: Drop platform data
Drop the platform data passing from the PRCMU driver. This platform
data was part of the ambition to support more SoCs, which in turn
were never mass produced.

Only a name remains of the MFD cell so switch to MFD_CELL_NAME().

Cc: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20210922230947.1864357-1-linus.walleij@linaro.org
Acked-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2021-11-01 20:18:08 +01:00
Linus Torvalds
9c6e8d52a7 Description for this pull request:
- Fix ->i_blocks truncation issue caused by wrong 32bit mask.
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCgA0FiEE6NzKS6Uv/XAAGHgyZwv7A1FEIQgFAmF/KeYWHGxpbmtpbmpl
 b25Aa2VybmVsLm9yZwAKCRBnC/sDUUQhCHlKEADF2SpjT2NJb9H+H8bwLwLA8uF9
 00dDUjcZOwqiEjpeVXwRf/6dIdZWv40+xtl3r0t7I9q0yh83nDgu9PvDmHNdN4/4
 iLZOGh5Y6OfAm/uN1D5ZQ39XPZ9z5G0f3Ht+Hn2L+0Cyb8Nn9SjhIrbARK3f4wr2
 1YbbeYe0NzFCwveeTBCfHQRzr3+6wsc0qL3lNxJ20m3f3KFGzscQZRrNF5KNeZRW
 S4JyK0IxfcVQGLOVu5ytpNtMmE2wErJ1wNnhj9Qiqd05FchNd3ciRg5Sg/fbcBbE
 IhMZM9VFcIwJyolqTosSWk9bC6h/mc68h1AUYv5mm3+WDqVFuXmGHP/gDYvqjGSp
 Ks2/7elrTrj+IsxTmt0On9bQCpzEm47kBapsltBPSDupMGIZ+k7JHRMV+2FnVztC
 A812kJnsy9mtDA2UmCDpeKb8KQchpTUIILjP3nn4uL2tA4QOKVr8M7GLxErqbaii
 gE9NnvZ1A0Jh/U833icpZrvpDG3+PMizciMc5h7XZCK00N++As/QrimHvfkr+mpv
 s0oM+xVgre8RGpV21yJdUmiYh4EaFmHAe4dBrVoG2x/lsmGvS/8VeGdXKsQPcLsP
 dwNwJ2XRf6yqcubVHkBQ/EIYwaYzst7haTB1xBl8LFVFnlWZlSD1MVENVJK2n8kG
 cEUWhbvqfmxm1sDj3A==
 =QC9p
 -----END PGP SIGNATURE-----

Merge tag 'exfat-for-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat

Pull exfat fix from Namjae Jeon:
 "Fix ->i_blocks truncation issue caused by wrong 32bit mask"

* tag 'exfat-for-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat:
  exfat: fix incorrect loading of i_blocks for large files
2021-11-01 11:45:04 -07:00
Oscar Carter
ebe4560ed5 firewire: Remove function callback casts
In 1394 OHCI specification, Isochronous Receive DMA context has several
modes. One of mode is 'BufferFill' and Linux FireWire stack uses it to
receive isochronous packets for multiple isochronous channel as
FW_ISO_CONTEXT_RECEIVE_MULTICHANNEL.

The mode is not used by in-kernel driver, while it's available for
userspace. The character device driver in firewire-core includes
cast of function callback for the mode since the type of callback
function is different from the other modes. The case is inconvenient
to effort of Control Flow Integrity builds due to
-Wcast-function-type warning.

This commit removes the cast. A static helper function is newly added
to initialize isochronous context for the mode. The helper function
arranges isochronous context to assign specific callback function
after call of existent kernel API. It's noticeable that the number of
isochronous channel, speed, and the size of header are not required for
the mode. The helper function is used for the mode by character device
driver instead of direct call of existent kernel API.

The same goal can be achieved (in the ioctl_create_iso_context function)
without this helper function as follows:
- Call the fw_iso_context_create function passing NULL to the callback
  parameter.
- Then setting the context->callback.sc or context->callback.mc
  variables based on the a->type value.

However using the helper function created in this patch makes code more
clear and declarative. This way avoid the call to a function with one
purpose to achieved another one.

Co-developed-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
Co-developed-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Oscar Carter <oscar.carter@gmx.com>
Reviewed-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
Testeb-by: Takashi Sakamoto<o-takashi@sakamocchi.jp>
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2021-11-01 13:44:26 -05:00
Linus Torvalds
67a135b80e Changes since last update:
- support multiple devices for multi-layer container images;
 
  - support the secondary compression head;
 
  - support readmore decompression strategy;
 
  - support new LZMA algorithm (specifically called MicroLZMA);
 
  - some bugfixes & cleanups.
 -----BEGIN PGP SIGNATURE-----
 
 iIcEABYIAC8WIQThPAmQN9sSA0DVxtI5NzHcH7XmBAUCYX8j7hEceGlhbmdAa2Vy
 bmVsLm9yZwAKCRA5NzHcH7XmBE+SAQChAmAUav03OQujm8PvVNX7VUGusGNvww8E
 qu5+zasC8wEArypW2Z75ZZ3IZNPCk6QWFlaC2I5Xnz7NNl0OGPKOCAg=
 =DZQ4
 -----END PGP SIGNATURE-----

Merge tag 'erofs-for-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs

Pull erofs updates from Gao Xiang:
 "There are some new features available for this cycle. Firstly, EROFS
  LZMA algorithm support, specifically called MicroLZMA, is available as
  an option for embedded devices, LiveCDs and/or as the secondary
  auxiliary compression algorithm besides the primary algorithm in one
  file.

  In order to better support the LZMA fixed-sized output compression,
  especially for 4KiB pcluster size (which has lowest memory pressure
  thus useful for memory-sensitive scenarios), Lasse introduced a new
  LZMA header/container format called MicroLZMA to minimize the original
  LZMA1 header (for example, we don't need to waste 4-byte dictionary
  size and another 8-byte uncompressed size, which can be calculated by
  fs directly, for each pcluster) and enable EROFS fixed-sized output
  compression.

  Note that MicroLZMA can also be later used by other things in addition
  to EROFS too where wasting minimal amount of space for headers is
  important and it can be only compiled by enabling XZ_DEC_MICROLZMA.
  MicroLZMA has been supported by the latest upstream XZ embedded [1] &
  XZ utils [2], apply the latest related XZ embedded upstream patches by
  the XZ author Lasse here.

  Secondly, multiple device is also supported in this cycle, which is
  designed for multi-layer container images. By working together with
  inter-layer data deduplication and compression, we can achieve the
  next high-performance container image solution. Our team will announce
  the new Nydus container image service [3] implementation with new RAFS
  v6 (EROFS-compatible) format in Open Source Summit 2021 China [4]
  soon.

  Besides, the secondary compression head support and readmore
  decompression strategy are also included in this cycle. There are also
  some minor bugfixes and cleanups, as always.

  Summary:

   - support multiple devices for multi-layer container images;

   - support the secondary compression head;

   - support readmore decompression strategy;

   - support new LZMA algorithm (specifically called MicroLZMA);

   - some bugfixes & cleanups"

* tag 'erofs-for-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
  erofs: don't trigger WARN() when decompression fails
  erofs: get rid of ->lru usage
  erofs: lzma compression support
  erofs: rename some generic methods in decompressor
  lib/xz, lib/decompress_unxz.c: Fix spelling in comments
  lib/xz: Add MicroLZMA decoder
  lib/xz: Move s->lzma.len = 0 initialization to lzma_reset()
  lib/xz: Validate the value before assigning it to an enum variable
  lib/xz: Avoid overlapping memcpy() with invalid input with in-place decompression
  erofs: introduce readmore decompression strategy
  erofs: introduce the secondary compression head
  erofs: get compression algorithms directly on mapping
  erofs: add multiple device support
  erofs: decouple basic mount options from fs_context
  erofs: remove the fast path of per-CPU buffer decompression
2021-11-01 11:39:22 -07:00
Linus Torvalds
cd3e8ea847 fscrypt updates for 5.16
Some cleanups for fs/crypto/:
 
 - Allow 256-bit master keys with AES-256-XTS
 
 - Improve documentation and comments
 
 - Remove unneeded field fscrypt_operations::max_namelen
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCYX8U4hQcZWJpZ2dlcnNA
 Z29vZ2xlLmNvbQAKCRDzXCl4vpKOKyXYAP0d7BNuKsMyw6qlzLMxbaO5wdTg2HaD
 04ApVeHM6qp7IQEA/Ve2Mr+BcPOZ7E6io8haZtXs0MrRMYeessKWcWMCdQ0=
 =2WNZ
 -----END PGP SIGNATURE-----

Merge tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt

Pull fscrypt updates from Eric Biggers:
 "Some cleanups for fs/crypto/:

   - Allow 256-bit master keys with AES-256-XTS

   - Improve documentation and comments

   - Remove unneeded field fscrypt_operations::max_namelen"

* tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt:
  fscrypt: improve a few comments
  fscrypt: allow 256-bit master keys with AES-256-XTS
  fscrypt: improve documentation for inline encryption
  fscrypt: clean up comments in bio.c
  fscrypt: remove fscrypt_operations::max_namelen
2021-11-01 11:36:35 -07:00
Jason Gunthorpe
6a463bc9d9 Merge branch 'for-rc' into rdma.git for-next
Patches held over for a possible rc8.

* for-rc:
  RDMA/qedr: Fix NULL deref for query_qp on the GSI QP
  RDMA/hns: Modify the value of MAX_LP_MSG_LEN to meet hardware compatibility
  RDMA/hns: Fix initial arm_st of CQ

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-11-01 15:08:19 -03:00
Jason Gunthorpe
a2a2a69d14 Linux 5.15
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmF/AjYeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiG1hkIAJ6sFDbvb4M4LMwf
 Slh2NVL9o5sLMBDzVwnVlyMSKDbMn1WBKreGssaLgZjGDc74lxsdSmw5l9MZm0JN
 xlq95Q6XFiuu+0qDHPWwfDz3JFO4TqW2ZLLPWk9NnkNbRXqccSrlVRi1RpgE1t3/
 NUtS8CQLu6A2BYMc6mkk3aV6IwSNKOkWbM5eBHSvU4j8B6lLbNQop0AfO/wyY1xB
 U6LiVE1RpN/b7Yv+75ITtNzuHzVIBx6305FvSnOlKbMKKvIClt96Vd2OeuoEkK+6
 wGU8JraB1+fc0GckAhynNrjWQWdvi0MAhFWWEJxjS20OGcV1rXDduNfkVNauO1Zn
 +dNyJ3s=
 =g9fz
 -----END PGP SIGNATURE-----

Merge tag 'v5.15' into rdma.git for-next

Pull in the accepted for-rc patches as the next merge needs a newer base.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-11-01 14:49:20 -03:00
Zhu Yanjun
9ed8110c9b RDMA/irdma: optimize rx path by removing unnecessary copy
In the function irdma_post_recv, the function irdma_copy_sg_list is
not needed since the struct irdma_sge and ib_sge have the similar
member variables. The struct irdma_sge can be replaced with the
struct ib_sge totally.

This can increase the rx performance of irdma.

Link: https://lore.kernel.org/r/20211030104226.253346-1-yanjun.zhu@linux.dev
Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-11-01 14:39:33 -03:00
Michał Mirosław
7552750d04 dm table: log table creation error code
Help debugging table creation errors by adding the error name in the log.

Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2021-11-01 13:28:52 -04:00
Michał Mirosław
c7c879eedc dm: make workqueue names device-specific
Add device number to kdmflush workqueue name to help debugging CPU usage.

Resulting `ps axfu` snippet:

root        3791  0.0  0.0      0     0 ?        I<   paź19   0:00  \_ [kdmflush/253:7]
root        3792  0.0  0.0      0     0 ?        I<   paź19   0:00  \_ [kcryptd_io/253:7]
root        3793  0.0  0.0      0     0 ?        I<   paź19   0:00  \_ [kcryptd/253:7]
root        3794  0.0  0.0      0     0 ?        S    paź19   0:00  \_ [dmcrypt_write/253:7]
root        3814  0.0  0.0      0     0 ?        I<   paź19   0:00  \_ [kdmflush/253:8]
root        3815  0.0  0.0      0     0 ?        I<   paź19   0:00  \_ [kdmflush/253:9]
root        3816  0.0  0.0      0     0 ?        I<   paź19   0:00  \_ [kdmflush/253:10]

Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2021-11-01 13:28:51 -04:00
Cai Huoqing
f635237a9b dm writecache: Make use of the helper macro kthread_run()
Replace kthread_create/wake_up_process() with kthread_run()
to simplify the code.

Signed-off-by: Cai Huoqing <caihuoqing@baidu.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2021-11-01 13:28:50 -04:00
Cai Huoqing
a5217c1105 dm crypt: Make use of the helper macro kthread_run()
Replace kthread_create/wake_up_process() with kthread_run()
to simplify the code.

Signed-off-by: Cai Huoqing <caihuoqing@baidu.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2021-11-01 13:28:49 -04:00
Christoph Hellwig
30495e688d dm verity: use bvec_kmap_local in verity_for_bv_block
Using local kmaps slightly reduces the chances to stray writes, and
the bvec interface cleans up the code a little bit.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2021-11-01 13:28:48 -04:00
Christoph Hellwig
27db271708 dm log writes: use memcpy_from_bvec in log_writes_map
Use memcpy_from_bvec instead of open coding the logic.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2021-11-01 13:28:47 -04:00
Christoph Hellwig
25058d1c72 dm integrity: use bvec_kmap_local in __journal_read_write
Using local kmaps slightly reduces the chances to stray writes, and
the bvec interface cleans up the code a little bit.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2021-11-01 13:28:46 -04:00
Christoph Hellwig
c12d205dae dm integrity: use bvec_kmap_local in integrity_metadata
Using local kmaps slightly reduces the chances to stray writes, and
the bvec interface cleans up the code a little bit.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2021-11-01 13:28:45 -04:00
Luis Chamberlain
089975379d dm: add add_disk() error handling
We never checked for errors on add_disk() as this function returned
void. Now that this is fixed, use the shiny new error handling.

There are two calls to dm_setup_md_queue() which can fail then, one on
dm_early_create() and we can easily see that the error path there
calls dm_destroy in the error path. The other use case is on the ioctl
table_load case. If that fails userspace needs to call the
DM_DEV_REMOVE_CMD to cleanup the state - similar to any other
failure.

Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2021-11-01 13:28:44 -04:00
Christophe JAILLET
ea3dba3052 dm: Remove redundant flush_workqueue() calls
destroy_workqueue() already drains the queue before destroying it, so
there is no need to flush it explicitly.

Remove the redundant flush_workqueue() calls.

This was generated with coccinelle:

@@
expression E;
@@
- 	flush_workqueue(E);
	destroy_workqueue(E);

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2021-11-01 13:28:43 -04:00
Randy Dunlap
603bdf5d6c kernel-doc: support DECLARE_PHY_INTERFACE_MASK()
Support the DECLARE_PHY_INTERFACE_MASK() macro that is used to declare
a bitmap by converting the macro to DECLARE_BITMAP(), as has been done
for the __ETHTOOL_DECLARE_LINK_MODE_MASK() macro.

This fixes a 'make htmldocs' warning:

include/linux/phylink.h:82: warning: Function parameter or member 'DECLARE_PHY_INTERFACE_MASK(supported_interfaces' not described in 'phylink_config'

that was introduced by commit
  38c310eb46 ("net: phylink: add MAC phy_interface_t bitmap")

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Russell King (Oracle) <linux@armlinux.org.uk>
Link: https://lore.kernel.org/r/45934225-7942-4326-f883-a15378939db9@infradead.org
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2021-11-01 11:28:35 -06:00
Yanteng Si
75ca80e4c4 docs/zh_CN: add core-api xarray translation
Translate Documentation/core-api/xarray.rst into Chinese

Signed-off-by: Yanteng Si <siyanteng@loongson.cn>
Reviewed-by: Alex Shi <alexs@kernel.org>
Link: https://lore.kernel.org/r/2a125bcb3220e7c1b72ae87bcad1b225dd950338.1634358018.git.siyanteng@loongson.cn
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2021-11-01 11:25:33 -06:00
Yanteng Si
5876a638c8 docs/zh_CN: add core-api assoc_array translation
Translate Documentation/core-api/assoc_array.rst into Chinese.

Signed-off-by: Yanteng Si <siyanteng@loongson.cn>
Reviewed-by: Alex Shi <alexs@kernel.org>
Link: https://lore.kernel.org/r/860ac85d9a2a83c2b63eb8d1be929ad64280d7b2.1634358018.git.siyanteng@loongson.cn
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2021-11-01 11:25:33 -06:00
Linus Torvalds
19901165d9 for-5.16/inode-sync-2021-10-29
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmF8MEkQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpkWyEACBp3TltQu/jvyFlCzuOQJqpIqVw6ZeRn9h
 0cYZaYsRzNBTzIOKogpmhT3lWYOMxIbFMq6RyzLCPaQz6juEP+tmQIdLdPMxC5ON
 XdzItF0bMaLzoW0IRK21/aF1s/7UFcr1OLT0BT8F0umeQQXcEOOSim4kZuK9u6mS
 4pOvh61yXeB7UZxDOpMqH3aVlwrLjIr51j0ECGx/Qz1OZtXREQSeptlRUKEKVTXB
 uYPCB9FLL6ZWFyiDAuaiO4Gi//dhpoOe7Yich9m0tbtfei8gl74TqgzeaCBu+gFj
 aRyfwhyvFcm69MJqPGmRBDVxtXVC6ofjd4G6PSG8R/cAuAgPFywL/s0ETmjUJBvY
 HqnExUnMcr8FUHGIfYHmX7EWCAtD+FbpUSnCgWH2ulUhziKFR/LLE/ZYayPbhrgL
 aA89BYpeDS/POc94KXJJON/Ux612vGwhJxVsngYBEboYNeiP7YwsaQapU9RsKp0o
 YTlhz8zFuToUPEh6BQLYuOZek5AsEue5o7525Aj0vdjpxH/qH6JhjE790c7yWhL+
 hbxlTAAdqdVO2Xxrr3qdMXBUI3wnFKKu8Z6+oqi7ujQRKJZmLnXYn4ZkNRs6C858
 3NEW0mySPHxNRCZrt2M7zWmoq/eZtcJIzPy4JMW3xkQgqgdImuT1z7PrgRDw6/h8
 GB382CO2AQ==
 =AKpp
 -----END PGP SIGNATURE-----

Merge tag 'for-5.16/inode-sync-2021-10-29' of git://git.kernel.dk/linux-block

Pull block inode sync updates from Jens Axboe:
 "This contains improvements to how bdev inode syncing is handled,
  unifying the API"

* tag 'for-5.16/inode-sync-2021-10-29' of git://git.kernel.dk/linux-block:
  block: simplify the block device syncing code
  ntfs3: use sync_blockdev_nowait
  fat: use sync_blockdev_nowait
  btrfs: use sync_blockdev
  xen-blkback: use sync_blockdev
  block: remove __sync_blockdev
  fs: remove __sync_filesystem
2021-11-01 10:25:27 -07:00
Colin Ian King
d64fbe9f50 speakup: Fix typo in documentation "boo" -> "boot"
There is a typo in the speakup documentation. Fix it.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Link: https://lore.kernel.org/r/20211028182319.613315-1-colin.i.king@gmail.com
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2021-11-01 11:17:21 -06:00