Commit graph

84385 commits

Author SHA1 Message Date
Song Liu
fe736565ef bpf: Introduce bpf_arch_text_invalidate for bpf_prog_pack
Introduce bpf_arch_text_invalidate and use it to fill unused part of the
bpf_prog_pack with illegal instructions when a BPF program is freed.

Fixes: 57631054fa ("bpf: Introduce bpf_prog_pack allocator")
Fixes: 33c9805860 ("bpf: Introduce bpf_jit_binary_pack_[alloc|finalize|free]")
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Song Liu <song@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220520235758.1858153-4-song@kernel.org
2022-05-23 23:08:11 +02:00
Linus Torvalds
5dc921868c for-5.19/drivers-2022-05-22
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmKKrTcQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgph/REAC0/7odRfJeTJ1PkJhSKFc7dhyS7rK4du2s
 3+z+H6Yeua2yVIJb0mYYGEJcOUUQ9nD2T9424n3NzDOw88U4y8Vg2YEH+UiJBuj4
 AJoxPNkQdxL7WzmwHmRNLCcOOFhISLqWiCJSr45d+LP1f6aO24Q9lewYWxtNA4TW
 mqb7Ne7e3Z77m9rmsCsZ26bzQHg1EEQ6qgjZM9tqMhOeTqYhmrqfrD9KtG8TIkpK
 N8277E5QcequHf7v6VpKqEOzf3d2kx55JaZdu+oxLPVMED3wJJFwcYF1/xmM7Fgx
 tp7xCjqqUHXwKvJNCFJpnvw+cXu0Ct7cWOIG4ROCvaTD4vBI1KzZLc0gO7pKFW0Y
 hNIlMXr4n8PmonS81tMV4TqmRWxedX/jxuaeJCVNr89PqYU4luPpigJZqv7rlGry
 KZUlktQot22M/7FC2MS6KhgbQKLPrRGTAEyY/JNwBHckCZiduWQFlmKLQ926xQIJ
 6vdjSzHK5MrT/d+yow3bGFxAJWloGJ+L+RsH0b+WikF81+6ic9P3AoStgbVilfKD
 6sbjcju8SShDlQ+W/Ocm0rHC+i/RDKT3QqItXgfhA/1FfMPODQGc/xcZg+AdTswn
 VSnUIkvk9/mTO0StilVfNJDfG1QkSpJ5Ilvs/DnIahZj6IG4QbJvtnVNbmQX6ptz
 AUB4DdGwXg==
 =geQL
 -----END PGP SIGNATURE-----

Merge tag 'for-5.19/drivers-2022-05-22' of git://git.kernel.dk/linux-block

Pull block driver updates from Jens Axboe:
 "Here are the driver updates queued up for 5.19. This contains:

   - NVMe pull requests via Christoph:
       - tighten the PCI presence check (Stefan Roese)
       - fix a potential NULL pointer dereference in an error path (Kyle
         Miller Smith)
       - fix interpretation of the DMRSL field (Tom Yan)
       - relax the data transfer alignment (Keith Busch)
       - verbose error logging improvements (Max Gurtovoy, Chaitanya
         Kulkarni)
       - misc cleanups (Chaitanya Kulkarni, Christoph)
       - set non-mdts limits in nvme_scan_work (Chaitanya Kulkarni)
       - add support for TP4084 - Time-to-Ready Enhancements (Christoph)

   - MD pull request via Song:
       - Improve annotation in raid5 code, by Logan Gunthorpe
       - Support MD_BROKEN flag in raid-1/5/10, by Mariusz Tkaczyk
       - Other small fixes/cleanups

   - null_blk series making the configfs side much saner (Damien)

   - Various minor drbd cleanups and fixes (Haowen, Uladzislau, Jiapeng,
     Arnd, Cai)

   - Avoid using the system workqueue (and hence flushing it) in rnbd
     (Jack)

   - Avoid using the system workqueue (and hence flushing it) in aoe
     (Tetsuo)

   - Series fixing discard_alignment issues in drivers (Christoph)

   - Small series fixing drivers poking at disk->part0 for openers
     information (Christoph)

   - Series fixing deadlocks in loop (Christoph, Tetsuo)

   - Remove loop.h and add SPDX headers (Christoph)

   - Various fixes and cleanups (Julia, Xie, Yu)"

* tag 'for-5.19/drivers-2022-05-22' of git://git.kernel.dk/linux-block: (72 commits)
  mtip32xx: fix typo in comment
  nvme: set non-mdts limits in nvme_scan_work
  nvme: add support for TP4084 - Time-to-Ready Enhancements
  nvme: split the enum used for various register constants
  nbd: Fix hung on disconnect request if socket is closed before
  nvme-fabrics: add a request timeout helper
  nvme-pci: harden drive presence detect in nvme_dev_disable()
  nvme-pci: fix a NULL pointer dereference in nvme_alloc_admin_tags
  nvme: mark internal passthru request RQF_QUIET
  nvme: remove unneeded include from constants file
  nvme: add missing status values to verbose logging
  nvme: set dma alignment to dword
  nvme: fix interpretation of DMRSL
  loop: remove most the top-of-file boilerplate comment from the UAPI header
  loop: remove most the top-of-file boilerplate comment
  loop: add a SPDX header
  loop: remove loop.h
  block: null_blk: Improve device creation with configfs
  block: null_blk: Cleanup messages
  block: null_blk: Cleanup device creation and deletion
  ...
2022-05-23 14:04:14 -07:00
Linus Torvalds
115cd47132 for-5.19/block-2022-05-22
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmKKrUsQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpgDjD/44hY9h0JsOLoRH1IvFtuaH6n718JXuqG17
 hHCfmnAUVqj2jT00IUbVlUTd905bCGpfrodBL3PAmPev1zZHOUd/MnJKrSynJ+/s
 NJEMZQaHxLmocNDpJ1sZo7UbAFErsZXB0gVYUO8cH2bFYNu84H1mhRCOReYyqmvQ
 aIAASX5qRB/ciBQCivzAJl2jTdn4WOn5hWi9RLidQB7kSbaXGPmgKAuN88WI4H7A
 zQgAkEl2EEquyMI5tV1uquS7engJaC/4PsenF0S9iTyrhJLjneczJBJZKMLeMR8d
 sOm6sKJdpkrfYDyaA4PIkgmLoEGTtwGpqGHl4iXTyinUAxJoca5tmPvBb3wp66GE
 2Mr7pumxc1yJID2VHbsERXlOAX3aZNCowx2gum2MTRIO8g11Eu3aaVn2kv37MBJ2
 4R2a/cJFl5zj9M8536cG+Yqpy0DDVCCQKUIqEupgEu1dyfpznyWH5BTAHXi1E8td
 nxUin7uXdD0AJkaR0m04McjS/Bcmc1dc6I8xvkdUFYBqYCZWpKOTiEpIBlHg0XJA
 sxdngyz5lSYTGVA4o4QCrdR0Tx1n36A1IYFuQj0wzxBJYZ02jEZuII/A3dd+8hiv
 EY+VeUQeVIXFFuOcY+e0ScPpn7Nr17hAd1en/j2Hcoe4ZE8plqG2QTcnwgflcbis
 iomvJ4yk0Q==
 =0Rw1
 -----END PGP SIGNATURE-----

Merge tag 'for-5.19/block-2022-05-22' of git://git.kernel.dk/linux-block

Pull block updates from Jens Axboe:
 "Here are the core block changes for 5.19. This contains:

   - blk-throttle accounting fix (Laibin)

   - Series removing redundant assignments (Michal)

   - Expose bio cache via the bio_set, so that DM can use it (Mike)

   - Finish off the bio allocation interface cleanups by dealing with
     the weirdest member of the family. bio_kmalloc combines a kmalloc
     for the bio and bio_vecs with a hidden bio_init call and magic
     cleanup semantics (Christoph)

   - Clean up the block layer API so that APIs consumed by file systems
     are (almost) only struct block_device based, so that file systems
     don't have to poke into block layer internals like the
     request_queue (Christoph)

   - Clean up the blk_execute_rq* API (Christoph)

   - Clean up various lose end in the blk-cgroup code to make it easier
     to follow in preparation of reworking the blkcg assignment for bios
     (Christoph)

   - Fix use-after-free issues in BFQ when processes with merged queues
     get moved to different cgroups (Jan)

   - BFQ fixes (Jan)

   - Various fixes and cleanups (Bart, Chengming, Fanjun, Julia, Ming,
     Wolfgang, me)"

* tag 'for-5.19/block-2022-05-22' of git://git.kernel.dk/linux-block: (83 commits)
  blk-mq: fix typo in comment
  bfq: Remove bfq_requeue_request_body()
  bfq: Remove superfluous conversion from RQ_BIC()
  bfq: Allow current waker to defend against a tentative one
  bfq: Relax waker detection for shared queues
  blk-cgroup: delete rcu_read_lock_held() WARN_ON_ONCE()
  blk-throttle: Set BIO_THROTTLED when bio has been throttled
  blk-cgroup: Remove unnecessary rcu_read_lock/unlock()
  blk-cgroup: always terminate io.stat lines
  block, bfq: make bfq_has_work() more accurate
  block, bfq: protect 'bfqd->queued' by 'bfqd->lock'
  block: cleanup the VM accounting in submit_bio
  block: Fix the bio.bi_opf comment
  block: reorder the REQ_ flags
  blk-iocost: combine local_stat and desc_stat to stat
  block: improve the error message from bio_check_eod
  block: allow passing a NULL bdev to bio_alloc_clone/bio_init_clone
  block: remove superfluous calls to blkcg_bio_issue_init
  kthread: unexport kthread_blkcg
  blk-cgroup: cleanup blkcg_maybe_throttle_current
  ...
2022-05-23 13:56:39 -07:00
Linus Torvalds
f6792c877a for-5.19/cdrom-2022-05-22
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmKKrNUQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpjZREACGej0LutZ5t3ar0WxFT9N8AhRR8y7v0Ard
 U66a7gVlObfnq/P2S+dIEQ6nKda3q5MzRLr2uoMI1xDpTlngwafkGpln9g32QwU2
 yqpQRM1uAtGUuNCtloYmX54nWYmHG69QAxcrxeYZK4Z4NB3A2TIQSGHLVg6CfUq7
 HxwgCBj0GI/VaxunnuTpMXgjGDVe1yq5iwCSr4VlG9xxHd3ySvoonZ/JnAaWwZqR
 G8/fHPL8A235QVtPnqxKZ0HoGYLRxY8pO3O9sWaNXQ8tVGerw0lShlvuyhig8DCn
 CXFQ011PuR4dxrDdz2GCIl/s8bVUofr0zsonJunHPgawJvCKGsN/C8rTSqc2lWa3
 C0JeBy+kapRHygGZQgC0W/ApPPsE7KrqLMUtcz60sLR1ziYmRXfFWAxQNsQ35xUR
 cdzFGFEqmBMjGA5NI8bcVB8SfueKBG5Q3w0sBcF+L7EaHKkBN82fK3pBbBA6SXGT
 36JjA5M3b2Wv7ZYRyqPLOtMO6vbehQ/AJekZzY4Uc4eUFeervv0EHjXr7bzH2h3N
 kuge1MfcrLmEGPUX0erkrV3I7cLWSc+vnkDb767RItEBb2HWfRU/4z1wzjppVN3/
 Y2f6N+oF/JVnRztYO8DMwwZlTPa4ojyXDkv0uDk0fO0ZEeGUByTLn6FR6S1Mkh0z
 B7DwhDQk0Q==
 =f7O3
 -----END PGP SIGNATURE-----

Merge tag 'for-5.19/cdrom-2022-05-22' of git://git.kernel.dk/linux-block

Pull cdrom updates from Jens Axboe:
 "Removal of unused code and documentation updates"

* tag 'for-5.19/cdrom-2022-05-22' of git://git.kernel.dk/linux-block:
  cdrom: remove obsolete TODO list
  block: remove last remaining traces of IDE documentation
  cdrom: mark CDROMGETSPINDOWN/CDROMSETSPINDOWN obsolete
  cdrom: remove the unused driver specific disc change ioctl
  cdrom: make EXPORT_SYMBOL follow exported function
2022-05-23 13:52:14 -07:00
Linus Torvalds
9836e93c0a for-5.19/io_uring-passthrough-2022-05-22
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmKKovAQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpv9oD/4qCs7k3bPZZWZ6xoWb4EObyyWOUifi26lp
 vpsJHFUbA67S/i4++LV9H18SazWJ7h08ac4bjgZ+NQz40/1WkTN8/Fa76jo+BnNK
 7T10Wp4Ak6uwWVrKaA81pnT+G9+xmHlJ3X27aKxzLuT7BEPpShZ6ouFVjTkx9CzN
 LrLjuCDTOBBN+ZoaroWYfdLwTQX2VCAl9B15lOtQIlFvuuU8VlrvLboY+80K8TvY
 1wvTA2HTjnXoYx+/cTTMIFZIwQH3r1hsbwEDD8/YJj1+ouhSRQ1b0p/nk2pA+3ws
 HF5r/YS/rLBjlPF094IzeOBaUyA433AN1VhZqnII8ek7ViT3W3x+BRrgE9O6ZkWT
 0AjX1BXReI5rdFmxBmwsSdBnrSoGaJOf2GdsCCdubXBIi+F/RvyajrPf7PTB5zbW
 9WEK/uy3xvZsRVkUGAzOb9QGdvjcllgMzwPJsDegDCw5PdcPdT3mzy6KGIWipFLp
 j8R+br7hRMpOJv/YpihJDMzSDkQ/r1/SCwR4fpLid/QdSHG/eRTQK6c4Su5bNYEy
 QDy2F6kQdBVtEJCQHcEOsbhXzSTNBcdB+ujUUM5653FkaHe6y4JbomLrsNx407Id
 i/4ROwA5K1dioJx503Eap+OhbI5rV+PFytJTwxvLrNyVGccwbH2YOVq80fsVBP2e
 cZbn6EX4Vg==
 =/peE
 -----END PGP SIGNATURE-----

Merge tag 'for-5.19/io_uring-passthrough-2022-05-22' of git://git.kernel.dk/linux-block

Pull io_uring NVMe command passthrough from Jens Axboe:
 "On top of everything else, this adds support for passthrough for
  io_uring.

  The initial feature for this is NVMe passthrough support, which allows
  non-filesystem based IO commands and admin commands.

  To support this, io_uring grows support for SQE and CQE members that
  are twice as big, allowing to pass in a full NVMe command without
  having to copy data around. And to complete with more than just a
  single 32-bit value as the output"

* tag 'for-5.19/io_uring-passthrough-2022-05-22' of git://git.kernel.dk/linux-block: (22 commits)
  io_uring: cleanup handling of the two task_work lists
  nvme: enable uring-passthrough for admin commands
  nvme: helper for uring-passthrough checks
  blk-mq: fix passthrough plugging
  nvme: add vectored-io support for uring-cmd
  nvme: wire-up uring-cmd support for io-passthru on char-device.
  nvme: refactor nvme_submit_user_cmd()
  block: wire-up support for passthrough plugging
  fs,io_uring: add infrastructure for uring-cmd
  io_uring: support CQE32 for nop operation
  io_uring: enable CQE32
  io_uring: support CQE32 in /proc info
  io_uring: add tracing for additional CQE32 fields
  io_uring: overflow processing for CQE32
  io_uring: flush completions for CQE32
  io_uring: modify io_get_cqe for CQE32
  io_uring: add CQE32 completion processing
  io_uring: add CQE32 setup processing
  io_uring: change ring size calculation for CQE32
  io_uring: store add. return values for CQE32
  ...
2022-05-23 13:06:15 -07:00
Linus Torvalds
e1a8fde720 for-5.19/io_uring-net-2022-05-22
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmKKotMQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpmVwEACo7qBTjrrneZEwlYUWrSr45QtDNsQHPWjv
 aoK1dBLVH4ZjoZoOTI/aYcRgd5IJYo1P6I9tUrolM/+N3adM4UTEVC7i2PYDOaL3
 WUm/YT2aSLiyHaHQON7SMyGSVU8kfM9YvJAGbj7ohalO9A2VVtHfUAmcAtBdgWqv
 Dl/Uu6vbogOl19xztAwN4nvwqljA+GUMnbHJ/oeASzrMzYMOdQ0q3UsQbEt+pTXt
 rBzv8fCsrKsT2uBc59Bi3eFKeBMM6ERzux/40TlqcOnXf3KUCK7nM4VaRgPbvXdt
 GOOYfYs+j9L8SSEedvdKyYNq4vVwWgYfTRAKMNB0FPiOaTGZuUthqkgRZGYY8AA9
 +lJWxa+mzPmWEOmL+E44kt0OwtKDHX72ccEJUD7PHhTp0g87yKZfS6mXRNYLSxm7
 IYt7N1x3cOp0lrwUTvLDnSPOTuYOSEiB2JZtfkf+y3SuI5SWowIcudKOuO5p7G1r
 IpAROsZrpHzMf/eniINoX3IrqBSqr254jzwq+9IgUaw/ky76oPYqM1dWP9BnVxCg
 PXgvfT5zj6xrU43TxTeIPU92JoAqhMeXi6dcyoiAAf9+8Vih+sbmLzAdJbYb5F2v
 G0ISy31+x/Goi43fQS59HzS/MNXJplcmy2mxKUYBT7/ZoJ2A26Q8SukTWD+U8sDn
 XIrV4HEOUQ==
 =PUw1
 -----END PGP SIGNATURE-----

Merge tag 'for-5.19/io_uring-net-2022-05-22' of git://git.kernel.dk/linux-block

Pull io_uring 'more data in socket' support from Jens Axboe:
 "To be able to fully utilize the 'poll first' support in the core
  io_uring branch, it's advantageous knowing if the socket was empty
  after a receive. This adds support for that"

* tag 'for-5.19/io_uring-net-2022-05-22' of git://git.kernel.dk/linux-block:
  io_uring: return hint on whether more data is available after receive
  tcp: pass back data left in socket after receive
2022-05-23 12:51:04 -07:00
Björn Ardö
bca1a10046 mailbox: forward the hrtimer if not queued and under a lock
This reverts commit c7dacf5b0f,
"mailbox: avoid timer start from callback"

The previous commit was reverted since it lead to a race that
caused the hrtimer to not be started at all. The check for
hrtimer_active() in msg_submit() will return true if the
callback function txdone_hrtimer() is currently running. This
function could return HRTIMER_NORESTART and then the timer
will not be restarted, and also msg_submit() will not start
the timer. This will lead to a message actually being submitted
but no timer will start to check for its compleation.

The original fix that added checking hrtimer_active() was added to
avoid a warning with hrtimer_forward. Looking in the kernel
another solution to avoid this warning is to check hrtimer_is_queued()
before calling hrtimer_forward_now() instead. This however requires a
lock so the timer is not started by msg_submit() inbetween this check
and the hrtimer_forward() call.

Fixes: c7dacf5b0f ("mailbox: avoid timer start from callback")
Signed-off-by: Björn Ardö <bjorn.ardo@axis.com>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
2022-05-23 14:45:24 -05:00
Linus Torvalds
368da430d0 for-5.19/io_uring-socket-2022-05-22
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmKKorgQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpm0eEACdTzhm7h5cXn9KjIvWLkdocAb/NOL8GYPn
 Q1mY1SqKQFZvs/fyKHkkZEiIBPxhvN6snVFXMpb4LDmPYeeH4GTUlNomrGTIjvf/
 j6SnZN4lCs9A2NlE+iDVWnFQOPQFALza2Y9BhC5xzay326qnKlO+0fQv3C1vXXrc
 /PNLqxQr7+GmO0a0PJnS6mGWGj6qF7nLqilB9apnKsTK6BKbJEec6ciKreqxU6ME
 WHaux11uIAbcf8rc6C/2myEK0k6jCOAue3vZ0lizygf+8klUCl2vMqV5BLwCBlXG
 /e7hBsUUrGr0CG0fryqhQQTUxsZLshioBbQH1vttSeZCli46mmWWAhPNy3/jb1ZU
 72bazA84Fe9ney9uVZvZoMoBsG+6t6UOatqND13MeRFAXnkRr0jZRuau2iBxgqAr
 OINJW+IVPU7IrCD+S4lV1/LCdhLhYcob8/zfKmIrdHMQnWG/gLonVpYJIBCyLDAv
 2jvHFIPJuSMUSGVjRKCb16LLNV6u7YG6VOWbKuippxfJxDdwA3TOtOhvTJIpYq0u
 TotPgpZ7bfcr4xDsGgD9mZS8E7jwsL/G0/MwsnixELykEXuhd++sgoTbr+RyUYdV
 45Hm6DsxlytjzOb/5uQrqhwrso05eVt14K74XApPa3fWKL8aWCh1jGSdo3CSbIyW
 iHwss919Ag==
 =nb5i
 -----END PGP SIGNATURE-----

Merge tag 'for-5.19/io_uring-socket-2022-05-22' of git://git.kernel.dk/linux-block

Pull io_uring socket() support from Jens Axboe:
 "This adds support for socket(2) for io_uring. This is handy when using
  direct / registered file descriptors with io_uring.

  Outside of those two patches, a small series from Dylan on top that
  improves the tracing by providing a text representation of the opcode
  rather than needing to decode this by reading the header file every
  time.

  That sits in this branch as it was the last opcode added (until it
  wasn't...)"

* tag 'for-5.19/io_uring-socket-2022-05-22' of git://git.kernel.dk/linux-block:
  io_uring: use the text representation of ops in trace
  io_uring: rename op -> opcode
  io_uring: add io_uring_get_opcode
  io_uring: add type to op enum
  io_uring: add socket(2) support
  net: add __sys_socket_file()
2022-05-23 12:42:33 -07:00
Linus Torvalds
3a166bdbf3 for-5.19/io_uring-2022-05-22
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmKKol0QHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpn+sEACbdEQqG6OoCOhJ0ZuxTdQqNMGxCImKBxjP
 8Bqf+0hYNgwfG+80/UQvmc7olb+KxvZ6KtrgViC/ujhvMQmX0Xf/881kiiKG/iHJ
 XKoL9PdqIkenIGnlyEp1uRmnUbooYF+s4iT6Gj/pjnn29GbcKjsPzKV1CUNkt3GC
 R+wpdKczHQDaSwzDY5Ntyjf68QUQOyUznkHW+6JOcBeih3ET7NfapR/zsFS93RlL
 B9pQ9NiBBQfzCAUycVyQMC+p/rJbKWgidAiFk4fXKRm8/7iNwT4dB0+oUymlECxt
 xvalRVK6ER1s4RSdQcUTZoQA+SrzzOnK1DYja9cvcLT3wH+aojana6S0rOMDi8wp
 hoWT5jdMaZN09Vcm7J4sBN15i50m9aDITp21PKOVDZXSMVsebltCL9phaN5+9x/j
 AfF6Vki1WTB4gYaDHR8v6UkW+HcF1WOmMdq8GB9UMfnTya6EJqAooYT9lhQBP/rv
 jxkdj9Fu98O87dOfy1Av9AxH1UB8d7ypCJKkSEMAUPoWf0rC9HjYr0cRq/yppAj8
 pI/0PwXaXRfQuoHPqZyETrPel77VQdBw+Hg+6TS0KlTd3WlVEJMZJPtXK466IFLp
 pYSRVnSI9PuhiClOpxriTCw0cppfRIv11IerCxRziqH9S1zijk0VBCN40//XDs1o
 JfvoA6htKQ==
 =S+Uf
 -----END PGP SIGNATURE-----

Merge tag 'for-5.19/io_uring-2022-05-22' of git://git.kernel.dk/linux-block

Pull io_uring updates from Jens Axboe:
 "Here are the main io_uring changes for 5.19. This contains:

   - Fixes for sparse type warnings (Christoph, Vasily)

   - Support for multi-shot accept (Hao)

   - Support for io_uring managed fixed files, rather than always
     needing the applicationt o manage the indices (me)

   - Fix for a spurious poll wakeup (Dylan)

   - CQE overflow fixes (Dylan)

   - Support more types of cancelations (me)

   - Support for co-operative task_work signaling, rather than always
     forcing an IPI (me)

   - Support for doing poll first when appropriate, rather than always
     attempting a transfer first (me)

   - Provided buffer cleanups and support for mapped buffers (me)

   - Improve how io_uring handles inflight SCM files (Pavel)

   - Speedups for registered files (Pavel, me)

   - Organize the completion data in a struct in io_kiocb rather than
     keep it in separate spots (Pavel)

   - task_work improvements (Pavel)

   - Cleanup and optimize the submission path, in general and for
     handling links (Pavel)

   - Speedups for registered resource handling (Pavel)

   - Support sparse buffers and file maps (Pavel, me)

   - Various fixes and cleanups (Almog, Pavel, me)"

* tag 'for-5.19/io_uring-2022-05-22' of git://git.kernel.dk/linux-block: (111 commits)
  io_uring: fix incorrect __kernel_rwf_t cast
  io_uring: disallow mixed provided buffer group registrations
  io_uring: initialize io_buffer_list head when shared ring is unregistered
  io_uring: add fully sparse buffer registration
  io_uring: use rcu_dereference in io_close
  io_uring: consistently use the EPOLL* defines
  io_uring: make apoll_events a __poll_t
  io_uring: drop a spurious inline on a forward declaration
  io_uring: don't use ERR_PTR for user pointers
  io_uring: use a rwf_t for io_rw.flags
  io_uring: add support for ring mapped supplied buffers
  io_uring: add io_pin_pages() helper
  io_uring: add buffer selection support to IORING_OP_NOP
  io_uring: fix locking state for empty buffer group
  io_uring: implement multishot mode for accept
  io_uring: let fast poll support multishot
  io_uring: add REQ_F_APOLL_MULTISHOT for requests
  io_uring: add IORING_ACCEPT_MULTISHOT for accept
  io_uring: only wake when the correct events are set
  io_uring: avoid io-wq -EAGAIN looping for !IOPOLL
  ...
2022-05-23 12:22:49 -07:00
Linus Torvalds
1e57930e9f RCU pull request for v5.19
This pull request contains the following branches:
 
 docs.2022.04.20a: Documentation updates.
 
 fixes.2022.04.20a: Miscellaneous fixes.
 
 nocb.2022.04.11b: Callback-offloading updates, mainly simplifications.
 
 rcu-tasks.2022.04.11b: RCU-tasks updates, including some -rt fixups,
 	handling of systems with sparse CPU numbering, and a fix for a
 	boot-time race-condition failure.
 
 srcu.2022.05.03a: Put SRCU on a memory diet in order to reduce the size
 	of the srcu_struct structure.
 
 torture.2022.04.11b: Torture-test updates fixing some bugs in tests and
 	closing some testing holes.
 
 torture-tasks.2022.04.20a: Torture-test updates for the RCU tasks flavors,
 	most notably ensuring that building rcutorture and friends does
 	not change the RCU-tasks-related Kconfig options.
 
 torturescript.2022.04.20a: Torture-test scripting updates.
 
 exp.2022.05.11a: Expedited grace-period updates, most notably providing
 	milliseconds-scale (not all that) soft real-time response from
 	synchronize_rcu_expedited().  This is also the first time in
 	almost 30 years of RCU that someone other than me has pushed
 	for a reduction in the RCU CPU stall-warning timeout, in this
 	case by more than three orders of magnitude from 21 seconds to
 	20 milliseconds.  This tighter timeout applies only to expedited
 	grace periods.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEbK7UrM+RBIrCoViJnr8S83LZ+4wFAmKG2zcTHHBhdWxtY2tA
 a2VybmVsLm9yZwAKCRCevxLzctn7jGXgD/90xtRtZyN0umlN/IOBBn8fIOM+BAMu
 5k3ef6wLsXKXlLO13WTjSitypX9LEFwytTeVhEyN4ODeX0cI9mUmts6Z8/6sV92D
 fN8vqTavveE7m5YfFfLRvDRfVHpB0LpLMM+V0qWPu/F8dWPDKA0225rX9IC7iICP
 LkxCuNVNzJ0cLaVTvsUWlxMdHcogydXZb1gPDVRhnR6iVFWCBtL4RRpU41CoSNh4
 fWRSLQak6OhZRFE7hVoLQhZyLE0GIw1fuUJgj2fCllhgGogDx78FQ8jHdDzMEhVk
 cD4Yel5vUPiy2AKphGfi28bKFYcyhVBnD/Jq733VJV0/szyddxNbz0xKpEA0/8qh
 w1T7IjBN6MAKHSh0uUitm6U24VN13m4r30HrUQSpp71VFZkUD4QS6TismKsaRNjR
 lK4q2QKBprBb3Hv7KPAGYT1Us3aS7qLPrgPf3gzSxL1aY5QV0A5UpPP6RKTLbWPl
 CEQxEno6g5LTHwKd5QD74dG8ccphg9377lDMJpeesYShBqlLNrNWCxqJoZk2HnSf
 f2dTQeQWrtRJjeTGy/4cfONCGZTghE0Pch43XMzLLt3ZTuDc8FVM0t3Xs9J5Kg22
 zmThQh6LRXTGjrb1vLiOrjPf5JaTnX2Sz8OUJTo/ZxwcixxP/mj8Ja+W81NjfqnK
 LLZ1D6UN4a8n9A==
 =4spH
 -----END PGP SIGNATURE-----

Merge tag 'rcu.2022.05.19a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu

Pull RCU update from Paul McKenney:

 - Documentation updates

 - Miscellaneous fixes

 - Callback-offloading updates, mainly simplifications

 - RCU-tasks updates, including some -rt fixups, handling of systems
   with sparse CPU numbering, and a fix for a boot-time race-condition
   failure

 - Put SRCU on a memory diet in order to reduce the size of the
   srcu_struct structure

 - Torture-test updates fixing some bugs in tests and closing some
   testing holes

 - Torture-test updates for the RCU tasks flavors, most notably ensuring
   that building rcutorture and friends does not change the
   RCU-tasks-related Kconfig options

 - Torture-test scripting updates

 - Expedited grace-period updates, most notably providing
   milliseconds-scale (not all that) soft real-time response from
   synchronize_rcu_expedited().

   This is also the first time in almost 30 years of RCU that someone
   other than me has pushed for a reduction in the RCU CPU stall-warning
   timeout, in this case by more than three orders of magnitude from 21
   seconds to 20 milliseconds. This tighter timeout applies only to
   expedited grace periods

* tag 'rcu.2022.05.19a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: (80 commits)
  rcu: Move expedited grace period (GP) work to RT kthread_worker
  rcu: Introduce CONFIG_RCU_EXP_CPU_STALL_TIMEOUT
  srcu: Drop needless initialization of sdp in srcu_gp_start()
  srcu: Prevent expedited GPs and blocking readers from consuming CPU
  srcu: Add contention check to call_srcu() srcu_data ->lock acquisition
  srcu: Automatically determine size-transition strategy at boot
  rcutorture: Make torture.sh allow for --kasan
  rcutorture: Make torture.sh refscale and rcuscale specify Tasks Trace RCU
  rcutorture: Make kvm.sh allow more memory for --kasan runs
  torture: Save "make allmodconfig" .config file
  scftorture: Remove extraneous "scf" from per_version_boot_params
  rcutorture: Adjust scenarios' Kconfig options for CONFIG_PREEMPT_DYNAMIC
  torture: Enable CSD-lock stall reports for scftorture
  torture: Skip vmlinux check for kvm-again.sh runs
  scftorture: Adjust for TASKS_RCU Kconfig option being selected
  rcuscale: Allow rcuscale without RCU Tasks Rude/Trace
  rcuscale: Allow rcuscale without RCU Tasks
  refscale: Allow refscale without RCU Tasks Rude/Trace
  refscale: Allow refscale without RCU Tasks
  rcutorture: Allow specifying per-scenario stat_interval
  ...
2022-05-23 11:46:51 -07:00
Rafael J. Wysocki
388292df27 Merge back earlier thermal control updates for 5.19-rc1. 2022-05-23 20:31:57 +02:00
Linus Torvalds
bf2431021c EFI updates for v5.19
- Allow runtime services to be re-enabled at boot on RT kernels.
 - Provide access to secrets injected into the boot image by CoCo
   hypervisors (COnfidential COmputing)
 - Use DXE services on x86 to make the boot image executable after
   relocation, if needed.
 - Prefer mirrored memory for randomized allocations.
 - Only randomize the placement of the kernel image on arm64 if the
   loader has not already done so.
 - Add support for obtaining the boot hartid from EFI on RISC-V.
 -----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEE+9lifEBpyUIVN1cpw08iOZLZjyQFAmKHRF4ACgkQw08iOZLZ
 jyTAlQv9GSctgp3ItPEG7/dF90f2u/ezaqiyLt1ug3cnOrzZL6cbaQPJt/XtxeMY
 XA4eO8aNrMyioClKu2+KEqQgIiNc30HgwOWMxfZpWBWLVlrx5PhvTbwJB6Wfb8r3
 WFze5lc6X2Yttp3jxUU9jLUTPVTJx8SjyhGwBXbzN63aiGv8+bGjD5e4pPg1axP/
 HvUwVpRzK5uU0ju1IM7BPvIjjAOiciwC+KbLjj8Hm++LIbwju7QHlJWy9oMKD1X5
 yuZsIan2dTM+4OclTji7HlSg6c4IFlhMj7GHGJD62aWNyM0/tZokOCIVY1wITXyS
 KRsxag4gjtkVBRNvAHsRsYe3aZ+jQ5DzhGEGTipNGnj3b8FOecuWFSn5a/aMdNkV
 kMSOAbdjZu8xGllroFWS199BamCb6SHijnbv8EzeWNgJXofwxn8vumdgxXZuHIe9
 md1gP2QIuo3/R15zcgy54buB11JD4PeDV7NuovuTQUzFuvsIyIKbEkLMBwEl3j4N
 TIlijEyI
 =xqxQ
 -----END PGP SIGNATURE-----

Merge tag 'efi-next-for-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi

Pull EFI updates from Ard Biesheuvel:

 - Allow runtime services to be re-enabled at boot on RT kernels.

 - Provide access to secrets injected into the boot image by CoCo
   hypervisors (COnfidential COmputing)

 - Use DXE services on x86 to make the boot image executable after
   relocation, if needed.

 - Prefer mirrored memory for randomized allocations.

 - Only randomize the placement of the kernel image on arm64 if the
   loader has not already done so.

 - Add support for obtaining the boot hartid from EFI on RISC-V.

* tag 'efi-next-for-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
  riscv/efi_stub: Add support for RISCV_EFI_BOOT_PROTOCOL
  efi: stub: prefer mirrored memory for randomized allocations
  efi/arm64: libstub: run image in place if randomized by the loader
  efi: libstub: pass image handle to handle_kernel_image()
  efi: x86: Set the NX-compatibility flag in the PE header
  efi: libstub: ensure allocated memory to be executable
  efi: libstub: declare DXE services table
  efi: Add missing prototype for efi_capsule_setup_info
  docs: security: Add secrets/coco documentation
  efi: Register efi_secret platform device if EFI secret area is declared
  virt: Add efi_secret module to expose confidential computing secrets
  efi: Save location of EFI confidential computing area
  efi: Allow to enable EFI runtime services by default on RT
2022-05-23 11:27:24 -07:00
Rafael J. Wysocki
cd8198a2c1 Merge branch 'pm-domains'
Merge generlic power domains update for 5.19-rc1:

 - Extend dev_pm_domain_detach() doc (Krzysztof Kozlowski).

 - Move genpd's time-accounting to ktime_get_mono_fast_ns() (Ulf
   Hansson).

 - Improve the way genpd deals with its governors (Ulf Hansson).

* pm-domains:
  PM: domains: Trust domain-idle-states from DT to be correct by genpd
  PM: domains: Measure power-on/off latencies in genpd based on a governor
  PM: domains: Allocate governor data dynamically based on a genpd governor
  PM: domains: Clean up some code in pm_genpd_init() and genpd_remove()
  PM: domains: Fix initialization of genpd's next_wakeup
  PM: domains: Fixup QoS latency measurements for IRQ safe devices in genpd
  PM: domains: Measure suspend/resume latencies in genpd based on governor
  PM: domains: Move the next_wakeup variable into the struct gpd_timing_data
  PM: domains: Allocate gpd_timing_data dynamically based on governor
  PM: domains: Skip another warning in irq_safe_dev_in_sleep_domain()
  PM: domains: Rename irq_safe_dev_in_no_sleep_domain() in genpd
  PM: domains: Don't check PM_QOS_FLAG_NO_POWER_OFF in genpd
  PM: domains: Drop redundant code for genpd always-on governor
  PM: domains: Add GENPD_FLAG_RPM_ALWAYS_ON for the always-on governor
  PM: domains: Move genpd's time-accounting to ktime_get_mono_fast_ns()
  PM: domains: Extend dev_pm_domain_detach() doc
2022-05-23 19:51:31 +02:00
Rafael J. Wysocki
d988c91342 Merge branch 'pm-cpufreq'
Merge cpufreq updates for 5.19-rc1:

 - Fix cpufreq governor clean up code to avoid using kfree() directly
   to free kobject-based items (Kevin Hao).

 - Prepare cpufreq for powerpc's asm/prom.h cleanup (Christophe Leroy).

 - Make intel_pstate notify frequency invariance code when no_turbo is
   turned on and off (Chen Yu).

 - Add Sapphire Rapids OOB mode support to intel_pstate (Srinivas
   Pandruvada).

 - Make cpufreq avoid unnecessary frequency updates due to mismatch
   between hardware and the frequency table (Viresh Kumar).

 - Make remove_cpu_dev_symlink() clear the real_cpus mask to simplify
   code (Viresh Kumar).

 - Rearrange cpufreq_offline() and cpufreq_remove_dev() to make the
   calling convention for some driver callbacks consistent (Rafael
   Wysocki).

 - Avoid accessing half-initialized cpufreq policies from the show()
   and store() sysfs functions (Schspa Shi).

 - Rearrange cpufreq_offline() to make the calling convention for some
   driver callbacks consistent (Schspa Shi).

 - Update CPPC handling in cpufreq (Pierre Gondois):

   * Add per_cpu efficiency_class to the CPPC driver.
   * Make the CPPC driver Register EM based on efficiency class
     information.
   * Adjust _OSC for flexible address space in the ACPI platform
     initialization code and always set CPPC _OSC bits if CPPC_LIB is
     supported.
   * Assume no transition latency if no PCCT in the CPPC driver.
   * Add fast_switch and dvfs_possible_from_any_cpu support to the CPPC
     driver.

* pm-cpufreq:
  cpufreq: CPPC: Enable dvfs_possible_from_any_cpu
  cpufreq: CPPC: Enable fast_switch
  ACPI: CPPC: Assume no transition latency if no PCCT
  ACPI: bus: Set CPPC _OSC bits for all and when CPPC_LIB is supported
  ACPI: CPPC: Check _OSC for flexible address space
  cpufreq: make interface functions and lock holding state clear
  cpufreq: Abort show()/store() for half-initialized policies
  cpufreq: Rearrange locking in cpufreq_remove_dev()
  cpufreq: Split cpufreq_offline()
  cpufreq: Reorganize checks in cpufreq_offline()
  cpufreq: Clear real_cpus mask from remove_cpu_dev_symlink()
  cpufreq: intel_pstate: Support Sapphire Rapids OOB mode
  Revert "cpufreq: Fix possible race in cpufreq online error path"
  cpufreq: CPPC: Register EM based on efficiency class information
  cpufreq: CPPC: Add per_cpu efficiency_class
  cpufreq: Avoid unnecessary frequency updates due to mismatch
  cpufreq: Fix possible race in cpufreq online error path
  cpufreq: intel_pstate: Handle no_turbo in frequency invariance
  cpufreq: Prepare cleanup of powerpc's asm/prom.h
  cpufreq: governor: Use kobject release() method to free dbs_data
2022-05-23 19:28:41 +02:00
Rafael J. Wysocki
16a23f394d Merge branches 'pm-em' and 'pm-cpuidle'
Marge Energy Model support updates and cpuidle updates for 5.19-rc1:

 - Update the Energy Model support code to allow the Energy Model to be
   artificial, which means that the power values may not be on a uniform
   scale with other devices providing power information, and update the
   cpufreq_cooling and devfreq_cooling thermal drivers to support
   artificial Energy Models (Lukasz Luba).

 - Make DTPM check the Energy Model type (Lukasz Luba).

 - Fix policy counter decrementation in cpufreq if Energy Model is in
   use (Pierre Gondois).

 - Add AlderLake processor support to the intel_idle driver (Zhang Rui).

 - Fix regression leading to no genpd governor in the PSCI cpuidle
   driver and fix the riscv-sbi cpuidle driver to allow a genpd
   governor to be used (Ulf Hansson).

* pm-em:
  PM: EM: Decrement policy counter
  powercap: DTPM: Check for Energy Model type
  thermal: cooling: Check Energy Model type in cpufreq_cooling and devfreq_cooling
  Documentation: EM: Add artificial EM registration description
  PM: EM: Remove old debugfs files and print all 'flags'
  PM: EM: Change the order of arguments in the .active_power() callback
  PM: EM: Use the new .get_cost() callback while registering EM
  PM: EM: Add artificial EM flag
  PM: EM: Add .get_cost() callback

* pm-cpuidle:
  cpuidle: riscv-sbi: Fix code to allow a genpd governor to be used
  cpuidle: psci: Fix regression leading to no genpd governor
  intel_idle: Add AlderLake support
2022-05-23 19:18:51 +02:00
Rafael J. Wysocki
95f2ce548a Merge branches 'pm-core', 'pm-sleep' and 'powercap'
Merge PM core changes, updates related to system sleep and power capping
updates for 5.19-rc1:

 - Export dev_pm_ops instead of suspend() and resume() in the IIO
   chemical scd30 driver (Jonathan Cameron).

 - Add namespace variants of EXPORT[_GPL]_SIMPLE_DEV_PM_OPS and
   PM-runtime counterparts (Jonathan Cameron).

 - Move symbol exports in the IIO chemical scd30 driver into the
   IIO_SCD30 namespace (Jonathan Cameron).

 - Avoid device PM-runtime usage count underflows (Rafael Wysocki).

 - Allow dynamic debug to control printing of PM messages  (David
   Cohen).

 - Fix some kernel-doc comments in hibernation code (Yang Li, Haowen
   Bai).

 - Preserve ACPI-table override during hibernation (Amadeusz Sławiński).

 - Improve support for suspend-to-RAM for PSCI OSI mode (Ulf Hansson).

 - Make Intel RAPL power capping driver support the RaptorLake and
   AlderLake N processors (Zhang Rui, Sumeet Pawnikar).

 - Remove redundant store to value after multiply in the RAPL power
   capping driver (Colin Ian King).

* pm-core:
  PM: runtime: Avoid device usage count underflows
  iio: chemical: scd30: Move symbol exports into IIO_SCD30 namespace
  PM: core: Add NS varients of EXPORT[_GPL]_SIMPLE_DEV_PM_OPS and runtime pm equiv
  iio: chemical: scd30: Export dev_pm_ops instead of suspend() and resume()

* pm-sleep:
  cpuidle: PSCI: Improve support for suspend-to-RAM for PSCI OSI mode
  PM: runtime: Allow to call __pm_runtime_set_status() from atomic context
  PM: hibernate: Don't mark comment as kernel-doc
  x86/ACPI: Preserve ACPI-table override during hibernation
  PM: hibernate: Fix some kernel-doc comments
  PM: sleep: enable dynamic debug support within pm_pr_dbg()
  PM: sleep: Narrow down -DDEBUG on kernel/power/ files

* powercap:
  powercap: intel_rapl: remove redundant store to value after multiply
  powercap: intel_rapl: add support for ALDERLAKE_N
  powercap: RAPL: Add Power Limit4 support for RaptorLake
  powercap: intel_rapl: add support for RaptorLake
2022-05-23 19:06:33 +02:00
Chuck Lever
fb70bf124b NFSD: Instantiate a struct file when creating a regular NFSv4 file
There have been reports of races that cause NFSv4 OPEN(CREATE) to
return an error even though the requested file was created. NFSv4
does not provide a status code for this case.

To mitigate some of these problems, reorganize the NFSv4
OPEN(CREATE) logic to allocate resources before the file is actually
created, and open the new file while the parent directory is still
locked.

Two new APIs are added:

+ Add an API that works like nfsd_file_acquire() but does not open
the underlying file. The OPEN(CREATE) path can use this API when it
already has an open file.

+ Add an API that is kin to dentry_open(). NFSD needs to create a
file and grab an open "struct file *" atomically. The
alloc_empty_file() has to be done before the inode create. If it
fails (for example, because the NFS server has exceeded its
max_files limit), we avoid creating the file and can still return
an error to the NFS client.

BugLink: https://bugzilla.linux-nfs.org/show_bug.cgi?id=382
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: JianHong Yin <jiyin@redhat.com>
2022-05-23 11:06:29 -04:00
Takashi Iwai
0163717ed5 ASoC: Updates for v5.19
This is quite a big update, partly due to the addition of some larger
 drivers (more of which is to follow since at least the AVS driver is
 still a work in progress) and partly due to Charles' work sorting out
 our handling of endianness.  As has been the case recently it's much
 more about drivers than the core.
 
  - Overhaul of endianness specification for data formats, avoiding
    needless restrictions due to CODECs.
  - Initial stages of Intel AVS driver merge.
  - Introduction of v4 IPC mechanism for SOF.
  - TDM mode support for AK4613.
  - Support for Analog Devices ADAU1361, Cirrus Logic CS35L45, Maxim
    MAX98396, MediaTek MT8186, NXP i.MX8 micfil and SAI interfaces,
    nVidia Tegra186 ASRC, and Texas Instruments TAS2764 and TAS2780
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmKLf5EACgkQJNaLcl1U
 h9ASqAf/YnwbFP919ree/DEKUDCNc4klUH5M4JOexXbZlZDqxKYRGZjoLuiwX/PQ
 Au/xOjGEvm3Yg5/g5c8YFVNcIkv1O8VclRkV59oIxlBwKmQeTKvq+lOmlel2l1wZ
 XOmvHjE46wxH1N1cLwL6KkX0YDn59orSZGYZRpfLjL61y6LQWsLNU0tY6AWCRATB
 Llnrbu+DYgCsYNTEOOOY5s4V+4LkQm8TLdft91Va7mBdkPPRFoXRO0HGcVBqbkoN
 7pf2mrjrLAWL9yuA8FlrgJbHq58DF9WGe5uEU7qlVL1zw46ClgIM0ABxPOdNdjV2
 Wzb1jI7GmztgQNxlR9BcJB0kxAj9vA==
 =oD5l
 -----END PGP SIGNATURE-----

Merge tag 'asoc-v5.19' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus

ASoC: Updates for v5.19

This is quite a big update, partly due to the addition of some larger
drivers (more of which is to follow since at least the AVS driver is
still a work in progress) and partly due to Charles' work sorting out
our handling of endianness.  As has been the case recently it's much
more about drivers than the core.

 - Overhaul of endianness specification for data formats, avoiding
   needless restrictions due to CODECs.
 - Initial stages of Intel AVS driver merge.
 - Introduction of v4 IPC mechanism for SOF.
 - TDM mode support for AK4613.
 - Support for Analog Devices ADAU1361, Cirrus Logic CS35L45, Maxim
   MAX98396, MediaTek MT8186, NXP i.MX8 micfil and SAI interfaces,
   nVidia Tegra186 ASRC, and Texas Instruments TAS2764 and TAS2780
2022-05-23 16:03:04 +02:00
Mickaël Salaün
100f59d964
LSM: Remove double path_rename hook calls for RENAME_EXCHANGE
In order to be able to identify a file exchange with renameat2(2) and
RENAME_EXCHANGE, which will be useful for Landlock [1], propagate the
rename flags to LSMs.  This may also improve performance because of the
switch from two set of LSM hook calls to only one, and because LSMs
using this hook may optimize the double check (e.g. only one lock,
reduce the number of path walks).

AppArmor, Landlock and Tomoyo are updated to leverage this change.  This
should not change the current behavior (same check order), except
(different level of) speed boosts.

[1] https://lore.kernel.org/r/20220221212522.320243-1-mic@digikod.net

Cc: James Morris <jmorris@namei.org>
Cc: Kentaro Takeda <takedakn@nttdata.co.jp>
Cc: Serge E. Hallyn <serge@hallyn.com>
Acked-by: John Johansen <john.johansen@canonical.com>
Acked-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reviewed-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Link: https://lore.kernel.org/r/20220506161102.525323-7-mic@digikod.net
2022-05-23 13:27:58 +02:00
Vlastimil Babka
e001897da6 Merge branches 'slab/for-5.19/stackdepot' and 'slab/for-5.19/refactor' into slab/for-linus 2022-05-23 11:14:32 +02:00
Yuezhang Mo
97d6fb1b48 block: add sync_blockdev_range()
sync_blockdev_range() is to support syncing multiple sectors
with as few block device requests as possible, it is helpful
to make the block device to give full play to its performance.

Signed-off-by: Yuezhang Mo <Yuezhang.Mo@sony.com>
Suggested-by: Christoph Hellwig <hch@infradead.org>
Reviewed-by: Andy Wu <Andy.Wu@sony.com>
Reviewed-by: Aoyama Wataru <wataru.aoyama@sony.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Acked-by: Sungjong Seo <sj1557.seo@samsung.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
2022-05-23 11:17:30 +09:00
Jakub Kicinski
c304eddcec net: wrap the wireless pointers in struct net_device in an ifdef
Most protocol-specific pointers in struct net_device are under
a respective ifdef. Wireless is the notable exception. Since
there's a sizable number of custom-built kernels for datacenter
workloads which don't build wireless it seems reasonable to
ifdefy those pointers as well.

While at it move IPv4 and IPv6 pointers up, those are special
for obvious reasons.

Acked-by: Johannes Berg <johannes@sipsolutions.net>
Acked-by: Stefan Schmidt <stefan@datenfreihafen.org> # ieee802154
Acked-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-22 21:51:54 +01:00
David Howells
ad25f5cb39 rxrpc: Fix locking issue
There's a locking issue with the per-netns list of calls in rxrpc.  The
pieces of code that add and remove a call from the list use write_lock()
and the calls procfile uses read_lock() to access it.  However, the timer
callback function may trigger a removal by trying to queue a call for
processing and finding that it's already queued - at which point it has a
spare refcount that it has to do something with.  Unfortunately, if it puts
the call and this reduces the refcount to 0, the call will be removed from
the list.  Unfortunately, since the _bh variants of the locking functions
aren't used, this can deadlock.

================================
WARNING: inconsistent lock state
5.18.0-rc3-build4+ #10 Not tainted
--------------------------------
inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
ksoftirqd/2/25 [HC0[0]:SC1[1]:HE1:SE0] takes:
ffff888107ac4038 (&rxnet->call_lock){+.?.}-{2:2}, at: rxrpc_put_call+0x103/0x14b
{SOFTIRQ-ON-W} state was registered at:
...
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(&rxnet->call_lock);
  <Interrupt>
    lock(&rxnet->call_lock);

 *** DEADLOCK ***

1 lock held by ksoftirqd/2/25:
 #0: ffff8881008ffdb0 ((&call->timer)){+.-.}-{0:0}, at: call_timer_fn+0x5/0x23d

Changes
=======
ver #2)
 - Changed to using list_next_rcu() rather than rcu_dereference() directly.

Fixes: 17926a7932 ("[AF_RXRPC]: Provide secure RxRPC sockets for use by userspace and kernel both")
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-22 21:03:01 +01:00
Julia Lawall
cc4e7fa549 net: qed: fix typos in comments
Spelling mistakes (triple letters) in comments.
Detected with the help of Coccinelle.

Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-22 20:44:29 +01:00
Guenter Roeck
e5d2107205 hwmon: Introduce hwmon_device_register_for_thermal
The thermal subsystem registers a hwmon driver without providing
chip or sysfs group information. This is for legacy reasons and
would be difficult to change. At the same time, we want to enforce
that chip information is provided when registering a hwmon device
using hwmon_device_register_with_info(). To enable this, introduce
a special API for use only by the thermal subsystem.

Acked-by: Rafael J . Wysocki <rafael@kernel.org>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2022-05-22 11:32:31 -07:00
Michael Walle
cd705ea857 lib: add generic polynomial calculation
Some temperature and voltage sensors use a polynomial to convert between
raw data points and actual temperature or voltage. The polynomial is
usually the result of a curve fitting of the diode characteristic.

The BT1 PVT hwmon driver already uses such a polynonmial calculation
which is rather generic. Move it to lib/ so other drivers can reuse it.

Signed-off-by: Michael Walle <michael@walle.cc>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20220401214032.3738095-2-michael@walle.cc
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2022-05-22 11:32:30 -07:00
Krzysztof Kozlowski
bb12dd42d2 powerpc/powermac: constify device_node in of_irq_parse_oldworld()
The of_irq_parse_oldworld() does not modify passed device_node so make
it a pointer to const for safety.  Drop the extern while modifying the
line.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210924105653.46963-2-krzysztof.kozlowski@canonical.com
2022-05-22 15:59:54 +10:00
Pawan Gupta
8d50cdf8b8 x86/speculation/mmio: Add sysfs reporting for Processor MMIO Stale Data
Add the sysfs reporting file for Processor MMIO Stale Data
vulnerability. It exposes the vulnerability and mitigation state similar
to the existing files for the other hardware vulnerabilities.

Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
2022-05-21 12:16:04 +02:00
Yuntao Wang
b23316aabf selftests/bpf: Add missing trampoline program type to trampoline_count test
Currently the trampoline_count test doesn't include any fmod_ret bpf
programs, fix it to make the test cover all possible trampoline program
types.

Since fmod_ret bpf programs can't be attached to __set_task_comm function,
as it's neither whitelisted for error injection nor a security hook, change
it to bpf_modify_return_test.

This patch also does some other cleanups such as removing duplicate code,
dropping inconsistent comments, etc.

Signed-off-by: Yuntao Wang <ytcoode@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20220519150610.601313-1-ytcoode@gmail.com
2022-05-20 16:12:14 -07:00
Geliang Tang
3bc253c2e6 bpf: Add bpf_skc_to_mptcp_sock_proto
This patch implements a new struct bpf_func_proto, named
bpf_skc_to_mptcp_sock_proto. Define a new bpf_id BTF_SOCK_TYPE_MPTCP,
and a new helper bpf_skc_to_mptcp_sock(), which invokes another new
helper bpf_mptcp_sock_from_subflow() in net/mptcp/bpf.c to get struct
mptcp_sock from a given subflow socket.

v2: Emit BTF type, add func_id checks in verifier.c and bpf_trace.c,
remove build check for CONFIG_BPF_JIT
v5: Drop EXPORT_SYMBOL (Martin)

Co-developed-by: Nicolas Rybowski <nicolas.rybowski@tessares.net>
Co-developed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Nicolas Rybowski <nicolas.rybowski@tessares.net>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220519233016.105670-2-mathew.j.martineau@linux.intel.com
2022-05-20 15:29:00 -07:00
Linus Torvalds
b851c1f8e0 A fix for a nasty use-after-free, marked for stable.
-----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCAAxFiEEydHwtzie9C7TfviiSn/eOAIR84sFAmKHsGETHGlkcnlvbW92
 QGdtYWlsLmNvbQAKCRBKf944AhHzi6Q8B/97dkJamfa0rfcenW8qnb6Rx2DI6QmE
 vEV2et8Qvrjxr9s10ylTaiH7veYG5Cgb986ufDN1Af52uDx1VdW7TOz4cD7Umx8G
 QsjzviREL3VfN7Ag3WY0SsI5cjQ/iRJfjMJx/fB4G5bMkor1ouH32sQNtmcVLS6D
 HHQZqVL7xP0ORV0lFvBns5EVUCsLHAKjoPGiLprmm7lwlhOo3e60WHBbBHTD9Isc
 SrO8Gz5QiHYyVS6eksgYOZj0Tg5qLFKtKWXXxb1nyF8fLHcQU0S/zicf4AQKDj7i
 5HOagl3S3Gmu+0g/wnWF9YyG3yoTVgfEfZ38XAh1rlwJJOkb1rbeFKAb
 =tZDh
 -----END PGP SIGNATURE-----

Merge tag 'ceph-for-5.18-rc8' of https://github.com/ceph/ceph-client

Pull ceph fix from Ilya Dryomov:
 "A fix for a nasty use-after-free, marked for stable"

* tag 'ceph-for-5.18-rc8' of https://github.com/ceph/ceph-client:
  libceph: fix misleading ceph_osdc_cancel_request() comment
  libceph: fix potential use-after-free on linger ping and resends
2022-05-20 08:15:40 -10:00
Catalin Marinas
201729d53a Merge branches 'for-next/sme', 'for-next/stacktrace', 'for-next/fault-in-subpage', 'for-next/misc', 'for-next/ftrace' and 'for-next/crashkernel', remote-tracking branch 'arm64/for-next/perf' into for-next/core
* arm64/for-next/perf:
  perf/arm-cmn: Decode CAL devices properly in debugfs
  perf/arm-cmn: Fix filter_sel lookup
  perf/marvell_cn10k: Fix tad_pmu_event_init() to check pmu type first
  drivers/perf: hisi: Add Support for CPA PMU
  drivers/perf: hisi: Associate PMUs in SICL with CPUs online
  drivers/perf: arm_spe: Expose saturating counter to 16-bit
  perf/arm-cmn: Add CMN-700 support
  perf/arm-cmn: Refactor occupancy filter selector
  perf/arm-cmn: Add CMN-650 support
  dt-bindings: perf: arm-cmn: Add CMN-650 and CMN-700
  perf: check return value of armpmu_request_irq()
  perf: RISC-V: Remove non-kernel-doc ** comments

* for-next/sme: (30 commits)
  : Scalable Matrix Extensions support.
  arm64/sve: Move sve_free() into SVE code section
  arm64/sve: Make kernel FPU protection RT friendly
  arm64/sve: Delay freeing memory in fpsimd_flush_thread()
  arm64/sme: More sensibly define the size for the ZA register set
  arm64/sme: Fix NULL check after kzalloc
  arm64/sme: Add ID_AA64SMFR0_EL1 to __read_sysreg_by_encoding()
  arm64/sme: Provide Kconfig for SME
  KVM: arm64: Handle SME host state when running guests
  KVM: arm64: Trap SME usage in guest
  KVM: arm64: Hide SME system registers from guests
  arm64/sme: Save and restore streaming mode over EFI runtime calls
  arm64/sme: Disable streaming mode and ZA when flushing CPU state
  arm64/sme: Add ptrace support for ZA
  arm64/sme: Implement ptrace support for streaming mode SVE registers
  arm64/sme: Implement ZA signal handling
  arm64/sme: Implement streaming SVE signal handling
  arm64/sme: Disable ZA and streaming mode when handling signals
  arm64/sme: Implement traps and syscall handling for SME
  arm64/sme: Implement ZA context switching
  arm64/sme: Implement streaming SVE context switching
  ...

* for-next/stacktrace:
  : Stacktrace cleanups.
  arm64: stacktrace: align with common naming
  arm64: stacktrace: rename stackframe to unwind_state
  arm64: stacktrace: rename unwinder functions
  arm64: stacktrace: make struct stackframe private to stacktrace.c
  arm64: stacktrace: delete PCS comment
  arm64: stacktrace: remove NULL task check from unwind_frame()

* for-next/fault-in-subpage:
  : btrfs search_ioctl() live-lock fix using fault_in_subpage_writeable().
  btrfs: Avoid live-lock in search_ioctl() on hardware with sub-page faults
  arm64: Add support for user sub-page fault probing
  mm: Add fault_in_subpage_writeable() to probe at sub-page granularity

* for-next/misc:
  : Miscellaneous patches.
  arm64: Kconfig.platforms: Add comments
  arm64: Kconfig: Fix indentation and add comments
  arm64: mm: avoid writable executable mappings in kexec/hibernate code
  arm64: lds: move special code sections out of kernel exec segment
  arm64/hugetlb: Implement arm64 specific huge_ptep_get()
  arm64/hugetlb: Use ptep_get() to get the pte value of a huge page
  arm64: mm: Make arch_faults_on_old_pte() check for migratability
  arm64: mte: Clean up user tag accessors
  arm64/hugetlb: Drop TLB flush from get_clear_flush()
  arm64: Declare non global symbols as static
  arm64: mm: Cleanup useless parameters in zone_sizes_init()
  arm64: fix types in copy_highpage()
  arm64: Set ARCH_NR_GPIO to 2048 for ARCH_APPLE
  arm64: cputype: Avoid overflow using MIDR_IMPLEMENTOR_MASK
  arm64: document the boot requirements for MTE
  arm64/mm: Compute PTRS_PER_[PMD|PUD] independently of PTRS_PER_PTE

* for-next/ftrace:
  : ftrace cleanups.
  arm64/ftrace: Make function graph use ftrace directly
  ftrace: cleanup ftrace_graph_caller enable and disable

* for-next/crashkernel:
  : Support for crashkernel reservations above ZONE_DMA.
  arm64: kdump: Do not allocate crash low memory if not needed
  docs: kdump: Update the crashkernel description for arm64
  of: Support more than one crash kernel regions for kexec -s
  of: fdt: Add memory for devices by DT property "linux,usable-memory-range"
  arm64: kdump: Reimplement crashkernel=X
  arm64: Use insert_resource() to simplify code
  kdump: return -ENOENT if required cmdline option does not exist
2022-05-20 18:50:35 +01:00
Miquel Raynal
2c51d0d880 NAND core:
* Print offset instead of page number for bad blocks
 
 Raw NAND controller drivers:
 * Cadence: Fix possible null-ptr-deref in cadence_nand_dt_probe()
 * CS553X: simplify the return expression of cs553x_write_ctrl_byte()
 * Davinci: Remove redundant unsigned comparison to zero
 * Denali: Use managed device resources
 * GPMI:
   - Add large oob bch setting support
   - Rename the variable ecc_chunk_size
   - Uninline the gpmi_check_ecc function
   - Add strict ecc strength check
   - Refactor BCH geometry settings function
 * Intel: Fix possible null-ptr-deref in ebu_nand_probe()
 * MPC5121: Check before clk_disable_unprepare() not needed
 * Mtk:
   - MTD_NAND_ECC_MEDIATEK should depend on ARCH_MEDIATEK
   - Also parse the default nand-ecc-engine property if available
   - Make mtk_ecc.c a separated module
 * OMAP ELM:
   - Convert the bindings to yaml
   - Describe the bindings for AM64 ELM
   - Add support for its compatible
 * Renesas: Use runtime PM instead of the raw clock API and update the
            bindings accordingly
 * Rockchip: Check before clk_disable_unprepare() not needed
 * TMIO: Check return value after calling platform_get_resource()
 
 Raw NAND chip driver:
 * Kioxia: Add support for TH58NVG3S0HBAI4 and TC58NVG0S3HTA00
 
 SPI-NAND chip drivers:
 * Gigadevice:
   - Add support for:
     - GD5FxGM7xExxG
     - GD5F{2,4}GQ5xExxG
     - GD5F1GQ5RExxG
     - GD5FxGQ4xExxG
   - Fix Quad IO for GD5F1GQ5UExxG
 * XTX: Add support for XT26G0xA
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEE9HuaYnbmDhq/XIDIJWrqGEe9VoQFAmKD51AACgkQJWrqGEe9
 VoQJ8AgAo8qjO5QQmd76v1q1Regxxl7RsH7hr7r9gJfgTwzhQC/xIyTiorcVPG7B
 UDNu4Dwe3OAPbmU54TQNS/AMQkgHGMcIRTVut8+oL+ZnjYl+gv261GxmsfxaK/Hu
 Vvvq9X0iyaKpZfyq3uksbsxiXbwMn4fHT7Reaimc9Thw+XKD7AcYWCuFb9GAWfaf
 XQUVLg6y4Qk4BR9ZpYAx2v5FH4amJV9RKKTIqiymwcjnBZjYOI29wgKwY1hX+3bm
 2Wu4wcccWhDzlV0Casf/hIGBydKx3omV+cJHLtmx7s+dqPvYSUGuvR2nbq5wuMqp
 ZcAYeRhGEAYMYcMB/QuKmh4g/Js26w==
 =QyqB
 -----END PGP SIGNATURE-----

Merge tag 'nand/for-5.19' into mtd/next

NAND core:
* Print offset instead of page number for bad blocks

Raw NAND controller drivers:
* Cadence: Fix possible null-ptr-deref in cadence_nand_dt_probe()
* CS553X: simplify the return expression of cs553x_write_ctrl_byte()
* Davinci: Remove redundant unsigned comparison to zero
* Denali: Use managed device resources
* GPMI:
  - Add large oob bch setting support
  - Rename the variable ecc_chunk_size
  - Uninline the gpmi_check_ecc function
  - Add strict ecc strength check
  - Refactor BCH geometry settings function
* Intel: Fix possible null-ptr-deref in ebu_nand_probe()
* MPC5121: Check before clk_disable_unprepare() not needed
* Mtk:
  - MTD_NAND_ECC_MEDIATEK should depend on ARCH_MEDIATEK
  - Also parse the default nand-ecc-engine property if available
  - Make mtk_ecc.c a separated module
* OMAP ELM:
  - Convert the bindings to yaml
  - Describe the bindings for AM64 ELM
  - Add support for its compatible
* Renesas: Use runtime PM instead of the raw clock API and update the
           bindings accordingly
* Rockchip: Check before clk_disable_unprepare() not needed
* TMIO: Check return value after calling platform_get_resource()

Raw NAND chip driver:
* Kioxia: Add support for TH58NVG3S0HBAI4 and TC58NVG0S3HTA00

SPI-NAND chip drivers:
* Gigadevice:
  - Add support for:
    - GD5FxGM7xExxG
    - GD5F{2,4}GQ5xExxG
    - GD5F1GQ5RExxG
    - GD5FxGQ4xExxG
  - Fix Quad IO for GD5F1GQ5UExxG
* XTX: Add support for XT26G0xA

Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
2022-05-20 13:59:25 +02:00
Miquel Raynal
e6828be5ed SPI NOR core changes:
- Read back written SR value to make sure the write was done correctly.
 - Introduce a common function for Read ID that manufacturer drivers can
   use to verify the Octal DTR switch worked correctly.
 - Add helpers for read/write any register commands so manufacturer
   drivers don't open code it every time.
 - Clarify rdsr dummy cycles documentation.
 - Add debugfs entry to expose internal flash parameters and state.
 
 SPI NOR manufacturer drivers changes:
 - Add support for Winbond W25Q512NW-IM, and Eon EN25QH256A.
 - Move spi_nor_write_ear() to Winbond module since only Winbond flashes
   use it.
 - Rework Micron and Cypress Octal DTR enable methods to improve
   readability.
 - Use the common Read ID function to verify switch to Octal DTR mode for
   Micron and Cypress flashes.
 - Skip polling status on volatile register writes for Micron and Cypress
   flashes since the operation is instant.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQQTlUWNzXGEo3bFmyIR4drqP028CQUCYoH1/QAKCRAR4drqP028
 CXQeAP9hAfNGV55cOsd+z+3+xOxKeAYN/PC7UmXSvFgJZKYEqgEA/UF+SwsrOTyd
 x+5UwNSOvrUX404j6ROP0kH4/kzARAY=
 =Krne
 -----END PGP SIGNATURE-----

Merge tag 'spi-nor/for-5.19' into mtd/next

SPI NOR core changes:
- Read back written SR value to make sure the write was done correctly.
- Introduce a common function for Read ID that manufacturer drivers can
  use to verify the Octal DTR switch worked correctly.
- Add helpers for read/write any register commands so manufacturer
  drivers don't open code it every time.
- Clarify rdsr dummy cycles documentation.
- Add debugfs entry to expose internal flash parameters and state.

SPI NOR manufacturer drivers changes:
- Add support for Winbond W25Q512NW-IM, and Eon EN25QH256A.
- Move spi_nor_write_ear() to Winbond module since only Winbond flashes
  use it.
- Rework Micron and Cypress Octal DTR enable methods to improve
  readability.
- Use the common Read ID function to verify switch to Octal DTR mode for
  Micron and Cypress flashes.
- Skip polling status on volatile register writes for Micron and Cypress
  flashes since the operation is instant.

Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
2022-05-20 13:58:54 +02:00
Joerg Roedel
b0dacee202 Merge branches 'apple/dart', 'arm/mediatek', 'arm/msm', 'arm/smmu', 'ppc/pamu', 'x86/vt-d', 'x86/amd' and 'vfio-notifier-fix' into next 2022-05-20 12:27:17 +02:00
Al Viro
70f8d9c575 move mount-related externs from fs.h to mount.h
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-05-19 23:25:48 -04:00
Al Viro
59df85d5fb linux/mount.h: trim includes
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-05-19 23:25:18 -04:00
Muneendra Kumar
827fc630e4 scsi: nvme-fc: Add new routine nvme_fc_io_getuuid()
Add nvme_fc_io_getuuid() to the nvme-fc transport. The routine is invoked
by the FC LLDD on a per-I/O request basis.  The routine translates from the
FC-specific request structure to the bio and the cgroup structure in order
to obtain the FC appid stored in the cgroup structure. If a value is not
set or a bio is not found, a NULL appid (aka uuid) will be returned to the
LLDD.

Link: https://lore.kernel.org/r/20220519123110.17361-2-jsmart2021@gmail.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Muneendra Kumar <muneendra.kumar@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-05-19 20:24:56 -04:00
Harini Katakam
5cebb40bc9 net: macb: Fix PTP one step sync support
PTP one step sync packets cannot have CSUM padding and insertion in
SW since time stamp is inserted on the fly by HW.
In addition, ptp4l version 3.0 and above report an error when skb
timestamps are reported for packets that not processed for TX TS
after transmission.
Add a helper to identify PTP one step sync and fix the above two
errors. Add a common mask for PTP header flag field "twoStepflag".
Also reset ptp OSS bit when one step is not selected.

Fixes: ab91f0a9b5 ("net: macb: Add hardware PTP support")
Fixes: 653e92a917 ("net: macb: add support for padding and fcs computation")
Signed-off-by: Harini Katakam <harini.katakam@xilinx.com>
Reviewed-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
Reviewed-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Link: https://lore.kernel.org/r/20220518170756.7752-1-harini.katakam@xilinx.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-19 16:58:16 -07:00
Palmer Dabbelt
83a7a614ce
riscv: kexec: add kexec_file_load() support
This patch set implements kexec_file_load() for RISC-V, which is
currently only allowed on rv64 due to some minor build issues on 32-bit
platforms in the generic code.  This allows users to kexec() using an FD
as opposed to a buffer.

Link: https://lore.kernel.org/all/20220408100914.150110-1-lizhengyu3@huawei.com/

* palmer/riscv-kexec_file:
  RISC-V: Load purgatory in kexec_file
  RISC-V: Add purgatory
  RISC-V: Support for kexec_file on panic
  RISC-V: Add kexec_file support
  RISC-V: use memcpy for kexec_file mode
  kexec_file: Fix kexec_file.c build error for riscv platform
2022-05-19 16:26:50 -07:00
Dietmar Eggemann
991d8d8142 topology: Remove unused cpu_cluster_mask()
default_topology[] uses cpu_clustergroup_mask() for the CLS level
(guarded by CONFIG_SCHED_CLUSTER) which is currently provided by x86
(arch/x86/kernel/smpboot.c) and arm64 (drivers/base/arch_topology.c).

Fixes: 778c558f49 ("sched: Add cluster scheduler level in core and
related Kconfig for ARM64")

Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Barry Song <baohua@kernel.org>
Link: https://lore.kernel.org/r/20220513093433.425163-1-dietmar.eggemann@arm.com
2022-05-19 23:46:13 +02:00
Christophe de Dinechin
37462a9203 nodemask.h: fix compilation error with GCC12
With gcc version 12.0.1 20220401 (Red Hat 12.0.1-0), building with
defconfig results in the following compilation error:

|   CC      mm/swapfile.o
| mm/swapfile.c: In function `setup_swap_info':
| mm/swapfile.c:2291:47: error: array subscript -1 is below array bounds
|  of `struct plist_node[]' [-Werror=array-bounds]
|  2291 |                                 p->avail_lists[i].prio = 1;
|       |                                 ~~~~~~~~~~~~~~^~~
| In file included from mm/swapfile.c:16:
| ./include/linux/swap.h:292:27: note: while referencing `avail_lists'
|   292 |         struct plist_node avail_lists[]; /*
|       |                           ^~~~~~~~~~~

This is due to the compiler detecting that the mask in
node_states[__state] could theoretically be zero, which would lead to
first_node() returning -1 through find_first_bit.

I believe that the warning/error is legitimate.  I first tried adding a
test to check that the node mask is not emtpy, since a similar test exists
in the case where MAX_NUMNODES == 1.

However, adding the if statement causes other warnings to appear in
for_each_cpu_node_but, because it introduces a dangling else ambiguity. 
And unfortunately, GCC is not smart enough to detect that the added test
makes the case where (node) == -1 impossible, so it still complains with
the same message.

This is why I settled on replacing that with a harmless, but relatively
useless (node) >= 0 test.  Based on the warning for the dangling else, I
also decided to fix the case where MAX_NUMNODES == 1 by moving the
condition inside the for loop.  It will still only be tested once.  This
ensures that the meaning of an else following for_each_node_mask or
derivatives would not silently have a different meaning depending on the
configuration.

Link: https://lkml.kernel.org/r/20220414150855.2407137-3-dinechin@redhat.com
Signed-off-by: Christophe de Dinechin <christophe@dinechin.org>
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Ben Segall <bsegall@google.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Daniel Bristot de Oliveira <bristot@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Zhen Lei <thunder.leizhen@huawei.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-19 14:08:55 -07:00
Qi Zheng
3f913fc5f9 mm: fix missing handler for __GFP_NOWARN
We expect no warnings to be issued when we specify __GFP_NOWARN, but
currently in paths like alloc_pages() and kmalloc(), there are still some
warnings printed, fix it.

But for some warnings that report usage problems, we don't deal with them.
If such warnings are printed, then we should fix the usage problems. 
Such as the following case:

	WARN_ON_ONCE((gfp_flags & __GFP_NOFAIL) && (order > 1));

[zhengqi.arch@bytedance.com: v2]
 Link: https://lkml.kernel.org/r/20220511061951.1114-1-zhengqi.arch@bytedance.com
Link: https://lkml.kernel.org/r/20220510113809.80626-1-zhengqi.arch@bytedance.com
Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
Cc: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jiri Slaby <jirislaby@kernel.org>
Cc: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-19 14:08:55 -07:00
Minchan Kim
6d4675e601 mm: don't be stuck to rmap lock on reclaim path
The rmap locks(i_mmap_rwsem and anon_vma->root->rwsem) could be contended
under memory pressure if processes keep working on their vmas(e.g., fork,
mmap, munmap).  It makes reclaim path stuck.  In our real workload traces,
we see kswapd is waiting the lock for 300ms+(worst case, a sec) and it
makes other processes entering direct reclaim, which were also stuck on
the lock.

This patch makes lru aging path try_lock mode like shink_page_list so the
reclaim context will keep working with next lru pages without being stuck.
if it found the rmap lock contended, it rotates the page back to head of
lru in both active/inactive lrus to make them consistent behavior, which
is basic starting point rather than adding more heristic.

Since this patch introduces a new "contended" field as out-param along
with try_lock in-param in rmap_walk_control, it's not immutable any longer
if the try_lock is set so remove const keywords on rmap related functions.
Since rmap walking is already expensive operation, I doubt the const
would help sizable benefit( And we didn't have it until 5.17).

In a heavy app workload in Android, trace shows following statistics.  It
almost removes rmap lock contention from reclaim path.

Martin Liu reported:

Before:

   max_dur(ms)  min_dur(ms)  max-min(dur)ms  avg_dur(ms)  sum_dur(ms)  count blocked_function
         1632            0            1631   151.542173        31672    209  page_lock_anon_vma_read
          601            0             601   145.544681        28817    198  rmap_walk_file

After:

   max_dur(ms)  min_dur(ms)  max-min(dur)ms  avg_dur(ms)  sum_dur(ms)  count blocked_function
          NaN          NaN              NaN          NaN          NaN    0.0             NaN
            0            0                0     0.127645            1     12  rmap_walk_file

[minchan@kernel.org: add comment, per Matthew]
  Link: https://lkml.kernel.org/r/YnNqeB5tUf6LZ57b@google.com
Link: https://lkml.kernel.org/r/20220510215423.164547-1-minchan@kernel.org
Signed-off-by: Minchan Kim <minchan@kernel.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: John Dias <joaodias@google.com>
Cc: Tim Murray <timmurray@google.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Martin Liu <liumartin@google.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-19 14:08:54 -07:00
Johannes Weiner
f4840ccfca zswap: memcg accounting
Applications can currently escape their cgroup memory containment when
zswap is enabled.  This patch adds per-cgroup tracking and limiting of
zswap backend memory to rectify this.

The existing cgroup2 memory.stat file is extended to show zswap statistics
analogous to what's in meminfo and vmstat.  Furthermore, two new control
files, memory.zswap.current and memory.zswap.max, are added to allow
tuning zswap usage on a per-workload basis.  This is important since not
all workloads benefit from zswap equally; some even suffer compared to
disk swap when memory contents don't compress well.  The optimal size of
the zswap pool, and the threshold for writeback, also depends on the size
of the workload's warm set.

The implementation doesn't use a traditional page_counter transaction. 
zswap is unconventional as a memory consumer in that we only know the
amount of memory to charge once expensive compression has occurred.  If
zwap is disabled or the limit is already exceeded we obviously don't want
to compress page upon page only to reject them all.  Instead, the limit is
checked against current usage, then we compress and charge.  This allows
some limit overrun, but not enough to matter in practice.

[hannes@cmpxchg.org: fix for CONFIG_SLOB builds]
  Link: https://lkml.kernel.org/r/YnwD14zxYjUJPc2w@cmpxchg.org
[hannes@cmpxchg.org: opt out of cgroups v1]
  Link: https://lkml.kernel.org/r/Yn6it9mBYFA+/lTb@cmpxchg.org
Link: https://lkml.kernel.org/r/20220510152847.230957-7-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Roman Gushchin <guro@fb.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Seth Jennings <sjenning@redhat.com>
Cc: Dan Streetman <ddstreet@ieee.org>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-19 14:08:53 -07:00
Johannes Weiner
f6498b776d mm: zswap: add basic meminfo and vmstat coverage
Currently it requires poking at debugfs to figure out the size and
population of the zswap cache on a host.  There are no counters for reads
and writes against the cache.  As a result, it's difficult to understand
zswap behavior on production systems.

Print zswap memory consumption and how many pages are zswapped out in
/proc/meminfo.  Count zswapouts and zswapins in /proc/vmstat.

Link: https://lkml.kernel.org/r/20220510152847.230957-6-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Dan Streetman <ddstreet@ieee.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Roman Gushchin <guro@fb.com>
Cc: Seth Jennings <sjenning@redhat.com>
Cc: Shakeel Butt <shakeelb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-19 14:08:53 -07:00
Miaohe Lin
ff351f4bb9 mm/swap: fix comment about swap extent
Since commit 4efaceb1c5 ("mm, swap: use rbtree for swap_extent"), rbtree
is used for swap extent.  Also curr_swap_extent is removed at that time. 
Update the corresponding comment.

Link: https://lkml.kernel.org/r/20220509131416.17553-16-linmiaohe@huawei.com
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: NeilBrown <neilb@suse.de>
Cc: Peter Xu <peterx@redhat.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Oscar Salvador <osalvador@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-19 14:08:52 -07:00
Miaohe Lin
a930c210c4 mm/swap: fix the obsolete comment for SWP_TYPE_SHIFT
Since commit 3159f943aa ("xarray: Replace exceptional entries"), there
is only one bit of 'type' can be shifted up.  Update the corresponding
comment.

Link: https://lkml.kernel.org/r/20220509131416.17553-13-linmiaohe@huawei.com
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: NeilBrown <neilb@suse.de>
Cc: Peter Xu <peterx@redhat.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Oscar Salvador <osalvador@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-19 14:08:52 -07:00
Miaohe Lin
3db3264d8a mm/swap: make page_swapcount and __lru_add_drain_all static
Make page_swapcount and __lru_add_drain_all static.  They are only used
within the file now.

Link: https://lkml.kernel.org/r/20220509131416.17553-9-linmiaohe@huawei.com
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: NeilBrown <neilb@suse.de>
Cc: Peter Xu <peterx@redhat.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Oscar Salvador <osalvador@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-19 14:08:51 -07:00
Miaohe Lin
bc4a68adb1 mm/swap: remove unneeded return value of free_swap_slot
The return value of free_swap_slot is always 0 and also ignored now. 
Remove it to clean up the code.

Link: https://lkml.kernel.org/r/20220509131416.17553-5-linmiaohe@huawei.com
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: NeilBrown <neilb@suse.de>
Cc: Peter Xu <peterx@redhat.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-19 14:08:50 -07:00