Commit graph

1059881 commits

Author SHA1 Message Date
Krzysztof Hałasa
901a52c433 media: Add ADV7610 support for adv7604 driver - DT docs.
ADV7610 is another HDMI receiver chip, very similar to
the ADV7611. Tested on TinyRex BaseBoard Lite.

Signed-off-by: Krzysztof Hałasa <khalasa@piap.pl>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2021-10-19 08:08:38 +01:00
Krzysztof Hałasa
d64a7709a8 media: TDA1997x: replace video detection routine
The TDA1997x (HDMI receiver) driver currently uses a specific video
format detection scheme. The frame (or field in interlaced mode), line
and HSync pulse durations are compared to those of known, standard video
modes. If a match is found, the mode is assumed to be detected,
otherwise -ERANGE is returned (then possibly ignored). This means that:
- another mode with similar timings will be detected incorrectly
  (e.g. 2x faster clock and lines twice as long)
- non-standard modes will not work.

This patch replaces this scheme with a direct read of geometry
registers. This way all modes recognized by the chip are supported.

In interlaced modes, the code assumes the V sync signal has the same
duration for both fields. While this may be not necessarily true,
I can't see any way to get the "other" V sync width. This is most
probably harmless.

All tests have been performed on Gateworks' Ventana GW54xx board, with
a TDA19971 chip.

Signed-off-by: Krzysztof Hałasa <khalasa@piap.pl>
Tested-by: Tim Harvey <tharvey@gateworks.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2021-10-19 08:08:38 +01:00
Hans Verkuil
3ae5c3bc07 media: gspca/gl860-mi1320/ov9655: avoid -Wstring-concatenation warning
Newer clang versions are suspicious of definitions that mix concatenated
strings with comma-separated arrays of strings, this has found real bugs
elsewhere, but this seems to be a false positive:

drivers/media/usb/gspca/gl860/gl860-mi1320.c:62:37: error: suspicious concatenation of string literals in an array initialization; did you
mean to separate the elements with a comma? [-Werror,-Wstring-concatenation]
        "\xd3\x02\xd4\x28\xd5\x01\xd0\x02" "\xd1\x18\xd2\xc1"
                                           ^
                                          ,
drivers/media/usb/gspca/gl860/gl860-mi1320.c:62:2: note: place parentheses around the string literal to silence warning
        "\xd3\x02\xd4\x28\xd5\x01\xd0\x02" "\xd1\x18\xd2\xc1"

Replace the string literals by compound initializers, using normal hex numbers.

Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Reported-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2021-10-19 08:08:38 +01:00
Scott K Logan
1cab969d55 media: saa7134: Add support for Leadtek WinFast HDTV200 H
Similar configuration to Kworld PC150-U.

Tested: Composite, S-Video, NTSC, ATSC
Unsupported: IR remote

Signed-off-by: Scott K Logan <logans@cottsay.net>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2021-10-19 08:08:38 +01:00
Jammy Huang
52fed10ad7 media: aspeed: add debugfs
To show video real-time information as below:

Capture:
  Signal              : Unlock
  Width               : 1920
  Height              : 1080
  FRC                 : 30

Performance:
  Frame#              : 0
  Frame Duration(ms)  :
    Now               : 0
    Min               : 0
    Max               : 0
  FPS                 : 0

[hverkuil: make aspeed_video_proc_open() static, fixes sparse warning]

Signed-off-by: Jammy Huang <jammy_huang@aspeedtech.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2021-10-19 08:08:38 +01:00
Sergey Senozhatsky
67f85135c5 media: videobuf2: always set buffer vb2 pointer
We need to always link allocated vb2_dc_buf back to vb2_buffer because
we dereference vb2 in prepare() and finish() callbacks.

Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Tested-by: Chen-Yu Tsai <wenst@chromium.org>
Acked-by: Tomasz Figa <tfiga@chromium.org>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2021-10-19 08:08:38 +01:00
Vladimir Barinov
cd0e5e8c42 media: rcar-vin: add G/S_PARM ioctls
This adds g/s_parm ioctls for parallel interface.

Signed-off-by: Vladimir Barinov <vladimir.barinov@cogentembedded.com>
Signed-off-by: Nikita Yushchenko <nikita.yoush@cogentembedded.com>
Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2021-10-19 08:08:38 +01:00
Arnd Bergmann
570a82b9c3 media: i2c: select V4L2_ASYNC where needed
I came across a link failure from randconfig builds:

x86_64-linux-ld: drivers/media/i2c/ths8200.o: in function `ths8200_remove':
ths8200.c:(.text+0x491): undefined reference to `v4l2_async_unregister_subdev'
x86_64-linux-ld: drivers/media/i2c/ths8200.o: in function `ths8200_probe':
ths8200.c:(.text+0xe49): undefined reference to `v4l2_async_register_subdev'
x86_64-linux-ld: drivers/media/i2c/tw9910.o: in function `tw9910_remove':
tw9910.c:(.text+0x467): undefined reference to `v4l2_async_unregister_subdev'
x86_64-linux-ld: drivers/media/i2c/tw9910.o: in function `tw9910_probe':
tw9910.c:(.text+0x1123): undefined reference to `v4l2_async_register_subdev'

These clearly lack a 'select' statement, but I don't know why
this started happening only now. I had a bit of a look around to find
other configs that have the same problem, but could not come up with
a reliable way and found nothing else through experimentation.
It is likely that other symbols like these exist that need an extra
select.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2021-10-19 08:08:38 +01:00
Hans Verkuil
112024a3b6 media: vidtv: move kfree(dvb) to vidtv_bridge_dev_release()
Adding kfree(dvb) to vidtv_bridge_remove() will remove the memory
too soon: if an application still has an open filehandle to the device
when the driver is unloaded, then when that filehandle is closed, a
use-after-free access takes place to the freed memory.

Move the kfree(dvb) to vidtv_bridge_dev_release() instead.

Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Fixes: 76e21bb8be ("media: vidtv: Fix memory leak in remove")
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2021-10-19 08:08:19 +01:00
Greg Kroah-Hartman
2b74240be3 First set of counter subsystem new feature support for the 5.16 cycle
Most interesting element this time is the new chrdev based interface
 for the counter subsystem.  Affects all drivers. Some minor precursor
 patches.
 
 Major parts:
 * Bring all the sysfs attribute setup into the counter core rather than
   leaving it to individual drivers.  Docs updates accompany these changes.
 * Move various definitions to a uapi header as now needed from userspace.
 * Add the chardev interface + extensive documentation and example tool
 * Add new ABI needed to identify indexes needed for chrdev interface
 * Implement new interface for the 104-quad-8
 * Follow up deals with wrong path for documentation build
 * Various trivial cleanups and missing feature additions related to this
   series
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCAAvFiEEbilms4eEBlKRJoGxVIU0mcT0FogFAmFtvWMRHGppYzIzQGtl
 cm5lbC5vcmcACgkQVIU0mcT0FohYnQ//U58yVdWHoav81bm08xl38Hv3F1O9Gk1E
 1pWVb2FNn065q9ES9n0YB1UjECRU/4zvEw+5S+vRqfXlbQiUMVvgUSDWWokzK4Zf
 rMxRFeYjzS89A8b+VoTLSxOn5wAWUc67v29HMDl1MiQHG6gP9UW7mxBbCIeIU9Et
 BhNpo32IS9Qu9LeoaPsRz8yEfAyBheLGMfrlJAYy6ENTh4EMdyhzLvVlJXiE+rm7
 8KwFU7XVTdIG8JQxJGSs9/CMd6CGhKaBr+B6LG4rnSZTy5k41a1DXpIBQU24nLns
 ofX+KllTCnuHW5g+rqi05FkNApCN06RCV7FgUKmqOCa4vbXVEF4mFXejkQo+mgIS
 zHy7lLfkiYVNprpKRKZv5lzj1go8cl4OB0F5odyaiRVh8KeCS9UbYe5fV3vLTiY9
 2/yhmW/dbaogM0bFsx8pbWdn84MiBNXT/PvyqJ+I0EfndDZcmrl0mPWQlAS31QPC
 3aS2wKyBt2pjTUQIaPq80xPeWAAVCVQ4IxrV5SLT8BUxJc6YwZUDEMAM+/3oEhFC
 f1/fxqpcctoFEauJXrggiQWpcVbYkbA9RzF+ecUT+3ngABlbf0KOI0+RZnn6GVvJ
 UVM99Dy0q5utkzFXsHZ4FjXhxiW9LcbvU27KbyrZjKJjOoQAs9Ky5uhjt5DKXh4B
 KSofs8zHEs8=
 =AaeD
 -----END PGP SIGNATURE-----

Merge tag 'counter-for-5.16a-take2' of https://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into char-misc-next

Jonathan writes:

First set of counter subsystem new feature support for the 5.16 cycle

Most interesting element this time is the new chrdev based interface
for the counter subsystem.  Affects all drivers. Some minor precursor
patches.

Major parts:
* Bring all the sysfs attribute setup into the counter core rather than
  leaving it to individual drivers.  Docs updates accompany these changes.
* Move various definitions to a uapi header as now needed from userspace.
* Add the chardev interface + extensive documentation and example tool
* Add new ABI needed to identify indexes needed for chrdev interface
* Implement new interface for the 104-quad-8
* Follow up deals with wrong path for documentation build
* Various trivial cleanups and missing feature additions related to this
  series

* tag 'counter-for-5.16a-take2' of https://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio:
  docs: counter: Include counter-chrdev kernel-doc to generic-counter.rst
  counter: fix docum. build problems after filename change
  counter: microchip-tcb-capture: Tidy up a false kernel-doc /** marking.
  counter: 104-quad-8: Add IRQ support for the ACCES 104-QUAD-8
  counter: 104-quad-8: Replace mutex with spinlock
  counter: Implement events_queue_size sysfs attribute
  counter: Implement *_component_id sysfs attributes
  counter: Implement signalZ_action_component_id sysfs attribute
  tools/counter: Create Counter tools
  docs: counter: Document character device interface
  counter: Add character device interface
  counter: Move counter enums to uapi header
  docs: counter: Update to reflect sysfs internalization
  counter: Update counter.h comments to reflect sysfs internalization
  counter: Internalize sysfs interface code
  counter: stm32-timer-cnt: Provide defines for slave mode selection
  counter: stm32-lptimer-cnt: Provide defines for clock polarities
2021-10-19 09:08:16 +02:00
Andrej Shadura
362d5dfc48 mailmap: add Andrej Shadura
Add a mapping for my old work email for BelDisplayTech to the personal
email, and make sure the Collabora email has the correct spelling of the
first name.

Link: https://lkml.kernel.org/r/20210917091016.30232-1-andrew.shadura@collabora.co.uk
Signed-off-by: Andrej Shadura <andrew.shadura@collabora.co.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-18 20:22:03 -10:00
Marek Szyprowski
1ca7554d05 mm/thp: decrease nr_thps in file's mapping on THP split
Decrease nr_thps counter in file's mapping to ensure that the page cache
won't be dropped excessively on file write access if page has been
already split.

I've tried a test scenario running a big binary, kernel remaps it with
THPs, then force a THP split with /sys/kernel/debug/split_huge_pages.
During any further open of that binary with O_RDWR or O_WRITEONLY kernel
drops page cache for it, because of non-zero thps counter.

Link: https://lkml.kernel.org/r/20211012120237.2600-1-m.szyprowski@samsung.com
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Fixes: 09d91cda0e ("mm,thp: avoid writes to file with THP in pagecache")
Fixes: 06d3eff62d ("mm/thp: fix node page state in split_huge_page_to_list()")
Acked-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Yang Shi <shy828301@gmail.com>
Cc: <sfoon.kim@samsung.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Hillf Danton <hdanton@sina.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: William Kucharski <william.kucharski@oracle.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-18 20:22:03 -10:00
Sean Christopherson
79f9bc5843 mm/secretmem: fix NULL page->mapping dereference in page_is_secretmem()
Check for a NULL page->mapping before dereferencing the mapping in
page_is_secretmem(), as the page's mapping can be nullified while gup()
is running, e.g.  by reclaim or truncation.

  BUG: kernel NULL pointer dereference, address: 0000000000000068
  #PF: supervisor read access in kernel mode
  #PF: error_code(0x0000) - not-present page
  PGD 0 P4D 0
  Oops: 0000 [#1] PREEMPT SMP NOPTI
  CPU: 6 PID: 4173897 Comm: CPU 3/KVM Tainted: G        W
  RIP: 0010:internal_get_user_pages_fast+0x621/0x9d0
  Code: <48> 81 7a 68 80 08 04 bc 0f 85 21 ff ff 8 89 c7 be
  RSP: 0018:ffffaa90087679b0 EFLAGS: 00010046
  RAX: ffffe3f37905b900 RBX: 00007f2dd561e000 RCX: ffffe3f37905b934
  RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffe3f37905b900
  ...
  CR2: 0000000000000068 CR3: 00000004c5898003 CR4: 00000000001726e0
  Call Trace:
   get_user_pages_fast_only+0x13/0x20
   hva_to_pfn+0xa9/0x3e0
   try_async_pf+0xa1/0x270
   direct_page_fault+0x113/0xad0
   kvm_mmu_page_fault+0x69/0x680
   vmx_handle_exit+0xe1/0x5d0
   kvm_arch_vcpu_ioctl_run+0xd81/0x1c70
   kvm_vcpu_ioctl+0x267/0x670
   __x64_sys_ioctl+0x83/0xa0
   do_syscall_64+0x56/0x80
   entry_SYSCALL_64_after_hwframe+0x44/0xae

Link: https://lkml.kernel.org/r/20211007231502.3552715-1-seanjc@google.com
Fixes: 1507f51255 ("mm: introduce memfd_secret system call to create "secret" memory areas")
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reported-by: Darrick J. Wong <djwong@kernel.org>
Reported-by: Stephen <stephenackerman16@gmail.com>
Tested-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-18 20:22:03 -10:00
Matthew Wilcox (Oracle)
032146cda8 vfs: check fd has read access in kernel_read_file_from_fd()
If we open a file without read access and then pass the fd to a syscall
whose implementation calls kernel_read_file_from_fd(), we get a warning
from __kernel_read():

        if (WARN_ON_ONCE(!(file->f_mode & FMODE_READ)))

This currently affects both finit_module() and kexec_file_load(), but it
could affect other syscalls in the future.

Link: https://lkml.kernel.org/r/20211007220110.600005-1-willy@infradead.org
Fixes: b844f0ecbc ("vfs: define kernel_copy_file_from_fd()")
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reported-by: Hao Sun <sunhao.th@gmail.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Mimi Zohar <zohar@linux.ibm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-18 20:22:03 -10:00
Lukas Bulwahn
b0e901280d elfcore: correct reference to CONFIG_UML
Commit 6e7b64b9dd ("elfcore: fix building with clang") introduces
special handling for two architectures, ia64 and User Mode Linux.
However, the wrong name, i.e., CONFIG_UM, for the intended Kconfig
symbol for User-Mode Linux was used.

Although the directory for User Mode Linux is ./arch/um; the Kconfig
symbol for this architecture is called CONFIG_UML.

Luckily, ./scripts/checkkconfigsymbols.py warns on non-existing configs:

  UM
  Referencing files: include/linux/elfcore.h
  Similar symbols: UML, NUMA

Correct the name of the config to the intended one.

[akpm@linux-foundation.org: fix um/x86_64, per Catalin]
  Link: https://lkml.kernel.org/r/20211006181119.2851441-1-catalin.marinas@arm.com
  Link: https://lkml.kernel.org/r/YV6pejGzLy5ppEpt@arm.com

Link: https://lkml.kernel.org/r/20211006082209.417-1-lukas.bulwahn@gmail.com
Fixes: 6e7b64b9dd ("elfcore: fix building with clang")
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Barret Rhoden <brho@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-18 20:22:03 -10:00
Miaohe Lin
3ddd60268c mm, slub: fix incorrect memcg slab count for bulk free
kmem_cache_free_bulk() will call memcg_slab_free_hook() for all objects
when doing bulk free.  So we shouldn't call memcg_slab_free_hook() again
for bulk free to avoid incorrect memcg slab count.

Link: https://lkml.kernel.org/r/20210916123920.48704-6-linmiaohe@huawei.com
Fixes: d1b2cf6cb8 ("mm: memcg/slab: uncharge during kmem_cache_free_bulk()")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Bharata B Rao <bharata@linux.ibm.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Faiyaz Mohammed <faiyazm@codeaurora.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Roman Gushchin <guro@fb.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-18 20:22:03 -10:00
Miaohe Lin
67823a5444 mm, slub: fix potential use-after-free in slab_debugfs_fops
When sysfs_slab_add failed, we shouldn't call debugfs_slab_add() for s
because s will be freed soon.  And slab_debugfs_fops will use s later
leading to a use-after-free.

Link: https://lkml.kernel.org/r/20210916123920.48704-5-linmiaohe@huawei.com
Fixes: 64dd68497b ("mm: slub: move sysfs slab alloc/free interfaces to debugfs")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Bharata B Rao <bharata@linux.ibm.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Faiyaz Mohammed <faiyazm@codeaurora.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Roman Gushchin <guro@fb.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-18 20:22:03 -10:00
Miaohe Lin
9037c57681 mm, slub: fix potential memoryleak in kmem_cache_open()
In error path, the random_seq of slub cache might be leaked.  Fix this
by using __kmem_cache_release() to release all the relevant resources.

Link: https://lkml.kernel.org/r/20210916123920.48704-4-linmiaohe@huawei.com
Fixes: 210e7a43fa ("mm: SLUB freelist randomization")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Bharata B Rao <bharata@linux.ibm.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Faiyaz Mohammed <faiyazm@codeaurora.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Roman Gushchin <guro@fb.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-18 20:22:03 -10:00
Miaohe Lin
899447f669 mm, slub: fix mismatch between reconstructed freelist depth and cnt
If object's reuse is delayed, it will be excluded from the reconstructed
freelist.  But we forgot to adjust the cnt accordingly.  So there will
be a mismatch between reconstructed freelist depth and cnt.  This will
lead to free_debug_processing() complaining about freelist count or a
incorrect slub inuse count.

Link: https://lkml.kernel.org/r/20210916123920.48704-3-linmiaohe@huawei.com
Fixes: c3895391df ("kasan, slub: fix handling of kasan_slab_free hook")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Bharata B Rao <bharata@linux.ibm.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Faiyaz Mohammed <faiyazm@codeaurora.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Roman Gushchin <guro@fb.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-18 20:22:03 -10:00
Miaohe Lin
2127d22509 mm, slub: fix two bugs in slab_debug_trace_open()
Patch series "Fixups for slub".

This series contains various bug fixes for slub.  We fix memoryleak,
use-afer-free, NULL pointer dereferencing and so on in slub.  More
details can be found in the respective changelogs.

This patch (of 5):

It's possible that __seq_open_private() will return NULL.  So we should
check it before using lest dereferencing NULL pointer.  And in error
paths, we forgot to release private buffer via seq_release_private().
Memory will leak in these paths.

Link: https://lkml.kernel.org/r/20210916123920.48704-1-linmiaohe@huawei.com
Link: https://lkml.kernel.org/r/20210916123920.48704-2-linmiaohe@huawei.com
Fixes: 64dd68497b ("mm: slub: move sysfs slab alloc/free interfaces to debugfs")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Faiyaz Mohammed <faiyazm@codeaurora.org>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Bharata B Rao <bharata@linux.ibm.com>
Cc: Roman Gushchin <guro@fb.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-18 20:22:03 -10:00
Eric Dumazet
6d2aec9e12 mm/mempolicy: do not allow illegal MPOL_F_NUMA_BALANCING | MPOL_LOCAL in mbind()
syzbot reported access to unitialized memory in mbind() [1]

Issue came with commit bda420b985 ("numa balancing: migrate on fault
among multiple bound nodes")

This commit added a new bit in MPOL_MODE_FLAGS, but only checked valid
combination (MPOL_F_NUMA_BALANCING can only be used with MPOL_BIND) in
do_set_mempolicy()

This patch moves the check in sanitize_mpol_flags() so that it is also
used by mbind()

  [1]
  BUG: KMSAN: uninit-value in __mpol_equal+0x567/0x590 mm/mempolicy.c:2260
   __mpol_equal+0x567/0x590 mm/mempolicy.c:2260
   mpol_equal include/linux/mempolicy.h:105 [inline]
   vma_merge+0x4a1/0x1e60 mm/mmap.c:1190
   mbind_range+0xcc8/0x1e80 mm/mempolicy.c:811
   do_mbind+0xf42/0x15f0 mm/mempolicy.c:1333
   kernel_mbind mm/mempolicy.c:1483 [inline]
   __do_sys_mbind mm/mempolicy.c:1490 [inline]
   __se_sys_mbind+0x437/0xb80 mm/mempolicy.c:1486
   __x64_sys_mbind+0x19d/0x200 mm/mempolicy.c:1486
   do_syscall_x64 arch/x86/entry/common.c:51 [inline]
   do_syscall_64+0x54/0xd0 arch/x86/entry/common.c:82
   entry_SYSCALL_64_after_hwframe+0x44/0xae

  Uninit was created at:
   slab_alloc_node mm/slub.c:3221 [inline]
   slab_alloc mm/slub.c:3230 [inline]
   kmem_cache_alloc+0x751/0xff0 mm/slub.c:3235
   mpol_new mm/mempolicy.c:293 [inline]
   do_mbind+0x912/0x15f0 mm/mempolicy.c:1289
   kernel_mbind mm/mempolicy.c:1483 [inline]
   __do_sys_mbind mm/mempolicy.c:1490 [inline]
   __se_sys_mbind+0x437/0xb80 mm/mempolicy.c:1486
   __x64_sys_mbind+0x19d/0x200 mm/mempolicy.c:1486
   do_syscall_x64 arch/x86/entry/common.c:51 [inline]
   do_syscall_64+0x54/0xd0 arch/x86/entry/common.c:82
   entry_SYSCALL_64_after_hwframe+0x44/0xae
  =====================================================
  Kernel panic - not syncing: panic_on_kmsan set ...
  CPU: 0 PID: 15049 Comm: syz-executor.0 Tainted: G    B             5.15.0-rc2-syzkaller #0
  Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
  Call Trace:
   __dump_stack lib/dump_stack.c:88 [inline]
   dump_stack_lvl+0x1ff/0x28e lib/dump_stack.c:106
   dump_stack+0x25/0x28 lib/dump_stack.c:113
   panic+0x44f/0xdeb kernel/panic.c:232
   kmsan_report+0x2ee/0x300 mm/kmsan/report.c:186
   __msan_warning+0xd7/0x150 mm/kmsan/instrumentation.c:208
   __mpol_equal+0x567/0x590 mm/mempolicy.c:2260
   mpol_equal include/linux/mempolicy.h:105 [inline]
   vma_merge+0x4a1/0x1e60 mm/mmap.c:1190
   mbind_range+0xcc8/0x1e80 mm/mempolicy.c:811
   do_mbind+0xf42/0x15f0 mm/mempolicy.c:1333
   kernel_mbind mm/mempolicy.c:1483 [inline]
   __do_sys_mbind mm/mempolicy.c:1490 [inline]
   __se_sys_mbind+0x437/0xb80 mm/mempolicy.c:1486
   __x64_sys_mbind+0x19d/0x200 mm/mempolicy.c:1486
   do_syscall_x64 arch/x86/entry/common.c:51 [inline]
   do_syscall_64+0x54/0xd0 arch/x86/entry/common.c:82
   entry_SYSCALL_64_after_hwframe+0x44/0xae

Link: https://lkml.kernel.org/r/20211001215630.810592-1-eric.dumazet@gmail.com
Fixes: bda420b985 ("numa balancing: migrate on fault among multiple bound nodes")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Cc: "Huang, Ying" <ying.huang@intel.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-18 20:22:03 -10:00
Peng Fan
5173ed72bc memblock: check memory total_size
mem=[X][G|M] is broken on ARM64 platform, there are cases that even
type.cnt is 1, but total_size is not 0 because regions are merged into
1.  So only check 'cnt' is not enough, total_size should be used,
othersize bootargs 'mem=[X][G|B]' not work anymore.

Link: https://lkml.kernel.org/r/20210930024437.32598-1-peng.fan@oss.nxp.com
Fixes: e888fa7bb8 ("memblock: Check memory add/cap ordering")
Signed-off-by: Peng Fan <peng.fan@nxp.com>
Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: David Hildenbrand <david@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-18 20:22:03 -10:00
Valentin Vidic
b15fa9224e ocfs2: mount fails with buffer overflow in strlen
Starting with kernel 5.11 built with CONFIG_FORTIFY_SOURCE mouting an
ocfs2 filesystem with either o2cb or pcmk cluster stack fails with the
trace below.  Problem seems to be that strings for cluster stack and
cluster name are not guaranteed to be null terminated in the disk
representation, while strlcpy assumes that the source string is always
null terminated.  This causes a read outside of the source string
triggering the buffer overflow detection.

  detected buffer overflow in strlen
  ------------[ cut here ]------------
  kernel BUG at lib/string.c:1149!
  invalid opcode: 0000 [#1] SMP PTI
  CPU: 1 PID: 910 Comm: mount.ocfs2 Not tainted 5.14.0-1-amd64 #1
    Debian 5.14.6-2
  RIP: 0010:fortify_panic+0xf/0x11
  ...
  Call Trace:
   ocfs2_initialize_super.isra.0.cold+0xc/0x18 [ocfs2]
   ocfs2_fill_super+0x359/0x19b0 [ocfs2]
   mount_bdev+0x185/0x1b0
   legacy_get_tree+0x27/0x40
   vfs_get_tree+0x25/0xb0
   path_mount+0x454/0xa20
   __x64_sys_mount+0x103/0x140
   do_syscall_64+0x3b/0xc0
   entry_SYSCALL_64_after_hwframe+0x44/0xae

Link: https://lkml.kernel.org/r/20210929180654.32460-1-vvidic@valentin-vidic.from.hr
Signed-off-by: Valentin Vidic <vvidic@valentin-vidic.from.hr>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-18 20:22:03 -10:00
Jan Kara
5314454ea3 ocfs2: fix data corruption after conversion from inline format
Commit 6dbf7bb555 ("fs: Don't invalidate page buffers in
block_write_full_page()") uncovered a latent bug in ocfs2 conversion
from inline inode format to a normal inode format.

The code in ocfs2_convert_inline_data_to_extents() attempts to zero out
the whole cluster allocated for file data by grabbing, zeroing, and
dirtying all pages covering this cluster.  However these pages are
beyond i_size, thus writeback code generally ignores these dirty pages
and no blocks were ever actually zeroed on the disk.

This oversight was fixed by commit 693c241a5f ("ocfs2: No need to zero
pages past i_size.") for standard ocfs2 write path, inline conversion
path was apparently forgotten; the commit log also has a reasoning why
the zeroing actually is not needed.

After commit 6dbf7bb555, things became worse as writeback code stopped
invalidating buffers on pages beyond i_size and thus these pages end up
with clean PageDirty bit but with buffers attached to these pages being
still dirty.  So when a file is converted from inline format, then
writeback triggers, and then the file is grown so that these pages
become valid, the invalid dirtiness state is preserved,
mark_buffer_dirty() does nothing on these pages (buffers are already
dirty) but page is never written back because it is clean.  So data
written to these pages is lost once pages are reclaimed.

Simple reproducer for the problem is:

  xfs_io -f -c "pwrite 0 2000" -c "pwrite 2000 2000" -c "fsync" \
    -c "pwrite 4000 2000" ocfs2_file

After unmounting and mounting the fs again, you can observe that end of
'ocfs2_file' has lost its contents.

Fix the problem by not doing the pointless zeroing during conversion
from inline format similarly as in the standard write path.

[akpm@linux-foundation.org: fix whitespace, per Joseph]

Link: https://lkml.kernel.org/r/20210930095405.21433-1-jack@suse.cz
Fixes: 6dbf7bb555 ("fs: Don't invalidate page buffers in block_write_full_page()")
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Tested-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Acked-by: Gang He <ghe@suse.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Jun Piao <piaojun@huawei.com>
Cc: "Markov, Andrey" <Markov.Andrey@Dell.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-18 20:22:03 -10:00
Huang Ying
a6a0251c6f mm/migrate: fix CPUHP state to update node demotion order
The node demotion order needs to be updated during CPU hotplug.  Because
whether a NUMA node has CPU may influence the demotion order.  The
update function should be called during CPU online/offline after the
node_states[N_CPU] has been updated.  That is done in
CPUHP_AP_ONLINE_DYN during CPU online and in CPUHP_MM_VMSTAT_DEAD during
CPU offline.  But in commit 884a6e5d1f ("mm/migrate: update node
demotion order on hotplug events"), the function to update node demotion
order is called in CPUHP_AP_ONLINE_DYN during CPU online/offline.  This
doesn't satisfy the order requirement.

For example, there are 4 CPUs (P0, P1, P2, P3) in 2 sockets (P0, P1 in S0
and P2, P3 in S1), the demotion order is

 - S0 -> NUMA_NO_NODE
 - S1 -> NUMA_NO_NODE

After P2 and P3 is offlined, because S1 has no CPU now, the demotion
order should have been changed to

 - S0 -> S1
 - S1 -> NO_NODE

but it isn't changed, because the order updating callback for CPU
hotplug doesn't see the new nodemask.  After that, if P1 is offlined,
the demotion order is changed to the expected order as above.

So in this patch, we added CPUHP_AP_MM_DEMOTION_ONLINE and
CPUHP_MM_DEMOTION_DEAD to be called after CPUHP_AP_ONLINE_DYN and
CPUHP_MM_VMSTAT_DEAD during CPU online and offline, and register the
update function on them.

Link: https://lkml.kernel.org/r/20210929060351.7293-1-ying.huang@intel.com
Fixes: 884a6e5d1f ("mm/migrate: update node demotion order on hotplug events")
Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Wei Xu <weixugc@google.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: David Rientjes <rientjes@google.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Keith Busch <kbusch@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-18 20:22:03 -10:00
Dave Hansen
76af6a054d mm/migrate: add CPU hotplug to demotion #ifdef
Once upon a time, the node demotion updates were driven solely by memory
hotplug events.  But now, there are handlers for both CPU and memory
hotplug.

However, the #ifdef around the code checks only memory hotplug.  A
system that has HOTPLUG_CPU=y but MEMORY_HOTPLUG=n would miss CPU
hotplug events.

Update the #ifdef around the common code.  Add memory and CPU-specific
#ifdefs for their handlers.  These memory/CPU #ifdefs avoid unused
function warnings when their Kconfig option is off.

[arnd@arndb.de: rework hotplug_memory_notifier() stub]
  Link: https://lkml.kernel.org/r/20211013144029.2154629-1-arnd@kernel.org

Link: https://lkml.kernel.org/r/20210924161255.E5FE8F7E@davehans-spike.ostc.intel.com
Fixes: 884a6e5d1f ("mm/migrate: update node demotion order on hotplug events")
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: "Huang, Ying" <ying.huang@intel.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Wei Xu <weixugc@google.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: David Rientjes <rientjes@google.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-18 20:22:02 -10:00
Dave Hansen
295be91f7e mm/migrate: optimize hotplug-time demotion order updates
Patch series "mm/migrate: 5.15 fixes for automatic demotion", v2.

This contains two fixes for the "automatic demotion" code which was
merged into 5.15:

 * Fix memory hotplug performance regression by watching
   suppressing any real action on irrelevant hotplug events.

 * Ensure CPU hotplug handler is registered when memory hotplug
   is disabled.

This patch (of 2):

== tl;dr ==

Automatic demotion opted for a simple, lazy approach to handling hotplug
events.  This noticeably slows down memory hotplug[1].  Optimize away
updates to the demotion order when memory hotplug events should have no
effect.

This has no effect on CPU hotplug.  There is no known problem on the CPU
side and any work there will be in a separate series.

== Background ==

Automatic demotion is a memory migration strategy to ensure that new
allocations have room in faster memory tiers on tiered memory systems.
The kernel maintains an array (node_demotion[]) to drive these
migrations.

The node_demotion[] path is calculated by starting at nodes with CPUs
and then "walking" to nodes with memory.  Only hotplug events which
online or offline a node with memory (N_ONLINE) or CPUs (N_CPU) will
actually affect the migration order.

== Problem ==

However, the current code is lazy.  It completely regenerates the
migration order on *any* CPU or memory hotplug event.  The logic was
that these events are extremely rare and that the overhead from
indiscriminate order regeneration is minimal.

Part of the update logic involves a synchronize_rcu(), which is a pretty
big hammer.  Its overhead was large enough to be detected by some 0day
tests that watch memory hotplug performance[1].

== Solution ==

Add a new helper (node_demotion_topo_changed()) which can differentiate
between superfluous and impactful hotplug events.  Skip the expensive
update operation for superfluous events.

== Aside: Locking ==

It took me a few moments to declare the locking to be safe enough for
node_demotion_topo_changed() to work.  It all hinges on the memory
hotplug lock:

During memory hotplug events, 'mem_hotplug_lock' is held for write.
This ensures that two memory hotplug events can not be called
simultaneously.

CPU hotplug has a similar lock (cpuhp_state_mutex) which also provides
mutual exclusion between CPU hotplug events.  In addition, the demotion
code acquire and hold the mem_hotplug_lock for read during its CPU
hotplug handlers.  This provides mutual exclusion between the demotion
memory hotplug callbacks and the CPU hotplug callbacks.

This effectively allows treating the migration target generation code to
act as if it is single-threaded.

1. https://lore.kernel.org/all/20210905135932.GE15026@xsang-OptiPlex-9020/

Link: https://lkml.kernel.org/r/20210924161251.093CCD06@davehans-spike.ostc.intel.com
Link: https://lkml.kernel.org/r/20210924161253.D7673E31@davehans-spike.ostc.intel.com
Fixes: 884a6e5d1f ("mm/migrate: update node demotion order on hotplug events")
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reported-by: kernel test robot <oliver.sang@intel.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: "Huang, Ying" <ying.huang@intel.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Wei Xu <weixugc@google.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: David Rientjes <rientjes@google.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-18 20:22:02 -10:00
Nadav Amit
cb185d5f1e userfaultfd: fix a race between writeprotect and exit_mmap()
A race is possible when a process exits, its VMAs are removed by
exit_mmap() and at the same time userfaultfd_writeprotect() is called.

The race was detected by KASAN on a development kernel, but it appears
to be possible on vanilla kernels as well.

Use mmget_not_zero() to prevent the race as done in other userfaultfd
operations.

Link: https://lkml.kernel.org/r/20210921200247.25749-1-namit@vmware.com
Fixes: 63b2d4174c ("userfaultfd: wp: add the writeprotect API to userfaultfd ioctl")
Signed-off-by: Nadav Amit <namit@vmware.com>
Tested-by: Li  Wang <liwang@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-18 20:22:02 -10:00
Peter Xu
8913970c19 mm/userfaultfd: selftests: fix memory corruption with thp enabled
In RHEL's gating selftests we've encountered memory corruption in the
uffd event test even with upstream kernel:

        # ./userfaultfd anon 128 4
        nr_pages: 32768, nr_pages_per_cpu: 32768
        bounces: 3, mode: rnd racing read, userfaults: 6240 missing (6240) 14729 wp (14729)
        bounces: 2, mode: racing read, userfaults: 1444 missing (1444) 28877 wp (28877)
        bounces: 1, mode: rnd read, userfaults: 6055 missing (6055) 14699 wp (14699)
        bounces: 0, mode: read, userfaults: 82 missing (82) 25196 wp (25196)
        testing uffd-wp with pagemap (pgsize=4096): done
        testing uffd-wp with pagemap (pgsize=2097152): done
        testing events (fork, remap, remove): ERROR: nr 32427 memory corruption 0 1 (errno=0, line=963)
        ERROR: faulting process failed (errno=0, line=1117)

It can be easily reproduced when global thp enabled, which is the
default for RHEL.

It's also known as a side effect of commit 0db282ba2c ("selftest: use
mmap instead of posix_memalign to allocate memory", 2021-07-23), which
is imho right itself on using mmap() to make sure the addresses will be
untagged even on arm.

The problem is, for each test we allocate buffers using two
allocate_area() calls.  We assumed these two buffers won't affect each
other, however they could, because mmap() could have found that the two
buffers are near each other and having the same VMA flags, so they got
merged into one VMA.

It won't be a big problem if thp is not enabled, but when thp is
agressively enabled it means when initializing the src buffer it could
accidentally setup part of the dest buffer too when there's a shared THP
that overlaps the two regions.  Then some of the dest buffer won't be
able to be trapped by userfaultfd missing mode, then it'll cause memory
corruption as described.

To fix it, do release_pages() after initializing the src buffer.

Since the previous two release_pages() calls are after
uffd_test_ctx_clear() which will unmap all the buffers anyway (which is
stronger than release pages; as unmap() also tear town pgtables), drop
them as they shouldn't really be anything useful.

We can mark the Fixes tag upon 0db282ba2c as it's reported to only
happen there, however the real "Fixes" IMHO should be 8ba6e86408, as
before that commit we'll always do explicit release_pages() before
registration of uffd, and 8ba6e86408 changed that logic by adding
extra unmap/map and we didn't release the pages at the right place.
Meanwhile I don't have a solid glue anyway on whether posix_memalign()
could always avoid triggering this bug, hence it's safer to attach this
fix to commit 8ba6e86408.

Link: https://lkml.kernel.org/r/20210923232512.210092-1-peterx@redhat.com
Fixes: 8ba6e86408 ("userfaultfd/selftests: reinitialize test context in each test")
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1994931
Signed-off-by: Peter Xu <peterx@redhat.com>
Reported-by: Li Wang <liwan@redhat.com>
Tested-by: Li Wang <liwang@redhat.com>
Reviewed-by: Axel Rasmussen <axelrasmussen@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-18 20:22:02 -10:00
Takashi Iwai
f917c04fac ALSA: memalloc: Fix a typo in snd_dma_buffer_sync() description
It caused a warning for kernel-doc build.

Fixes: a25684a956 ("ALSA: memalloc: Support for non-contiguous page allocation")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Link: https://lore.kernel.org/r/20211019165402.4fa82c38@canb.auug.org.au
Link: https://lore.kernel.org/r/20211019060536.26089-2-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2021-10-19 08:07:41 +02:00
Takashi Iwai
7d2a0df242 ALSA: memalloc: Drop superfluous snd_dma_buffer_sync() declaration
snd_dma_buffer_sync() is declared twice, and the one outside the ifdef
CONFIG_HAS_DMA could lead to a build error when CONFIG_HAS_DMA=n.
As it's an overlooked leftover after rebase, drop this line.

Fixes: a25684a956 ("ALSA: memalloc: Support for non-contiguous page allocation")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Link: https://lore.kernel.org/r/20211019165402.4fa82c38@canb.auug.org.au
Link: https://lore.kernel.org/r/20211019060536.26089-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2021-10-19 08:07:26 +02:00
Marco Giunta
2966492372 ALSA: usb-audio: Fix microphone sound on Jieli webcam.
When a Jieli Technology USB Webcam is connected, the video part works
well, but the mic sound is speeded up. On dmesg there are messages
about different rates from the runtime rates, warnings about volume
resolution and lastly, the log is filled, every 5 seconds, with
retire_capture_urb error messages.

The mic works only when ep packet size is set to wMaxPacketSize (normal
sound and no more retire_capture_urb error messages). Skipping reading
sample rate, fixes the messages about different rates and forcing a volume
resolution, fixes warnings about volume range. I have arbitrarily choosed
the value (16): I read in a comment that there should be no more than 255
levels, so 4096 (max volume) / 16 = 0-255.

Signed-off-by: Marco Giunta <giun7a@gmail.com>
Link: https://lore.kernel.org/r/20211018162552.12082-1-giun7a@gmail.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2021-10-19 08:07:01 +02:00
Max Filippov
bd47cdb789 xtensa: move section symbols to asm/sections.h
Introduce asm/sections.h and move section declarations to this header
from setup.c. Assign section symbols char array type uniformly and drop
address operator from section symbol references in code.
Sort headers in setup.c while at it.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2021-10-18 22:19:35 -07:00
Max Filippov
431d1a34df xtensa: remove unused variable wmask
After a cleanup the function show_regs doesn't use variable wmask but
still computes it. Drop it.

Fixes: 8d7e8240e6 ("[XTENSA] Clean up elf-gregset.")
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2021-10-18 22:19:35 -07:00
Max Filippov
da0a4e5c8f xtensa: only build windowed register support code when needed
There's no need in window overflow/underflow/alloca exception handlers
or window spill code when neither kernel nor userspace support windowed
registers. Don't build or link it.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2021-10-18 22:19:35 -07:00
Max Filippov
09af39f649 xtensa: use register window specific opcodes only when present
xtensa core may be configured without register windows support, don't
use register window specific opcodes in that case. Use window register
specific opcodes to initialize hardware or reset core to a known state
regardless of the chosen ABI.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2021-10-18 22:19:35 -07:00
Max Filippov
0b5372570b xtensa: implement call0 ABI support in assembly
Replace hardcoded register and opcode names with ABI-agnostic macros.
Add register save/restore code where necessary. Conditionalize windowed
only or call0 only code. Add stack initialization matching _switch_to
epilogue to copy_thread.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2021-10-18 22:19:35 -07:00
Max Filippov
5cce39b6aa xtensa: definitions for call0 ABI
Add assembly macros for calls, call arguments, preserved registers,
function entry and return for windowed and call0 ABIs.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2021-10-18 22:19:34 -07:00
Max Filippov
61a6b91283 xtensa: don't use a12 in __xtensa_copy_user in call0 ABI
a12 is callee-saved register in xtensa call0 ABI, so a function must not
change it. The main unaligned copy loop of __xtensa_copy_user uses all
low-numbered registers, so a register must be spilled to avoid using a12
as a loop counter.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2021-10-18 22:19:34 -07:00
Max Filippov
d191323bc0 xtensa: don't use a12 in strncpy_user
a12 is callee-saved register in xtensa call0 ABI, so a function must not
change it. a10 is not used in this function at all, use it instead of
a12 to avoid saving/restoring it.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2021-10-18 22:19:34 -07:00
Max Filippov
eda8dd1224 xtensa: use a14 instead of a15 in inline assembly
a15 is a frame pointer in the call0 xtensa ABI, don't use it explicitly
in the inline assembly. Use a14 instead, as it has the same properties
as a15 w.r.t. window overflow.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2021-10-18 22:19:34 -07:00
Max Filippov
e369953a5b xtensa: move _SimulateUserKernelVectorException out of WindowVectors
In configurations without window registers support the section
.WindowVectors.text may never be linked.
_SimulateUserKernelVectorException is a common handler for high priority
interrupts, it does not belong in that section anyway. Move it out of
that section and mark it as __XTENSA_HANDLER so it gets bundled with
other vector helpers.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2021-10-18 22:19:34 -07:00
Adrian Hunter
4e5483b844 scsi: ufs: ufs-pci: Force a full restore after suspend-to-disk
Implement the ->restore() PM operation and set the link to off, which will
force a full reset and restore.  This ensures that Host Performance Booster
is reset after suspend-to-disk.

The Host Performance Booster feature caches logical-to-physical mapping
information in the host memory.  After suspend-to-disk, such information is
not valid, so a full reset and restore is needed.

A full reset and restore is done if the SPM level is 5 or 6, but not for
other SPM levels, so this change fixes those cases.

A full reset and restore also restores base address registers, so that code
is removed.

Link: https://lore.kernel.org/r/20211018151004.284200-2-adrian.hunter@intel.com
Reviewed-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-18 23:19:44 -04:00
Dmitry Bogdanov
4a8f71014b scsi: qla2xxx: Fix unmap of already freed sgl
The sgl is freed in the target stack in target_release_cmd_kref() before
calling qlt_free_cmd() but there is an unmap of sgl in qlt_free_cmd() that
causes a panic if sgl is not yet DMA unmapped:

NIP dma_direct_unmap_sg+0xdc/0x180
LR  dma_direct_unmap_sg+0xc8/0x180
Call Trace:
 ql_dbg_prefix+0x68/0xc0 [qla2xxx] (unreliable)
 dma_unmap_sg_attrs+0x54/0xf0
 qlt_unmap_sg.part.19+0x54/0x1c0 [qla2xxx]
 qlt_free_cmd+0x124/0x1d0 [qla2xxx]
 tcm_qla2xxx_release_cmd+0x4c/0xa0 [tcm_qla2xxx]
 target_put_sess_cmd+0x198/0x370 [target_core_mod]
 transport_generic_free_cmd+0x6c/0x1b0 [target_core_mod]
 tcm_qla2xxx_complete_free+0x6c/0x90 [tcm_qla2xxx]

The sgl may be left unmapped in error cases of response sending.  For
instance, qlt_rdy_to_xfer() maps sgl and exits when session is being
deleted keeping the sgl mapped.

This patch removes use-after-free of the sgl and ensures that the sgl is
unmapped for any command that was not sent to firmware.

Link: https://lore.kernel.org/r/20211018122650.11846-1-d.bogdanov@yadro.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Dmitry Bogdanov <d.bogdanov@yadro.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-18 23:19:44 -04:00
Maor Dickman
d40bfeddac net/mlx5: E-Switch, Increase supported number of forward destinations to 32
Increase supported number of forward destinations in the same rule, local
and remote, from 2 to 32.

Signed-off-by: Maor Dickman <maord@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-18 20:18:10 -07:00
Maor Dickman
408881627f net/mlx5: E-Switch, Use dynamic alloc for dest array
Use dynamic allocation for the dest array in preparation for
the next patch which increase MLX5_MAX_FLOW_FWD_VPORTS and
will cause stack allocation to be bigger than 1024 bytes.

Signed-off-by: Maor Dickman <maord@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-18 20:18:10 -07:00
Maor Gottlieb
da6b0bb0fc net/mlx5: Lag, use steering to select the affinity port in LAG
Use the steering based solution for select the affinity port
when the LAG mode is based on hash policy and the device support
in port selection flow table.

Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-18 20:18:10 -07:00
Maor Gottlieb
b7267869e9 net/mlx5: Lag, add support to create/destroy/modify port selection
Add create function, build the steering tables, TTC and definers
according to the LAG hash type.

The destroy function, destroys all the steering components.

The modify functions is used when the bond mapping changes and it
iterates over all the rules in the definers and modifies them to steer
the packet to the relevant active ports.

Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-18 20:18:09 -07:00
Maor Gottlieb
8e25a2bc66 net/mlx5: Lag, add support to create TTC tables for LAG port selection
Add support to create inner and outer TTC tables for LAG port
selection. These tables are used to classify the packets in
order to select the related definer.

Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-18 20:18:09 -07:00
Maor Gottlieb
dc48516ec7 net/mlx5: Lag, add support to create definers for LAG
Every definer will consist of a flow table with a single hash group
with exactly two flow table entries, one for each device port.
The destination of these entries is the uplink vport according to the
port state and hash policy.

Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-18 20:18:09 -07:00