LLVM's integrated assembler does not like comments within macros:
<instantiation>:3:19: error: too many positional arguments
GR_NUM b2, 1 /* Base register */
^
Remove them, since they are obvious anyway.
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Use local labels in .set directives to avoid potential compile errors
with LTO + clang. See commit 334865b291 ("x86/extable: Prefer local
labels in .set directives") for further details.
Since s390 doesn't support LTO currently this doesn't fix a real bug
for now, but helps to avoid problems as soon as required pieces have
been added to llvm.
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Use local labels in .set directives to avoid potential compile errors
with LTO + clang. See commit 334865b291 ("x86/extable: Prefer local
labels in .set directives") for further details.
Since s390 doesn't support LTO currently this doesn't fix a real bug
for now, but helps to avoid problems as soon as required pieces have
been added to llvm.
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Currently many console drivers for s390 rely on panic/reboot notifiers
to invoke callbacks on these events. The panic() function disables local
IRQs, secondary CPUs and preemption, so callbacks invoked on panic are
effectively running in atomic context.
Happens that most of these console callbacks from s390 doesn't take the
proper care with regards to atomic context, like taking spinlocks that
might be taken in other function/CPU and hence will cause a lockup
situation.
The goal for this patch is to improve the notifiers reliability, acting
on 4 console drivers, as detailed below:
(1) con3215: changed a regular spinlock to the trylock alternative.
(2) con3270: also changed a regular spinlock to its trylock counterpart,
but here we also have another problem: raw3270_activate_view() takes a
different spinlock. So, we worked a helper to validate if this other lock
is safe to acquire, and if so, raw3270_activate_view() should be safe.
Notice though that there is a functional change here: it's now possible
to continue the notifier code [reaching con3270_wait_write() and
con3270_rebuild_update()] without executing raw3270_activate_view().
(3) sclp: a global lock is used heavily in the functions called from
the notifier, so we added a check here - if the lock is taken already,
we just bail-out, preventing the lockup.
(4) sclp_vt220: same as (3), a lock validation was added to prevent the
potential lockup problem.
Besides (1)-(4), we also removed useless void functions, adding the
code called from the notifier inside its own body, and changed the
priority of such notifiers to execute late, since they are "heavyweight"
for the panic environment, so we aim to reduce risks here.
Changed return values to NOTIFY_DONE as well, the standard one.
Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Link: https://lore.kernel.org/r/20220427224924.592546-14-gpiccoli@igalia.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
* Account for family 17h event renumberings in AMD PMU emulation
* Remove CPUID leaf 0xA on AMD processors
* Fix lockdep issue with locking all vCPUs
* Fix loss of A/D bits in SPTEs
* Fix syzkaller issue with invalid guest state
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmJ1Vf4UHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroNaUQgAgygZ2KsejlJCYGtEkAsjcpdzmPVL
8j42nWB673/PLZ6GrDXcFnRwQaBIT+0YrES5VHTkTI996d2T/yHII2L4G3DQtUGm
6L3qYqrjJlX2WjbYGvYzkJ6m4EzcstUfPYNO2Qzfvbl2y/wz64HlAhNdymwMX2UU
GPUVoo3EHeobJdZVKFMe7eI6r/uY1/uPdsKqNjnlWI73op+tc7mMRN5+SlQDgQvR
kmzw+Nk0J+PERQO+D+fm1vUdXDQ8hiI7LtTBIUX7rf47IqVlHNHC8frC94PX3W3E
l2sVS+LzRQRqCgFgQ2ay2gYkl078VL8z4A6vWpcWSmaToEYE7VcAnHqb0Q==
=6gt2
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
"x86:
- Account for family 17h event renumberings in AMD PMU emulation
- Remove CPUID leaf 0xA on AMD processors
- Fix lockdep issue with locking all vCPUs
- Fix loss of A/D bits in SPTEs
- Fix syzkaller issue with invalid guest state"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: VMX: Exit to userspace if vCPU has injected exception and invalid state
KVM: SEV: Mark nested locking of vcpu->lock
kvm: x86/cpuid: Only provide CPUID leaf 0xA if host has architectural PMU
KVM: x86/svm: Account for family 17h event renumberings in amd_pmc_perf_hw_id
KVM: x86/mmu: Use atomic XCHG to write TDP MMU SPTEs with volatile bits
KVM: x86/mmu: Move shadow-present check out of spte_has_volatile_bits()
KVM: x86/mmu: Don't treat fully writable SPTEs as volatile (modulo A/D)
When the battery is neither charging or discharging and is not full,
"not-charging" is a useful status description for the case in general.
Currently this state is set as "unknown" by default, expect when this is
explicitly replaced with "not-charging" on a per device or per vendor
basis.
A lot of devices have this state without a BIOS specification available
explicitly describing it. e.g. some current Clevo barebones have a BIOS
setting to stop charging at a user defined battery level.
Signed-off-by: Werner Sembach <wse@tuxedocomputers.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
When building with W=1, we get the following warning:
drivers/acpi/arm64/agdi.c:88:13: warning: no previous prototype for ‘acpi_agdi_init’ [-Wmissing-prototypes]
void __init acpi_agdi_init(void)
Include AGDI driver's header file to pull in the prototype definition
for acpi_agdi_init() to get rid of the compiler warning
Fixes: a2a591fb76 ("ACPI: AGDI: Add driver for Arm Generic Diagnostic Dump and Reset device")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Currently the #define IB_SRQ_INIT_MASK is used to distinguish the
rxe_create_srq verb from the rxe_modify_srq verb so that some code can be
shared between these two subroutines.
This commit splits rxe_srq_chk_attr into two subroutines: rxe_srq_chk_init
and rxe_srq_chk_attr which handle the create_srq and modify_srq verbs
separately.
Link: https://lore.kernel.org/r/20220421014042.26985-2-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
* A fix to relocate the DTB early in boot, in cases where the bootloader
doesn't put the DTB in a region that will end up mapped by the kernel.
This manifests as a crash early in boot on a handful of
configurations.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEEAM520YNJYN/OiG3470yhUCzLq0EFAmJ1TjQTHHBhbG1lckBk
YWJiZWx0LmNvbQAKCRDvTKFQLMurQQLmD/4477/Ax+QlHXJ4stM1m4VaXybzJ1qb
gvjsV/xnOwwLcQU6683B6mm0LQ+LGXfJHwveYwMQs1dsoWEeOjmwpadO5qjfNv55
dbJh/9QtS4BYWHBySvumU2hv3Wyn4kO+7f/bLoxgIR0CupMBslTdP1oo8jFQeHBt
bL/wWUU/MVEg9FAB0t2cSkaJVcxvCkyRk5MYJUok2eHedTpwOwtIGTnSpv00EA8z
V1aDN3Y8MJuqRVSsbN/hGCKB1WrKkP5K7qxnvp5tH+68G3t30zc5NvVtrH+VZNiT
YTod4kf5Y3JlekboNz17O/WDP3BjpP5QBga9K7m64de9vSZjiEtM4Ze4i3A/C1xe
Z/LaizIy1H+1ehA3tWPQH6MwLFVUx4XNVWPQfDhAGtAyXTA5Epl028V9mvvUClfg
l63f9cWEEGsy2DVg9kU9MfzNdro5iJARL/pYUFQCyoEiUQIkl7E1Eh9XXqQwIozK
3rhvRL9DbELYTK65xKUXwBZcnuCBgUPNQvf7/AZ02qPMQzmy+NlWQLKYQvEYCH5U
IbYV0LCwpYAi2tdxgkzv7wkI86m5NLtRRKeD/eZLMEozlkkxVtq1fPeVXTDA2d8N
RdsF16H9CW9Bdf+COyhR4xn2IpxqSV/aVOsjYA0F87D5GUJk+AQp4w7Pm2NdH2fb
yw8HwqrnD8+66A==
=JBfr
-----END PGP SIGNATURE-----
Merge tag 'riscv-for-linus-5.18-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fix from Palmer Dabbelt:
- A fix to relocate the DTB early in boot, in cases where the
bootloader doesn't put the DTB in a region that will end up
mapped by the kernel.
This manifests as a crash early in boot on a handful of
configurations.
* tag 'riscv-for-linus-5.18-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
RISC-V: relocate DTB if it's outside memory region
As reported in:
https://lore.kernel.org/linux-perf-users/20220428122821.3652015-1-tmricht@linux.ibm.com/
the 'instructions:u' event may not be supported. Add a skip using 'perf
record'.
Switch some code away from pipe to make the failures clearer.
Reported-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Richter <tmricht@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: https://lore.kernel.org/r/20220505182505.3313191-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Since the vector length configuration mechanism is identical between SVE
and SME we share large elements of the code including the definition for
the maximum vector length. Unfortunately when we were defining the ABI
for SVE we included not only the actual maximum vector length of 2048
bits but also the value possible if all the bits reserved in the
architecture for expansion of the LEN field were used, 16384 bits.
This starts creating problems if we try to allocate anything for the ZA
matrix based on the maximum possible vector length, as we do for the
regset used with ptrace during the process of generating a core dump.
While the maximum potential size for ZA with the current architecture is
a reasonably managable 64K with the higher reserved limit ZA would be
64M which leads to entirely reasonable complaints from the memory
management code when we try to allocate a buffer of that size. Avoid
these issues by defining the actual maximum vector length for the
architecture and using it for the SME regsets.
Also use the full ZA_PT_SIZE() with the header rather than just the
actual register payload when specifying the size, fixing support for the
largest vector lengths now that we have this new, lower define. With the
SVE maximum this did not cause problems due to the extra headroom we
had.
While we're at it add a comment clarifying why even though ZA is a
single register we tell the regset code that it is a multi-register
regset.
Reported-by: Qian Cai <quic_qiancai@quicinc.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Tested-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Link: https://lore.kernel.org/r/20220505221517.1642014-1-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
As Yanming reported in bugzilla:
https://bugzilla.kernel.org/show_bug.cgi?id=215914
The root cause is: in a very small sized image, it's very easy to
exceed threshold of foreground GC, if we calculate free space and
dirty data based on section granularity, in corner case,
has_not_enough_free_secs() will always return true, result in
deadloop in f2fs_gc().
So this patch refactors has_not_enough_free_secs() as below to fix
this issue:
1. calculate needed space based on block granularity, and separate
all blocks to two parts, section part, and block part, comparing
section part to free section, and comparing block part to free space
in openned log.
2. account F2FS_DIRTY_NODES, F2FS_DIRTY_IMETA and F2FS_DIRTY_DENTS
as node block consumer;
3. account F2FS_DIRTY_DENTS as data block consumer;
Cc: stable@vger.kernel.org
Reported-by: Ming Yan <yanming@tju.edu.cn>
Signed-off-by: Chao Yu <chao.yu@oppo.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
As Yanming reported in bugzilla:
https://bugzilla.kernel.org/show_bug.cgi?id=215894
I have encountered a bug in F2FS file system in kernel v5.17.
I have uploaded the system call sequence as case.c, and a fuzzed image can
be found in google net disk
The kernel should enable CONFIG_KASAN=y and CONFIG_KASAN_INLINE=y. You can
reproduce the bug by running the following commands:
kernel BUG at fs/f2fs/segment.c:2291!
Call Trace:
f2fs_invalidate_blocks+0x193/0x2d0
f2fs_fallocate+0x2593/0x4a70
vfs_fallocate+0x2a5/0xac0
ksys_fallocate+0x35/0x70
__x64_sys_fallocate+0x8e/0xf0
do_syscall_64+0x3b/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae
The root cause is, after image was fuzzed, block mapping info in inode
will be inconsistent with SIT table, so in f2fs_fallocate(), it will cause
panic when updating SIT with invalid blkaddr.
Let's fix the issue by adding sanity check on block address before updating
SIT table with it.
Cc: stable@vger.kernel.org
Reported-by: Ming Yan <yanming@tju.edu.cn>
Signed-off-by: Chao Yu <chao.yu@oppo.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
As Yanming reported in bugzilla:
https://bugzilla.kernel.org/show_bug.cgi?id=215897
I have encountered a bug in F2FS file system in kernel v5.17.
The kernel should enable CONFIG_KASAN=y and CONFIG_KASAN_INLINE=y. You can
reproduce the bug by running the following commands:
The kernel message is shown below:
kernel BUG at fs/f2fs/f2fs.h:2511!
Call Trace:
f2fs_remove_inode_page+0x2a2/0x830
f2fs_evict_inode+0x9b7/0x1510
evict+0x282/0x4e0
do_unlinkat+0x33a/0x540
__x64_sys_unlinkat+0x8e/0xd0
do_syscall_64+0x3b/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae
The root cause is: .total_valid_block_count or .total_valid_node_count
could fuzzed to zero, then once dec_valid_node_count() was called, it
will cause BUG_ON(), this patch fixes to print warning info and set
SBI_NEED_FSCK into CP instead of panic.
Cc: stable@vger.kernel.org
Reported-by: Ming Yan <yanming@tju.edu.cn>
Signed-off-by: Chao Yu <chao.yu@oppo.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
If there's not enough free sections each of which consistis of large segments,
we can hit no free section for upcoming section allocation. Let's reclaim some
prefree segments by writing checkpoints.
Signed-off-by: Byungki Lee <dominicus79@gmail.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
As Yanming reported in bugzilla:
https://bugzilla.kernel.org/show_bug.cgi?id=215904
The kernel message is shown below:
kernel BUG at fs/f2fs/inode.c:825!
Call Trace:
evict+0x282/0x4e0
__dentry_kill+0x2b2/0x4d0
shrink_dentry_list+0x17c/0x4f0
shrink_dcache_parent+0x143/0x1e0
do_one_tree+0x9/0x30
shrink_dcache_for_umount+0x51/0x120
generic_shutdown_super+0x5c/0x3a0
kill_block_super+0x90/0xd0
kill_f2fs_super+0x225/0x310
deactivate_locked_super+0x78/0xc0
cleanup_mnt+0x2b7/0x480
task_work_run+0xc8/0x150
exit_to_user_mode_prepare+0x14a/0x150
syscall_exit_to_user_mode+0x1d/0x40
do_syscall_64+0x48/0x90
The root cause is: inode node and dnode node share the same nid,
so during f2fs_evict_inode(), dnode node truncation will invalidate
its NAT entry, so when truncating inode node, it fails due to
invalid NAT entry, result in inode is still marked as dirty, fix
this issue by clearing dirty for inode and setting SBI_NEED_FSCK
flag in filesystem.
output from dump.f2fs:
[print_node_info: 354] Node ID [0xf:15] is inode
i_nid[0] [0x f : 15]
Cc: stable@vger.kernel.org
Reported-by: Ming Yan <yanming@tju.edu.cn>
Signed-off-by: Chao Yu <chao.yu@oppo.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
F2FS zoned support has power of 2 zone size assumption in many places
such as in __f2fs_issue_discard_zone, init_blkz_info. As the power of 2
requirement has been removed from the block layer, explicitly add a
condition in f2fs to allow only power of 2 zone size devices.
This condition will be relaxed once those calculation based on power of
2 is made generic.
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Pankaj Raghav <p.raghav@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Instead of calling bdev_zone_sectors() multiple times, call
it once and cache the value locally. This will make the
subsequent change easier to read.
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Pankaj Raghav <p.raghav@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
There are multiple calculations and reads of fields of sbi that should
be protected by stat_lock. As stat_lock is not used to read these
values in statfs, this can lead to inconsistent results.
Extend the locking to prevent this issue.
Commit c9c8ed50d9 ("f2fs: fix to avoid potential race on
sbi->unusable_block_count access/update")
already added the use of sbi->stat_lock in statfs in
order to make the calculation of multiple, different fields atomic so
that results are consistent. This is similar to that patch regarding the
change in statfs.
Signed-off-by: Niels Dossche <dossche.niels@gmail.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
The block layer for zoned disk can reorder the FUA'ed IOs. Let's use flush
command to keep the write order.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Syzbot triggers two WARNs in f2fs_is_valid_blkaddr and
__is_bitmap_valid. For example, in f2fs_is_valid_blkaddr,
if type is DATA_GENERIC_ENHANCE or DATA_GENERIC_ENHANCE_READ,
it invokes WARN_ON if blkaddr is not in the right range.
The call trace is as follows:
f2fs_get_node_info+0x45f/0x1070
read_node_page+0x577/0x1190
__get_node_page.part.0+0x9e/0x10e0
__get_node_page
f2fs_get_node_page+0x109/0x180
do_read_inode
f2fs_iget+0x2a5/0x58b0
f2fs_fill_super+0x3b39/0x7ca0
Fix these two WARNs by replacing WARN_ON with dump_stack.
Reported-by: syzbot+763ae12a2ede1d99d4dc@syzkaller.appspotmail.com
Signed-off-by: Dongliang Mu <mudongliangabcd@gmail.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Read stale PTP Tx timestamps from PHY on cleanup.
After running out of Tx timestamps request handlers, hardware (HW) stops
reporting finished requests. Function ice_ptp_tx_tstamp_cleanup() used
to only clean up stale handlers in driver and was leaving the hardware
registers not read. Not reading stale PTP Tx timestamps prevents next
interrupts from arriving and makes timestamping unusable.
Fixes: ea9b847cda ("ice: enable transmit timestamps for E810 devices")
Signed-off-by: Michal Michalik <michal.michalik@intel.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
The iAVF driver uses 3 virtchnl op codes to communicate with the PF
regarding the VF Tx queues:
* VIRTCHNL_OP_CONFIG_VSI_QUEUES configures the hardware and firmware
logic for the Tx queues
* VIRTCHNL_OP_ENABLE_QUEUES configures the queue interrupts
* VIRTCHNL_OP_DISABLE_QUEUES disables the queue interrupts and Tx rings.
There is a bug in the iAVF driver due to the race condition between VF
reset request and shutdown being executed in parallel. This leads to a
break in logic and VIRTCHNL_OP_DISABLE_QUEUES is not being sent.
If this occurs, the PF driver never cleans up the Tx queues. This results
in leaving behind stale Tx queue settings in the hardware and firmware.
The most obvious outcome is that upon the next
VIRTCHNL_OP_CONFIG_VSI_QUEUES, the PF will fail to program the Tx
scheduler node due to a lack of space.
We need to protect ICE driver against such situation.
To fix this, make sure we clear existing stale settings out when
handling VIRTCHNL_OP_CONFIG_VSI_QUEUES. This ensures we remove the
previous settings.
Calling ice_vf_vsi_dis_single_txq should be safe as it will do nothing if
the queue is not configured. The function already handles the case when the
Tx queue is not currently configured and exits with a 0 return in that
case.
Fixes: 7ad15440ac ("ice: Refactor VIRTCHNL_OP_CONFIG_VSI_QUEUES handling")
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Anatolii Gerasymenko <anatolii.gerasymenko@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Function ice_plug_aux_dev() assigns pf->adev field too early prior
aux device initialization and on other side ice_unplug_aux_dev()
starts aux device deinit and at the end assigns NULL to pf->adev.
This is wrong because pf->adev should always be non-NULL only when
aux device is fully initialized and ready. This wrong order causes
a crash when ice_send_event_to_aux() call occurs because that function
depends on non-NULL value of pf->adev and does not assume that
aux device is half-initialized or half-destroyed.
After order correction the race window is tiny but it is still there,
as Leon mentioned and manipulation with pf->adev needs to be protected
by mutex.
Fix (un-)plugging functions so pf->adev field is set after aux device
init and prior aux device destroy and protect pf->adev assignment by
new mutex. This mutex is also held during ice_send_event_to_aux()
call to ensure that aux device is valid during that call.
Note that device lock used ice_send_event_to_aux() needs to be kept
to avoid race with aux drv unload.
Reproducer:
cycle=1
while :;do
echo "#### Cycle: $cycle"
ip link set ens7f0 mtu 9000
ip link add bond0 type bond mode 1 miimon 100
ip link set bond0 up
ifenslave bond0 ens7f0
ip link set bond0 mtu 9000
ethtool -L ens7f0 combined 1
ip link del bond0
ip link set ens7f0 mtu 1500
sleep 1
let cycle++
done
In short when the device is added/removed to/from bond the aux device
is unplugged/plugged. When MTU of the device is changed an event is
sent to aux device asynchronously. This can race with (un)plugging
operation and because pf->adev is set too early (plug) or too late
(unplug) the function ice_send_event_to_aux() can touch uninitialized
or destroyed fields. In the case of crash below pf->adev->dev.mutex.
Crash:
[ 53.372066] bond0: (slave ens7f0): making interface the new active one
[ 53.378622] bond0: (slave ens7f0): Enslaving as an active interface with an u
p link
[ 53.386294] IPv6: ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready
[ 53.549104] bond0: (slave ens7f1): Enslaving as a backup interface with an up
link
[ 54.118906] ice 0000:ca:00.0 ens7f0: Number of in use tx queues changed inval
idating tc mappings. Priority traffic classification disabled!
[ 54.233374] ice 0000:ca:00.1 ens7f1: Number of in use tx queues changed inval
idating tc mappings. Priority traffic classification disabled!
[ 54.248204] bond0: (slave ens7f0): Releasing backup interface
[ 54.253955] bond0: (slave ens7f1): making interface the new active one
[ 54.274875] bond0: (slave ens7f1): Releasing backup interface
[ 54.289153] bond0 (unregistering): Released all slaves
[ 55.383179] MII link monitoring set to 100 ms
[ 55.398696] bond0: (slave ens7f0): making interface the new active one
[ 55.405241] BUG: kernel NULL pointer dereference, address: 0000000000000080
[ 55.405289] bond0: (slave ens7f0): Enslaving as an active interface with an u
p link
[ 55.412198] #PF: supervisor write access in kernel mode
[ 55.412200] #PF: error_code(0x0002) - not-present page
[ 55.412201] PGD 25d2ad067 P4D 0
[ 55.412204] Oops: 0002 [#1] PREEMPT SMP NOPTI
[ 55.412207] CPU: 0 PID: 403 Comm: kworker/0:2 Kdump: loaded Tainted: G S
5.17.0-13579-g57f2d6540f03 #1
[ 55.429094] bond0: (slave ens7f1): Enslaving as a backup interface with an up
link
[ 55.430224] Hardware name: Dell Inc. PowerEdge R750/06V45N, BIOS 1.4.4 10/07/
2021
[ 55.430226] Workqueue: ice ice_service_task [ice]
[ 55.468169] RIP: 0010:mutex_unlock+0x10/0x20
[ 55.472439] Code: 0f b1 13 74 96 eb e0 4c 89 ee eb d8 e8 79 54 ff ff 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 65 48 8b 04 25 40 ef 01 00 31 d2 <f0> 48 0f b1 17 75 01 c3 e9 e3 fe ff ff 0f 1f 00 0f 1f 44 00 00 48
[ 55.491186] RSP: 0018:ff4454230d7d7e28 EFLAGS: 00010246
[ 55.496413] RAX: ff1a79b208b08000 RBX: ff1a79b2182e8880 RCX: 0000000000000001
[ 55.503545] RDX: 0000000000000000 RSI: ff4454230d7d7db0 RDI: 0000000000000080
[ 55.510678] RBP: ff1a79d1c7e48b68 R08: ff4454230d7d7db0 R09: 0000000000000041
[ 55.517812] R10: 00000000000000a5 R11: 00000000000006e6 R12: ff1a79d1c7e48bc0
[ 55.524945] R13: 0000000000000000 R14: ff1a79d0ffc305c0 R15: 0000000000000000
[ 55.532076] FS: 0000000000000000(0000) GS:ff1a79d0ffc00000(0000) knlGS:0000000000000000
[ 55.540163] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 55.545908] CR2: 0000000000000080 CR3: 00000003487ae003 CR4: 0000000000771ef0
[ 55.553041] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 55.560173] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 55.567305] PKRU: 55555554
[ 55.570018] Call Trace:
[ 55.572474] <TASK>
[ 55.574579] ice_service_task+0xaab/0xef0 [ice]
[ 55.579130] process_one_work+0x1c5/0x390
[ 55.583141] ? process_one_work+0x390/0x390
[ 55.587326] worker_thread+0x30/0x360
[ 55.590994] ? process_one_work+0x390/0x390
[ 55.595180] kthread+0xe6/0x110
[ 55.598325] ? kthread_complete_and_exit+0x20/0x20
[ 55.603116] ret_from_fork+0x1f/0x30
[ 55.606698] </TASK>
Fixes: f9f5301e7e ("ice: Register auxiliary device to provide RDMA")
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Reviewed-by: Dave Ertman <david.m.ertman@intel.com>
Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Exit to userspace with an emulation error if KVM encounters an injected
exception with invalid guest state, in addition to the existing check of
bailing if there's a pending exception (KVM doesn't support emulating
exceptions except when emulating real mode via vm86).
In theory, KVM should never get to such a situation as KVM is supposed to
exit to userspace before injecting an exception with invalid guest state.
But in practice, userspace can intervene and manually inject an exception
and/or stuff registers to force invalid guest state while a previously
injected exception is awaiting reinjection.
Fixes: fc4fad79fc ("KVM: VMX: Reject KVM_RUN if emulation is required with pending exception")
Reported-by: syzbot+cfafed3bb76d3e37581b@syzkaller.appspotmail.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20220502221850.131873-1-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
svm_vm_migrate_from() uses sev_lock_vcpus_for_migration() to lock all
source and target vcpu->locks. Unfortunately there is an 8 subclass
limit, so a new subclass cannot be used for each vCPU. Instead maintain
ownership of the first vcpu's mutex.dep_map using a role specific
subclass: source vs target. Release the other vcpu's mutex.dep_maps.
Fixes: b56639318b ("KVM: SEV: Add support for SEV intra host migration")
Reported-by: John Sperbeck<jsperbeck@google.com>
Suggested-by: David Rientjes <rientjes@google.com>
Suggested-by: Sean Christopherson <seanjc@google.com>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: Hillf Danton <hdanton@sina.com>
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Peter Gonda <pgonda@google.com>
Message-Id: <20220502165807.529624-1-pgonda@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This reverts commit bfaaba99e6.
Commit bfaaba99e6 ("ice: Hide bus-info in ethtool for PRs in switchdev
mode") was a workaround for lshw tool displaying incorrect
descriptions for port representors and PF in switchdev mode. Now the issue
has been fixed in the lshw tool itself [1].
Removing the workaround can be considered a regression, as the user might
be running older, unpatched lshw version. However, another important change
(ice: link representors to PCI device, which improves port representor
netdev naming with SET_NETDEV_DEV) also causes the same "regression" as
removing the workaround, i.e. unpatched lshw is able to access bus-info
information (this time not via ethtool) and the bug can occur. Therefore,
the workaround no longer prevents the bug and can be removed.
[1] 9bf4e4c9c1
Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com>
Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
A few recent regressions in rxe's multicast code, and some old driver
bugs:
- Error case unwind bug in rxe for rkeys
- Dot not call netdev functions under a spinlock in rxe multicast code
- Use the proper BH lock type in rxe multicast code
- Fix idrma deadlock and crash
- Add a missing flush to drain irdma QPs when in error
- Fix high userspace latency in irdma during destroy due to
synchronize_rcu()
- Rare race in siw MPA processing
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQRRRCHOFoQz/8F5bUaFwuHvBreFYQUCYnUqeQAKCRCFwuHvBreF
Yf58AQCNUQZlmEiuBid6WxggXPW/MM5sxJdOqZeX+Ddbmm7swAEAidtoVBILozLC
ltd8+P8qNdccqOZDatgqYYSpXUfHIA4=
=Idcg
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull rdma fixes from Jason Gunthorpe:
"A few recent regressions in rxe's multicast code, and some old driver
bugs:
- Error case unwind bug in rxe for rkeys
- Dot not call netdev functions under a spinlock in rxe multicast
code
- Use the proper BH lock type in rxe multicast code
- Fix idrma deadlock and crash
- Add a missing flush to drain irdma QPs when in error
- Fix high userspace latency in irdma during destroy due to
synchronize_rcu()
- Rare race in siw MPA processing"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
RDMA/rxe: Change mcg_lock to a _bh lock
RDMA/rxe: Do not call dev_mc_add/del() under a spinlock
RDMA/siw: Fix a condition race issue in MPA request processing
RDMA/irdma: Fix possible crash due to NULL netdev in notifier
RDMA/irdma: Reduce iWARP QP destroy time
RDMA/irdma: Flush iWARP QP if modified to ERR from RTR state
RDMA/rxe: Recheck the MR in when generating a READ reply
RDMA/irdma: Fix deadlock in irdma_cleanup_cm_core()
RDMA/rxe: Fix "Replace mr by rkey in responder resources"
Link port representors to parent PCI device to benefit from
systemd defined naming scheme.
Example from ip tool:
- without linking:
eth0 ...
- with linking:
eth0 ...
altname enp24s0f0npf0vf0
The port representor name is being shown in altname, because the name is
longer than IFNAMSIZ (16) limit. Altname can be used in ip tool.
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
fbdev:
- hotunplugging fix
amdgpu:
- Fix a xen dom0 regression on APUs
- Fix a potential array overflow if a receiver were to
send an erroneous audio channel count
msm:
- lockdep fix.
it6505:
- kconfig fix
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmJ0nIoACgkQDHTzWXnE
hr5BVxAAlTPqazcayV1T2cTEWeTjPD39D3t+WMP+hNy7aEgRa0LBHYdciRWx4gLx
4ddu0VutlCtlUC2bnhQNPfJ7hbcStCvcsu5tRaIk9JwhcngwjAWFCc0sWDQwU2OK
7OILBAFMd5t4fA+YxfySykI1pIQUGFB9QMGWakPV1xPKo2oyyJbPxpGcQcZXjp5Z
myKnfM2LaG+7gyCrgizh9218vei7XSVbwcKHW2rHA5002kthb0K1rH+Q7sSuRbiW
ozZgC7RrdGwpu/iz3oGDef8b9AJv3JQgL87FE9LqetAV9F7t3tc9y8WzM50PJh4u
Q/AKj9xRo/NxDwBKvn/YeL/7UgQyA+Ob0sQr/ObqMXcowvDWP0gCTxySkLEzS9tv
WuE6w3qLJ9GHHiMI8BUYtKGuFC5/E4sLqXhL2hUiFq7yQXWokL/ocAp7RYBh6ls9
2uBWWCEIv9MJ/vV23kMAg8sdqk4nvAVUtucp0z3Ee7w0Vt3v0YMtOhxXoD42Qg/w
eg/gMo39BgEKJsvFv5OQcRzGi5Sm9VKm9uAQN97dMEoe3tJPs7udCMehKwVebGBa
z6M1ch+qoIydFdVU1mPBqsmtHSPPlpfmRYVx6TY656RE/eanlVB76qUDBr5J2Ulw
yMjLPdyqRfTUwXatAqF9eBv9RRmb3xdX9iKkhEsAFf8jXKgJFbs=
=Ak9r
-----END PGP SIGNATURE-----
Merge tag 'drm-fixes-2022-05-06' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"A pretty quiet week, one fbdev, msm, kconfig, and two amdgpu fixes,
about what I'd expect for rc6.
fbdev:
- hotunplugging fix
amdgpu:
- Fix a xen dom0 regression on APUs
- Fix a potential array overflow if a receiver were to send an
erroneous audio channel count
msm:
- lockdep fix.
it6505:
- kconfig fix"
* tag 'drm-fixes-2022-05-06' of git://anongit.freedesktop.org/drm/drm:
drm/amd/display: Avoid reading audio pattern past AUDIO_CHANNELS_COUNT
drm/amdgpu: do not use passthrough mode in Xen dom0
drm/bridge: ite-it6505: add missing Kconfig option select
fbdev: Make fb_release() return -ENODEV if fbdev was unregistered
drm/msm/dp: remove fail safe mode related code
For write request the remote addr will be sent only with first packet so
we don't have to adjust wqe->iova in retry operation.
Link: https://lore.kernel.org/r/20220502053907.6388-1-cgxu519@mykernel.net
Signed-off-by: Chengguang Xu <cgxu519@mykernel.net>
Reviewed-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
The RTC section of the H616 manual mentions in a half-sentence the
existence of a clock "32K divided by PLL_PERI(2X)". This is used as
one of the possible inputs for the mux that selects the clock for the
32 KHz fanout pad. On the H616 this is routed to pin PG10, and some
boards use that clock output to compensate for a missing 32KHz crystal.
On the OrangePi Zero2 this is for instance connected to the LPO pin of
the WiFi/BT chip.
The new RTC clock binding requires this clock to be named as one input
clock, so we need to expose this to the DT. In contrast to the D1 SoC
there does not seem to be a gate for this clock, so just use a fixed
divider clock, using a newly assigned clock number.
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Samuel Holland <samuel@sholland.org>
Signed-off-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Link: https://lore.kernel.org/r/20220428230933.15262-3-andre.przywara@arm.com
The H6 and H616 feature an (undocumented) bus clock gate for accessing
the RTC registers. This seems to be enabled at reset (or by the BootROM),
so we got away without it so far, but exists regardless.
Since the new RTC clock binding for the H616 requires this "bus" clock
to be specified in the DT, add this to R_CCU clock driver and expose it
on the DT side with a new number.
We do this for both the H6 and H616, but mark it as IGNORE_UNUSED, as we
cannot reference it in any H6 DTs.
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Samuel Holland <samuel@sholland.org>
Signed-off-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Link: https://lore.kernel.org/r/20220428230933.15262-2-andre.przywara@arm.com
Allow the NVIDIA-specific ARM SMMU implementation to bind to the SMMU
instances found on Tegra234.
Acked-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: https://lore.kernel.org/r/20220429082243.496000-4-thierry.reding@gmail.com
Signed-off-by: Will Deacon <will@kernel.org>
The NVIDIA Tegra234 SoC comes with one single-instance ARM SMMU used by
isochronous memory clients and two dual-instance ARM SMMUs used by non-
isochronous memory clients.
Reviewed-by: Rob Herring <robh@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: https://lore.kernel.org/r/20220429082243.496000-3-thierry.reding@gmail.com
Signed-off-by: Will Deacon <will@kernel.org>
On NVIDIA SoC's the ARM SMMU needs to interact with the memory
controller in order to map memory clients to the corresponding stream
IDs. Document how the nvidia,memory-controller property can be used to
achieve this.
Note that this is a backwards-incompatible change that is, however,
necessary to ensure correctness. Without the new property, most of the
devices would still work but it is not guaranteed that all will.
Reviewed-by: Rob Herring <robh@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: https://lore.kernel.org/r/20220429082243.496000-2-thierry.reding@gmail.com
Signed-off-by: Will Deacon <will@kernel.org>
Add the Qualcomm SC8280XP platform to the list of compatible for which
the Qualcomm-impl of the ARM SMMU should apply.
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20220503163429.960998-3-bjorn.andersson@linaro.org
Signed-off-by: Will Deacon <will@kernel.org>
Add compatible for the Qualcomm SC8280XP platform to the ARM SMMU
DeviceTree binding.
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20220503163429.960998-2-bjorn.andersson@linaro.org
Signed-off-by: Will Deacon <will@kernel.org>
We currently call arm64_mm_context_put() without holding a reference to
the mm, which can result in use-after-free. Call mmgrab()/mmdrop() to
ensure the mm only gets freed after we unpinned the ASID.
Fixes: 32784a9562 ("iommu/arm-smmu-v3: Implement iommu_sva_bind/unbind()")
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Link: https://lore.kernel.org/r/20220426130444.300556-1-jean-philippe@linaro.org
Signed-off-by: Will Deacon <will@kernel.org>
It will cause null-ptr-deref when using 'res', if platform_get_resource()
returns NULL, so move using 'res' after devm_ioremap_resource() that
will check it to avoid null-ptr-deref.
And use devm_platform_get_and_ioremap_resource() to simplify code.
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Link: https://lore.kernel.org/r/20220425114136.2649310-1-yangyingliang@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
When one port's input state get inverted (eg. from low to hight) after
pca953x_irq_setup but before setting irq_mask (by some other driver such as
"gpio-keys"), the next inversion of this port (eg. from hight to low) will not
be triggered any more (because irq_stat is not updated at the first time). Issue
should be fixed after this commit.
Fixes: 89ea8bbe9c ("gpio: pca953x.c: add interrupt handling capability")
Signed-off-by: Puyou Lu <puyou.lu@gmail.com>
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
[Why]
Z10 and S0i3 have some shared path. Previous code clean up ,
incorrectly removed these pointers, which breaks s0i3 restore
[How]
Do not clear the function pointers based on Z10 disable.
Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Acked-by: Pavle Kotarac <Pavle.Kotarac@amd.com>
Signed-off-by: Eric Yang <Eric.Yang2@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Active State Power Management (ASPM) feature is enabled since kernel 5.14.
There are some AMD Volcanic Islands (VI) GFX cards, such as the WX3200 and
RX640, that do not work with ASPM-enabled Intel Alder Lake based systems.
Using these GFX cards as video/display output, Intel Alder Lake based
systems will freeze after suspend/resume.
The issue was originally reported on one system (Dell Precision 3660 with
BIOS version 0.14.81), but was later confirmed to affect at least 4
pre-production Alder Lake based systems.
Add an extra check to disable ASPM on Intel Alder Lake based systems with
the problematic AMD Volcanic Islands GFX cards.
Fixes: 0064b0ce85 ("drm/amd/pm: enable ASPM by default")
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1885
Signed-off-by: Richard Gong <richard.gong@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org