When the RED Qdisc is currently configured to enable ECN, the RED algorithm
is used to decide whether a certain SKB should be marked. If that SKB is
not ECN-capable, it is early-dropped.
It is also possible to keep all traffic in the queue, and just mark the
ECN-capable subset of it, as appropriate under the RED algorithm. Some
switches support this mode, and some installations make use of it.
To that end, add a new RED flag, TC_RED_NODROP. When the Qdisc is
configured with this flag, non-ECT traffic is enqueued instead of being
early-dropped.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The qdiscs RED, GRED, SFQ and CHOKE use different subsets of the same pool
of global RED flags. These are passed in tc_red_qopt.flags. However none of
these qdiscs validate the flag field, and just copy it over wholesale to
internal structures, and later dump it back. (An exception is GRED, which
does validate for VQs -- however not for the main setup.)
A broken userspace can therefore configure a qdisc with arbitrary
unsupported flags, and later expect to see the flags on qdisc dump. The
current ABI therefore allows storage of several bits of custom data to
qdisc instances of the types mentioned above. How many bits, depends on
which flags are meaningful for the qdisc in question. E.g. SFQ recognizes
flags ECN and HARDDROP, and the rest is not interpreted.
If SFQ ever needs to support ADAPTATIVE, it needs another way of doing it,
and at the same time it needs to retain the possibility to store 6 bits of
uninterpreted data. Likewise RED, which adds a new flag later in this
patchset.
To that end, this patch adds a new function, red_get_flags(), to split the
passed flags of RED-like qdiscs to flags and user bits, and
red_validate_flags() to validate the resulting configuration. It further
adds a new attribute, TCA_RED_FLAGS, to pass arbitrary flags.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a handful of tests for creating RED with different flags.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Not every stmmac based platform makes use of the eth_wake_irq or eth_lpi
interrupts. Use the platform_get_irq_byname_optional variant for these
interrupts, so no error message is displayed, if they can't be found.
Rather print an information to hint something might be wrong to assist
debugging on platforms which use these interrupts.
Signed-off-by: Markus Fuchs <mklntf@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The bpfilter UMH code was recently changed to log its informative messages to
/dev/kmsg, however this interface doesn't support SEEK_CUR yet, used by
dprintf(). As result dprintf() returns -EINVAL and doesn't log anything.
However there already had some discussions about supporting SEEK_CUR into
/dev/kmsg interface in the past it wasn't concluded. Since the only user of
that from userspace perspective inside the kernel is the bpfilter UMH
(userspace) module it's better to correct it here instead waiting a conclusion
on the interface.
Fixes: 36c4357c63 ("net: bpfilter: print umh messages to /dev/kmsg")
Signed-off-by: Bruno Meneguele <bmeneg@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Corresponds to the MAC_SPOOFING_TX privilege in the hardware.
Some firmware versions on some cards don't support the feature, so check
the TX_MAC_SECURITY capability and fail EOPNOTSUPP if trying to enable
spoofchk on a NIC that doesn't support it.
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jose Abreu says:
====================
net: phy: XLGMII define and usage in PHYLINK
Adds XLGMII defines and usage in PHYLINK.
Patch 1/2, adds the define for it, whilst 2/2 adds the usage of it in
PHYLINK.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Add XLGMII interface and the list of XLGMII speeds to PHYLINK.
Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is a statement that is indented incorrectly, remove a space.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The switches supported so far by the driver only have non-SerDes ports,
so they should be configured in the PHYLINK callback that provides the
resolved PHY link parameters.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add more T5/T6 registers to be collected in register dump:
1. MPS register range 0x9810 to 0x9864 and 0xd000 to 0xd004.
2. NCSI register range 0x1a114 to 0x1a130 and 0x1a138 to 0x1a1c4.
Signed-off-by: Shahjada Abul Husain <shahjada@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In commit 599be01ee5 ("net_sched: fix an OOB access in cls_tcindex")
I moved cp->hash calculation before the first
tcindex_alloc_perfect_hash(), but cp->alloc_hash is left untouched.
This difference could lead to another out of bound access.
cp->alloc_hash should always be the size allocated, we should
update it after this tcindex_alloc_perfect_hash().
Reported-and-tested-by: syzbot+dcc34d54d68ef7d2d53d@syzkaller.appspotmail.com
Reported-and-tested-by: syzbot+c72da7b9ed57cde6fca2@syzkaller.appspotmail.com
Fixes: 599be01ee5 ("net_sched: fix an OOB access in cls_tcindex")
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
syzbot reported a use-after-free in tcindex_dump(). This is due to
the lack of RTNL in the deferred rcu work. We queue this work with
RTNL in tcindex_change(), later, tcindex_dump() is called:
fh = tp->ops->get(tp, t->tcm_handle);
...
err = tp->ops->change(..., &fh, ...);
tfilter_notify(..., fh, ...);
but there is nothing to serialize the pending
tcindex_partial_destroy_work() with tcindex_dump().
Fix this by simply holding RTNL in tcindex_partial_destroy_work(),
so that it won't be called until RTNL is released after
tc_new_tfilter() is completed.
Reported-and-tested-by: syzbot+653090db2562495901dc@syzkaller.appspotmail.com
Fixes: 3d210534cc ("net_sched: fix a race condition in tcindex_destroy()")
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Enable io-wq hashing stuff for dependent works simply by re-enqueueing
such requests.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
It's a preparation patch removing io_wq_enqueue_hashed(), which
now should be done by io_wq_hash_work() + io_wq_enqueue().
Also, set hash value for dependant works, and do it as late as possible,
because req->file can be unavailable before. This hash will be ignored
by io-wq.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
This little tweak restores the behaviour that was before the recent
io_worker_handle_work() optimisation patches. It makes the function do
cond_resched() and flush_signals() only if there is an actual work to
execute.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Processing links, io_submit_sqe() prepares requests, drops sqes, and
passes them with sqe=NULL to io_queue_sqe(). There IOSQE_DRAIN and/or
IOSQE_ASYNC requests will go through the same prep, which doesn't expect
sqe=NULL and fail with NULL pointer deference.
Always do full prepare including io_alloc_async_ctx() for linked
requests, and then it can skip the second preparation.
Cc: stable@vger.kernel.org # 5.5
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pull i2c fixes from Wolfram Sang:
"I2C has quite some regression fixes this time.
One is also related to watchdogs, we have proper acks from Guenter for
them"
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: acpi: put device when verifying client fails
misc: eeprom: at24: fix regulator underflow
i2c: gpio: suppress error on probe defer
macintosh: windfarm: fix MODINFO regression
i2c: designware-pci: Fix BUG_ON during device removal
i2c: i801: Do not add ICH_RES_IO_SMI for the iTCO_wdt device
watchdog: iTCO_wdt: Make ICH_RES_IO_SMI optional
watchdog: iTCO_wdt: Export vendorsupport
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQEcBAABAgAGBQJebMXnAAoJEL/70l94x66D3fYIAJ1r+o2qgzadwEqoXTvlihjB
ujX1jOs20EJJ56VhTtXF/wZQc+7VeKCjpIqNv4WaeSYPUhzFGyL9t5tw1YdRDCwY
u6gklxruIzZodgp+vCoTkPyyUylVmY50sY/yBIJ4F8qOaMxhTEE1aXzGuaOrYqVO
MmIlAltEKQzdXPO1SVPD7triGPgUTj+DRxrlyRrGt2ItiMUincCz9K6TDyXFib0r
SSCVFNYtYmzu/bV/E4/Sphi2BxCQEem5DIFWLcngzN8Wy5oCoRVzPGugT4Q9eXWt
ZtWIDh473JGiXBLYmDq4REJsRSca+7s/YiiLSiQwYfByhIPJpVEoy54fcdaZflo=
=T4AD
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
"Bugfixes for x86 and s390"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: nVMX: avoid NULL pointer dereference with incorrect EVMCS GPAs
KVM: x86: Initializing all kvm_lapic_irq fields in ioapic_write_indirect
KVM: VMX: Condition ENCLS-exiting enabling on CPU support for SGX1
KVM: s390: Also reset registers in sync regs for initial cpu reset
KVM: fix Kconfig menu text for -Werror
KVM: x86: remove stale comment from struct x86_emulate_ctxt
KVM: x86: clear stale x86_emulate_ctxt->intercept value
KVM: SVM: Fix the svm vmexit code for WRMSR
KVM: X86: Fix dereference null cpufreq policy
Currently, the intel iommu debugfs directory(/sys/kernel/debug/iommu/intel)
gets populated only when DMA remapping is enabled (dmar_disabled = 0)
irrespective of whether interrupt remapping is enabled or not.
Instead, populate the intel iommu debugfs directory if any IOMMUs are
detected.
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: ee2636b867 ("iommu/vt-d: Enable base Intel IOMMU debugfs support")
Signed-off-by: Megha Dey <megha.dey@linux.intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
net/bluetooth/l2cap_core.c: In function l2cap_ecred_conn_req:
net/bluetooth/l2cap_core.c:5848:6: warning: variable credits set but not used [-Wunused-but-set-variable]
commit 15f02b9105 ("Bluetooth: L2CAP: Add initial code for Enhanced Credit Based Mode")
involved this unused variable, remove it.
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
We can use the variable allocated_clusters rather than map_from_clusters
to control reserved block/cluster accounting in ext4_ext_map_blocks.
This eliminates a variable and associated code and improves readability
a little.
Reviewed-by: Ritesh Harjani <riteshh@linux.ibm.com>
Signed-off-by: Eric Whitney <enwlinux@gmail.com>
Link: https://lore.kernel.org/r/20200311205125.25061-1-enwlinux@gmail.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Now that the eofblocks code has been removed, we don't need to assign
0 to err before calling ext4_ext_insert_extent() since it will assign
a return value to ret anyway. The variable free_on_err can be
eliminated and replaced by a reference to allocated_clusters which
clearly conveys the idea that newly allocated blocks should be freed
when recovering from an extent insertion failure. The error handling
code itself should be restructured so that it errors out immediately on
an insertion failure in the case where no new blocks have been allocated
(bigalloc) rather than proceeding further into the mapping code. The
initializer for fb_flags can also be rearranged for improved
readability. Finally, insert a missing space in nearby code.
No known bugs are addressed by this patch - it's simply a cleanup.
Reviewed-by: Ritesh Harjani <riteshh@linux.ibm.com>
Signed-off-by: Eric Whitney <enwlinux@gmail.com>
Link: https://lore.kernel.org/r/20200311205033.25013-1-enwlinux@gmail.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The current codebase makes use of the zero-length array language
extension to the C90 standard, but the preferred mechanism to declare
variable-length types such as these ones is a flexible array member[1][2],
introduced in C99:
struct foo {
int stuff;
struct boo array[];
};
By making use of the mechanism above, we will get a compiler warning
in case the flexible array does not occur last in the structure, which
will help us prevent some kind of undefined behavior bugs from being
inadvertently introduced[3] to the codebase from now on.
Also, notice that, dynamic memory allocations won't be affected by
this change:
"Flexible array members have incomplete type, and so the sizeof operator
may not be applied. As a quirk of the original implementation of
zero-length arrays, sizeof evaluates to zero."[1]
This issue was found with the help of Coccinelle.
[1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
[2] https://github.com/KSPP/linux/issues/21
[3] commit 7649773293 ("cxgb3/l2t: Fix undefined behaviour")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Link: https://lore.kernel.org/r/20200309180813.GA3347@embeddedor
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The current codebase makes use of the zero-length array language
extension to the C90 standard, but the preferred mechanism to declare
variable-length types such as these ones is a flexible array member[1][2],
introduced in C99:
struct foo {
int stuff;
struct boo array[];
};
By making use of the mechanism above, we will get a compiler warning
in case the flexible array does not occur last in the structure, which
will help us prevent some kind of undefined behavior bugs from being
inadvertently introduced[3] to the codebase from now on.
Also, notice that, dynamic memory allocations won't be affected by
this change:
"Flexible array members have incomplete type, and so the sizeof operator
may not be applied. As a quirk of the original implementation of
zero-length arrays, sizeof evaluates to zero."[1]
This issue was found with the help of Coccinelle.
[1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
[2] https://github.com/KSPP/linux/issues/21
[3] commit 7649773293 ("cxgb3/l2t: Fix undefined behaviour")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Link: https://lore.kernel.org/r/20200309154838.GA31559@embeddedor
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
This patch moves ext4_fiemap to use iomap framework.
For xattr a new 'ext4_iomap_xattr_ops' is added.
Reported-by: kbuild test robot <lkp@intel.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Link: https://lore.kernel.org/r/b9f45c885814fcdd0631747ff0fe08886270828c.1582880246.git.riteshh@linux.ibm.com
Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
For indirect block mapping if the i_block > max supported block in inode
then ext4_ind_map_blocks() returns a -EIO error. But in case of fiemap
this could be a valid query to ->iomap_begin call.
So check if the offset >= s_bitmap_maxbytes in ext4_iomap_begin_report(),
then simply skip calling ext4_map_blocks().
Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/87fa0ddc5967fa707656212a3b66a7233425325c.1582880246.git.riteshh@linux.ibm.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
ext4_iomap_begin is already implemented which provides ext4_map_blocks,
so just move the API from generic_block_bmap to iomap_bmap for iomap
conversion.
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
Link: https://lore.kernel.org/r/8bbd53bd719d5ccfecafcce93f2bf1d7955a44af.1582880246.git.riteshh@linux.ibm.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
This patch avoids the memory alloc & free path when depth is 0,
since anyway there is no extra caching done in that case.
So on checking depth 0, simply return early.
Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/93da0d0f073c73358e85bb9849d8a5378d1da539.1582880246.git.riteshh@linux.ibm.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
IOMAP_F_MERGED needs to be set in case of non-extent based mapping.
This is needed in later patches for conversion of ext4_fiemap to use iomap.
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
Link: https://lore.kernel.org/r/a4764c91c08c16d4d4a4b36defb2a08625b0e9b3.1582880246.git.riteshh@linux.ibm.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Create a quirk that allows special processing and/or
skipping the call to snd_card_register.
For HyperX AMP, which uses two interfaces, but only has
a capture stream in the second, this allows the capture
stream to merge with the first PCM.
Signed-off-by: Chris Wulff <crwulff@gmail.com>
Link: https://lore.kernel.org/r/20200314165449.4086-3-crwulff@gmail.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Use the USB interface of the mixer that the control
was created on instead of the default control interface.
This fixes the Kingston HyperX AMP (0951:16d8) which has
controls on two interfaces.
Signed-off-by: Chris Wulff <crwulff@gmail.com>
Link: https://lore.kernel.org/r/20200314165449.4086-2-crwulff@gmail.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
more fixes for this -rc series.
- Mark device node const in of_clk_get_parent APIs to ease landing
changes in users later
- Fix flag for Qualcomm SC7180 video clocks where we thought it would
never turn off but actually hardware takes care of it
- Remove disp_cc_mdss_rscc_ahb_clk on Qualcomm SC7180 SoCs because this
clk is always on anyway
- Correct some bad dt-binding numbers for i.MX8MN SoCs
-----BEGIN PGP SIGNATURE-----
iQJFBAABCAAvFiEE9L57QeeUxqYDyoaDrQKIl8bklSUFAl5sMV0RHHNib3lkQGtl
cm5lbC5vcmcACgkQrQKIl8bklSXzBhAAh4bih0khHYOh7FWJVuIuoUqJrikiSc4L
rxD1oi3ZyUyauvFl0QOcxe6YB8qwjA6G14FWjluj6LzhXGCnl2I58j0eWV/eFiCD
3GXMhbUkEHjOLa1BC52LgJ7/eifQHECCFJkzi7HXhUDaUDpC4zP+ysBBde5A1ECX
IYCXmUlv0TkxRe6pEgtylrU+XFC9IPuq8FOHHYEcDKi8XwLhw8PS2klBkE2hsHH6
i/ZFIx+2VFFcCtPfvOtOl1L26pHrvehorjp9JaajyKInDpfLYAZPoxCY4k7agxix
uGQq8B+2Wl9W5yT4vGhuujSV2IVZHlC56VjBbwEUTGwCIKJqiexTqaL8Ls083nY+
te/wi21pFBr1oH77ZlP6gbUlHGaH2wONJVim8DbLrZ5t7fcGf0pMWkofBgEyj2rh
WX8Kbhjp/GN4Q3qUMQbF1Gej3RMg9e6/LirlaFkvi0clrXFOzlM1tCuDdrfgA5i6
NnTL1MDFPu57Vc5Srkk7+/jabMSFCX1fxX3GP/y7ZS3Fnxxb6ZY6iMpUdP+XnJyz
G0PKPFRg26k5YC2wy5V8hBpvt/9IXkv1FTnDa6FLSbuMAFVdL3mb+FDxlVZBZAGJ
KL0HMGfhOieZDB0K+6KNQndbcJzWCPZYcChqZaLaM8/ZFFFMWZ8Sh6B5BR4Keal7
wUYounnQlic=
=m90j
-----END PGP SIGNATURE-----
Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk fixes from Stephen Boyd:
"A small collection of fixes. I'll make another sweep soon to look for
more fixes for this -rc series.
- Mark device node const in of_clk_get_parent APIs to ease landing
changes in users later
- Fix flag for Qualcomm SC7180 video clocks where we thought it would
never turn off but actually hardware takes care of it
- Remove disp_cc_mdss_rscc_ahb_clk on Qualcomm SC7180 SoCs because
this clk is always on anyway
- Correct some bad dt-binding numbers for i.MX8MN SoCs"
* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
clk: imx8mn: Fix incorrect clock defines
clk: qcom: dispcc: Remove support of disp_cc_mdss_rscc_ahb_clk
clk: qcom: videocc: Update the clock flag for video_cc_vcodec0_core_clk
of: clk: Make of_clk_get_parent_{count,name}() parameter const
Since snprintf() returns the would-be-output size instead of the
actual output size, the succeeding calls may go beyond the given
buffer limit. Fix it by replacing with scnprintf().
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
device_driver name is const char pointer, so it not useful to cast
driver_name (which is already const char).
Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
When an EVMCS enabled L1 guest on KVM will tries doing enlightened VMEnter
with EVMCS GPA = 0 the host crashes because the
evmcs_gpa != vmx->nested.hv_evmcs_vmptr
condition in nested_vmx_handle_enlightened_vmptrld() will evaluate to
false (as nested.hv_evmcs_vmptr is zeroed after init). The crash will
happen on vmx->nested.hv_evmcs pointer dereference.
Another problematic EVMCS ptr value is '-1' but it only causes host crash
after nested_release_evmcs() invocation. The problem is exactly the same as
with '0', we mistakenly think that the EVMCS pointer hasn't changed and
thus nested.hv_evmcs_vmptr is valid.
Resolve the issue by adding an additional !vmx->nested.hv_evmcs
check to nested_vmx_handle_enlightened_vmptrld(), this way we will
always be trying kvm_vcpu_map() when nested.hv_evmcs is NULL
and this is supposed to catch all invalid EVMCS GPAs.
Also, initialize hv_evmcs_vmptr to '0' in nested_release_evmcs()
to be consistent with initialization where we don't currently
set hv_evmcs_vmptr to '-1'.
Cc: stable@vger.kernel.org
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Since snprintf() returns the would-be-output size instead of the
actual output size, the succeeding calls may go beyond the given
buffer limit. Fix it by replacing with scnprintf().
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
To make the code a bit more readable, let's move the OSI specific
initialization out of the psci_dt_cpu_init_idle() and into a separate
function.
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Commit 2c36168480 ("PM / Domains: Don't treat zero found compatible idle
states as an error"), moved of_genpd_parse_idle_states() towards allowing
none compatible idle state to be found for the device node, rather than
returning an error code.
However, it didn't consider that the "domain-idle-states" DT property may
be missing as it's optional, which makes of_count_phandle_with_args() to
return -ENOENT. Let's fix this to make the behaviour consistent.
Fixes: 2c36168480 ("PM / Domains: Don't treat zero found compatible idle states as an error")
Reported-by: Benjamin Gaignard <benjamin.gaignard@st.com>
Cc: 4.20+ <stable@vger.kernel.org> # 4.20+
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
With 7de3f1423f ("KVM: s390: Add new reset vcpu API") we clarified
the meaning of the reset ioctl to fully reset the CPU and not only the
parts that can not be handled by userspace. Turns out that we missed
some parts.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABAgAGBQJeakS4AAoJEBF7vIC1phx8CrUP/R2tybl1+fUG7Jm6VRvu6idU
CxU34zV+goFbeznyln9WdQtW4D+IkbIH1wvHNMDFqZvktA8St3wzNkZ6W5uCebJg
cmAQiOoz4VADh0EM1sSC4/45PfjADh3xHZxA+5X76bh7ji9kp7lkqUwXiclysge5
rurB2r1PFZoaMv1sQbAUIlyb8BTfe4zK8w0+zEjIeN1Mh+mjs1wAhyo1qOmvS24J
lrv3vrAdJDp1OVebCfrKF6NzgLrQBSK8ETRFAoRPSZPCkSMF7dCUfgvRWUw7zs5A
wyDHqtMUU5MQ0AKjd4cXH6Un4vzfYSQtoGQJAe3UdnnWNOpxP5/wOLt1xQFb6nun
K2wDLx9hxu6f4vT9zntMBZ2zCsBGfWTtwa+DRN58HI4cSFowo8PB7jMuauHBeJ7B
teNwAnDsjhOLH2fRFuh7eM0f5tOJNACvVxS6fXAChu4fXM1rtG1WDnn5V3y5tYbw
UBe7NV657vEKFzp63C3vB7EK/hkCo8cc/c9JKi1kMoR9q3bUfMSN+kRh2WLkxni6
Ux4AuAuXGMw/PBrqtt43C4GFrUkyaTIEtl8KHDHWfRcxV0rKlIp2ebJKRLG8QlVY
hTJPCv8DDY1FoyTnOPZWYNdDUY3EWdo/R0LQ2L9ywDxbtR30Z6mcqH7FhlKvPKRj
C4/RRmpBco4vnizfD62r
=bmtm
-----END PGP SIGNATURE-----
Merge tag 'kvm-s390-master-5.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into kvm-master
KVM: s390: Fully do the CPU resets as intended
With 7de3f1423f ("KVM: s390: Add new reset vcpu API") we clarified
the meaning of the reset ioctl to fully reset the CPU and not only the
parts that can not be handled by userspace. Turns out that we missed
some parts.
Since the SNAPSHOT_GET_IMAGE_SIZE, SNAPSHOT_AVAIL_SWAP_SIZE, and
SNAPSHOT_ALLOC_SWAP_PAGE ioctls produce an loff_t result, and loff_t is
always 64-bit even in the compat case, there's no reason to have the
special compat handling for these ioctls. Just remove this unneeded
code so that these ioctls call into snapshot_ioctl() directly, doing
just the compat_ptr() conversion on the argument.
(This unnecessary code was also causing a KMSAN false positive.)
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Correctly document how the SNAPSHOT_GET_IMAGE_SIZE and
SNAPSHOT_AVAIL_SWAP_SIZE ioctls return their result.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
URLs for presentation and Intel Software Developer’s Manual are updated
as they were using "http" which are gradually replaced by "https".
Signed-off-by: Alex Hung <alex.hung@canonical.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
There is still some code duplication between intel_pstate_verify_policy()
and intel_cpufreq_verify_policy(), so avoid it by moving the common
code into a separate function and calling it from both these places.
No intentional functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Before commit 1328edca4a ("cpuidle-haltpoll: Enable kvm guest polling
when dedicated physical CPUs are available") the cpuidle-haltpoll driver
could also be used in scenarios when the host does not advertise the
KVM_HINTS_REALTIME hint.
While the behavior introduced by the aforementioned commit makes sense as
the default there are cases where the old behavior is desired, for example,
when other kernel changes triggered by presence by this hint are unwanted,
for some workloads where the latency benefit from polling overweights the
loss from idle CPU capacity that otherwise would be available, or just when
running under older Qemu versions that lack this hint.
Let's provide a typical "force" module parameter that allows restoring the
old behavior.
Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>