A protected VM accessing ID_AA64ISAR2_EL1 gets punished with an UNDEF,
while it really should only get a zero back if the register is not
handled by the hypervisor emulation (as mandated by the architecture).
Introduce all the missing ID registers (including the unallocated ones),
and have them to return 0.
Reported-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220609121223.2551-3-will@kernel.org
If we fail to allocate the 'supported_cpus' cpumask in kvm_arch_init_vm()
then be sure to return -ENOMEM instead of success (0) on the failure
path.
Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220609121223.2551-2-will@kernel.org
Add runtime PM support for atmel-quadspi which will disable/enable
QSPI clocks on proper runtime_suspend/runtime_resume ops.
Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Link: https://lore.kernel.org/r/20220609084246.1795419-2-claudiu.beznea@microchip.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Return boolean values ("true" or "false") instead of 1 or 0 from bool
function.
As reported by coccicheck:
./drivers/spi/spi-s3c64xx.c:385:9-10: WARNING: return of 0/1 in function
's3c64xx_spi_can_dma' with return type bool
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20220609071250.59509-1-yang.lee@linux.alibaba.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Remove .owner field if calls are used which set it automatically.
Eliminate the following coccicheck warning:
./drivers/spi/spi-microchip-core.c:624:3-8: No need to set .owner here.
The core will do it.
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Link: https://lore.kernel.org/r/20220609055533.95866-1-yang.lee@linux.alibaba.com
Signed-off-by: Mark Brown <broonie@kernel.org>
* Convert IRQ chips in Diolan and Intel GPIO drivers to be immutable
The following is an automated git shortlog grouped by driver:
crystalcove:
- Join function declarations and long lines
- Use specific type and API for IRQ number
- make irq_chip immutable
dln2:
- make irq_chip immutable
merrifield:
- make irq_chip immutable
sch:
- make irq_chip immutable
wcove:
- make irq_chip immutable
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQSu93Raj3rZDNXzGZv7cr9lmVa5zAUCYqCTOQAKCRD7cr9lmVa5
zFpMAQC//bjzu/ZKCSBTnfLjXsNjje1G8yOuKxkljsUR8rMc5AD/VXJGijZCkfA4
QVALyw+7yisUPCzPzj+HFvCHTZNltQQ=
=tclG
-----END PGP SIGNATURE-----
Merge tag 'intel-gpio-v5.19-2' of gitolite.kernel.org:pub/scm/linux/kernel/git/andy/linux-gpio-intel into gpio/for-current
intel-gpio for v5.19-2
* Convert IRQ chips in Diolan and Intel GPIO drivers to be immutable
Systems with AST graphics can have multiple output; typically VGA
plus some other port. Record detected output chips in a bitmask and
initialize each output on its own.
Assume a VGA output by default and use SIL164 and DP501 if available.
For ASTDP assume that it can run in parallel with VGA.
Tested on AST2100.
v3:
* define a macro for each BIT(ast_tx_chip) (Patrik)
v2:
* make VGA/SIL164/DP501 mutually exclusive
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Patrik Jakobsson <patrik.r.jakobsson@gmail.com>
Fixes: a59b026419 ("drm/ast: Initialize encoder and connector for VGA in helper function")
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Javier Martinez Canillas <javierm@redhat.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: dri-devel@lists.freedesktop.org
Link: https://patchwork.freedesktop.org/patch/msgid/20220607092008.22123-2-tzimmermann@suse.de
(cherry picked from commit 7f35680ada)
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
The revision of the imx-sdma IP that is in the i.MX8M series is the
same is that as that in the i.MX7 series but the imx7d MODULE_FIRMWARE
directive is wrapped in a condiditional which means it's not defined
when built for aarch64 SOC_IMX8M platforms and hence you get the
following errors when the driver loads on imx8m devices:
imx-sdma 302c0000.dma-controller: Direct firmware load for imx/sdma/sdma-imx7d.bin failed with error -2
imx-sdma 302c0000.dma-controller: external firmware not found, using ROM firmware
Add the SOC_IMX8M into the check so the firmware can load on i.MX8.
Fixes: 1474d48bd6 ("arm64: dts: imx8mq: Add SDMA nodes")
Fixes: 941acd566b ("dmaengine: imx-sdma: Only check ratio on parts that support 1:1")
Signed-off-by: Peter Robinson <pbrobinson@gmail.com>
Cc: stable@vger.kernel.org # v5.2+
Reviewed-by: Fabio Estevam <festevam@gmail.com>
Link: https://lore.kernel.org/r/20220606161034.3544803-1-pbrobinson@gmail.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
This reverts commit a8facc7b98 ("dmaengine: add verification of
DMA_INTERRUPT capability for dmatest") as it causes regression due to
the fact that DMA_INTERRUPT in linked to dma_prep_interrupt() so
checking that is incorrect here
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Link: https://lore.kernel.org/r/20220606174906.3979283-1-vkoul@kernel.org
Signed-off-by: Vinod Koul <vkoul@kernel.org>
of_find_device_by_node() takes reference, we should use put_device()
to release it when not need anymore.
Fixes: a074ae38f8 ("dmaengine: Add driver for TI DMA crossbar on DRA7x")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Acked-by: Peter Ujfalusi <peter.ujfalusi@gmail.com>
Link: https://lore.kernel.org/r/20220605042723.17668-1-linmq006@gmail.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
of_parse_phandle() returns a node pointer with refcount
incremented, we should use of_node_put() on it when not needed anymore.
Add missing of_node_put() in to fix this.
Fixes: ec9bfa1e1a ("dmaengine: ti-dma-crossbar: dra7: Use bitops instead of idr")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Link: https://lore.kernel.org/r/20220605042723.17668-2-linmq006@gmail.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
This patch makes get_vq_group and set_group_asid optional. This is
needed to unbreak the vDPA parent that doesn't support multiple
address spaces.
Cc: Gautam Dawar <gautam.dawar@xilinx.com>
Fixes: aaca8373c4 ("vhost-vdpa: support ASID based IOTLB API")
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20220609041901.2029-1-jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
There are double "the" in message in file virtio_mmio.c
and virtio_pci_modern_dev.c, fix it.
Signed-off-by: Bo Liu <liubo03@inspur.com>
Message-Id: <20220609031106.2161-1-liubo03@inspur.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
see warning:
| drivers/net/ethernet/amd/xgbe/xgbe-drv.c:2787:43: warning: format specifies
| type 'unsigned short' but the argument has type 'int' [-Wformat]
| netdev_dbg(netdev, "Protocol: %#06hx\n", ntohs(eth->h_proto));
| ~~~~~~ ^~~~~~~~~~~~~~~~~~~
Variadic functions (printf-like) undergo default argument promotion.
Documentation/core-api/printk-formats.rst specifically recommends
using the promoted-to-type's format flag.
Also, as per C11 6.3.1.1:
(https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1548.pdf)
`If an int can represent all values of the original type ..., the
value is converted to an int; otherwise, it is converted to an
unsigned int. These are called the integer promotions.`
Since the argument is a u16 it will get promoted to an int and thus it is
most accurate to use the %x format specifier here. It should be noted that the
`#06` formatting sugar does not alter the promotion rules.
Link: https://github.com/ClangBuiltLinux/linux/issues/378
Signed-off-by: Justin Stitt <jstitt007@gmail.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Link: https://lore.kernel.org/r/20220607191119.20686-1-jstitt007@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
In our server, there may be no high order (>= 6) memory since we reserve
lots of HugeTLB pages when booting. Then the system panic. So use
alloc_large_system_hash() to allocate table_perturb.
Fixes: e926147618 ("tcp: dynamically allocate the perturb table used by source ports")
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20220607070214.94443-1-songmuchun@bytedance.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Since commit a18e6521a7 ("net: phylink: handle NA interface mode in
phylink_fwnode_phy_connect()"), phylib defaults to GMII when no phy-mode
or phy-connection-type property is specified in a DSA port node of the
device tree. The same commit caused a regression in rtl8365mb whereby
phylink would fail to connect, because the driver did not advertise
support for GMII for ports with internal PHY.
It should be noted that the aforementioned regression is not because the
blamed commit was incorrect: on the contrary, the blamed commit is
correcting the previous behaviour whereby unspecified phy-mode would
cause the internal interface mode to be PHY_INTERFACE_MODE_NA. The
rtl8365mb driver only worked by accident before because it _did_
advertise support for PHY_INTERFACE_MODE_NA, despite NA being reserved
for internal use by phylink. With one mistake fixed, the other was
exposed.
Commit a5dba0f207 ("net: dsa: rtl8365mb: add GMII as user port mode")
then introduced implicit support for GMII mode on ports with internal
PHY to allow a PHY connection for device trees where the phy-mode is not
explicitly set to "internal". At this point everything was working OK
again.
Subsequently, commit 6ff6064605 ("net: dsa: realtek: convert to
phylink_generic_validate()") broke this behaviour again by discarding
the usage of rtl8365mb_phy_mode_supported() - where this GMII support
was indicated - while switching to the new .phylink_get_caps API.
With the new API, rtl8365mb_phy_mode_supported() is no longer needed.
Remove it altogether and add back the GMII capability - this time to
rtl8365mb_phylink_get_caps() - so that the above default behaviour works
for ports with internal PHY again.
Fixes: 6ff6064605 ("net: dsa: realtek: convert to phylink_generic_validate()")
Signed-off-by: Alvin Šipraga <alsi@bang-olufsen.dk>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/20220607184624.417641-1-alvin@pqrs.dk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2022-06-07
This series contains updates to ixgbe driver only.
Olivier Matz resolves an issue so that broadcast packets can still be
received when VF removes promiscuous settings and removes setting of
VLAN promiscuous, in promiscuous mode, to prevent a loop when VFs are
bridged.
* '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
ixgbe: fix unexpected VLAN Rx in promisc mode on VF
ixgbe: fix bcast packets Rx on VF after promisc removal
====================
Link: https://lore.kernel.org/r/20220607181538.748786-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Russell King says:
====================
mv88e6xxx: fixes for reading serdes state
These are some low-priority fixes to the mv88e6xxx serdes code.
Patch 1 fixes the reporting of an_complete, which is used in the
emulation of a conventional C22 PHY. Patch from Marek.
Patch 2 makes one of the error messages in patch 2 to be consistent
with the other error messages in this function.
Patch 3 ensures that we do not miss a link-failure event.
====================
Link: https://lore.kernel.org/r/Yp82TyoLon9jz6k3@shell.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Phylink wants to know if the link has dropped since the last time state
was retrieved, and the BMSR gives us that. Read the BMSR and use it when
deciding the link state. Fill in the an_complete member as well for the
emulated PHY state.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Other errors accessing the registers in mv88e6352_serdes_pcs_get_state()
print "PHY " before the register name, except for the BMSR. Make this
consistent with the other error messages.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Commit ede359d884 ("net: dsa: mv88e6xxx: Link in pcs_get_state() if AN
is bypassed") added the ability to link if AN was bypassed, and added
filling of state->an_complete field, but set it to true if AN was
enabled in BMCR, not when AN was reported complete in BMSR.
This was done because for some reason, when I wanted to use BMSR value
to infer an_complete, I was looking at BMSR_ANEGCAPABLE bit (which was
always 1), instead of BMSR_ANEGCOMPLETE bit.
Use BMSR_ANEGCOMPLETE for filling state->an_complete.
Fixes: ede359d884 ("net: dsa: mv88e6xxx: Link in pcs_get_state() if AN is bypassed")
Signed-off-by: Marek Behún <kabel@kernel.org>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Every iteration of for_each_child_of_node() decrements
the reference count of the previous node.
When break from a for_each_child_of_node() loop,
we need to explicitly call of_node_put() on the child node when
not need anymore.
Add missing of_node_put() to avoid refcount leak.
Fixes: bbd2190ce9 ("Altera TSE: Add main and header file for Altera Ethernet Driver")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Link: https://lore.kernel.org/r/20220607041144.7553-1-linmq006@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
If packet headers changed, the cached nfct is no longer relevant
for the packet and attempt to re-use it leads to the incorrect packet
classification.
This issue is causing broken connectivity in OpenStack deployments
with OVS/OVN due to hairpin traffic being unexpectedly dropped.
The setup has datapath flows with several conntrack actions and tuple
changes between them:
actions:ct(commit,zone=8,mark=0/0x1,nat(src)),
set(eth(src=00:00:00:00:00:01,dst=00:00:00:00:00:06)),
set(ipv4(src=172.18.2.10,dst=192.168.100.6,ttl=62)),
ct(zone=8),recirc(0x4)
After the first ct() action the packet headers are almost fully
re-written. The next ct() tries to re-use the existing nfct entry
and marks the packet as invalid, so it gets dropped later in the
pipeline.
Clearing the cached conntrack entry whenever packet tuple is changed
to avoid the issue.
The flow key should not be cleared though, because we should still
be able to match on the ct_state if the recirculation happens after
the tuple change but before the next ct() action.
Cc: stable@vger.kernel.org
Fixes: 7f8a436eaa ("openvswitch: Add conntrack action")
Reported-by: Frode Nordahl <frode.nordahl@canonical.com>
Link: https://mail.openvswitch.org/pipermail/ovs-discuss/2022-May/051829.html
Link: https://bugs.launchpad.net/ubuntu/+source/ovn/+bug/1967856
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Link: https://lore.kernel.org/r/20220606221140.488984-1-i.maximets@ovn.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Commit fed9b26b25 ("MAINTAINERS: Update KVM RISC-V entry to cover
selftests support") optimistically adds a file entry for
tools/testing/selftests/kvm/riscv/, but this directory does not exist.
Hence, ./scripts/get_maintainer.pl --self-test=patterns complains about a
broken reference. The script is very useful to keep MAINTAINERS up to date
and MAINTAINERS can be kept in a state where the script emits no warning.
So, just drop the non-matching file entry rather than starting to collect
exceptions of entries that may match in some close or distant future.
Fixes: fed9b26b25 ("MAINTAINERS: Update KVM RISC-V entry to cover selftests support")
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
Various spelling mistakes in comments.
Detected with the help of Coccinelle.
Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Signed-off-by: Anup Patel <anup@brainfault.org>
When rx_flag == MTK_RX_FLAGS_HWLRO,
rx_data_len = MTK_MAX_LRO_RX_LENGTH(4096 * 3) > PAGE_SIZE.
netdev_alloc_frag is for alloction of page fragment only.
Reference to other drivers and Documentation/vm/page_frags.rst
Branch to use __get_free_pages when ring->frag_size > PAGE_SIZE.
Signed-off-by: Chen Lin <chen45464546@163.com>
Link: https://lore.kernel.org/r/1654692413-2598-1-git-send-email-chen45464546@163.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
GRE with TUNNEL_CSUM will apply local checksum offload on
CHECKSUM_PARTIAL packets.
ipgre_xmit must validate csum_start after an optional skb_pull,
else lco_csum may trigger an overflow. The original check was
if (csum && skb_checksum_start(skb) < skb->data)
return -EINVAL;
This had false positives when skb_checksum_start is undefined:
when ip_summed is not CHECKSUM_PARTIAL. A discussed refinement
was straightforward
if (csum && skb->ip_summed == CHECKSUM_PARTIAL &&
skb_checksum_start(skb) < skb->data)
return -EINVAL;
But was eventually revised more thoroughly:
- restrict the check to the only branch where needed, in an
uncommon GRE path that uses header_ops and calls skb_pull.
- test skb_transport_header, which is set along with csum_start
in skb_partial_csum_set in the normal header_ops datapath.
Turns out skbs can arrive in this branch without the transport
header set, e.g., through BPF redirection.
Revise the check back to check csum_start directly, and only if
CHECKSUM_PARTIAL. Do leave the check in the updated location.
Check field regardless of whether TUNNEL_CSUM is configured.
Link: https://lore.kernel.org/netdev/YS+h%2FtqCJJiQei+W@shredder/
Link: https://lore.kernel.org/all/20210902193447.94039-2-willemdebruijn.kernel@gmail.com/T/#u
Fixes: 8a0ed250f9 ("ip_gre: validate csum_start only on pull")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Link: https://lore.kernel.org/r/20220606132107.3582565-1-willemdebruijn.kernel@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Daniel Borkmann says:
====================
pull-request: bpf 2022-06-09
We've added 6 non-merge commits during the last 2 day(s) which contain
a total of 8 files changed, 49 insertions(+), 15 deletions(-).
The main changes are:
1) Fix an illegal copy_to_user() attempt seen by syzkaller through arm64
BPF JIT compiler, from Eric Dumazet.
2) Fix calling global functions from BPF_PROG_TYPE_EXT programs by using
the correct program context type, from Toke Høiland-Jørgensen.
3) Fix XSK TX batching invalid descriptor handling, from Maciej Fijalkowski.
4) Fix potential integer overflows in multi-kprobe link code by using safer
kvmalloc_array() allocation helpers, from Dan Carpenter.
5) Add Quentin as bpftool maintainer, from Quentin Monnet.
* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
MAINTAINERS: Add a maintainer for bpftool
xsk: Fix handling of invalid descriptors in XSK TX batching API
selftests/bpf: Add selftest for calling global functions from freplace
bpf: Fix calling global functions from BPF_PROG_TYPE_EXT programs
bpf: Use safer kvmalloc_array() where possible
bpf, arm64: Clear prog->jited_len along prog->jited
====================
Link: https://lore.kernel.org/r/20220608234133.32265-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add the (still missing!) ATA sysfs file documentation to the libata
subsystem entry in the MAINTAINERS file.
Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
The {dma|pio}_mode sysfs files are incorrectly documented as having a
list of the supported DMA/PIO transfer modes, while the corresponding
fields of the *struct* ata_device hold the transfer mode IDs, not masks.
To match these docs, the {dma|pio}_mode (and even xfer_mode!) sysfs
files are handled by the ata_bitfield_name_match() macro which leads to
reading such kind of nonsense from them:
$ cat /sys/class/ata_device/dev3.0/pio_mode
XFER_UDMA_7, XFER_UDMA_6, XFER_UDMA_5, XFER_UDMA_4, XFER_MW_DMA_4,
XFER_PIO_6, XFER_PIO_5, XFER_PIO_4, XFER_PIO_3, XFER_PIO_2, XFER_PIO_1,
XFER_PIO_0
Using the correct ata_bitfield_name_search() macro fixes that:
$ cat /sys/class/ata_device/dev3.0/pio_mode
XFER_PIO_4
While fixing the file documentation, somewhat reword the {dma|pio}_mode
file doc and add a note about being mostly useful for PATA devices to
the xfer_mode file doc...
Fixes: d9027470b8 ("[libata] Add ATA transport class")
Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
OpenSSL 3.0 deprecated the OpenSSL's ENGINE API. That is as may be, but
the kernel build host tools still use it. Disable the warning about
deprecated declarations until somebody who cares fixes it.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
As platform_driver_register() could fail, it should be better
to deal with the return value in order to maintain the code
consisitency.
Fixes: 56a1485b10 ("i2c: npcm7xx: Add Nuvoton NPCM I2C controller driver")
Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn>
Acked-by: Tali Perry <tali.perry1@gmail.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
Remove vendor checks from prefer_mwait_c1_over_halt function. Restore
the decision tree to support MWAIT C1 as the default idle state based on
CPUID checks as done by Thomas Gleixner in
commit 09fd4b4ef5 ("x86: use cpuid to check MWAIT support for C1")
The decision tree is removed in
commit 69fb3676df ("x86 idle: remove mwait_idle() and "idle=mwait" cmdline param")
Prefer MWAIT when the following conditions are satisfied:
1. CPUID_Fn00000001_ECX [Monitor] should be set
2. CPUID_Fn00000005 should be supported
3. If CPUID_Fn00000005_ECX [EMX] is set then there should be
at least one C1 substate available, indicated by
CPUID_Fn00000005_EDX [MWaitC1SubStates] bits.
Otherwise use HLT for default_idle function.
HPC customers who want to optimize for lower latency are known to
disable Global C-States in the BIOS. In fact, some vendors allow
choosing a BIOS 'performance' profile which explicitly disables
C-States. In this scenario, the cpuidle driver will not be loaded and
the kernel will continue with the default idle state chosen at boot
time. On AMD systems currently the default idle state is HLT which has
a higher exit latency compared to MWAIT.
The reason for the choice of HLT over MWAIT on AMD systems is:
1. Families prior to 10h didn't support MWAIT
2. Families 10h-15h supported MWAIT, but not MWAIT C1. Hence it was
preferable to use HLT as the default state on these systems.
However, AMD Family 17h onwards supports MWAIT as well as MWAIT C1. And
it is preferable to use MWAIT as the default idle state on these
systems, as it has lower exit latencies.
The below table represents the exit latency for HLT and MWAIT on AMD
Zen 3 system. Exit latency is measured by issuing a wakeup (IPI) to
other CPU and measuring how many clock cycles it took to wakeup. Each
iteration measures 10K wakeups by pinning source and destination.
HLT:
25.0000th percentile : 1900 ns
50.0000th percentile : 2000 ns
75.0000th percentile : 2300 ns
90.0000th percentile : 2500 ns
95.0000th percentile : 2600 ns
99.0000th percentile : 2800 ns
99.5000th percentile : 3000 ns
99.9000th percentile : 3400 ns
99.9500th percentile : 3600 ns
99.9900th percentile : 5900 ns
Min latency : 1700 ns
Max latency : 5900 ns
Total Samples 9999
MWAIT:
25.0000th percentile : 1400 ns
50.0000th percentile : 1500 ns
75.0000th percentile : 1700 ns
90.0000th percentile : 1800 ns
95.0000th percentile : 1900 ns
99.0000th percentile : 2300 ns
99.5000th percentile : 2500 ns
99.9000th percentile : 3200 ns
99.9500th percentile : 3500 ns
99.9900th percentile : 4600 ns
Min latency : 1200 ns
Max latency : 4600 ns
Total Samples 9997
Improvement (99th percentile): 21.74%
Below is another result for context_switch2 micro-benchmark, which
brings out the impact of improved wakeup latency through increased
context-switches per second.
with HLT:
-------------------------------
50.0000th percentile : 190184
75.0000th percentile : 191032
90.0000th percentile : 192314
95.0000th percentile : 192520
99.0000th percentile : 192844
MIN : 190148
MAX : 192852
with MWAIT:
-------------------------------
50.0000th percentile : 277444
75.0000th percentile : 278268
90.0000th percentile : 278888
95.0000th percentile : 279164
99.0000th percentile : 280504
MIN : 273278
MAX : 281410
Improvement(99th percentile): ~ 45.46%
Signed-off-by: Wyes Karny <wyes.karny@amd.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Zhang Rui <rui.zhang@intel.com>
Link: https://ozlabs.org/~anton/junkcode/context_switch2.c
Link: https://lkml.kernel.org/r/0cc675d8fd1f55e41b510e10abf2e21b6e9803d5.1654538381.git-series.wyes.karny@amd.com
When kernel is booted with idle=nomwait do not use MWAIT as the
default idle state.
If the user boots the kernel with idle=nomwait, it is a clear
direction to not use mwait as the default idle state.
However, the current code does not take this into consideration
while selecting the default idle state on x86.
Fix it by checking for the idle=nomwait boot option in
prefer_mwait_c1_over_halt().
Also update the documentation around idle=nomwait appropriately.
[ dhansen: tweak commit message ]
Signed-off-by: Wyes Karny <wyes.karny@amd.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Tested-by: Zhang Rui <rui.zhang@intel.com>
Link: https://lkml.kernel.org/r/fdc2dc2d0a1bc21c2f53d989ea2d2ee3ccbc0dbe.1654538381.git-series.wyes.karny@amd.com
The actual status of the code is Supported (from x86 perspective).
Reported-by: dave.hansen@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
[wsa: fixed "DesignWare" spelling]
Signed-off-by: Wolfram Sang <wsa@kernel.org>
When combining two steering rules into one check
not only do they share the same actions but those
actions are also the same. This resolves an issue where
when creating two different rules with the same match
the actions are overwritten and one of the rules is deleted
a FW syndrome can be seen in dmesg.
mlx5_core 0000:03:00.0: mlx5_cmd_check:819:(pid 2105): DEALLOC_MODIFY_HEADER_CONTEXT(0x941) op_mod(0x0) failed, status bad resource state(0x9), syndrome (0x1ab444)
Fixes: 0d235c3fab ("net/mlx5: Add hash table to search FTEs in a flow-group")
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
The current design does not arm the tracer if traces are available before
the tracer string database is fully loaded, leading to an unfunctional tracer.
This fix will rearm the tracer every time the FW triggers tracer event
regardless of the tracer strings database status.
Fixes: c71ad41ccb ("net/mlx5: FW tracer, events handling")
Signed-off-by: Feras Daoud <ferasda@nvidia.com>
Signed-off-by: Roy Novich <royno@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
FW is not ready, fix was sent too soon.
This reverts commit f05ec8d9d0.
Fixes: f05ec8d9d0 ("net/mlx5e: Allow relaxed ordering over VFs")
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Commit 40379a0084 ("net/mlx5_fpga: Drop INNOVA TLS support") removes all
files in the directory drivers/net/ethernet/mellanox/mlx5/core/accel/, but
misses to adjust its reference in MAINTAINERS.
Hence, ./scripts/get_maintainer.pl --self-test=patterns complains about a
broken reference.
Remove the file entry to the removed directory in MELLANOX ETHERNET INNOVA
DRIVERS.
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
The conversion to the dma-mapping API in linux-2.6.11 was incomplete
and left a virt_to_bus() call around. There have been a number of
fixes for DMA mapping API abuse in this driver, but this one always
slipped through.
Change it to just use the existing dma_addr_t pointer, and make it
use the correct types throughout the driver to make it easier to
understand the virtual vs dma address spaces.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Tested-by: Manuel Lauss <manuel.lauss@gmail.com>
Link: https://lore.kernel.org/r/20220607090206.19830-1-arnd@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Unused now, and the interface never really made a whole lot of sense to
start with.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>