The get_unit_name() function crafts a string based on the device name
and the device unit number. It then stores this in a static variable.
This has concurrency issues as can be seen with this log:
hfi1 0000:02:00.0: hfi1_1: read_idle_message: read idle message 0x203
hfi1 0000:01:00.0: hfi1_1: read_idle_message: read idle message 0x203
The PCI device ID (0000:02:00.0 vs. 0000:01:00.0) is correct for the
message, but the device string hfi1_1 is incorrect (it should be
hfi1_0 for the second log message).
Remove get_unit_name() function.
Instead, use the rvt accessor rvt_get_ibdev_name() to get the IB name
string.
Clean up any hfi1_early_xx calls that can now use the new path.
QIB has the same (qib_get_unit_name()) issue. Updating as necessary.
Remove qib_get_unit_name() function.
Update log message that has redundant device name.
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
CQ allocation does not ensure that completion queue entries
and the completion queue structure are allocated on the correct
numa node.
Fix by allocating the rvt_cq and kernel CQ entries on the device node,
leaving the user CQ entries on the default local node. Also ensure
CQ resizes use the correct allocator when extending a CQ.
Reviewed-by: Sebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
When an 8051 command times out, the entire DC block is restarted. During
the restart, the host interface version bit is set, which calls
do_8051_command() recursively. The host version bit needs to be set
before the link moves into polling, so the host version bit can be set
in set_local_link_attributes() instead. Thus, the 8051 command functions
can be simplied as a non-locking version (dd->dc8051_lock) of those
functions are no longer needed.
Fixes: 9be6a5d788 ("IB/hfi1: Prevent LNI out of sync by resetting host interface version")
Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Normal receive queue allocation ensures that kernel receive queues
are allocated on the local numa node. Shared receive queues
do not behave the same way.
Ensure that kernel shared receive queues are allocated on the device
local node.
Reviewed-by: Sebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
rdmavt has a down call to client drivers to retrieve a crafted card
name.
This name should be the IB defined name.
Rather than craft the name each time it is needed, simply retrieve
the IB allocated name from the IB device.
Update the function name to reflect its application.
Clean up driver code to match this change.
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Currently the HFI and QIB drivers allow the IB core to assign a unit
number to the driver name string.
If multiple devices exist in a system, there is a possibility that the
device unit number and the IB core number will be mismatched.
Fix by using the driver defined unit number to generate the device
name.
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
When the rdmavt's RNRNAK timer is fired, it tries to cancel the timer by
calling hrtimer_try_to_cancel(), which always returns -1 because the timer
is currently running. This patch removes this useless call.
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Add support of DVP parallel mode in addition of
existing MIPI CSI mode. The choice between two modes
and configuration is made through device tree.
Signed-off-by: Hugues Fruchet <hugues.fruchet@st.com>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Quite a big update here, mostly in new device support and some big
updates for older drivers too. The main core work continues to be
Morimoto-san's efforts on modernising drivers to use the component
layer.
- Lots more updates from Morimoto-san to move more things into the
component level.
- Large cleanups of some of the TI CODEC drivers from Andrew F. Davis.
- Even more quirks and cleanups of quirks for x86 systems.
- Refactoring of the Freescale SSI driver from Nicolin Chen in
preparation for some more substantive improvements which are
currently in review.
- New drivers for Allwinner A83T, Maxim MAX89373, SocioNext UiniPhier
EVEA Tempo Semiconductor TSCS42xx and TI PCM816x, TAS5722 and TAS6424
devices.
-----BEGIN PGP SIGNATURE-----
iQFHBAABCgAxFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAlpPs5ETHGJyb29uaWVA
a2VybmVsLm9yZwAKCRAk1otyXVSH0ML7B/oDInL0FboyUorNxANa9SPWVby29OIh
umWCG0Yq6683aOpjHJaCQeQ8VePlV+ABN/cP979PFxjEaJSZ+uaTOQs6oPgDwiHM
Piaw5+6S6hz2W/E1smpW5rReBfw3MBLSl938eoauQOAU10tDNBqK7z6H41/cBWsu
o1VAebcuTcIJyHhBYch17IxXX/H+NzumX/WK8YZH+fYBjnJfRbaYyXlFTl2jmdrO
LOq6JAl8pTtJL0foZZSCeFHoZnVw47y6zkZQaMViaW70RIVMVBVtO+onRSI8bIpj
vGtwJWvOg+Mrjk2eizQjZsWqsPYsClv+eOoY4Qtwg3mIFs7GN6WqUWPP
=aUw7
-----END PGP SIGNATURE-----
Merge tag 'asoc-v4.16' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-next
ASoC: Updates for v4.16
Quite a big update here, mostly in new device support and some big
updates for older drivers too. The main core work continues to be
Morimoto-san's efforts on modernising drivers to use the component
layer.
- Lots more updates from Morimoto-san to move more things into the
component level.
- Large cleanups of some of the TI CODEC drivers from Andrew F. Davis.
- Even more quirks and cleanups of quirks for x86 systems.
- Refactoring of the Freescale SSI driver from Nicolin Chen in
preparation for some more substantive improvements which are
currently in review.
- New drivers for Allwinner A83T, Maxim MAX89373, SocioNext UiniPhier
EVEA Tempo Semiconductor TSCS42xx and TI PCM816x, TAS5722 and TAS6424
devices.
The mt9m111 has the test pattern generator features. This makes use of
it through V4L2_CID_TEST_PATTERN control.
Cc: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
The mt9m111 driver requires clocks property for the master clock to the
sensor, but there is no description for that. This adds it.
Cc: Rob Herring <robh@kernel.org>
Cc: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Create a source pad and set the media controller type to the sensor.
Cc: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Set the V4L2_SUBDEV_FL_HAS_DEVNODE flag for the subdevice so that the
subdevice device node is created.
Cc: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
The ov7740 (color) image sensor is a high performance VGA CMOS
image snesor, which supports for output formats: RAW RGB and YUV
and image sizes: VGA, and QVGA, CIF and any size smaller.
Signed-off-by: Songjun Wu <songjun.wu@microchip.com>
Signed-off-by: Wenyou Yang <wenyou.yang@microchip.com>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Add the device tree binding documentation for the ov7740 sensor driver.
Signed-off-by: Wenyou Yang <wenyou.yang@microchip.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Smatch complains that "err" can be uninitialized if we have a zero size
write. The flow analysis is a little complicated so I'm not sure if
that's possible or not, but it's harmless to set this to zero and it
makes the code easier to read.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
cio2 driver should release buffer with QUEUED state
when start_stream op failed, wrong buffer state will
cause vb2 core throw a warning.
Signed-off-by: Yong Zhi <yong.zhi@intel.com>
Signed-off-by: Cao Bing Bu <bingbu.cao@intel.com>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
When dmabuf is used for BLOB type frame, the frame
buffers allocated by gralloc will hold more pages
than the valid frame data due to height alignment.
In this case, the page numbers in sg list could exceed the
FBPT upper limit value - max_lops(8)*1024 to cause crash.
Limit the LOP access to the valid data length
to avoid FBPT sub-entries overflow.
Signed-off-by: Yong Zhi <yong.zhi@intel.com>
Signed-off-by: Cao Bing Bu <bingbu.cao@intel.com>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Geminilake requires the 3D driver to select whether barriers are
intended for compute shaders, or tessellation control shaders, by
whacking a "Barrier Mode" bit in SLICE_COMMON_ECO_CHICKEN1 when
switching pipelines. Failure to do this properly can result in GPU
hangs.
Unfortunately, this means it needs to switch mid-batch, so only
userspace can properly set it. To facilitate this, the kernel needs
to whitelist the register.
The workarounds page currently tags this as applying to Broxton only,
but that doesn't make sense. The documentation for the register it
references says the bit userspace is supposed to toggle only exists on
Geminilake. Empirically, the Mesa patch to toggle this bit appears to
fix intermittent GPU hangs in tessellation control shader barrier tests
on Geminilake; we haven't seen those hangs on Broxton.
v2: Mention WA #0862 in the comment (it doesn't have a name).
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180105085905.9298-1-kenneth@whitecape.org
The driver apparently assumes that the device uses the same page size
as the CPU, but also assumes that this is 4096 bytes. On architectures
with a larger page size like 65536 bytes, we get a warning about an
integer overflow:
drivers/media/pci/intel/ipu3/ipu3-cio2.c: In function 'cio2_fbpt_entry_init_dummy':
arch/arm64/include/asm/page-def.h:28:20: error: large integer implicitly truncated to unsigned type [-Werror=overflow]
#define PAGE_SIZE (_AC(1, UL) << PAGE_SHIFT)
^
drivers/media/pci/intel/ipu3/ipu3-cio2.h:404:26: note: in expansion of macro 'PAGE_SIZE'
#define CIO2_PAGE_SIZE PAGE_SIZE
^~~~~~~~~
drivers/media/pci/intel/ipu3/ipu3-cio2.c:172:3: note: in expansion of macro 'CIO2_PAGE_SIZE'
CIO2_PAGE_SIZE / sizeof(u32) * CIO2_MAX_LOPS;
Obviously this won't work, but the driver is also unlikely to ever be
used on such an architecture, so the easiest workaround is to define
the CIO2_PAGE_SIZE macro to the size that the hardware actually uses.
Fixes: c2a6a07afe ("media: intel-ipu3: cio2: add new MIPI-CSI2 driver")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
When CONFIG_PM is disabled, we get harmless warnings about the
suspend/resume callbacks being unused:
drivers/media/pci/intel/ipu3/ipu3-cio2.c:1993:12: error: 'cio2_resume' defined but not used [-Werror=unused-function]
drivers/media/pci/intel/ipu3/ipu3-cio2.c:1967:12: error: 'cio2_suspend' defined but not used [-Werror=unused-function]
This marks them as __maybe_unused to shut up the warnings.
Fixes: c2a6a07afe ("media: intel-ipu3: cio2: add new MIPI-CSI2 driver")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
The arr_size() macro which is used to calculate the size of the chunk in the
array to be arranged resembles ARRAY_SIZE naming-wise. Avoid confusion by
renaming it to CHUNK_SIZE instead.
Also use min() macro to calculate the minimum of two numbers.
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Since commit 31847b67be ("kconfig: allow use of relations other than
(in)equality") it is possible to use relational operators in Kconfig
statements. However, those operators give unexpected results when
applied to bool/tristate values:
(n < y) = y (correct)
(m < y) = y (correct)
(n < m) = n (wrong)
This happens because relational operators process bool and tristate
symbols as strings and m sorts before n. It makes little sense to do a
lexicographical compare on bool and tristate values though.
Documentation/kbuild/kconfig-language.txt states that expression can have
a value of 'n', 'm' or 'y' (or 0, 1, 2 respectively for calculations).
Let's make it so for relational comparisons with bool/tristate
expressions as well and document them. If at least one symbol is an
actual string then the lexicographical compare works just as before.
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
This duplicate include has been found with scripts/checkincludes.pl but
it has been removed manually to avoid removing false positives.
Signed-off-by: Pravin Shedge <pravin.shedge4linux@gmail.com>
Patchwork-Id: 10092051
Acked-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Add hardware version to the firmware file name to handle scenarios where
single system image supports variety of devices.
Signed-off-by: Jeffrey Lin <jeffrey.lin@rad-ic.com>
Patchwork-Id: 10127677
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
data structures, removal of unnecessary newlines from gpio labels and code
simplification.
Also a defconfig update for DaVinci, enabling support for USB network adaptors.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJaT4oeAAoJEGFBu2jqvgRNEaMP+gKZlNTd5Xg+ZprJLdWf6eQw
0+9DZWJtTwXIJV44YkUDbRG+bo2xagrbMthwh1Orhhxqp5TQk474XuSY6gtmb6wL
wNadBo0VVqU/aQD0oSEDdVxFMyb5CInLsLZJAY/3zn4/h7OaP7P1jQ1lV0TslVb1
MXv4llU0NRskhfO6lpysvT0NcRVBtPDXItgKiGuzFnF5IgWGzWzoEFrVu5kWbYQ+
H77AuSChtXfs4SzgSdx7SN13IeRS0z/hg0MfGGlVUrHNkx2iorGu2cwJKMO/386e
mawJxkCQZ43c172e8IarbZRBWfU2x6mREI7OxDSrYF1lPSTLl+yDT+t7UGVi78aa
BF6POwrbgY3kFvfCZhqYlMCZ3Qg8MRxlJ4omr/YChtUdpsKVXf5cRNAfM4SfH1eW
rVZh3gDq/YBlFkNmRaWjaHHU6XvQlbrdODxMDAIafDiVF4SjJZtkOawhsGk3zuVM
6mE+F0pJTegRoc4Nv85NSMpqmRVG1X+Ht5Dj2llodCDc1OUnVoFRbRXbqDvKCuCA
pxK6XaiNpIpEXNP+IwzZLrYJn7f+vbpD9eyWyyWLrY2RDLX5QbV/c0hsr0dnFg/l
gL1nRmn+O4VqRufzyh4MgvbrcZJMoiWtSudNAcppe3ogh52oK1TkTqhSdcRcvwTJ
PN4m7htGARI2zlVa9xcU
=oZrM
-----END PGP SIGNATURE-----
Merge tag 'davinci-for-v4.16/soc-v2' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/nsekhar/linux-davinci into next/soc
Pull "TI DaVinci SoC support updates for v4.16" from Sekhar Nori:
DaVinci SoC updates consisting of non-critical bug fixes including constifying
data structures, removal of unnecessary newlines from gpio labels and code
simplification.
Also a defconfig update for DaVinci, enabling support for USB network adaptors.
* tag 'davinci-for-v4.16/soc-v2' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/nsekhar/linux-davinci:
ARM: davinci: constify gpio_led
ARM: davinci: drop unneeded newline
ARM: davinci: Use PTR_ERR_OR_ZERO()
ARM: davinci: make davinci_soc_info structures const
ARM: davinci: make argument to davinci_common_init() as const
ARM: davinci_all_defconfig: enable support for USB network adaptors
- Update i.MX GPC driver to support PCI power domain of i.MX6SX SoC.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJaTCqGAAoJEFBXWFqHsHzOavcH/1g0k3tN4MaOvf44lSDm5uCZ
WqCKLN01f8T02YegdcgVVc59wirJB6xKIYD86dQIiVC8gEfMOhvfEuv6qkPRHQOM
WVYJTIFFisEeypnXEHjD2SwN6/SskSGPMUDpfj0hveb0srLnCQou+5Ty8zpvmAdl
XEAP9mjK7s6ovfomfuxEY4iWxGRKbDXtwXqtvYTI3tpXwn/c6LXgdrg02xC/KAXp
sKXbS0NbYjBDHBd8hPcVVwujugw3UZHr/saYYf0rmC7WKr7074WylSWW08aKQ89u
Elv5fNeoAYrlnwpeoqzLqcgrvhCJcpx0HTVnYjmNIVLYRLAktYu6glR7icAdpb8=
=kB/f
-----END PGP SIGNATURE-----
Merge tag 'imx-drivers-4.16' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into next/drivers
Pull "i.MX drivers update for 4.16" from Shawn Guo:
- Update i.MX GPC driver to support PCI power domain of i.MX6SX SoC.
* tag 'imx-drivers-4.16' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
soc: imx: gpc: Add i.MX6SX PCI power domain
Instead of doing that upon each ipcns creation, we do that the first
time mq_open(2) or mqueue mount is done in an ipcns. What's more,
doing that allows to get rid of mount_ns() use - we can go with
considerably cheaper mount_nodev(), avoiding the loop over all
mqueue superblock instances; ipcns->mq_mnt is used to locate preexisting
instance in O(1) time instead of O(instances) mount_ns() would've
cost us.
Based upon the version by Giuseppe Scrivano <gscrivan@redhat.com>; I've
added handling of userland mqueue mounts (original had been broken in
that area) and added a switch to mount_nodev().
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Similar to vfs_create(), but with caller-supplied callback (and
argument for it) to be used instead of ->create().
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
- Drop power saving status checking from MMDC driver probe function,
since there is nothing really depending on power saving being
enabled.
- Clean up unused imx3 pm definitions.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJaTCoAAAoJEFBXWFqHsHzOhO8H/2dA59AXGNc0HQGChD46R0sY
mogHzAPhmziEE3/t/CL1PLeGtp8Z5enVQa/ckd0AByIdMbN4gTXiGrOLMwXF3fpP
jpkhWw6ZPb/jz53zDLWcEPhwnFu2ANUF7dAYgA4rI7742idqy21zbD9tXpJFU+ZM
f7ieXqjVAoQ4vKG8pPCrTyEn6urEAZP2cOmW6WvYcLpyMIFRFoLHCthA3DtHAobG
910VtDyMOg4Bc4uvn26SWZCwC5+hXxda75Uc9lgdSGG3nO7LIFi4shcwh87mTFtI
eOJ23UTtB6C4lTIPZwUDeA2YIsWdQRfWIdtGwGrPztB+R+HYblwCmNk4zAZOIXM=
=l5gq
-----END PGP SIGNATURE-----
Merge tag 'imx-soc-4.16' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into next/soc
Pull "i.MX SoC updates for 4.16" from Shawn Guo:
- Drop power saving status checking from MMDC driver probe function,
since there is nothing really depending on power saving being
enabled.
- Clean up unused imx3 pm definitions.
* tag 'imx-soc-4.16' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
ARM: imx: remove unused imx3 pm definitions
ARM: imx: don't abort MMDC probe if power saving status doesn't match
This is the end. Update port status. Change contact address. Add Mans.
Signed-off-by: Marc Gonzalez <marc_gonzalez@sigmadesigns.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Descriptor table is a shared object; it's not a place where you can
stick temporary references to files, especially when we don't need
an opened file at all.
Cc: stable@vger.kernel.org # v4.14
Fixes: 98589a0998 ("netfilter: xt_bpf: Fix XT_BPF_MODE_FD_PINNED mode of 'xt_bpf_info_v1'")
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Commit a33801e8b4 ("block, bfq: move debug blkio stats behind
CONFIG_DEBUG_BLK_CGROUP") introduced two batches of confusing ifdefs:
one reported in [1], plus a similar one in another function. This
commit removes both batches, in the way suggested in [1].
[1] https://www.spinics.net/lists/linux-block/msg20043.html
Fixes: a33801e8b4 ("block, bfq: move debug blkio stats behind CONFIG_DEBUG_BLK_CGROUP")
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Tested-by: Luca Miccio <lucmiccio@gmail.com>
Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
BFQ privileges the I/O of soft real-time applications, such as video
players, to guarantee to these application a high bandwidth and a low
latency. In this respect, it is not easy to correctly detect when an
application is soft real-time. A particularly nasty false positive is
that of an I/O-bound application that occasionally happens to meet all
requirements to be deemed as soft real-time. After being detected as
soft real-time, such an application monopolizes the device. Fortunately,
BFQ will realize soon that the application is actually not soft
real-time and suspend every privilege. Yet, the application may happen
again to be wrongly detected as soft real-time, and so on.
As highlighted by our tests, this problem causes BFQ to occasionally
fail to guarantee a high responsiveness, in the presence of heavy
background I/O workloads. The reason is that the background workload
happens to be detected as soft real-time, more or less frequently,
during the execution of the interactive task under test. To give an
idea, because of this problem, Libreoffice Writer occasionally takes 8
seconds, instead of 3, to start up, if there are sequential reads and
writes in the background, on a Kingston SSDNow V300.
This commit addresses this issue by leveraging the following facts.
The reason why some applications are detected as soft real-time despite
all BFQ checks to avoid false positives, is simply that, during high
CPU or storage-device load, I/O-bound applications may happen to do
I/O slowly enough to meet all soft real-time requirements, and pass
all BFQ extra checks. Yet, this happens only for limited time periods:
slow-speed time intervals are usually interspersed between other time
intervals during which these applications do I/O at a very high speed.
To exploit these facts, this commit introduces a little change, in the
detection of soft real-time behavior, to systematically consider also
the recent past: the higher the speed was in the recent past, the
later next I/O should arrive for the application to be considered as
soft real-time. At the beginning of a slow-speed interval, the minimum
arrival time allowed for the next I/O usually happens to still be so
high, to fall *after* the end of the slow-speed period itself. As a
consequence, the application does not risk to be deemed as soft
real-time during the slow-speed interval. Then, during the next
high-speed interval, the application cannot, evidently, be deemed as
soft real-time (exactly because of its speed), and so on.
This extra filtering proved to be rather effective: in the above test,
the frequency of false positives became so low that the start-up time
was 3 seconds in all iterations (apart from occasional outliers,
caused by page-cache-management issues, which are out of the scope of
this commit, and cannot be solved by an I/O scheduler).
Tested-by: Lee Tibbert <lee.tibbert@gmail.com>
Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
Signed-off-by: Angelo Ruocco <angeloruocco90@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
When two or more processes do I/O in a way that the their requests are
sequential in respect to one another, BFQ merges the bfq_queues associated
with the processes. This way the overall I/O pattern becomes sequential,
and thus there is a boost in througput.
These cooperating processes usually start or restart to do I/O shortly
after each other. So, in order to avoid merging non-cooperating processes,
BFQ ensures that none of these queues has been in weight raising for too
long.
In this respect, from commit "block, bfq-sq, bfq-mq: let a queue be merged
only shortly after being created", BFQ checks whether any queue (and not
only weight-raised ones) is doing I/O continuously from too long to be
merged.
This new additional check makes the first one useless: a queue doing
I/O from long enough, if being weight-raised, is also a queue in
weight raising for too long to be merged. Accordingly, this commit
removes the first check.
Signed-off-by: Angelo Ruocco <angeloruocco90@gmail.com>
Signed-off-by: Paolo Valente <paolo.valente@linaro.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
In BFQ and CFQ, two processes are said to be cooperating if they do
I/O in such a way that the union of their I/O requests yields a
sequential I/O pattern. To get such a sequential I/O pattern out of
the non-sequential pattern of each cooperating process, BFQ and CFQ
merge the queues associated with these processes. In more detail,
cooperating processes, and thus their associated queues, usually
start, or restart, to do I/O shortly after each other. This is the
case, e.g., for the I/O threads of KVM/QEMU and of the dump
utility. Basing on this assumption, this commit allows a bfq_queue to
be merged only during a short time interval (100ms) after it starts,
or re-starts, to do I/O. This filtering provides two important
benefits.
First, it greatly reduces the probability that two non-cooperating
processes have their queues merged by mistake, if they just happen to
do I/O close to each other for a short time interval. These spurious
merges cause loss of service guarantees. A low-weight bfq_queue may
unjustly get more than its expected share of the throughput: if such a
low-weight queue is merged with a high-weight queue, then the I/O for
the low-weight queue is served as if the queue had a high weight. This
may damage other high-weight queues unexpectedly. For instance,
because of this issue, lxterminal occasionally took 7.5 seconds to
start, instead of 6.5 seconds, when some sequential readers and
writers did I/O in the background on a FUJITSU MHX2300BT HDD. The
reason is that the bfq_queues associated with some of the readers or
the writers were merged with the high-weight queues of some processes
that had to do some urgent but little I/O. The readers then exploited
the inherited high weight for all or most of their I/O, during the
start-up of terminal. The filtering introduced by this commit
eliminated any outlier caused by spurious queue merges in our start-up
time tests.
This filtering also provides a little boost of the throughput
sustainable by BFQ: 3-4%, depending on the CPU. The reason is that,
once a bfq_queue cannot be merged any longer, this commit makes BFQ
stop updating the data needed to handle merging for the queue.
Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
Signed-off-by: Angelo Ruocco <angeloruocco90@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
A just-created bfq_queue will certainly be deemed as interactive on
the arrival of its first I/O request, if the low_latency flag is
set. Yet, if the queue is merged with another queue on the arrival of
its first I/O request, it will not have the chance to be flagged as
interactive. Nevertheless, if the queue is then split soon enough, it
has to be flagged as interactive after the split.
To handle this early-merge scenario correctly, BFQ saves the state of
the queue, on the merge, as if the latter had already been deemed
interactive. So, if the queue is split soon, it will get
weight-raised, because the previous state of the queue is resumed on
the split.
Unfortunately, in the act of saving the state of the newly-created
queue, BFQ doesn't check whether the low_latency flag is set, and this
causes early-merged queues to be then weight-raised, on queue splits,
even if low_latency is off. This commit addresses this problem by
adding the missing check.
Signed-off-by: Angelo Ruocco <angeloruocco90@gmail.com>
Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
If two processes do I/O close to each other, then BFQ merges the
bfq_queues associated with these processes, to get a more sequential
I/O, and thus a higher throughput. In this respect, to detect whether
two processes are doing I/O close to each other, BFQ keeps a list of
the head-of-line I/O requests of all active bfq_queues. The list is
ordered by initial sectors, and implemented through a red-black tree
(rq_pos_tree).
Unfortunately, the update of the rq_pos_tree was incomplete, because
the tree was not updated on the removal of the head-of-line I/O
request of a bfq_queue, in case the queue did not remain empty. This
commit adds the missing update.
Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
Signed-off-by: Angelo Ruocco <angeloruocco90@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>