Commit graph

1015345 commits

Author SHA1 Message Date
Ronnie Sahlberg
6ef4e9cbe1 cifs: add a function to get a cached dir based on its dentry
Needed for subsequent patches in the directory caching
series.

Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-04-25 16:28:23 -05:00
Ronnie Sahlberg
5e9c89d43f cifs: Grab a reference for the dentry of the cached directory during the lifetime of the cache
We need to hold both a reference for the root/superblock as well as the directory that we
are caching. We need to drop these references before we call kill_anon_sb().

At this point, the root and the cached dentries are always the same but this will change
once we start caching other directories as well.

Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-04-25 16:28:23 -05:00
Ronnie Sahlberg
269f67e1ff cifs: store a pointer to the root dentry in cifs_sb_info once we have completed mounting the share
And use this to only allow to take out a shared handle once the mount has completed and the
sb becomes available.
This will become important in follow up patches where we will start holding a reference to the
directory dentry for the shared handle during the lifetime of the handle.

Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-04-25 16:28:23 -05:00
Ronnie Sahlberg
45c0f1aabe cifs: rename the *_shroot* functions to *_cached_dir*
These functions will eventually be used to cache any directory, not just the root
so change the names.

Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-04-25 16:28:23 -05:00
Ronnie Sahlberg
e6eb19504e cifs: pass a path to open_shroot and check if it is the root or not
Move the check for the directory path into the open_shroot() function
but still fail for any non-root directories.
This is preparation for later when we will start using the cache also
for other directories than the root.

Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-04-25 16:28:23 -05:00
Ronnie Sahlberg
4df3d976dd cifs: move the check for nohandlecache into open_shroot
instead of doing it in the callsites for open_shroot.

Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-04-25 16:28:23 -05:00
Al Viro
991e72eb0e cifs: switch build_path_from_dentry() to using dentry_path_raw()
The cost is that we might need to flip '/' to '\\' in more than
just the prefix.  Needs profiling, but I suspect that we won't
get slowdown on that.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-04-25 16:28:23 -05:00
Al Viro
f6a9bc336b cifs: allocate buffer in the caller of build_path_from_dentry()
build_path_from_dentry() open-codes dentry_path_raw().  The reason
we can't use dentry_path_raw() in there (and postprocess the
result as needed) is that the callers of build_path_from_dentry()
expect that the object to be freed on cleanup and the string to
be used are at the same address.  That's painful, since the path
is naturally built end-to-beginning - we start at the leaf and
go through the ancestors, accumulating the pathname.

Life would be easier if we left the buffer allocation to callers.
It wouldn't be exact-sized buffer, but none of the callers keep
the result for long - it's always freed before the caller returns.
So there's no need to do exact-sized allocation; better use
__getname()/__putname(), same as we do for pathname arguments
of syscalls.  What's more, there's no need to do allocation under
spinlocks, so GFP_ATOMIC is not needed.

Next patch will replace the open-coded dentry_path_raw() (in
build_path_from_dentry_optional_prefix()) with calling the real
thing.  This patch only introduces wrappers for allocating/freeing
the buffers and switches to new calling conventions:
	build_path_from_dentry(dentry, buf)
expects buf to be address of a page-sized object or NULL,
return value is a pathname built inside that buffer on success,
ERR_PTR(-ENOMEM) if buf is NULL and ERR_PTR(-ENAMETOOLONG) if
the pathname won't fit into page.  Note that we don't need to
check for failure when allocating the buffer in the caller -
build_path_from_dentry() will do the right thing.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-04-25 16:28:23 -05:00
Al Viro
8e33cf20ce cifs: make build_path_from_dentry() return const char *
... and adjust the callers.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-04-25 16:28:23 -05:00
Al Viro
f6f1f17907 cifs: constify pathname arguments in a bunch of helpers
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-04-25 16:28:23 -05:00
Al Viro
558691393a cifs: constify path argument of ->make_node()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-04-25 16:28:23 -05:00
Al Viro
9cfdb1c12b cifs: constify get_normalized_path() properly
As it is, it takes const char * and, in some cases, stores it in
caller's variable that is plain char *.  Fortunately, none of the
callers actually proceeded to modify the string via now-non-const
alias, but that's trouble waiting to happen.

It's easy to do properly, anyway...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-04-25 16:28:23 -05:00
Al Viro
8d76722355 cifs: don't cargo-cult strndup()
strndup(s, strlen(s)) is a highly unidiomatic way to spell strdup(s);
it's *NOT* safer in any way, since strlen() is just as sensitive to
NUL-termination as strdup() is.

strndup() is for situations when you need a copy of a known-sized
substring, not a magic security juju to drive the bad spirits away.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-04-25 16:28:23 -05:00
Steve French
b9335f6210 SMB3: update structures for new compression protocol definitions
Protocol has been extended for additional compression headers.
See MS-SMB2 section 2.2.42

Signed-off-by: Steve French <stfrench@microsoft.com>
2021-04-25 16:28:23 -05:00
Aurelien Aptel
ec4e4862a9 cifs: remove old dead code
While reviewing a patch clarifying locks and locking hierarchy I
realized some locks were unused.

This commit removes old data and code that isn't actually used
anywhere, or hidden in ifdefs which cannot be enabled from the kernel
config.

* The uid/gid trees and associated locks are left-overs from when
  uid/sid mapping had an extra caching layer on top of the keyring and
  are now unused.
  See commit faa65f07d2 ("cifs: simplify id_to_sid and sid_to_id mapping code")
  from 2012.

* cifs_oplock_break_ops is a left-over from when slow_work was remplaced
  by regular workqueue and is now unused.
  See commit 9b64697246 ("cifs: use workqueue instead of slow-work")
  from 2010.

* CIFSSMBSetAttrLegacy is SMB1 cruft dealing with some legacy
  NT4/Win9x behaviour.

* Remove CONFIG_CIFS_DNOTIFY_EXPERIMENTAL left-overs. This was already
  partially removed in 392e1c5dc9 ("cifs: rename and clarify CIFS_ASYNC_OP and CIFS_NO_RESP")
  from 2019. Kill it completely.

* Another candidate that was considered but spared is
  CONFIG_CIFS_NFSD_EXPORT which has an empty implementation and cannot
  be enabled by a config option (although it is listed but disabled with
  "BROKEN" as a dep). It's unclear whether this could even function
  today in its current form but it has it's own .c file and Kconfig
  entry which is a bit more involved to remove and might make a come
  back?

Signed-off-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-04-25 16:28:22 -05:00
Gustavo A. R. Silva
9f4c6eed26 cifs: cifspdu.h: Replace one-element array with flexible-array member
There is a regular need in the kernel to provide a way to declare having
a dynamically sized set of trailing elements in a structure. Kernel code
should always use “flexible array members”[1] for these cases. The older
style of one-element or zero-length arrays should no longer be used[2].

Also, this helps with the ongoing efforts to enable -Warray-bounds by
fixing the following warning:

  CC [M]  fs/cifs/cifssmb.o
fs/cifs/cifssmb.c: In function ‘CIFSFindNext’:
fs/cifs/cifssmb.c:4636:23: warning: array subscript 1 is above array bounds of ‘char[1]’ [-Warray-bounds]
 4636 |   pSMB->ResumeFileName[name_len+1] = 0;
      |   ~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~

[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://www.kernel.org/doc/html/v5.10/process/deprecated.html#zero-length-and-one-element-arrays

Link: https://github.com/KSPP/linux/issues/79
Link: https://github.com/KSPP/linux/issues/109
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-04-25 16:28:22 -05:00
Wan Jiabing
5e14c7240a fs: cifs: Remove repeated struct declaration
struct cifs_writedata is declared twice.
One is declared at 209th line.
And struct cifs_writedata is defined blew.
The declaration hear is not needed. Remove the duplicate.

Signed-off-by: Wan Jiabing <wanjiabing@vivo.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-04-25 16:28:22 -05:00
Aurelien Aptel
443dd65d48 Documentation/admin-guide/cifs: document open_files and dfscache
Add missing documentation for open_files and dfscache /proc files.

Signed-off-by: Aurelien Aptel <aaptel@suse.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-04-25 16:28:22 -05:00
Aurelien Aptel
b7fd0fa0ea cifs: simplify SWN code with dummy funcs instead of ifdefs
This commit doesn't change the logic of SWN.

Add dummy implementation of SWN functions when SWN is disabled instead
of using ifdef sections.

The dummy functions get optimized out, this leads to clearer code and
compile time type-checking regardless of config options with no
runtime penalty.

Leave the simple ifdefs section as-is.

A single bitfield (bool foo:1) on its own will use up one int. Move
tcon->use_witness out of ifdefs with the other tcon bitfields.

Signed-off-by: Aurelien Aptel <aaptel@suse.com>
Reviewed-by: Samuel Cabrero <scabrero@suse.de>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-04-25 16:28:22 -05:00
Steve French
bb9cad1b49 smb3: update protocol header definitions based to include new flags
[MS-SMB2] protocol specification was recently updated to include
new flags, new negotiate context and some minor changes to fields.
Update smb2pdu.h structure definitions to match the newest version
of the protocol specification.  Updates to the compression context
values will be in a followon patch.

Signed-off-by: Steve French <stfrench@microsoft.com>
2021-04-25 16:28:22 -05:00
Steve French
edc9dd1e3c cifs: correct comments explaining internal semaphore usage in the module
A few of the semaphores had been removed, and one additional one
needed to be noted in the comments.

Signed-off-by: Steve French <stfrench@microsoft.com>
2021-04-25 16:28:22 -05:00
Jiapeng Chong
83cd9ed7ae cifs: Remove useless variable
Fix the following gcc warning:

fs/cifs/cifsacl.c:1097:8: warning: variable ‘nmode’ set but not used
[-Wunused-but-set-variable].

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-04-25 16:28:22 -05:00
jack1.li_cp
c45adff786 cifs: Fix spelling of 'security'
secuirty -> security

Signed-off-by: jack1.li_cp <liliu1@yulong.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-04-25 16:28:22 -05:00
Geert Uytterhoeven
1cfa807b06 leds: LEDS_BLINK_LGM should depend on X86
The Intel Lightning Mountain (LGM) Serial Shift Output controller (SSO)
is only present on Intel Lightning Mountain SoCs.  Hence add a
dependency on X86, to prevent asking the user about this driver when
configuring a kernel without Intel Lightning Mountain platform support.

While at it, merge the other dependencies into a single statement.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Pavel Machek <pavel@ucw.cz>
2021-04-25 22:56:42 +02:00
Linus Torvalds
9f4ad9e425 Linux 5.12 2021-04-25 13:49:08 -07:00
Colin Ian King
ec50536b78 leds: lgm: Fix spelling mistake "prepate" -> "prepare"
There is a spelling mistake in a dev_err error message. Fix it.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Pavel Machek <pavel@ucw.cz>
2021-04-25 22:37:57 +02:00
Pavel Machek
5222fa9121 MAINTAINERS: Remove Dan Murphy's bouncing email
Signed-off-by: Pavel Machek <pavel@ucw.cz>
2021-04-25 22:23:47 +02:00
Zheng Yongjun
fcc96cef8a leds-lm3642: convert comma to semicolon
Replace a comma between expression statements by a semicolon.

Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com>
Signed-off-by: Pavel Machek <pavel@ucw.cz>
2021-04-25 22:21:32 +02:00
Erik Flodin
e6b031d3c3 can: proc: fix rcvlist_* header alignment on 64-bit system
Before this fix, the function and userdata columns weren't aligned:
  device   can_id   can_mask  function  userdata   matches  ident
   vcan0  92345678  9fffffff  0000000000000000  0000000000000000         0  raw
   vcan0     123    00000123  0000000000000000  0000000000000000         0  raw

After the fix they are:
  device   can_id   can_mask      function          userdata       matches  ident
   vcan0  92345678  9fffffff  0000000000000000  0000000000000000         0  raw
   vcan0     123    00000123  0000000000000000  0000000000000000         0  raw

Link: Link: https://lore.kernel.org/r/20210425141440.229653-1-erik@flodin.me
Signed-off-by: Erik Flodin <erik@flodin.me>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-04-25 19:43:00 +02:00
Masahiro Yamada
8ac27f2c6e kconfig: refactor .gitignore
Add '/' prefix to clarify that the generated files exist right under
scripts/kconfig/, but not in any sub-directory.

Replace '*conf-cfg' with '[gmnq]conf-cfg' to make it explicit, and
still short enough.

Use '[gmnq]conf' to combine gconf, mconf, nconf, and qconf.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2021-04-26 02:17:39 +09:00
Linus Torvalds
d2d09fbe33 perf tools fixes for v5.12: 4th batch
- Fix potential NULL pointer dereference in the auxtrace option parser.
 
 - Fix access to PID in an array when setting a PID filter in 'perf ftrace'.
 
 - Fix error return code in the 'perf data' tool and in maps__clone(),
   found using a static analysis tool from Huawei.
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCYIV2gQAKCRCyPKLppCJ+
 JxlaAP9OUoT+/2lsgnMcU5b+m18TNR4RSTZwfmPszpeyOlfaEgD/YDB8OErUA5VT
 VxtLeyOisker3EwZFHzYhN7hxqh9sgU=
 =wvGY
 -----END PGP SIGNATURE-----

Merge tag 'perf-tools-fixes-for-v5.12-2021-04-25' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux

Pull perf tools fixes from Arnaldo Carvalho de Melo:

 - Fix potential NULL pointer dereference in the auxtrace option parser

 - Fix access to PID in an array when setting a PID filter in 'perf ftrace'

 - Fix error return code in the 'perf data' tool and in maps__clone(),
   found using a static analysis tool from Huawei

* tag 'perf-tools-fixes-for-v5.12-2021-04-25' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
  perf map: Fix error return code in maps__clone()
  perf ftrace: Fix access to pid in array when setting a pid filter
  perf auxtrace: Fix potential NULL pointer dereference
  perf data: Fix error return code in perf_data__create_dir()
2021-04-25 09:48:46 -07:00
Linus Torvalds
24dfc39007 - Fix BDW Xeon's stepping in the PEBS isolation table of CPUs
- Fix a panic when initializing perf uncore machinery on HSW and BDW servers
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmCFOhsACgkQEsHwGGHe
 VUppNA/+OqvGd0hye+LXlRYULqojOMmqDublMswx9KfUCpwTy3bysECt+Z9MIdZD
 6GHZ/6xd1/O5LG9EDJV7Mr66EIor2aDKnbMB8+VZhG4rF8+hk/03CKiqN+Xr6gTR
 cQn30RUS1E9e4z5sswa49LZJnFRuKxhcCMjv9lVvsiPeGhEkbECZqCkwFbWv9cwE
 /AqM4bmiRhSFWPHox6Iy9ixPYbcRf1muwqZF2Nwl129F4gxfWio3bNrupAkHGDG/
 KEbIDPaPxJ56eyLC1DfxIcfB/7FIwGHFZ5iduIqZ9nVReuSFgHo5OyPKP5a3OPFA
 yygdnC3woDfLw9KbBO3R7GhN8OXwT+y6qPV3YpHnze63GZ4acAVcaE3ZiOL/IDQk
 XY1owlNNlJFg7ibtbXNOYA9B1iLS4uG9yd5h3lzb2R2FYxUNy4towE/+d4cu6pt/
 FP5JCyTDSMUHs4t33E4wV19ytUl58dKkuZTCAAn9E0GLQVeIQkw/QARSkClATUie
 GKQqxfZt8BbLr/PPk++aFeNXDPnp0sPuxBIDzx/bmoDsPTJmsc7GFGm/DpZS6PQD
 m9qxrUProT0ITKhc3BeEunW6tjaycwt6BwXCfJuBtgLTR7UsccQaid8AEQ9hDFKz
 ihKgQsBoTvNT6EP6v0IIC2bfp2U5GNWVAx6PjFHaCuLZQ7h/Rsk=
 =H1RU
 -----END PGP SIGNATURE-----

Merge tag 'perf_urgent_for_v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 perf fixes from Borislav Petkov:

 - Fix Broadwell Xeon's stepping in the PEBS isolation table of CPUs

 - Fix a panic when initializing perf uncore machinery on Haswell and
   Broadwell servers

* tag 'perf_urgent_for_v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/kvm: Fix Broadwell Xeon stepping in isolation_ucodes[]
  perf/x86/intel/uncore: Remove uncore extra PCI dev HSWEP_PCI_PCU_3
2021-04-25 09:42:06 -07:00
Stefan Metzmacher
a2a7cc32a5 io_uring: io_sq_thread() no longer needs to reset current->pf_io_worker
This is done by create_io_thread() now.

Signed-off-by: Stefan Metzmacher <metze@samba.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-04-25 10:29:05 -06:00
Stefan Metzmacher
ff24430330 kernel: always initialize task->pf_io_worker to NULL
Otherwise io_wq_worker_{running,sleeping}() may dereference an
invalid pointer (in future). Currently all users of create_io_thread()
are fine and get task->pf_io_worker = NULL implicitly from the
wq_manager, which got it either from the userspace thread
of the sq_thread, which explicitly reset it to NULL.

I think it's safer to always reset it in order to avoid future
problems.

Fixes: 3bfe610669 ("io-wq: fork worker threads from original task")
cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-04-25 10:29:03 -06:00
Tom Zanussi
0bde4444ec dmaengine: idxd: Enable IDXD performance monitor support
Add the code needed in the main IDXD driver to interface with the IDXD
perfmon implementation.

[ Based on work originally by Jing Lin. ]

Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Link: https://lore.kernel.org/r/a5564a5583911565d31c2af9234218c5166c4b2c.1619276133.git.zanussi@kernel.org
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-04-25 21:46:12 +05:30
Tom Zanussi
81dd4d4d61 dmaengine: idxd: Add IDXD performance monitor support
Implement the IDXD performance monitor capability (named 'perfmon' in
the DSA (Data Streaming Accelerator) spec [1]), which supports the
collection of information about key events occurring during DSA and
IAX (Intel Analytics Accelerator) device execution, to assist in
performance tuning and debugging.

The idxd perfmon support is implemented as part of the IDXD driver and
interfaces with the Linux perf framework.  It has several features in
common with the existing uncore pmu support:

  - it does not support sampling
  - does not support per-thread counting

However it also has some unique features not present in the core and
uncore support:

  - all general-purpose counters are identical, thus no event constraints
  - operation is always system-wide

While the core perf subsystem assumes that all counters are by default
per-cpu, the uncore pmus are socket-scoped and use a cpu mask to
restrict counting to one cpu from each socket.  IDXD counters use a
similar strategy but expand the scope even further; since IDXD
counters are system-wide and can be read from any cpu, the IDXD perf
driver picks a single cpu to do the work (with cpu hotplug notifiers
to choose a different cpu if the chosen one is taken off-line).

More specifically, the perf userspace tool by default opens a counter
for each cpu for an event.  However, if it finds a cpumask file
associated with the pmu under sysfs, as is the case with the uncore
pmus, it will open counters only on the cpus specified by the cpumask.
Since perfmon only needs to open a single counter per event for a
given IDXD device, the perfmon driver will create a sysfs cpumask file
for the device and insert the first cpu of the system into it.  When a
user uses perf to open an event, perf will open a single counter on
the cpu specified by the cpu mask.  This amounts to the default
system-wide rather than per-cpu counting mentioned previously for
perfmon pmu events.  In order to keep the cpu mask up-to-date, the
driver implements cpu hotplug support for multiple devices, as IDXD
usually enumerates and registers more than one idxd device.

The perfmon driver implements basic perfmon hardware capability
discovery and configuration, and is initialized by the IDXD driver's
probe function.  During initialization, the driver retrieves the total
number of supported performance counters, the pmu ID, and the device
type from idxd device, and registers itself under the Linux perf
framework.

The perf userspace tool can be used to monitor single or multiple
events depending on the given configuration, as well as event groups,
which are also supported by the perfmon driver.  The user configures
events using the perf tool command-line interface by specifying the
event and corresponding event category, along with an optional set of
filters that can be used to restrict counting to specific work queues,
traffic classes, page and transfer sizes, and engines (See [1] for
specifics).

With the configuration specified by the user, the perf tool issues a
system call passing that information to the kernel, which uses it to
initialize the specified event(s).  The event(s) are opened and
started, and following termination of the perf command, they're
stopped.  At that point, the perfmon driver will read the latest count
for the event(s), calculate the difference between the latest counter
values and previously tracked counter values, and display the final
incremental count as the event count for the cycle.  An overflow
handler registered on the IDXD irq path is used to account for counter
overflows, which are signaled by an overflow interrupt.

Below are a couple of examples of perf usage for monitoring DSA events.

The following monitors all events in the 'engine' category.  Becuuse
no filters are specified, this captures all engine events for the
workload, which in this case is 19 iterations of the work generated by
the kernel dmatest module.

Details describing the events can be found in Appendix D of [1],
Performance Monitoring Events, but briefly they are:

  event 0x1:  total input data processed, in 32-byte units
  event 0x2:  total data written, in 32-byte units
  event 0x4:  number of work descriptors that read the source
  event 0x8:  number of work descriptors that write the destination
  event 0x10: number of work descriptors dispatched from batch descriptors
  event 0x20: number of work descriptors dispatched from work queues

 # perf stat -e dsa0/event=0x1,event_category=0x1/,
                dsa0/event=0x2,event_category=0x1/,
		dsa0/event=0x4,event_category=0x1/,
		dsa0/event=0x8,event_category=0x1/,
		dsa0/event=0x10,event_category=0x1/,
		dsa0/event=0x20,event_category=0x1/
		  modprobe dmatest channel=dma0chan0 timeout=2000
		  iterations=19 run=1 wait=1

     Performance counter stats for 'system wide':

                 5,332      dsa0/event=0x1,event_category=0x1/
                 5,327      dsa0/event=0x2,event_category=0x1/
                    19      dsa0/event=0x4,event_category=0x1/
                    19      dsa0/event=0x8,event_category=0x1/
                     0      dsa0/event=0x10,event_category=0x1/
                    19      dsa0/event=0x20,event_category=0x1/

          21.977436186 seconds time elapsed

The command below illustrates filter usage with a simple example.  It
specifies that MEM_MOVE operations should be counted for the DSA
device dsa0 (event 0x8 corresponds to the EV_MEM_MOVE event - Number
of Memory Move Descriptors, which is part of event category 0x3 -
Operations. The detailed category and event IDs are available in
Appendix D, Performance Monitoring Events, of [1]).  In addition to
the event and event category, a number of filters are also specified
(the detailed filter values are available in Chapter 6.4 (Filter
Support) of [1]), which will restrict counting to only those events
that meet all of the filter criteria.  In this case, the filters
specify that only MEM_MOVE operations that are serviced by work queue
wq0 and specifically engine number engine0 and traffic class tc0
having sizes between 0 and 4k and page size of between 0 and 1G result
in a counter hit; anything else will be filtered out and not appear in
the final count.  Note that filters are optional - any filter not
specified is assumed to be all ones and will pass anything.

 # perf stat -e dsa0/filter_wq=0x1,filter_tc=0x1,filter_sz=0x7,
                filter_eng=0x1,event=0x8,event_category=0x3/
		  modprobe dmatest channel=dma0chan0 timeout=2000
		  iterations=19 run=1 wait=1

     Performance counter stats for 'system wide':

       19      dsa0/filter_wq=0x1,filter_tc=0x1,filter_sz=0x7,
               filter_eng=0x1,event=0x8,event_category=0x3/

          21.865914091 seconds time elapsed

The output above reflects that the unspecified workload resulted in
the counting of 19 MEM_MOVE operation events that met the filter
criteria.

[1]: https://software.intel.com/content/www/us/en/develop/download/intel-data-streaming-accelerator-preliminary-architecture-specification.html

[ Based on work originally by Jing Lin. ]

Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Link: https://lore.kernel.org/r/0c5080a7d541904c4ad42b848c76a1ce056ddac7.1619276133.git.zanussi@kernel.org
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-04-25 21:46:12 +05:30
Hao Xu
2b4ae19c6d io_uring: update sq_thread_idle after ctx deleted
we shall update sq_thread_idle anytime we do ctx deletion from ctx_list

Fixes:734551df6f9b ("io_uring: fix shared sqpoll cancellation hangs")

Signed-off-by: Hao Xu <haoxu@linux.alibaba.com>
Link: https://lore.kernel.org/r/1619256380-236460-1-git-send-email-haoxu@linux.alibaba.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-04-25 10:14:25 -06:00
Pavel Begunkov
634d00df5e io_uring: add full-fledged dynamic buffers support
Hook buffers into all rsrc infrastructure, including tagging and
updates.

Suggested-by: Bijan Mottahedeh <bijan.mottahedeh@oracle.com>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/119ed51d68a491dae87eb55fb467a47870c86aad.1619356238.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-04-25 10:14:04 -06:00
Bijan Mottahedeh
bd54b6fe33 io_uring: implement fixed buffers registration similar to fixed files
Apply fixed_rsrc functionality for fixed buffers support.

Signed-off-by: Bijan Mottahedeh <bijan.mottahedeh@oracle.com>
[rebase, remove multi-level tables, fix unregister on exit]
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/17035f4f75319dc92962fce4fc04bc0afb5a68dc.1619356238.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-04-25 10:14:04 -06:00
Pavel Begunkov
eae071c9b4 io_uring: prepare fixed rw for dynanic buffers
With dynamic buffer updates, registered buffers in the table may change
at any moment. First of all we want to prevent future races between
updating and importing (i.e. io_import_fixed()), where the latter one
may happen without uring_lock held, e.g. from io-wq.

Save the first loaded io_mapped_ubuf buffer and reuse.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/21a2302d07766ae956640b6f753292c45200fe8f.1619356238.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-04-25 10:14:04 -06:00
Pavel Begunkov
41edf1a5ec io_uring: keep table of pointers to ubufs
Instead of keeping a table of ubufs convert them into pointers to ubuf,
so we can atomically read one pointer and be sure that the content of
ubuf won't change.

Because it was already dynamically allocating imu->bvec, throw both
imu and bvec into a single structure so they can be allocated together.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/b96efa4c5febadeccf41d0e849ac099f4c83b0d3.1619356238.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-04-25 10:14:04 -06:00
Pavel Begunkov
c3bdad0271 io_uring: add generic rsrc update with tags
Add IORING_REGISTER_RSRC_UPDATE, which also supports passing in rsrc
tags. Implement it for registered files.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/d4dc66df204212f64835ffca2c4eb5e8363f2f05.1619356238.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-04-25 10:14:04 -06:00
Pavel Begunkov
792e35824b io_uring: add IORING_REGISTER_RSRC
Add a new io_uring_register() opcode for rsrc registeration. Instead of
accepting a pointer to resources, fds or iovecs, it @arg is now pointing
to a struct io_uring_rsrc_register, and the second argument tells how
large that struct is to make it easily extendible by adding new fields.

All that is done mainly to be able to pass in a pointer with tags. Pass
it in and enable CQE posting for file resources. Doesn't support setting
tags on update yet.

A design choice made here is to not post CQEs on rsrc de-registration,
but only when we updated-removed it by rsrc dynamic update.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/c498aaec32a4bb277b2406b9069662c02cdda98c.1619356238.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-04-25 10:14:04 -06:00
Pavel Begunkov
fdecb66281 io_uring: enumerate dynamic resources
As resources are getting more support and common parts, it'll be more
convenient to index resources and use it for indexing.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/f0be63e9310212d5601d36277c2946ff7a040485.1619356238.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-04-25 10:14:04 -06:00
Pavel Begunkov
98f0b3b4f1 io_uring: add generic path for rsrc update
Extract some common parts for rsrc update, will be used reg buffers
support dynamic (i.e. quiesce-lee) managing.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/b49c3ff6b9ff0e530295767604fe4de64d349e04.1619356238.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-04-25 10:14:04 -06:00
Pavel Begunkov
b60c8dce33 io_uring: preparation for rsrc tagging
We need a way to notify userspace when a lazily removed resource
actually died out. This will be done by associating a tag, which is u64
exactly like req->user_data, with each rsrc (e.g. buffer of file). A CQE
will be posted once a resource is actually put down.

Tag 0 is a special value set by default, for whcih it don't generate an
CQE, so providing the old behaviour.

Don't expose it to the userspace yet, but prepare internally, allocate
buffers, add all posting hooks, etc.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/2e6beec5eabe7216bb61fb93cdf5aaf65812a9b0.1619356238.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-04-25 10:14:04 -06:00
Pavel Begunkov
d4d19c19d6 io_uring: decouple CQE filling from requests
Make __io_cqring_fill_event() agnostic of struct io_kiocb, pass all the
data needed directly into it. Will be used to post rsrc removal
completions, which don't have an associated request.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/c9b8da9e42772db2033547dfebe479dc972a0f2c.1619356238.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-04-25 10:14:04 -06:00
Pavel Begunkov
44b31f2fa2 io_uring: return back rsrc data free helper
Add io_rsrc_data_free() helper for destroying rsrc_data, easier for
search and the function will get more stuff to destroy shortly.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/562d1d53b5ff184f15b8949a63d76ef19c4ba9ec.1619356238.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-04-25 10:14:04 -06:00
Pavel Begunkov
fff4db76be io_uring: move __io_sqe_files_unregister
A preparation patch moving __io_sqe_files_unregister() definition closer
to other "files" functions without any modification.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/95caf17fe837e67bd1f878395f07049062a010d4.1619356238.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-04-25 10:14:04 -06:00
Linus Torvalds
0146da0d4c - Fix ordering in the queued writer lock's slowpath.
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmCFOD4ACgkQEsHwGGHe
 VUrwChAAr3r3Qr60WDUVTtbx0TS+d7Rk+aq63DhN6PvFADMlZe3PfjRFnw9bamIO
 hgf/65+BsYPTbO8w9CdSdgrv00oVw0v1pbYVdIAJDBzzwrG489gjJxozrIgQHU+X
 ixpxnG7R8e4dZrA4Q/ip2EZYRwGfWyO1H2Z1K6/W/sSk3HGi2EKa3+4BGBIhVtnI
 K81ckR59H+/+Fi8D3PZozK0HmC1XXKuoA5I0Yx0mgivy6i1XQ6UPohgWLXW76uae
 JtvbC2E9APhlYaRlIzdAAA7bx2REvyic6JjfTR515llIo3X2npVWlbusipjQdnOG
 enuFmHCCu49z9IAgGzI/ez+jEC41/SwJtma6lyDlcrW8j3Ovw5j6eMju14rrZD72
 8wqFXll7eAoFNbln0H9xUwvOVVWeIvMbF0+WluFWMsX5nDY4XRnEtYa24XPdeOFA
 aDhtuSmyQzcLi/mzgOU9q3dg/VKPb4i9XROwaXtJXXD8GVzbN2i5yOMleLmfHJ0K
 e1o/qc4VFItF+nCyp0v+yqEB8sQ23zRWiuw7zE1PvolBsULp+gO1lmJIJvSwaGaB
 nKJ04k7AvdkK82926cjfnhC6TvsOdHMWsdFBq+J/01bSnFGXwtzSJ6qoFTCts3TX
 zi7E5u+TEDAa4EZ4eO4Ds0yaKdJ0tBncoKUDLkTg7M98dLqv+dI=
 =GDTe
 -----END PGP SIGNATURE-----

Merge tag 'locking_urgent_for_v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull locking fix from Borislav Petkov:
 "Fix ordering in the queued writer lock's slowpath"

* tag 'locking_urgent_for_v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking/qrwlock: Fix ordering in queued_write_lock_slowpath()
2021-04-25 09:10:10 -07:00