Linux mainline fork with MSM8998 patches | https://mainline.space | Currently supported devices: OnePlus 5/5T, Xiaomi Mi 6, F(x)tec Pro¹ (2019 QX1000 model) & Sony Xperia XZ Premium (UNTESTED!)
Find a file
Filipe Manana 4d6221d7d8 btrfs: fix race that causes unnecessary logging of ancestor inodes
When logging an inode and we are checking if we need to log ancestors that
are new, if the previous transaction is still committing we have a time
window where we can unnecessarily log ancestor inodes that were created in
the previous transaction.

The race is described by the following steps:

1) We are at transaction 1000;

2) Directory inode X is created, its generation is set to 1000;

3) The commit for transaction 1000 is started by task A;

4) The task committing transaction 1000 sets the transaction state to
   unblocked, writes the dirty extent buffers and the super blocks, then
   unlocks tree_log_mutex;

5) Inode Y, a regular file, is created under directory inode X, this
   results in starting a new transaction with a generation of 1001;

6) The transaction 1000 commit is unpinning extents. At this point
   fs_info->last_trans_committed still has a value of 999;

7) Task B calls fsync on inode Y and gets a handle for transaction 1001;

8) Task B ends up at log_all_new_ancestors() and then because inode Y has
   only one hard link, ends up at log_new_ancestors_fast(). There it reads
   a value of 999 from fs_info->last_trans_committed, and sees that the
   parent inode X has a generation of 1000, so we end up logging inode X:

     if (inode->generation > fs_info->last_trans_committed) {
         ret = btrfs_log_inode(trans, root, inode,
                               LOG_INODE_EXISTS, ctx);
         (...)

   which is not necessary since it was created in the past transaction,
   with a generation of 1000, and that transaction has already committed
   its super blocks - it's still unpinning extents so it has not yet
   updated fs_info->last_trans_committed from 999 to 1000.

   So this just causes us to spend more time logging and allocating and
   writing more tree blocks for the log tree.

So fix this by comparing an inode's generation with the generation of the
transaction our transaction handle refers to - if the inode's generation
matches the generation of the current transaction than we know it is a
new inode we need to log, otherwise don't log it.

This case is often hit when running dbench for a long enough duration.

This patch belongs to a patch set that is comprised of the following
patches:

  btrfs: fix race causing unnecessary inode logging during link and rename
  btrfs: fix race that results in logging old extents during a fast fsync
  btrfs: fix race that causes unnecessary logging of ancestor inodes
  btrfs: fix race that makes inode logging fallback to transaction commit
  btrfs: fix race leading to unnecessary transaction commit when logging inode
  btrfs: do not block inode logging for so long during transaction commit

Performance results are mentioned in the change log of the last patch.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2020-12-09 19:16:06 +01:00
arch USB fixes for 5.10-rc7 2020-12-06 11:38:36 -08:00
block block-5.10-2020-12-05 2020-12-05 14:45:30 -08:00
certs .gitignore: add SPDX License Identifier 2020-03-25 11:50:48 +01:00
crypto drivers-5.10-2020-10-12 2020-10-13 13:04:41 -07:00
Documentation Networking fixes for 5.10-rc7, including fixes from bpf, netfilter, 2020-12-03 13:10:11 -08:00
drivers Char/Misc driver fixes for 5.10-rc7 2020-12-06 11:48:17 -08:00
fs btrfs: fix race that causes unnecessary logging of ancestor inodes 2020-12-09 19:16:06 +01:00
include btrfs: introduce ZONED feature flag 2020-12-08 15:54:16 +01:00
init Kbuild fixes for v5.10 (2nd) 2020-12-06 10:31:39 -08:00
ipc ipc: adjust proc_ipc_sem_dointvec definition to match prototype 2020-09-05 12:14:29 -07:00
kernel A set of updates for the interrupt subsystem: 2020-12-06 11:15:55 -08:00
lib zlib: export S390 symbols for zlib modules 2020-12-06 10:19:07 -08:00
LICENSES LICENSES/deprecated: add Zlib license text 2020-09-16 14:33:49 +02:00
mm mm/mmap.c: fix mmap return value when vma is merged after call_mmap() 2020-12-06 10:19:07 -08:00
net net/sched: act_mpls: ensure LSE is pullable before reading it 2020-12-03 11:13:37 -08:00
samples Tracing fixes for 5.10-rc6 2020-12-01 15:30:18 -08:00
scripts Kbuild fixes for v5.10 (2nd) 2020-12-06 10:31:39 -08:00
security selinux/stable-5.10 PR 20201113 2020-11-14 12:04:02 -08:00
sound ALSA: hda/realtek: Add mute LED quirk to yet another HP x360 model 2020-11-28 10:00:51 +01:00
tools A set of fixes for x86: 2020-12-06 11:22:39 -08:00
usr Merge branch 'work.fdpic' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2020-08-07 13:29:39 -07:00
virt kvm: x86/mmu: Support dirty logging for the TDP MMU 2020-10-23 03:42:13 -04:00
.clang-format RDMA 5.10 pull request 2020-10-17 11:18:18 -07:00
.cocciconfig scripts: add Linux .cocciconfig for coccinelle 2016-07-22 12:13:39 +02:00
.get_maintainer.ignore Opt out of scripts/get_maintainer.pl 2019-05-16 10:53:40 -07:00
.gitattributes .gitattributes: use 'dts' diff driver for dts files 2019-12-04 19:44:11 -08:00
.gitignore .gitignore: docs: ignore sphinx_*/ directories 2020-09-10 10:44:31 -06:00
.mailmap mailmap: add two more addresses of Uwe Kleine-König 2020-12-06 10:19:07 -08:00
COPYING COPYING: state that all contributions really are covered by this file 2020-02-10 13:32:20 -08:00
CREDITS MAINTAINERS: Move Jason Cooper to CREDITS 2020-11-30 10:20:34 +01:00
Kbuild kbuild: rename hostprogs-y/always to hostprogs/always-y 2020-02-04 01:53:07 +09:00
Kconfig kbuild: ensure full rebuild when the compiler is updated 2020-05-12 13:28:33 +09:00
MAINTAINERS A set of updates for the interrupt subsystem: 2020-12-06 11:15:55 -08:00
Makefile Linux 5.10-rc7 2020-12-06 14:25:12 -08:00
README Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.