Linux mainline fork with MSM8998 patches | https://mainline.space | Currently supported devices: OnePlus 5/5T, Xiaomi Mi 6, F(x)tec Pro¹ (2019 QX1000 model) & Sony Xperia XZ Premium (UNTESTED!)
Find a file
Kairui Song c1b8fdae62 mm: memcontrol: make cgroup_memory_noswap a static key
cgroup_memory_noswap is used in many hot path, so make it a static key
to lower the kernel overhead.

Using 8G of ZRAM as SWAP, benchmark using `perf stat -d -d -d --repeat 100`
with the following code snip in a non-root cgroup:

   #include <stdio.h>
   #include <string.h>
   #include <linux/mman.h>
   #include <sys/mman.h>
   #define MB 1024UL * 1024UL
   int main(int argc, char **argv){
      void *p = mmap(NULL, 8000 * MB, PROT_READ | PROT_WRITE,
                     MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
      memset(p, 0xff, 8000 * MB);
      madvise(p, 8000 * MB, MADV_PAGEOUT);
      memset(p, 0xff, 8000 * MB);
      return 0;
   }

Before:
          7,021.43 msec task-clock                #    0.967 CPUs utilized            ( +-  0.03% )
             4,010      context-switches          #  573.853 /sec                     ( +-  0.01% )
                 0      cpu-migrations            #    0.000 /sec
         2,052,057      page-faults               #  293.661 K/sec                    ( +-  0.00% )
    12,616,546,027      cycles                    #    1.805 GHz                      ( +-  0.06% )  (39.92%)
       156,823,666      stalled-cycles-frontend   #    1.25% frontend cycles idle     ( +-  0.10% )  (40.25%)
       310,130,812      stalled-cycles-backend    #    2.47% backend cycles idle      ( +-  4.39% )  (40.73%)
    18,692,516,591      instructions              #    1.49  insn per cycle
                                                  #    0.01  stalled cycles per insn  ( +-  0.04% )  (40.75%)
     4,907,447,976      branches                  #  702.283 M/sec                    ( +-  0.05% )  (40.30%)
        13,002,578      branch-misses             #    0.26% of all branches          ( +-  0.08% )  (40.48%)
     7,069,786,296      L1-dcache-loads           #    1.012 G/sec                    ( +-  0.03% )  (40.32%)
       649,385,847      L1-dcache-load-misses     #    9.13% of all L1-dcache accesses  ( +-  0.07% )  (40.10%)
     1,485,448,688      L1-icache-loads           #  212.576 M/sec                    ( +-  0.15% )  (39.49%)
        31,628,457      L1-icache-load-misses     #    2.13% of all L1-icache accesses  ( +-  0.40% )  (39.57%)
         6,667,311      dTLB-loads                #  954.129 K/sec                    ( +-  0.21% )  (39.50%)
         5,668,555      dTLB-load-misses          #   86.40% of all dTLB cache accesses  ( +-  0.12% )  (39.03%)
               765      iTLB-loads                #  109.476 /sec                     ( +- 21.81% )  (39.44%)
         4,370,351      iTLB-load-misses          # 214320.09% of all iTLB cache accesses  ( +-  1.44% )  (39.86%)
       149,207,254      L1-dcache-prefetches      #   21.352 M/sec                    ( +-  0.13% )  (40.27%)

           7.25869 +- 0.00203 seconds time elapsed  ( +-  0.03% )

After:
          6,576.16 msec task-clock                #    0.953 CPUs utilized            ( +-  0.10% )
             4,020      context-switches          #  605.595 /sec                     ( +-  0.01% )
                 0      cpu-migrations            #    0.000 /sec
         2,052,056      page-faults               #  309.133 K/sec                    ( +-  0.00% )
    11,967,619,180      cycles                    #    1.803 GHz                      ( +-  0.36% )  (38.76%)
       161,259,240      stalled-cycles-frontend   #    1.38% frontend cycles idle     ( +-  0.27% )  (36.58%)
       253,605,302      stalled-cycles-backend    #    2.16% backend cycles idle      ( +-  4.45% )  (34.78%)
    19,328,171,892      instructions              #    1.65  insn per cycle
                                                  #    0.01  stalled cycles per insn  ( +-  0.10% )  (31.46%)
     5,213,967,902      branches                  #  785.461 M/sec                    ( +-  0.18% )  (30.68%)
        12,385,170      branch-misses             #    0.24% of all branches          ( +-  0.26% )  (34.13%)
     7,271,687,822      L1-dcache-loads           #    1.095 G/sec                    ( +-  0.12% )  (35.29%)
       649,873,045      L1-dcache-load-misses     #    8.93% of all L1-dcache accesses  ( +-  0.11% )  (41.41%)
     1,950,037,608      L1-icache-loads           #  293.764 M/sec                    ( +-  0.33% )  (43.11%)
        31,365,566      L1-icache-load-misses     #    1.62% of all L1-icache accesses  ( +-  0.39% )  (45.89%)
         6,767,809      dTLB-loads                #    1.020 M/sec                    ( +-  0.47% )  (48.42%)
         6,339,590      dTLB-load-misses          #   95.43% of all dTLB cache accesses  ( +-  0.50% )  (46.60%)
               736      iTLB-loads                #  110.875 /sec                     ( +-  1.79% )  (48.60%)
         4,314,836      iTLB-load-misses          # 518653.73% of all iTLB cache accesses  ( +-  0.63% )  (42.91%)
       144,950,156      L1-dcache-prefetches      #   21.836 M/sec                    ( +-  0.37% )  (41.39%)

           6.89935 +- 0.00703 seconds time elapsed  ( +-  0.10% )

The performance is clearly better. There is no significant hotspot
improvement according to perf report, as there are quite a few
callers of memcg_swap_enabled and do_memsw_account (which calls
memcg_swap_enabled). Many pieces of minor optimizations resulted
in lower overhead for the branch predictor, and bettter performance.

Link: https://lkml.kernel.org/r/20220919180634.45958-3-ryncsn@gmail.com
Signed-off-by: Kairui Song <kasong@tencent.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
Acked-by: Muchun Song <songmuchun@bytedance.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-03 14:03:32 -07:00
arch x86: kmsan: handle CPU entry area 2022-10-03 14:03:26 -07:00
block block: kmsan: skip bio block merging logic for KMSAN 2022-10-03 14:03:23 -07:00
certs Kbuild updates for v5.20 2022-08-10 10:40:41 -07:00
crypto crypto: kmsan: disable accelerated configs under KMSAN 2022-10-03 14:03:22 -07:00
Documentation mm/page_alloc: remove obsolete gfpflags_normal_context() 2022-10-03 14:03:30 -07:00
drivers crypto: kmsan: disable accelerated configs under KMSAN 2022-10-03 14:03:22 -07:00
fs mm: fs: initialize fsdata passed to write_begin/write_end interface 2022-10-03 14:03:25 -07:00
include mm: memcontrol: use memcg_kmem_enabled in count_objcg_event 2022-10-03 14:03:32 -07:00
init init: kmsan: call KMSAN initialization routines 2022-10-03 14:03:21 -07:00
io_uring io_uring/net: save address for sendzc async execution 2022-08-25 07:52:30 -06:00
ipc ipc/shm: use VMA iterator instead of linked list 2022-09-26 19:46:21 -07:00
kernel bpf: kmsan: initialize BPF registers with zeroes 2022-10-03 14:03:25 -07:00
lib kmsan: disable strscpy() optimization under KMSAN 2022-10-03 14:03:22 -07:00
LICENSES LICENSES/LGPL-2.1: Add LGPL-2.1-or-later as valid identifiers 2021-12-16 14:33:10 +01:00
mm mm: memcontrol: make cgroup_memory_noswap a static key 2022-10-03 14:03:32 -07:00
net Including fixes from ipsec and netfilter (with one broken Fixes tag). 2022-08-25 14:03:58 -07:00
samples Tracing updates for 5.20 / 6.0 2022-08-05 09:41:12 -07:00
scripts kmsan: add KMSAN runtime core 2022-10-03 14:03:19 -07:00
security security: kmsan: fix interoperability with auto-initialization 2022-10-03 14:03:23 -07:00
sound sound fixes for 6.0-rc2 2022-08-19 09:46:11 -07:00
tools objtool: kmsan: list KMSAN API functions as uaccess-safe 2022-10-03 14:03:23 -07:00
usr Not a lot of material this cycle. Many singleton patches against various 2022-05-27 11:22:03 -07:00
virt KVM: Drop unnecessary initialization of "ops" in kvm_ioctl_create_device() 2022-08-19 04:05:43 -04:00
.clang-format PCI/DOE: Add DOE mailbox support functions 2022-07-19 15:38:04 -07:00
.cocciconfig scripts: add Linux .cocciconfig for coccinelle 2016-07-22 12:13:39 +02:00
.get_maintainer.ignore get_maintainer: add Alan to .get_maintainer.ignore 2022-08-20 15:17:44 -07:00
.gitattributes .gitattributes: use 'dts' diff driver for dts files 2019-12-04 19:44:11 -08:00
.gitignore kbuild: split the second line of *.mod into *.usyms 2022-05-08 03:16:59 +09:00
.mailmap .mailmap: update Luca Ceresoli's e-mail address 2022-08-28 14:02:46 -07:00
COPYING COPYING: state that all contributions really are covered by this file 2020-02-10 13:32:20 -08:00
CREDITS drm for 5.20/6.0 2022-08-03 19:52:08 -07:00
Kbuild kbuild: rename hostprogs-y/always to hostprogs/always-y 2020-02-04 01:53:07 +09:00
Kconfig kbuild: ensure full rebuild when the compiler is updated 2020-05-12 13:28:33 +09:00
MAINTAINERS x86: kmsan: handle CPU entry area 2022-10-03 14:03:26 -07:00
Makefile kmsan: add KMSAN runtime core 2022-10-03 14:03:19 -07:00
README Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.