Commit graph

51587 commits

Author SHA1 Message Date
Trond Myklebust
1425075e72 NFS: NFSoRDMA Client Side Changes
These patches include several bugfixes and cleanups for the NFSoRDMA client.
 This includes bugfixes for NFS v4.1, proper RDMA_ERROR handling, and fixes
 from the recent workqueue swicchover.  These patches also switch xprtrdma to
 use the new CQ API
 
 Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJW5wl+AAoJENfLVL+wpUDrH7sP/i3lZ7n6Pr+Flrb/Z+ywjmEA
 mvZ0u3O3eviFxDqr82J/WL7fgDUGbBd9urtUu5xZGscc0HNs3LQ8izm6Dy3gmrVF
 MFh69jNGL5djmfymYMWRbdoKLuOw4V70EBtGnYqH7Bh7gwpiIl3EiVcBBJup/vKV
 rgvx6NnSGYpYPYpBFC4Ql4qZx7m5j4cxsThRScbu7wMMjDknls+7ZDM1B5mGO00+
 1vYObpYGXOXovGZyHAHspQityWp6jvUcEMnJzWbMUFDqxkOmmGK3t54MfvRXZiFE
 vUkgg5nGxhMejEIfMywuf6czKGfXc4HZT2yF9eSZeA4IW+7QkeeXeCcIANBDY6Ga
 oKXu4fgM6T7SnCpefwkXRhinwtEM04tGAlxo1X3UcKrMz7Q3di3/NtgaWijfL8gy
 9Nd1lt2kqI375h7+OZccURl33lnQBSjO4F2pt/pFk6wYwGh0F8co9bIp6QIEVQ0f
 N2l8KU9vgLofhrD9JzZeu1l3+TCdDU9YaJLSjbkZ71BTjNtkNdUcd9Tk+bMSxept
 mq7mNKe1oBHAGgR8+7zYBUKEt85rdpNovPoU+Hz/QKbV3zikGSPmL3e9gpEvhmH4
 MKD7Vs61Fi8volfv7wHmzZF8Tk68qc1MAZXSbgzVQxwF/uBjqEuSO+n4Kii/gfkG
 MJjlzqmqlO01O1fn2XHu
 =aHja
 -----END PGP SIGNATURE-----

Merge tag 'nfs-rdma-4.6-1' of git://git.linux-nfs.org/projects/anna/nfs-rdma

NFS: NFSoRDMA Client Side Changes

These patches include several bugfixes and cleanups for the NFSoRDMA client.
This includes bugfixes for NFS v4.1, proper RDMA_ERROR handling, and fixes
from the recent workqueue swicchover.  These patches also switch xprtrdma to
use the new CQ API

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

* tag 'nfs-rdma-4.6-1' of git://git.linux-nfs.org/projects/anna/nfs-rdma: (787 commits)
  xprtrdma: Use new CQ API for RPC-over-RDMA client send CQs
  xprtrdma: Use an anonymous union in struct rpcrdma_mw
  xprtrdma: Use new CQ API for RPC-over-RDMA client receive CQs
  xprtrdma: Serialize credit accounting again
  xprtrdma: Properly handle RDMA_ERROR replies
  rpcrdma: Add RPCRDMA_HDRLEN_ERR
  xprtrdma: Do not wait if ib_post_send() fails
  xprtrdma: Segment head and tail XDR buffers on page boundaries
  xprtrdma: Clean up dprintk format string containing a newline
  xprtrdma: Clean up physical_op_map()
  xprtrdma: Clean up unused RPCRDMA_INLINE_PAD_THRESH macro
2016-03-16 16:25:09 -04:00
Guenter Roeck
15013ad813 watchdog: Add support for minimum time between heartbeats
Some watchdogs require a minimum time between heartbeats.
Examples are the watchdogs in DA9062 and AT91SAM9x.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2016-03-16 21:11:19 +01:00
Guenter Roeck
ee142889e3 watchdog: Introduce WDOG_HW_RUNNING flag
The WDOG_HW_RUNNING flag is expected to be set by watchdog drivers if
the hardware watchdog is running. If the flag is set, the watchdog
subsystem will ping the watchdog even if the watchdog device is closed.

The watchdog driver stop function is now optional and may be omitted
if the watchdog can not be stopped. If stopping the watchdog is not
possible but the driver implements a stop function, it is responsible
to set the WDOG_HW_RUNNING flag in its stop function.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2016-03-16 21:11:15 +01:00
Guenter Roeck
664a39236e watchdog: Introduce hardware maximum heartbeat in watchdog core
Introduce an optional hardware maximum heartbeat in the watchdog core.
The hardware maximum heartbeat can be lower than the maximum timeout.

Drivers can set the maximum hardware heartbeat value in the watchdog data
structure. If the configured timeout exceeds the maximum hardware heartbeat,
the watchdog core enables a timer function to assist sending keepalive
requests to the watchdog driver.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2016-03-16 21:11:14 +01:00
Linus Torvalds
271ecc5253 Merge branch 'akpm' (patches from Andrew)
Merge first patch-bomb from Andrew Morton:

 - some misc things

 - ofs2 updates

 - about half of MM

 - checkpatch updates

 - autofs4 update

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (120 commits)
  autofs4: fix string.h include in auto_dev-ioctl.h
  autofs4: use pr_xxx() macros directly for logging
  autofs4: change log print macros to not insert newline
  autofs4: make autofs log prints consistent
  autofs4: fix some white space errors
  autofs4: fix invalid ioctl return in autofs4_root_ioctl_unlocked()
  autofs4: fix coding style line length in autofs4_wait()
  autofs4: fix coding style problem in autofs4_get_set_timeout()
  autofs4: coding style fixes
  autofs: show pipe inode in mount options
  kallsyms: add support for relative offsets in kallsyms address table
  kallsyms: don't overload absolute symbol type for percpu symbols
  x86: kallsyms: disable absolute percpu symbols on !SMP
  checkpatch: fix another left brace warning
  checkpatch: improve UNSPECIFIED_INT test for bare signed/unsigned uses
  checkpatch: warn on bare unsigned or signed declarations without int
  checkpatch: exclude asm volatile from complex macro check
  mm: memcontrol: drop unnecessary lru locking from mem_cgroup_migrate()
  mm: migrate: consolidate mem_cgroup_migrate() calls
  mm/compaction: speed up pageblock_pfn_to_page() when zone is contiguous
  ...
2016-03-16 11:51:08 -07:00
Doug Ledford
d2ad9cc759 Merge branches 'mlx4', 'mlx5' and 'ocrdma' into k.o/for-4.6 2016-03-16 13:38:28 -04:00
Christoph Fritz
2609e4daaa mfd: imx6sx: Add PCIe register definitions for iomuxc gpr
This patch adds macros to define masks and bits for imx6sx
PCIe registers. This is based on a patch by Richard Zhu.

Signed-off-by: Christoph Fritz <chf.fritz@googlemail.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
2016-03-16 08:50:41 +00:00
Maciej S. Szmigiero
5c1488906f mfd: tps65090: Set regmap config reg counts properly
Regmap config max_register field should contain number of
device last register, however num_reg_defaults_raw field
should be set to register count instead
(usually one register more than max_register).

tps65090 driver had both of these fields set to the same value,
fix this by introducing separate defines for max register
number and total count of registers.

Signed-off-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
2016-03-16 08:50:36 +00:00
Philipp Zabel
8c037e0c8e mfd: syscon: Return ENOTSUPP instead of ENOSYS when disabled
When CONFIG_MFD_SYSCON is disabled, have the function stubs return
ENOTSUPP to indicate the syscon functionality is not available.
There are currently no callers that depend on the ENOSYS return value.

This patchfixes a checkpatch warning:
    WARNING: ENOSYS means 'invalid syscall nr' and nothing else

Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
2016-03-16 08:50:35 +00:00
Maciej S. Szmigiero
e9b7ba7954 mfd: as3711: Set regmap config reg counts properly
Regmap config max_register field should contain number of
device last register, however num_reg_defaults_raw field
should be set to register count instead
(usually one register more than max_register).

as3711 driver had both of these fields set to the same value,
fix this by introducing separate defines for max register
number and total count of registers.

Signed-off-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
2016-03-16 08:50:34 +00:00
Maciej S. Szmigiero
a862dc3ea7 mfd: rc5t583: Set regmap config reg counts properly
Regmap config max_register field should contain number of
device last register, however num_reg_defaults_raw field
should be set to register count instead
(usually one register more than max_register).

rc5t583 driver had both of these fields set to the same value,
fix this by introducing separate defines for max register
number and total count of registers.

Signed-off-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
2016-03-16 08:50:34 +00:00
John Crispin
44760cf3bf mfd: mt6397: Add MT6323 support to MT6397 driver
Signed-off-by: John Crispin <blogic@openwrt.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
2016-03-16 08:50:30 +00:00
John Crispin
feec4799ac mfd: mt6397: int_con and int_status may vary in location
MT6323 has the INT_CON and INT_STATUS located at a different position.
Make the registers locations configurable.

Signed-off-by: John Crispin <blogic@openwrt.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
2016-03-16 08:50:28 +00:00
Tomeu Vizoso
f4bcf5a29d mfd: cros_ec: Small kerneldoc fix
s/cros_ec_register/cros_ec_query_all

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
2016-03-16 08:50:24 +00:00
Andrew F. Davis
b45b719ee0 mfd: tps65086: Add driver for the TPS65086 PMIC
Add support for the TPS65912 device. It provides communication
through I2C and contains the following components:

 - Regulators
 - Load switches
 - GPO controller

Signed-off-by: Andrew F. Davis <afd@ti.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
2016-03-16 08:50:15 +00:00
Linus Torvalds
9256d5a308 LED core improvements:
- Fix misleading comment after workqueue removal from drivers.
 
 - Avoid error message when a USB LED device is unplugged.
 
 - Add helpers for calling brightness_set(_blocking)
 
 LED triggers:
 
 - Simplify led_trigger_store by using sysfs_streq().
 
 LED class drivers improvements:
 
 - Improve wording and formatting in a comment: lp3944.
 
 - Fix return value check in create_gpio_led(): leds-gpio.
 
 - Use GPIOF_OUT_INIT_LOW instead of hardcoded zero: leds-gpio.
 
 - Use devm_led_classdev_register(): leds-lm3533, leds-lm3533, leds-lp8788,
   leds-wm831x-status, leds-s3c24xx, leds-s3c24xx, leds: max8997.
 
 New LED class driver:
 
 - Add driver for the ISSI IS31FL32xx family of LED controllers.
 
 Device Tree documentation:
 
 - of: Add vendor prefixes for Integrated Silicon Solutions Inc. (issi)
   and Si-En Technology (si-en).
 
 - DT: Add common bindings for Si-En Technology SN3216/18 and IS31FL32xx
   family of LED controllers, since they seem to be the same hardware,
   just rebranded.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.17 (GNU/Linux)
 
 iQIcBAABAgAGBQJW5oghAAoJEL1qUBy3i3wm4sMP/2pEXQVZ5sYbjT6i3xnkajPh
 mgfCNof4uNm/PZnAqGLF7BCoTRQNsG89MG0wTx240c2wTNhWd3Iqo8QabpVu9UbE
 kc+B/j+GTKvXmpmcEae5uLdGOhaqK9K/dBLW/FFTHyxJcwao26x1pj+rvi4CEEDl
 z+u1uM+2Yh9QB8Q/+JgWRDSY0uRT5qKimN2OU+jwSO3iuSc49i39OzoBxpegP21l
 rpO1X2NeziOXyQl/qa2DYSQ2xqjnHkTRSjpQ98LeUaEauavJQDlynDeugbaWX6t7
 KCignPeeQMBvYySyDGLMtUYjUO5RW1AoikZ9eR4imdL+Oz9sw29xnp+FMr0nSE/U
 D/T2Pt2nRAVVvkJEQ0K2B4PbFAI2KmlDLTMS3vb0uJoRTaTOZBdMKFpa4sju8THQ
 bUeixSUi6rD6VoAVqO8VftaOFfmR4EjmZR1v8KSh5YHWLhAwq+GSTjcUHxpU6Dxx
 l3oQnMBVNsrfQQhxqr889HB5lQu1+X663Snap7eTzMpUpxXoTgCOGGMt2alRNDUO
 uYf4cue/m/aPvOZl25IRoHlujnxuCK7gS2stwKgoHqRqLKLzfABQ4NV5lnw1V4oI
 4834uE3+IHDJIvWj3Y9gCUhKVazbIK3n3rDJcwxPaSGk7MCdFWqkZ+joIgtgnEEC
 O1GD9sGRU93yBm8a/EoT
 =+CW7
 -----END PGP SIGNATURE-----

Merge tag 'leds_for_4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds

Pull LED updates from Jacek Anaszewski:
 "LED core improvements:
   - Fix misleading comment after workqueue removal from drivers
   - Avoid error message when a USB LED device is unplugged
   - Add helpers for calling brightness_set(_blocking)

  LED triggers:
   - Simplify led_trigger_store by using sysfs_streq()

  LED class drivers improvements:
   - Improve wording and formatting in a comment: lp3944
   - Fix return value check in create_gpio_led(): leds-gpio
   - Use GPIOF_OUT_INIT_LOW instead of hardcoded zero: leds-gpio
   - Use devm_led_classdev_register(): leds-lm3533, leds-lm3533,
     leds-lp8788, leds-wm831x-status, leds-s3c24xx, leds-s3c24xx,
     leds-max8997.

  New LED class driver:
   - Add driver for the ISSI IS31FL32xx family of LED controllers.

  Device Tree documentation:
   - of: Add vendor prefixes for Integrated Silicon Solutions Inc.
     (issi) and Si-En Technology (si-en).
   - DT: Add common bindings for Si-En Technology SN3216/18 and
     IS31FL32xx family of LED controllers, since they seem to be the
     same hardware, just rebranded"

* tag 'leds_for_4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds:
  leds: triggers: simplify led_trigger_store
  leds: max8997: Use devm_led_classdev_register
  leds: da903x: Use devm_led_classdev_register
  leds: s3c24xx: Use devm_led_classdev_register
  leds: wm831x-status: Use devm_led_classdev_register
  leds: lp8788: Use devm_led_classdev_register
  leds: 88pm860x: Use devm_led_classdev_register
  leds: Add SN3218 and SN3216 support to the IS31FL32XX driver
  of: Add vendor prefix for Si-En Technology
  leds: Add driver for the ISSI IS31FL32xx family of LED controllers
  DT: leds: Add binding for the ISSI IS31FL32xx family of LED controllers
  DT: Add vendor prefix for Integrated Silicon Solutions Inc.
  leds: lm3533: Use devm_led_classdev_register
  leds: gpio: Use GPIOF_OUT_INIT_LOW instead of hardcoded zero
  leds: core: add helpers for calling brightness_set(_blocking)
  leds: leds-gpio: Fix return value check in create_gpio_led()
  leds: lp3944: improve wording and formatting in a comment
  leds: core: avoid error message when a USB LED device is unplugged
  leds: core: fix misleading comment after workqueue removal from drivers
2016-03-15 22:04:53 -07:00
Linus Torvalds
13f6f62f61 RTC for 4.6
Core:
  - New sysfs interface to set and read clock offset
  - Drivers can now be both I2C and SPI (see pcf2127 and ds3232)
 
 New drivers:
  - Alphascale ASM9260
  - Epson RX6110SA
  - Maxim max20024 and max77620 (in max77686)
  - Microchip PIC32
  - NXP pcf2129 (in pcf2127)
 
 Subsystem wide cleanups:
  - remove IRQF_EARLY_RESUME when unecessary
 
 Drivers:
  - ds1307: clock output, temperature sensor and wakeup-source support
  - ds1685: actually spin forever in poweroff error path
  - ds3232: many cleanups
  - ds3234: merged in ds3232
  - hym8563: fix invalid year calculation
  - max77686: many cleanups
  - max77802 merged in max77686
  - pcf2123: cleanups and offset support
  - pcf85063: cleanups
  - pcf8523: propely handle oscillator stop bit
  - rv3029: many cleanups, trickle charger and temperature sensor support
  - rv8803: convert spin_lock to mutex_lock
  - rx8025: many fixes
  - vr41xx: restore alarm_irq_enable
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABCAAGBQJW5uWTAAoJEKbNnwlvZCyzUmAP/3/bxOtQ9QUq4nLIVb/dR9G0
 yNY5L/RmgQ/x3CJqMkGQImyj4rSEmwdHJSWakv/hGh5iTINtZrw4sl79YKNoTQlB
 LaXAse1sXzhyfLzH8jYrrzpq14z1kv/LlUPtgFf1arcaRMdjw1kabdIqHnsv0nH2
 fvyj3O0qW/KWWYongViiXW94dr8uQXoMVR4M/M8OmG6j6Lw1Kd7gKDZx02dqbo8+
 NNQQYAswSxzHCZirIv5Xyjdrz1gIuaWghEdbaVY6BW+CuT8NcC2if1BoYo6xR/Qe
 KC6M5MGY4xogOxP8+EkH1/GycXx1mwrisJSTxJh9YKMDnLgi9YnTjXnO4t9kQyoK
 HOspgaHhGHRONNc2dADJqm55SbLUKBEXVTk/Tzq6BJvTKmZmrisAUhWxISLscHwP
 QPU//fLxyerC/tMKoFyuG7yDtd5VofWThLQV8sF0wcX8AKfLaUIEnN69euCk9mVj
 +eCwkYyOly0UeYdeXkrnHczIFffzMTENpz9Q8TkrLF1UOMI7nCOb/jCo6hKidEEB
 ymgHHzhhGeppfEQgU+MsXOxG0E9Ku4gXGOaVk/q7zxyiBhXOmMpoHk6nshKq0A0S
 eBKmd0BWS0GwXJ+dABfHuI6MSay3A7Bn+fLLoD8LTPMJEONyvOGxDNSH3DymD9JR
 LpDO4HnalG1ao+eXcD+U
 =+x1Q
 -----END PGP SIGNATURE-----

Merge tag 'rtc-4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux

Pull RTC updates from Alexandre Belloni:
 "Core:
   - New sysfs interface to set and read clock offset
   - Drivers can now be both I2C and SPI (see pcf2127 and ds3232)

  New drivers:
   - Alphascale ASM9260
   - Epson RX6110SA
   - Maxim max20024 and max77620 (in max77686)
   - Microchip PIC32
   - NXP pcf2129 (in pcf2127)

  Subsystem wide cleanups:
   - remove IRQF_EARLY_RESUME when unecessary

  Drivers:
   - ds1307: clock output, temperature sensor and wakeup-source support
   - ds1685: actually spin forever in poweroff error path
   - ds3232: many cleanups
   - ds3234: merged in ds3232
   - hym8563: fix invalid year calculation
   - max77686: many cleanups
   - max77802 merged in max77686
   - pcf2123: cleanups and offset support
   - pcf85063: cleanups
   - pcf8523: propely handle oscillator stop bit
   - rv3029: many cleanups, trickle charger and temperature sensor support
   - rv8803: convert spin_lock to mutex_lock
   - rx8025: many fixes
   - vr41xx: restore alarm_irq_enable"

* tag 'rtc-4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux: (86 commits)
  rtc: pcf2127: add pcf2129 device id
  rtc: pcf2127: add support for spi interface
  rtc: pcf2127: convert to use regmap
  rtc: rv3029: Add thermometer hwmon support
  rtc: rv3029: Add update_bits helper for eeprom access
  rtc: ds1685: actually spin forever in poweroff error path
  rtc: hym8563: fix invalid year calculation
  rtc: ds3232: use rtc->ops_lock to protect alarm operations
  rtc: ds3232: fix issue when irq is shared several devices
  rtc: ds3232: remove unused UIE code
  rtc: ds3232: add register access error checks
  rtc: ds3232: fix read on /dev/rtc after RTC_AIE_ON
  rtc: merge ds3232 and ds3234
  rtc: ds3232: convert to use regmap
  rtc: pxa: fix Kconfig indentation
  rtc: rv3029: Add device tree property for trickle charger
  rtc: rv3029: Add functions for EEPROM access
  rtc: rv3029: Add i2c register update-bits helper
  rtc: rv3029: Add missing register definitions
  rtc: rv3029: Add "rv3029" I2C device id
  ...
2016-03-15 21:58:58 -07:00
Linus Torvalds
f0718cea47 hwmon updates for v4.5:
- New drivers for NSA320 and LTC2990
 - Added support for ADM1278 to adm1275 driver
 - Added support for ncpXXxh103 to ntc_thermistor driver
 - Renamed vexpress hwmon implementation
 - Minor cleanups and improvements
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJW5sANAAoJEMsfJm/On5mBKTEP/2SIXOFN+ew3Lhmz36Z8gDJY
 LpVcFjb8S/nRDibvaYjaO5r7p6x2d3jW6wkdpt1V44l5khvODjnaZhH7qZ1xaVuI
 qt9xRX0PvHAlwIpXHRNgueNAkAwTuCJqmbbYGrPQJCe4zu0sxiDrpRa117Ym1AJY
 NxOts2n/VAjVn0O+F4QiJELDmSpajsVt7QPjUGYfiEHPaGDgPVq+5gqRPeL9hfOu
 tjFVWn3YVJTI/Q3N/aB0+XQpwoZdu74RFnr/x7rkjzVcqiNdq0+WtFAIAS9Aw3s1
 gZbwRPX7Pu/sNmcVRaj+9rSpzl9uzHbqD4+/+hCuv3LjnyAbLm3NmV2EB1u42SwE
 bmqjnCTtC5InlkcKGi85Sv72YcGj5MgddSTQ9TDBKQIW8Z70rnFK/7xPbq776W/L
 vFXQ62YZ14lX3sVTMVfntf6BrYeteVbJmP7d9SXeTHfHVSSaxwGrq6kim8m2zi/J
 R8YMsVOok0P3ldyua73TIdujYtpz2AZ4vdddDFSlu/m1o/bp642le5HLVuKBpuYy
 Szjc26JGL5ymGLAUmeEUt8jOWUqQo9MLw9Gc8kgc/Zui3q+lMJKXZ6dvkgJpnFCL
 xw/fB/E6x4pDEhJPe3LeFcKBk/vF6bktEWrrA0rGHrVGJIaOhgPkxyeGV7BdNqFS
 DtMy4EcjmlAKOn5oxOL5
 =n07O
 -----END PGP SIGNATURE-----

Merge tag 'hwmon-for-linus-v4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging

Pull hwmon updates from Guenter Roeck:
 - New drivers for NSA320 and LTC2990
 - Added support for ADM1278 to adm1275 driver
 - Added support for ncpXXxh103 to ntc_thermistor driver
 - Renamed vexpress hwmon implementation
 - Minor cleanups and improvements

* tag 'hwmon-for-linus-v4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
  hwmon: Create an NSA320 hardware monitoring driver
  hwmon: Define binding for the nsa320-hwmon driver
  hwmon: (adm1275) Add support for ADM1278
  hwmon: (ntc_thermistor) Add support for ncpXXxh103
  Doc: hwmon: Fix typo "montoring" in hwmon
  ARM: dts: vfxxx: Add iio_hwmon node for ADC temperature channel
  ARM: dts: Change iio_hwmon nodes to use hypen in node names
  hwmon: (iio_hwmon) Allow the driver to accept hypen in device tree node names
  hwmon: Add LTC2990 sensor driver
  hwmon: (vexpress) rename vexpress hwmon implementation
2016-03-15 21:44:47 -07:00
Linus Torvalds
555f8160b2 regulator: Updates for v4.6
This has been an extremely quiet release for the regulator API, aside
 from bugfixes and small enhancements the only thing that really stands
 out are the new drivers for Action Semiconductors ACT8945A, HiSilicon
 HI665x, and the Maxim MAX20024 and MAX77620.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJW58EWAAoJECTWi3JdVIfQMGMH/1v4ojMy0X/yPlFk/523u7uq
 6ZJQ4zSTPyv0ccR+hsDjZ81BmM/vATrnQOllvTSwfnGMcs5amE3/9EhzdF5EciCB
 8gUff/GfugXNqpVgpfYYfPhbQVTGllOFnZPmXsLx2dGlu094i69f4fZtoJCFlkmo
 5c0L5Fnkt4w5XtFAA6ILuRtpwZ4jfJZ3VFKrtYwkgLcKw9GgSEqhVKvTPfkNqtjg
 hzEl2P2K/Ug2uTqeZusKH30Sa1J7MW4B+C9dUVyTRB29+uA60y0foGTQjW9uCzEi
 l1sqY9VDSKp9WwnshVrgVFViAhkrA7fWl4qaC/EDe7avvbh5SBpR/+gjZbmdRJU=
 =jEl0
 -----END PGP SIGNATURE-----

Merge tag 'regulator-v4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator

Pull regulator updates from Mark Brown:
 "This has been an extremely quiet release for the regulator API, aside
  from bugfixes and small enhancements the only thing that really stands
  out are the new drivers for Action Semiconductors ACT8945A, HiSilicon
  HI665x, and the Maxim MAX20024 and MAX77620"

* tag 'regulator-v4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: (46 commits)
  regulator: pwm: Add support to have multiple instance of pwm regulator
  regulator: pwm: Fix calculation of voltage-to-duty cycle
  regulator: of: Use of_property_read_u32() for reading min/max
  regulator: pv88060: fix incorrect clear of event register
  regulator: pv88090: fix incorrect clear of event register
  regulator: max77620: Add support to configure active-discharge
  regulator: core: Add support for active-discharge configuration
  regulator: helper: Add helper to configure active-discharge using regmap
  regulator: core: Add support for active-discharge configuration
  regulator: DT: Add DT property for active-discharge configuration
  regulator: act8865: Specify fixed voltage of 3.3V for ACT8600's REG9
  regulator: act8865: Rename platform_data field to init_data
  regulator: act8865: Remove "static" from local variable
  ASoC: cs4271: add regulator consumer support
  regulator: max77620: Remove duplicate module alias
  regulator: max77620: Eliminate duplicate code
  regulator: max77620: Remove unused fields
  regulator: core: fix crash in error path of regulator_register
  regulator: core: Request GPIO before creating sysfs entries
  regulator: gpio: don't print error on EPROBE_DEFER
  ...
2016-03-15 21:34:35 -07:00
Linus Torvalds
b7aae4a9d0 regmap: Updates for v4.6
This has been a very busy release for regmap, not just in cleaning up
 the mess we got ourselves into with the endianness handling but also in
 other areas too:
 
  - Fixes for the endianness handling so that we now explicltly default
    to little endian (the code used to do this by accident).  This
    fixes handling of explictly specified endianness on big endian
    systems.
  - Optimisation of the implementation of register striding.
  - A refectoring of the _update_bits() code to reduce duplication.
  - Fixes and enhancements for the interrupt implementation which
    make it easier to use in a wider range of systems.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJW5v4RAAoJECTWi3JdVIfQcP0H/R+22ZgPNo76YUHtnMi5zCiW
 /TtUzcKg3QXMXpCxQx2p8shRg1eckjb2zeImDJKPteIsO9y04lrlO2ljVnio/Ut9
 6uBGwmmcCa9/haL7waDrw5kQKmCjNAcXhAn7etsRcvpMaPdEwL71ZWfjCY3HtwyJ
 zGoArFNveHzTeZKRNzGZemSA7r1TOLIkHNvS4yRD4H9K7bGfn1HWwumCgurTvpiB
 nxuhpwO7GQy28fPQRfBlfg2TGI+B8GD+VV4qwWnGYR2pijyIE8Kxqawizxueq70f
 YsjiRNepxgPSdWHGMdjcUwQVB2iOSh8LvObxuyW8oGZEFeUxC3JHUiJ79zkKGsg=
 =PBtQ
 -----END PGP SIGNATURE-----

Merge tag 'regmap-v4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap

Pull regmap updates from Mark Brown:
 "This has been a very busy release for regmap, not just in cleaning up
  the mess we got ourselves into with the endianness handling but also
  in other areas too:

   - Fixes for the endianness handling so that we now explicitly default
     to little endian (the code used to do this by accident).  This
     fixes handling of explictly specified endianness on big endian
     systems.

   - Optimisation of the implementation of register striding.

   - A refectoring of the _update_bits() code to reduce duplication.

   - Fixes and enhancements for the interrupt implementation which make
     it easier to use in a wider range of systems"

* tag 'regmap-v4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap: (28 commits)
  regmap: irq: add devm apis for regmap_{add,del}_irq_chip
  regmap: replace regmap_write_bits()
  regmap: irq: Enable irq retriggering for nested irqs
  regmap: add regmap_fields_force_xxx() macros
  regmap: add regmap_field_force_xxx() macros
  regmap: merge regmap_fields_update_bits() into macro
  regmap: merge regmap_fields_write() into macro
  regmap: add regmap_fields_update_bits_base()
  regmap: merge regmap_field_update_bits() into macro
  regmap: merge regmap_field_write() into macro
  regmap: add regmap_field_update_bits_base()
  regmap: merge regmap_update_bits_check_async() into macro
  regmap: merge regmap_update_bits_check() into macro
  regmap: merge regmap_update_bits_async() into macro
  regmap: merge regmap_update_bits() into macro
  regmap: add regmap_update_bits_base()
  regcache: flat: Introduce register strider order
  regcache: Introduce the index parsing API by stride order
  regmap: core: Introduce register stride order
  regmap: irq: add devm apis for regmap_{add,del}_irq_chip
  ...
2016-03-15 21:22:26 -07:00
Linus Torvalds
ff280e3639 spi: Updates for v4.6
Not the biggest set of changes for SPI but a bit of a pickup in activity
 on the core:
 
  - Support for memory mapped read from flash devices via a SPI
    controller.
  - The beginnings of a message rewriting framework in the core which
    should in time allow us to support transforming messages to work
    around the limits of controllers or optimise the performance for
    controllers transparently to calling drivers.
  - Updates to the PXA2xx, the main functional change being to improve
    the ACPI support.
  - A new driver for the Analog Devices AXI SPI engine.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJW5vmhAAoJECTWi3JdVIfQbs8H/25sgjyLHNFUy97An1CbfF29
 YI3KCTTi/RlvtQSOl2WvfPu/FML3vJYClkCIG5/oUE1y5FfX+8vdh0oObtaG9ewn
 IwiuhHXH68GcAKTFNFzXrT52aLWo3V67jINwbDyleN2gwk4QchTqUsWvHpcqibCt
 KIOKOqmIhxLXeat/014KAMzsB1snaSVDBv2ao+1DstRBn/2Tb6uFKzHL4ehBAPDy
 mQ/f7ogBlTG3aNrldf+841NbgMl6ekMGapLIllec/j54C6H2Yig8hn0zoWmL8/BG
 eHumwBjdqQMJmLFAS9+cz8Q8YZy2EVG8gCimzQD6uQOMG6CrTGnfgBDmrFhqvRI=
 =WSqI
 -----END PGP SIGNATURE-----

Merge tag 'spi-v4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi

Pull spi updates from Mark Brown:
 "Not the biggest set of changes for SPI but a bit of a pickup in
  activity on the core:

   - Support for memory mapped read from flash devices via a SPI
     controller.

   - The beginnings of a message rewriting framework in the core which
     should in time allow us to support transforming messages to work
     around the limits of controllers or optimise the performance for
     controllers transparently to calling drivers.

   - Updates to the PXA2xx, the main functional change being to improve
     the ACPI support.

   - A new driver for the Analog Devices AXI SPI engine"

* tag 'spi-v4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: (66 commits)
  spi: Add gfp parameter to kernel-doc to fix build warning
  spi: Fix htmldocs build error due struct spi_replaced_transfers
  spi: rockchip: covert rsd_nsecs to u32 type
  spi: rockchip: header file cleanup
  spi: xilinx: Add devicetree binding for spi-xilinx
  spi: respect the maximum segment size of DMA device
  spi: rockchip: check requesting dma channel with EPROBE_DEFER
  spi: rockchip: migrate to dmaengine_terminate_async
  spi: rockchip: check return value of dmaengine_prep_slave_sg
  spi: core: Fix deadlock when sending messages
  spi/rockchip: fix endian mode for 16-bit transfers
  spi/rockchip: Make sure spi clk is on in rockchip_spi_set_cs
  spi: pxa2xx: Use newer more explicit DMAengine terminate API
  spi: pxa2xx: Add support for Intel Broxton B-Step
  spi: lp-8841: return correct error code from probe
  spi: imx: drop bogus tests for rx/tx bufs in DMA transfer
  spi: imx: set MX51_ECSPI_CTRL_SMC bit in setup function
  spi: imx: make some register defines simpler
  spi: imx: remove unnecessary bit clearing in mx51_ecspi_config
  spi: imx: add support for all SPI word width for DMA
  ...
2016-03-15 21:07:33 -07:00
Simon Horman
09c32427c9 clk: renesas: Rename header file renesas.h
This is part of an ongoing process to migrate from ARCH_SHMOBILE to
ARCH_RENESAS the motivation for which being that RENESAS seems to be a more
appropriate name than SHMOBILE for the majority of Renesas ARM based SoCs.

Along with the above mentioned Kconfig changes it seems appropriate
to also rename files.

Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Acked-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
2016-03-15 18:12:14 -07:00
Ian Kent
63c06227a2 autofs4: fix string.h include in auto_dev-ioctl.h
Since including linux/string.h will now do the right thing remove the
conditional check.

Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-15 16:55:16 -07:00
Ian Kent
e9a7c2f1a5 autofs4: coding style fixes
Try and make the coding style completely consistent throughtout the
autofs module and inline with kernel coding style recommendations.

Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-15 16:55:16 -07:00
Joonsoo Kim
7cf91a98e6 mm/compaction: speed up pageblock_pfn_to_page() when zone is contiguous
There is a performance drop report due to hugepage allocation and in
there half of cpu time are spent on pageblock_pfn_to_page() in
compaction [1].

In that workload, compaction is triggered to make hugepage but most of
pageblocks are un-available for compaction due to pageblock type and
skip bit so compaction usually fails.  Most costly operations in this
case is to find valid pageblock while scanning whole zone range.  To
check if pageblock is valid to compact, valid pfn within pageblock is
required and we can obtain it by calling pageblock_pfn_to_page().  This
function checks whether pageblock is in a single zone and return valid
pfn if possible.  Problem is that we need to check it every time before
scanning pageblock even if we re-visit it and this turns out to be very
expensive in this workload.

Although we have no way to skip this pageblock check in the system where
hole exists at arbitrary position, we can use cached value for zone
continuity and just do pfn_to_page() in the system where hole doesn't
exist.  This optimization considerably speeds up in above workload.

Before vs After
  Max: 1096 MB/s vs 1325 MB/s
  Min: 635 MB/s 1015 MB/s
  Avg: 899 MB/s 1194 MB/s

Avg is improved by roughly 30% [2].

[1]: http://www.spinics.net/lists/linux-mm/msg97378.html
[2]: https://lkml.org/lkml/2015/12/9/23

[akpm@linux-foundation.org: don't forget to restore zone->contiguous on error path, per Vlastimil]
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Reported-by: Aaron Lu <aaron.lu@intel.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Tested-by: Aaron Lu <aaron.lu@intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-15 16:55:16 -07:00
Johannes Weiner
fdf1cdb91b mm: remove unnecessary uses of lock_page_memcg()
There are several users that nest lock_page_memcg() inside lock_page()
to prevent page->mem_cgroup from changing.  But the page lock prevents
pages from moving between cgroups, so that is unnecessary overhead.

Remove lock_page_memcg() in contexts with locked contexts and fix the
debug code in the page stat functions to be okay with the page lock.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Vladimir Davydov <vdavydov@virtuozzo.com>
Cc: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-15 16:55:16 -07:00
Johannes Weiner
62cccb8c8e mm: simplify lock_page_memcg()
Now that migration doesn't clear page->mem_cgroup of live pages anymore,
it's safe to make lock_page_memcg() and the memcg stat functions take
pages, and spare the callers from memcg objects.

[akpm@linux-foundation.org: fix warnings]
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Suggested-by: Vladimir Davydov <vdavydov@virtuozzo.com>
Acked-by: Vladimir Davydov <vdavydov@virtuozzo.com>
Cc: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-15 16:55:16 -07:00
Johannes Weiner
6a93ca8fde mm: migrate: do not touch page->mem_cgroup of live pages
Changing a page's memcg association complicates dealing with the page,
so we want to limit this as much as possible.  Page migration e.g.  does
not have to do that.  Just like page cache replacement, it can forcibly
charge a replacement page, and then uncharge the old page when it gets
freed.  Temporarily overcharging the cgroup by a single page is not an
issue in practice, and charging is so cheap nowadays that this is much
preferrable to the headache of messing with live pages.

The only place that still changes the page->mem_cgroup binding of live
pages is when pages move along with a task to another cgroup.  But that
path isolates the page from the LRU, takes the page lock, and the move
lock (lock_page_memcg()).  That means page->mem_cgroup is always stable
in callers that have the page isolated from the LRU or locked.  Lighter
unlocked paths, like writeback accounting, can use lock_page_memcg().

[akpm@linux-foundation.org: fix build]
[vdavydov@virtuozzo.com: fix lockdep splat]
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Vladimir Davydov <vdavydov@virtuozzo.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Greg Thelen <gthelen@google.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-15 16:55:16 -07:00
Johannes Weiner
23047a96d7 mm: workingset: per-cgroup cache thrash detection
Cache thrash detection (see a528910e12 "mm: thrash detection-based
file cache sizing" for details) currently only works on the system
level, not inside cgroups.  Worse, as the refaults are compared to the
global number of active cache, cgroups might wrongfully get all their
refaults activated when their pages are hotter than those of others.

Move the refault machinery from the zone to the lruvec, and then tag
eviction entries with the memcg ID.  This makes the thrash detection
work correctly inside cgroups.

[sergey.senozhatsky@gmail.com: do not return from workingset_activation() with locked rcu and page]
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Reviewed-by: Vladimir Davydov <vdavydov@virtuozzo.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-15 16:55:16 -07:00
Johannes Weiner
81f8c3a461 mm: memcontrol: generalize locking for the page->mem_cgroup binding
These patches tag the page cache radix tree eviction entries with the
memcg an evicted page belonged to, thus making per-cgroup LRU reclaim
work properly and be as adaptive to new cache workingsets as global
reclaim already is.

This should have been part of the original thrash detection patch
series, but was deferred due to the complexity of those patches.

This patch (of 5):

So far the only sites that needed to exclude charge migration to
stabilize page->mem_cgroup have been per-cgroup page statistics, hence
the name mem_cgroup_begin_page_stat().  But per-cgroup thrash detection
will add another site that needs to ensure page->mem_cgroup lifetime.

Rename these locking functions to the more generic lock_page_memcg() and
unlock_page_memcg().  Since charge migration is a cgroup1 feature only,
we might be able to delete it at some point, and these now easy to
identify locking sites along with it.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Suggested-by: Vladimir Davydov <vdavydov@virtuozzo.com>
Acked-by: Vladimir Davydov <vdavydov@virtuozzo.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-15 16:55:16 -07:00
Vitaly Kuznetsov
31bc3858ea memory-hotplug: add automatic onlining policy for the newly added memory
Currently, all newly added memory blocks remain in 'offline' state
unless someone onlines them, some linux distributions carry special udev
rules like:

  SUBSYSTEM=="memory", ACTION=="add", ATTR{state}=="offline", ATTR{state}="online"

to make this happen automatically.  This is not a great solution for
virtual machines where memory hotplug is being used to address high
memory pressure situations as such onlining is slow and a userspace
process doing this (udev) has a chance of being killed by the OOM killer
as it will probably require to allocate some memory.

Introduce default policy for the newly added memory blocks in
/sys/devices/system/memory/auto_online_blocks file with two possible
values: "offline" which preserves the current behavior and "online"
which causes all newly added memory blocks to go online as soon as
they're added.  The default is "offline".

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Daniel Kiper <daniel.kiper@oracle.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Xishi Qiu <qiuxishi@huawei.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Kay Sievers <kay@vrfy.org>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-15 16:55:16 -07:00
Laura Abbott
1414c7f4f7 mm/page_poisoning.c: allow for zero poisoning
By default, page poisoning uses a poison value (0xaa) on free.  If this
is changed to 0, the page is not only sanitized but zeroing on alloc
with __GFP_ZERO can be skipped as well.  The tradeoff is that detecting
corruption from the poisoning is harder to detect.  This feature also
cannot be used with hibernation since pages are not guaranteed to be
zeroed after hibernation.

Credit to Grsecurity/PaX team for inspiring this work

Signed-off-by: Laura Abbott <labbott@fedoraproject.org>
Acked-by: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Mathias Krause <minipli@googlemail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Jianyu Zhan <nasa4836@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-15 16:55:16 -07:00
Laura Abbott
8823b1dbc0 mm/page_poison.c: enable PAGE_POISONING as a separate option
Page poisoning is currently set up as a feature if architectures don't
have architecture debug page_alloc to allow unmapping of pages.  It has
uses apart from that though.  Clearing of the pages on free provides an
increase in security as it helps to limit the risk of information leaks.
Allow page poisoning to be enabled as a separate option independent of
kernel_map pages since the two features do separate work.  Because of
how hiberanation is implemented, the checks on alloc cannot occur if
hibernation is enabled.  The runtime alloc checks can also be enabled
with an option when !HIBERNATION.

Credit to Grsecurity/PaX team for inspiring this work

Signed-off-by: Laura Abbott <labbott@fedoraproject.org>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Mathias Krause <minipli@googlemail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Jianyu Zhan <nasa4836@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-15 16:55:16 -07:00
Vlastimil Babka
ff8e811638 mm, debug: move bad flags printing to bad_page()
Since bad_page() is the only user of the badflags parameter of
dump_page_badflags(), we can move the code to bad_page() and simplify a
bit.

The dump_page_badflags() function is renamed to __dump_page() and can
still be called separately from dump_page() for temporary debug prints
where page_owner info is not desired.

The only user-visible change is that page->mem_cgroup is printed before
the bad flags.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-15 16:55:16 -07:00
Vlastimil Babka
4e462112e9 mm, page_owner: dump page owner info from dump_page()
The page_owner mechanism is useful for dealing with memory leaks.  By
reading /sys/kernel/debug/page_owner one can determine the stack traces
leading to allocations of all pages, and find e.g.  a buggy driver.

This information might be also potentially useful for debugging, such as
the VM_BUG_ON_PAGE() calls to dump_page().  So let's print the stored
info from dump_page().

Example output:

  page:ffffea000292f1c0 count:1 mapcount:0 mapping:ffff8800b2f6cc18 index:0x91d
  flags: 0x1fffff8001002c(referenced|uptodate|lru|mappedtodisk)
  page dumped because: VM_BUG_ON_PAGE(1)
  page->mem_cgroup:ffff8801392c5000
  page allocated via order 0, migratetype Movable, gfp_mask 0x24213ca(GFP_HIGHUSER_MOVABLE|__GFP_COLD|__GFP_NOWARN|__GFP_NORETRY)
   [<ffffffff811682c4>] __alloc_pages_nodemask+0x134/0x230
   [<ffffffff811b40c8>] alloc_pages_current+0x88/0x120
   [<ffffffff8115e386>] __page_cache_alloc+0xe6/0x120
   [<ffffffff8116ba6c>] __do_page_cache_readahead+0xdc/0x240
   [<ffffffff8116bd05>] ondemand_readahead+0x135/0x260
   [<ffffffff8116be9c>] page_cache_async_readahead+0x6c/0x70
   [<ffffffff811604c2>] generic_file_read_iter+0x3f2/0x760
   [<ffffffff811e0dc7>] __vfs_read+0xa7/0xd0
  page has been migrated, last migrate reason: compaction

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-15 16:55:16 -07:00
Vlastimil Babka
7cd12b4abf mm, page_owner: track and print last migrate reason
During migration, page_owner info is now copied with the rest of the
page, so the stacktrace leading to free page allocation during migration
is overwritten.  For debugging purposes, it might be however useful to
know that the page has been migrated since its initial allocation.  This
might happen many times during the lifetime for different reasons and
fully tracking this, especially with stacktraces would incur extra
memory costs.  As a compromise, store and print the migrate_reason of
the last migration that occurred to the page.  This is enough to
distinguish compaction, numa balancing etc.

Example page_owner entry after the patch:

  Page allocated via order 0, mask 0x24200ca(GFP_HIGHUSER_MOVABLE)
  PFN 628753 type Movable Block 1228 type Movable Flags 0x1fffff80040030(dirty|lru|swapbacked)
   [<ffffffff811682c4>] __alloc_pages_nodemask+0x134/0x230
   [<ffffffff811b6325>] alloc_pages_vma+0xb5/0x250
   [<ffffffff81177491>] shmem_alloc_page+0x61/0x90
   [<ffffffff8117a438>] shmem_getpage_gfp+0x678/0x960
   [<ffffffff8117c2b9>] shmem_fallocate+0x329/0x440
   [<ffffffff811de600>] vfs_fallocate+0x140/0x230
   [<ffffffff811df434>] SyS_fallocate+0x44/0x70
   [<ffffffff8158cc2e>] entry_SYSCALL_64_fastpath+0x12/0x71
  Page has been migrated, last migrate reason: compaction

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-15 16:55:16 -07:00
Vlastimil Babka
d435edca92 mm, page_owner: copy page owner info during migration
The page_owner mechanism stores gfp_flags of an allocation and stack
trace that lead to it.  During page migration, the original information
is practically replaced by the allocation of free page as the migration
target.  Arguably this is less useful and might lead to all the
page_owner info for migratable pages gradually converge towards
compaction or numa balancing migrations.  It has also lead to
inaccuracies such as one fixed by commit e2cfc91120 ("mm/page_owner:
set correct gfp_mask on page_owner").

This patch thus introduces copying the page_owner info during migration.
However, since the fact that the page has been migrated from its
original place might be useful for debugging, the next patch will
introduce a way to track that information as well.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-15 16:55:16 -07:00
Vlastimil Babka
7dd80b8af0 mm, page_owner: convert page_owner_inited to static key
CONFIG_PAGE_OWNER attempts to impose negligible runtime overhead when
enabled during compilation, but not actually enabled during runtime by
boot param page_owner=on.  This overhead can be further reduced using
the static key mechanism, which this patch does.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-15 16:55:16 -07:00
Vlastimil Babka
60f30350fd mm, page_owner: print migratetype of page and pageblock, symbolic flags
The information in /sys/kernel/debug/page_owner includes the migratetype
of the pageblock the page belongs to.  This is also checked against the
page's migratetype (as declared by gfp_flags during its allocation), and
the page is reported as Fallback if its migratetype differs from the
pageblock's one.  t This is somewhat misleading because in fact fallback
allocation is not the only reason why these two can differ.  It also
doesn't direcly provide the page's migratetype, although it's possible
to derive that from the gfp_flags.

It's arguably better to print both page and pageblock's migratetype and
leave the interpretation to the consumer than to suggest fallback
allocation as the only possible reason.  While at it, we can print the
migratetypes as string the same way as /proc/pagetypeinfo does, as some
of the numeric values depend on kernel configuration.  For that, this
patch moves the migratetype_names array from #ifdef CONFIG_PROC_FS part
of mm/vmstat.c to mm/page_alloc.c and exports it.

With the new format strings for flags, we can now also provide symbolic
page and gfp flags in the /sys/kernel/debug/page_owner file.  This
replaces the positional printing of page flags as single letters, which
might have looked nicer, but was limited to a subset of flags, and
required the user to remember the letters.

Example page_owner entry after the patch:

  Page allocated via order 0, mask 0x24213ca(GFP_HIGHUSER_MOVABLE|__GFP_COLD|__GFP_NOWARN|__GFP_NORETRY)
  PFN 520 type Movable Block 1 type Movable Flags 0xfffff8001006c(referenced|uptodate|lru|active|mappedtodisk)
   [<ffffffff811682c4>] __alloc_pages_nodemask+0x134/0x230
   [<ffffffff811b4058>] alloc_pages_current+0x88/0x120
   [<ffffffff8115e386>] __page_cache_alloc+0xe6/0x120
   [<ffffffff8116ba6c>] __do_page_cache_readahead+0xdc/0x240
   [<ffffffff8116bd05>] ondemand_readahead+0x135/0x260
   [<ffffffff8116bfb1>] page_cache_sync_readahead+0x31/0x50
   [<ffffffff81160523>] generic_file_read_iter+0x453/0x760
   [<ffffffff811e0d57>] __vfs_read+0xa7/0xd0

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-15 16:55:16 -07:00
Vlastimil Babka
420adbe9fc mm, tracing: unify mm flags handling in tracepoints and printk
In tracepoints, it's possible to print gfp flags in a human-friendly
format through a macro show_gfp_flags(), which defines a translation
array and passes is to __print_flags().  Since the following patch will
introduce support for gfp flags printing in printk(), it would be nice
to reuse the array.  This is not straightforward, since __print_flags()
can't simply reference an array defined in a .c file such as mm/debug.c
- it has to be a macro to allow the macro magic to communicate the
format to userspace tools such as trace-cmd.

The solution is to create a macro __def_gfpflag_names which is used both
in show_gfp_flags(), and to define the gfpflag_names[] array in
mm/debug.c.

On the other hand, mm/debug.c also defines translation tables for page
flags and vma flags, and desire was expressed (but not implemented in
this series) to use these also from tracepoints.  Thus, this patch also
renames the events/gfpflags.h file to events/mmflags.h and moves the
table definitions there, using the same macro approach as for gfpflags.
This allows translating all three kinds of mm-specific flags both in
tracepoints and printk.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Michal Hocko <mhocko@suse.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-15 16:55:16 -07:00
Vlastimil Babka
14e0a214d6 tools, perf: make gfp_compact_table up to date
When updating tracing's show_gfp_flags() I have noticed that perf's
gfp_compact_table is also outdated.  Fill in the missing flags and place
a note in gfp.h to increase chance that future updates are synced.
Convert the __GFP_X flags from "GFP_X" to "__GFP_X" strings in line with
the previous patch.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-15 16:55:16 -07:00
Vlastimil Babka
1f7866b4ae mm, tracing: make show_gfp_flags() up to date
The show_gfp_flags() macro provides human-friendly printing of gfp flags
in tracepoints.  However, it is somewhat out of date and missing several
flags.  This patches fills in the missing flags, and distinguishes
properly between GFP_ATOMIC and __GFP_ATOMIC which were both translated
to "GFP_ATOMIC".  More generally, all __GFP_X flags which were
previously printed as GFP_X, are now printed as __GFP_X, since ommiting
the underscores results in output that doesn't actually match the source
code, and can only lead to confusion.  Where both variants are defined
equal (e.g.  _DMA and _DMA32), the variant without underscores are
preferred.

Also add a note in gfp.h so hopefully future changes will be synced
better.

__GFP_MOVABLE is defined twice in include/linux/gfp.h with different
comments.  Leave just the newer one, which was intended to replace the
old one.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Michal Hocko <mhocko@suse.com>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-15 16:55:16 -07:00
Vlastimil Babka
20f6e03a40 tracepoints: move trace_print_flags definitions to tracepoint-defs.h
The following patch will need to declare array of struct
trace_print_flags in a header.  To prevent this header from pulling in
all of RCU through trace_events.h, move the struct
trace_print_flags{_64} definitions to the new lightweight
tracepoint-defs.h header.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-15 16:55:16 -07:00
Joonsoo Kim
d86bd1bece mm/slub: support left redzone
SLUB already has a redzone debugging feature.  But it is only positioned
at the end of object (aka right redzone) so it cannot catch left oob.
Although current object's right redzone acts as left redzone of next
object, first object in a slab cannot take advantage of this effect.
This patch explicitly adds a left red zone to each object to detect left
oob more precisely.

Background:

Someone complained to me that left OOB doesn't catch even if KASAN is
enabled which does page allocation debugging.  That page is out of our
control so it would be allocated when left OOB happens and, in this
case, we can't find OOB.  Moreover, SLUB debugging feature can be
enabled without page allocator debugging and, in this case, we will miss
that OOB.

Before trying to implement, I expected that changes would be too
complex, but, it doesn't look that complex to me now.  Almost changes
are applied to debug specific functions so I feel okay.

Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-15 16:55:16 -07:00
Laura Abbott
becfda68ab slub: convert SLAB_DEBUG_FREE to SLAB_CONSISTENCY_CHECKS
SLAB_DEBUG_FREE allows expensive consistency checks at free to be turned
on or off.  Expand its use to be able to turn off all consistency
checks.  This gives a nice speed up if you only want features such as
poisoning or tracing.

Credit to Mathias Krause for the original work which inspired this
series

Signed-off-by: Laura Abbott <labbott@fedoraproject.org>
Acked-by: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <js1304@gmail.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Mathias Krause <minipli@googlemail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-15 16:55:16 -07:00
Joonsoo Kim
d31676dfde mm/slab: alternative implementation for DEBUG_SLAB_LEAK
DEBUG_SLAB_LEAK is a debug option.  It's current implementation requires
status buffer so we need more memory to use it.  And, it cause
kmem_cache initialization step more complex.

To remove this extra memory usage and to simplify initialization step,
this patch implement this feature with another way.

When user requests to get slab object owner information, it marks that
getting information is started.  And then, all free objects in caches
are flushed to corresponding slab page.  Now, we can distinguish all
freed object so we can know all allocated objects, too.  After
collecting slab object owner information on allocated objects, mark is
checked that there is no free during the processing.  If true, we can be
sure that our information is correct so information is returned to user.

Although this way is rather complex, it has two important benefits
mentioned above.  So, I think it is worth changing.

There is one drawback that it takes more time to get slab object owner
information but it is just a debug option so it doesn't matter at all.

To help review, this patch implements new way only.  Following patch
will remove useless code.

Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-15 16:55:16 -07:00
Joonsoo Kim
40b4413797 mm/slab: clean up DEBUG_PAGEALLOC processing code
Currently, open code for checking DEBUG_PAGEALLOC cache is spread to
some sites.  It makes code unreadable and hard to change.

This patch cleans up this code.  The following patch will change the
criteria for DEBUG_PAGEALLOC cache so this clean-up will help it, too.

[akpm@linux-foundation.org: fix build with CONFIG_DEBUG_PAGEALLOC=n]
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-15 16:55:16 -07:00
Jesper Dangaard Brouer
9f706d6820 mm: fix some spelling
Fix up trivial spelling errors, noticed while reading the code.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Vladimir Davydov <vdavydov@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-15 16:55:16 -07:00
Jesper Dangaard Brouer
ca25719551 mm: new API kfree_bulk() for SLAB+SLUB allocators
This patch introduce a new API call kfree_bulk() for bulk freeing memory
objects not bound to a single kmem_cache.

Christoph pointed out that it is possible to implement freeing of
objects, without knowing the kmem_cache pointer as that information is
available from the object's page->slab_cache.  Proposing to remove the
kmem_cache argument from the bulk free API.

Jesper demonstrated that these extra steps per object comes at a
performance cost.  It is only in the case CONFIG_MEMCG_KMEM is compiled
in and activated runtime that these steps are done anyhow.  The extra
cost is most visible for SLAB allocator, because the SLUB allocator does
the page lookup (virt_to_head_page()) anyhow.

Thus, the conclusion was to keep the kmem_cache free bulk API with a
kmem_cache pointer, but we can still implement a kfree_bulk() API fairly
easily.  Simply by handling if kmem_cache_free_bulk() gets called with a
kmem_cache NULL pointer.

This does increase the code size a bit, but implementing a separate
kfree_bulk() call would likely increase code size even more.

Below benchmarks cost of alloc+free (obj size 256 bytes) on CPU i7-4790K
@ 4.00GHz, no PREEMPT and CONFIG_MEMCG_KMEM=y.

Code size increase for SLAB:

 add/remove: 0/0 grow/shrink: 1/0 up/down: 74/0 (74)
 function                                     old     new   delta
 kmem_cache_free_bulk                         660     734     +74

SLAB fastpath: 87 cycles(tsc) 21.814
  sz - fallback             - kmem_cache_free_bulk - kfree_bulk
   1 - 103 cycles 25.878 ns -  41 cycles 10.498 ns - 81 cycles 20.312 ns
   2 -  94 cycles 23.673 ns -  26 cycles  6.682 ns - 42 cycles 10.649 ns
   3 -  92 cycles 23.181 ns -  21 cycles  5.325 ns - 39 cycles 9.950 ns
   4 -  90 cycles 22.727 ns -  18 cycles  4.673 ns - 26 cycles 6.693 ns
   8 -  89 cycles 22.270 ns -  14 cycles  3.664 ns - 23 cycles 5.835 ns
  16 -  88 cycles 22.038 ns -  14 cycles  3.503 ns - 22 cycles 5.543 ns
  30 -  89 cycles 22.284 ns -  13 cycles  3.310 ns - 20 cycles 5.197 ns
  32 -  88 cycles 22.249 ns -  13 cycles  3.420 ns - 20 cycles 5.166 ns
  34 -  88 cycles 22.224 ns -  14 cycles  3.643 ns - 20 cycles 5.170 ns
  48 -  88 cycles 22.088 ns -  14 cycles  3.507 ns - 20 cycles 5.203 ns
  64 -  88 cycles 22.063 ns -  13 cycles  3.428 ns - 20 cycles 5.152 ns
 128 -  89 cycles 22.483 ns -  15 cycles  3.891 ns - 23 cycles 5.885 ns
 158 -  89 cycles 22.381 ns -  15 cycles  3.779 ns - 22 cycles 5.548 ns
 250 -  91 cycles 22.798 ns -  16 cycles  4.152 ns - 23 cycles 5.967 ns

SLAB when enabling MEMCG_KMEM runtime:
 - kmemcg fastpath: 130 cycles(tsc) 32.684 ns (step:0)
 1 - 148 cycles 37.220 ns -  66 cycles 16.622 ns - 66 cycles 16.583 ns
 2 - 141 cycles 35.510 ns -  51 cycles 12.820 ns - 58 cycles 14.625 ns
 3 - 140 cycles 35.017 ns -  37 cycles 9.326 ns - 33 cycles 8.474 ns
 4 - 137 cycles 34.507 ns -  31 cycles 7.888 ns - 33 cycles 8.300 ns
 8 - 140 cycles 35.069 ns -  25 cycles 6.461 ns - 25 cycles 6.436 ns
 16 - 138 cycles 34.542 ns -  23 cycles 5.945 ns - 22 cycles 5.670 ns
 30 - 136 cycles 34.227 ns -  22 cycles 5.502 ns - 22 cycles 5.587 ns
 32 - 136 cycles 34.253 ns -  21 cycles 5.475 ns - 21 cycles 5.324 ns
 34 - 136 cycles 34.254 ns -  21 cycles 5.448 ns - 20 cycles 5.194 ns
 48 - 136 cycles 34.075 ns -  21 cycles 5.458 ns - 21 cycles 5.367 ns
 64 - 135 cycles 33.994 ns -  21 cycles 5.350 ns - 21 cycles 5.259 ns
 128 - 137 cycles 34.446 ns -  23 cycles 5.816 ns - 22 cycles 5.688 ns
 158 - 137 cycles 34.379 ns -  22 cycles 5.727 ns - 22 cycles 5.602 ns
 250 - 138 cycles 34.755 ns -  24 cycles 6.093 ns - 23 cycles 5.986 ns

Code size increase for SLUB:
 function                                     old     new   delta
 kmem_cache_free_bulk                         717     799     +82

SLUB benchmark:
 SLUB fastpath: 46 cycles(tsc) 11.691 ns (step:0)
  sz - fallback             - kmem_cache_free_bulk - kfree_bulk
   1 -  61 cycles 15.486 ns -  53 cycles 13.364 ns - 57 cycles 14.464 ns
   2 -  54 cycles 13.703 ns -  32 cycles  8.110 ns - 33 cycles 8.482 ns
   3 -  53 cycles 13.272 ns -  25 cycles  6.362 ns - 27 cycles 6.947 ns
   4 -  51 cycles 12.994 ns -  24 cycles  6.087 ns - 24 cycles 6.078 ns
   8 -  50 cycles 12.576 ns -  21 cycles  5.354 ns - 22 cycles 5.513 ns
  16 -  49 cycles 12.368 ns -  20 cycles  5.054 ns - 20 cycles 5.042 ns
  30 -  49 cycles 12.273 ns -  18 cycles  4.748 ns - 19 cycles 4.758 ns
  32 -  49 cycles 12.401 ns -  19 cycles  4.821 ns - 19 cycles 4.810 ns
  34 -  98 cycles 24.519 ns -  24 cycles  6.154 ns - 24 cycles 6.157 ns
  48 -  83 cycles 20.833 ns -  21 cycles  5.446 ns - 21 cycles 5.429 ns
  64 -  75 cycles 18.891 ns -  20 cycles  5.247 ns - 20 cycles 5.238 ns
 128 -  93 cycles 23.271 ns -  27 cycles  6.856 ns - 27 cycles 6.823 ns
 158 - 102 cycles 25.581 ns -  30 cycles  7.714 ns - 30 cycles 7.695 ns
 250 - 107 cycles 26.917 ns -  38 cycles  9.514 ns - 38 cycles 9.506 ns

SLUB when enabling MEMCG_KMEM runtime:
 - kmemcg fastpath: 71 cycles(tsc) 17.897 ns (step:0)
 1 - 85 cycles 21.484 ns -  78 cycles 19.569 ns - 75 cycles 18.938 ns
 2 - 81 cycles 20.363 ns -  45 cycles 11.258 ns - 44 cycles 11.076 ns
 3 - 78 cycles 19.709 ns -  33 cycles 8.354 ns - 32 cycles 8.044 ns
 4 - 77 cycles 19.430 ns -  28 cycles 7.216 ns - 28 cycles 7.003 ns
 8 - 101 cycles 25.288 ns -  23 cycles 5.849 ns - 23 cycles 5.787 ns
 16 - 76 cycles 19.148 ns -  20 cycles 5.162 ns - 20 cycles 5.081 ns
 30 - 76 cycles 19.067 ns -  19 cycles 4.868 ns - 19 cycles 4.821 ns
 32 - 76 cycles 19.052 ns -  19 cycles 4.857 ns - 19 cycles 4.815 ns
 34 - 121 cycles 30.291 ns -  25 cycles 6.333 ns - 25 cycles 6.268 ns
 48 - 108 cycles 27.111 ns -  21 cycles 5.498 ns - 21 cycles 5.458 ns
 64 - 100 cycles 25.164 ns -  20 cycles 5.242 ns - 20 cycles 5.229 ns
 128 - 155 cycles 38.976 ns -  27 cycles 6.886 ns - 27 cycles 6.892 ns
 158 - 132 cycles 33.034 ns -  30 cycles 7.711 ns - 30 cycles 7.728 ns
 250 - 130 cycles 32.612 ns -  38 cycles 9.560 ns - 38 cycles 9.549 ns

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Vladimir Davydov <vdavydov@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-15 16:55:16 -07:00
Jesper Dangaard Brouer
fab9963a69 mm: fault-inject take over bootstrap kmem_cache check
Remove the SLAB specific function slab_should_failslab(), by moving the
check against fault-injection for the bootstrap slab, into the shared
function should_failslab() (used by both SLAB and SLUB).

This is a step towards sharing alloc_hook's between SLUB and SLAB.

This bootstrap slab "kmem_cache" is used for allocating struct
kmem_cache objects to the allocator itself.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Vladimir Davydov <vdavydov@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-15 16:55:16 -07:00