Commit graph

46737 commits

Author SHA1 Message Date
Majd Dibbiny
a05bdefa40 net/mlx5_core: Use port number when querying port ptys
Until now, mlx5_query_port_ptys always queried port number one.

Added new argument in the function's prototype so we can also query
the second port. This will be needed  when thr helper will be invoked
from the IB driver on non FPP (Function-Per-Port) devices.

Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-04 16:41:01 -07:00
Majd Dibbiny
e760152d08 net/mlx5_core: Use port number in the query port mtu helpers
Extend the function prototypes for max and operational mtu to take the
local port number. In the Ethernet driver is this hard coded to one,
since ConnectX4 Ethernet devices are always function-per-port.
The IB driver also serves older devices (ConnectIB) which isn't such,
and hence the part can vary.

Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-04 16:41:01 -07:00
Majd Dibbiny
211e6c80e5 net/mlx5_core: Get vendor-id using the query adapter command
Add two wrapper functions to the query adapter command:

1. mlx5_query_board_id -- replaces the old mlx5_cmd_query_adapter.

2. mlx5_core_query_vendor_id -- retrieves the vendor_id from the
   query_adapter command.

Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-04 16:41:01 -07:00
Majd Dibbiny
707c4602cd net/mlx5_core: Add new query HCA vport commands
Added the implementation for the following commands:

1. QUERY_HCA_VPORT_GID
2. QUERY_HCA_VPORT_PKEY
3. QUERY_HCA_VPORT_CONTEXT

They will be needed when we move to work with ISSI > 0 in the IB driver too.

Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-04 16:41:01 -07:00
Majd Dibbiny
d18a9470f8 net/mlx5_core: Make the vport helpers available for the IB driver too
Move the vport header file to be under include/linux/mlx5, such that
the mlx5 IB can use it as well.

Also add nic_ prefix to the vport NIC commands to differeniate between
HCA vport commands and NIC vport commands.

Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-04 16:41:01 -07:00
Haggai Abramonvsky
01949d0109 net/mlx5_core: Enable XRCs and SRQs when using ISSI > 0
When working in ISSI > 0 mode, the model exposed by the device for
XRCs and SRQs is different. XRCs use XRC SRQs and plain SRQs are based
on RPM (Receive Memory Pool).

Add helper functions to create, modify, query, and arm XRC SRQs and RMPs.

Signed-off-by: Haggai Abramovsky <hagaya@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-04 16:41:01 -07:00
Tom Herbert
42aecaa9bb net: Get skb hash over flow_keys structure
This patch changes flow hashing to use jhash2 over the flow_keys
structure instead just doing jhash_3words over src, dst, and ports.
This method will allow us take more input into the hashing function
so that we can include full IPv6 addresses, VLAN, flow labels etc.
without needing to resort to xor'ing which makes for a poor hash.

Acked-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-04 15:44:30 -07:00
Chuck Lever
0380a3f375 svcrdma: Add a separate "max data segs macro for svcrdma
The server and client maximum are architecturally independent.
Allow changing one without affecting the other.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-06-04 16:56:01 -04:00
Chuck Lever
b7e0b9a965 svcrdma: Replace GFP_KERNEL in a loop with GFP_NOFAIL
At the 2015 LSF/MM, it was requested that memory allocation
call sites that request GFP_KERNEL allocations in a loop should be
annotated with __GFP_NOFAIL.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-06-04 16:56:00 -04:00
Chuck Lever
30b7e246a6 svcrdma: Keep rpcrdma_msg fields in network byte-order
Fields in struct rpcrdma_msg are __be32. Don't byte-swap these
fields when decoding RPC calls and then swap them back for the
reply. For the most part, they can be left alone.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-06-04 16:55:59 -04:00
Oleg Nesterov
9e7c8f8c62 signals: don't abuse __flush_signals() in selinux_bprm_committed_creds()
selinux_bprm_committed_creds()->__flush_signals() is not right, we
shouldn't clear TIF_SIGPENDING unconditionally. There can be other
reasons for signal_pending(): freezing(), JOBCTL_PENDING_MASK, and
potentially more.

Also change this code to check fatal_signal_pending() rather than
SIGNAL_GROUP_EXIT, it looks a bit better.

Now we can kill __flush_signals() before it finds another buggy user.

Note: this code looks racy, we can flush a signal which was sent after
the task SID has been updated.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
2015-06-04 16:22:16 -04:00
Paolo Bonzini
64d6067057 KVM: x86: stubs for SMM support
This patch adds the interface between x86.c and the emulator: the
SMBASE register, a new emulator flag, the RSM instruction.  It also
adds a new request bit that will be used by the KVM_SMI ioctl.

Reviewed-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-04 16:01:45 +02:00
Masanari Iida
12f7c14aa6 crypto: doc - Fix typo in crypto-API.xml
This patch fix some typos found in crypto-API.xml.
It is because the file is generated from comments in sources,
so I had to fix typo in sources.

Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Acked-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2015-06-04 15:05:08 +08:00
David S. Miller
9d1dabfbd0 new driver mt7601u for MediaTek Wi-Fi devices MT7601U
ath10k:
 
 * qca6174 power consumption improvements, enable ASPM etc (Michal)
 
 wil6210:
 
 * support Wi-Fi Simple Configuration in STA mode
 
 iwlwifi:
 
 * a few fixes (re-enablement of interrupts for certain new
   platforms that have special power states)
 * Rework completely the RBD allocation model towards new
   multi RX hardware.
 * cleanups
 * scan reworks continuation (Luca)
 
 mwifiex:
 
 * improve firmware debug functionality
 
 rtlwifi:
 
 * update regulatory database
 
 brcmfmac:
 
 * cleanup and new feature support in PCIe code
 * alternative nvram loading for router support
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQEcBAABAgAGBQJVb1cPAAoJEG4XJFUm622bP0oIAKhUBlC3rtrOJd+9kREAGUJQ
 Dk2xZr/p6hdb4dSHHKKroBr5mfryHknSs+AI5akJMph36DoBMD+Mwb4HlcL9cI5J
 RXIjIvQEADsK+6ME7cqnw2htWlYsX8aJI96/2Eusveo/zHyAG3+eBC3wkyqWBlBK
 EGV5ziClSe5pE5yGWj5tyr9me+qRQiO+dFJK1AoRE3Zq4pjj+5VDZoVQN0GNZGP7
 lgeNOzvPxWt+ZseslP8IeCedN5c+NpacD889NnQJyMXaouSp7LmMod000bjnKK8o
 9sRHsKxI5qHgC4mUa3Tk3cEnFqVYAo8KKOVaBVtKsMc4XoO/Qov6Z0AtXig5Xnk=
 =CM/T
 -----END PGP SIGNATURE-----

Merge tag 'wireless-drivers-next-for-davem-2015-06-03' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next

Kalle Valo says:

====================
new driver mt7601u for MediaTek Wi-Fi devices MT7601U

ath10k:

* qca6174 power consumption improvements, enable ASPM etc (Michal)

wil6210:

* support Wi-Fi Simple Configuration in STA mode

iwlwifi:

* a few fixes (re-enablement of interrupts for certain new
  platforms that have special power states)
* Rework completely the RBD allocation model towards new
  multi RX hardware.
* cleanups
* scan reworks continuation (Luca)

mwifiex:

* improve firmware debug functionality

rtlwifi:

* update regulatory database

brcmfmac:

* cleanup and new feature support in PCIe code
* alternative nvram loading for router support
====================

Conflicts:
	drivers/net/wireless/iwlwifi/Kconfig

Trivial conflict in iwlwifi Kconfig, two commits adding
the same two chip numbers to the help text, but order
transposed.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-03 23:44:57 -07:00
Konstantin Khlebnikov
c8fff7bc5b of: return NUMA_NO_NODE from fallback of_node_to_nid()
Node 0 might be offline as well as any other numa node,
in this case kernel cannot handle memory allocation and crashes.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Fixes: 0c3f061c19 ("of: implement of_node_to_nid as a weak function")
Signed-off-by: Grant Likely <grant.likely@linaro.org>
2015-06-04 14:39:52 +09:00
Dave Chinner
66e8ac7bfa Merge branch 'xfs-dax-support' into for-next 2015-06-04 13:01:49 +10:00
Linus Torvalds
8a7deb362b Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull block layer fixes from Jens Axboe:
 "Sending this off now, as I'm not aware of other current bugs, nor do I
  expect further fixes before 4.1 final.  This contains two fixes:

   - a fix for a bdi unregister warning that gets spewed on md, due to a
     regression introduced earlier in this cycle.  From Neil Brown.

   - a fix for a compile warning for NVMe on 32-bit platforms, also a
     regression introduced in this cycle.  From Arnd Bergmann"

* 'for-linus' of git://git.kernel.dk/linux-block:
  NVMe: fix type warning on 32-bit
  block: discard bdi_unregister() in favour of bdi_destroy()
2015-06-03 16:35:00 -07:00
Dave Chinner
ce5c5d554d dax: expose __dax_fault for filesystems with locking constraints
Some filesystems cannot call dax_fault() directly because they have
different locking and/or allocation constraints in the page fault IO
path. To handle this, we need to follow the same model as the
generic block_page_mkwrite code, where the internals are exposed via
__block_page_mkwrite() so that filesystems can wrap the correct
locking and operations around the outside. 

This is loosely based on a patch originally from Matthew Willcox.
Unlike the original patch, it does not change ext4 code, error
returns or unwritten extent conversion handling.  It also adds a
__dax_mkwrite() wrapper for .page_mkwrite implementations to do the
right thing, too.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-06-04 09:18:18 +10:00
Dave Chinner
e842f29039 dax: don't abuse get_block mapping for endio callbacks
dax_fault() currently relies on the get_block callback to attach an
io completion callback to the mapping buffer head so that it can
run unwritten extent conversion after zeroing allocated blocks.

Instead of this hack, pass the conversion callback directly into
dax_fault() similar to the get_block callback. When the filesystem
allocates unwritten extents, it will set the buffer_unwritten()
flag, and hence the dax_fault code can call the completion function
in the contexts where it is necessary without overloading the
mapping buffer head.

Note: The changes to ext4 to use this interface are suspect at best.
In fact, the way ext4 did this end_io assignment in the first place
looks suspect because it only set a completion callback when there
wasn't already some other write() call taking place on the same
inode. The ext4 end_io code looks rather intricate and fragile with
all it's reference counting and passing to different contexts for
modification via inode private pointers that aren't protected by
locks...

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-06-04 09:18:18 +10:00
Chuck Lever
da7049f834 svcrdma: Remove svc_rdma_xdr_decode_deferred_req()
svc_rdma_xdr_decode_deferred_req() indexes an array with an
un-byte-swapped value off the wire. Fortunately this function
isn't used anywhere, so simply remove it.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-06-03 15:15:23 -04:00
Stephen Rothwell
d6472302f2 x86/mm: Decouple <linux/vmalloc.h> from <asm/io.h>
Nothing in <asm/io.h> uses anything from <linux/vmalloc.h>, so
remove it from there and fix up the resulting build problems
triggered on x86 {64|32}-bit {def|allmod|allno}configs.

The breakages were triggering in places where x86 builds relied
on vmalloc() facilities but did not include <linux/vmalloc.h>
explicitly and relied on the implicit inclusion via <asm/io.h>.

Also add:

  - <linux/init.h> to <linux/io.h>
  - <asm/pgtable_types> to <asm/io.h>

... which were two other implicit header file dependencies.

Suggested-by: David Miller <davem@davemloft.net>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
[ Tidied up the changelog. ]
Acked-by: David Miller <davem@davemloft.net>
Acked-by: Takashi Iwai <tiwai@suse.de>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Vinod Koul <vinod.koul@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Anton Vorontsov <anton@enomsg.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Colin Cross <ccross@android.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: James E.J. Bottomley <JBottomley@odin.com>
Cc: Jaroslav Kysela <perex@perex.cz>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Kristen Carlson Accardi <kristen@linux.intel.com>
Cc: Len Brown <lenb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: Suma Ramars <sramars@cisco.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-03 12:02:00 +02:00
Ingo Molnar
71966f3a0b Merge branch 'locking/core' into x86/core, to prepare for dependent patch
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-03 10:07:35 +02:00
Ingo Molnar
34e7724c07 Merge branches 'x86/mm', 'x86/build', 'x86/apic' and 'x86/platform' into x86/core, to apply dependent patch
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-03 10:05:18 +02:00
Greg Kroah-Hartman
b3d424e3dc phy: for 4.2 merge window
*) new Broadcom SATA3 PHY driver for Broadcom STB SoCs
 *) new phy API to get PHY by index which is used in EHCI and
    OHCI controller drivers
 *) support specifying supply at port level used for multi-port PHYs
 *) sparse warning fixes in miphy PHYs
 *) fix pm_runtime issues in twl4030 driver
 
 Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJVbbvfAAoJEA5ceFyATYLZdAIP/2mFENq9DIRDqU7LV4aQRDlC
 TlL+DLGwElAGKkq7bL3UaWy+10kooTmVaYVB1MACAPXZgKguTkjsSdWXDSxFHDyG
 tq32CqkeMx1uhsZBMhdxsIsTnAPfEo1s2lO7O/HaJTq38aqKixC5+A3IYXttSvET
 jF4XCx7IGNK7YIMylTzAW1FN9rHvsp4RRKIkPjS8EAiAbQIcu7rxKdSbfwJ69Zxt
 EUep3u7jvlQK6SmOig3nkoQFf70G6d7p0VsBPrU6l6Dg9Ciu/5ZvzCkg5ukGotOl
 1GIpowjW3abIaX6yGOD95phU8+JPZbncG/vrQ5Hf/FDZtvCgs6tYqY+Zqmxf7SlK
 4Jj43+sjvDt05Ty9K1sP2FnQqBq9XKEmX5K1pYCrHDbDiH0L9gKeX9/46dJhDrMS
 F3rfrcgBxqBqT96BsqPHmuN0uUvXaF7pGKyFWoGodblBsYee6z4XAl8Up62Lqq3U
 QglegTz9ICKDU1j8xmjLHN9l21BfXANdirFTyjtV3XIGR6mcmzd3R3qk7x2ulaQ8
 lA4gfUhm9X8f55FTqLIw+7Ld+rgEW9NYHSX4fFvtzB/T5SPdOrslgjby4i0a81Pv
 IfQcLaQZ75QIj+22sy27cXrQsmUCe6sNXrms190uknh3hEHDeNOtPupYIMJcXiO+
 n2YJ6QCOnly3tSMFNWfM
 =vPk6
 -----END PGP SIGNATURE-----

Merge tag 'phy-for-v4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/kishon/linux-phy into usb-next

Kishon writes:

phy: for 4.2 merge window

*) new Broadcom SATA3 PHY driver for Broadcom STB SoCs
*) new phy API to get PHY by index which is used in EHCI and
   OHCI controller drivers
*) support specifying supply at port level used for multi-port PHYs
*) sparse warning fixes in miphy PHYs
*) fix pm_runtime issues in twl4030 driver

Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
2015-06-03 14:13:41 +09:00
Greg Kroah-Hartman
00465f4c84 Update extcon for v4.2
This patchset include the huge update of extcon core and add the new one extcon
 driver and fix minor isseu of extcon drivers.
 
 Detailed description for patchset:
 1. Update the extcon core.
 - Modify the extcon device name on sysfs from device name name to 'extcon[X]'
 as following because if same extcon device are included in H/W development board,
 the one of the two device driver might be failed on the probe().
 : /sys/class/extcon/[device name] -> /sys/class/extcon/extcon[X]
 
 - Use the unique id for external connectors instead of legacy string name.
 Previously, extcon used the string name to identify the type of external
 connectors. This way have the many potential issues. So, extcon core define the
 unique id for each external connectors as following:
 
 	enum extcon {
 		EXTCON_NONE             = 0x0,
 
 		/* USB external connector */
 		EXTCON_USB              = 0x1,
 		EXTCON_USB_HOST		= 0x2,
 
 		/* Charger external connector */
 		EXTCON_TA		= 0x10,
 		EXTCON_FAST_CHARGER	= 0x11,
 		EXTCON_SLOW_CHARGER	= 0x12,
 		EXTCON_CHARGE_DOWNSTREAM = 0x13,
 
 		/* Audio and video external connector */
 		EXTCON_LINE_IN		= 0x20,
 		EXTCON_LINE_OUT		= 0x21,
 		EXTCON_MICROPHONE	= 0x22,
 		EXTCON_HEADPHONE	= 0x23,
 		...
 	};
 
 - Update tye prototype of extcon_register_notifier() by using the unique id
 (enum extcon) of external connectors.
 
 - Add extcon_get_edev_name() API to get the name of extcon device on extcon
 client driver because the name is included in 'struct extcon_dev' and 'struct
 extcon_dev' should be handled in only drivers/extcon directory. So. if extcon
 client need the name of extcon device, they could use this function.
 
 - Unify the jig/dock and MHL-TA cable name on extcon driver.
 : JIG-{USB-ON|USB-OFF|UART-ON|UART-OFF} -> JIG
 : Dock-{Smart|Desk|Audio|Card} -> DOCK
 : MHL-TA -> TA
 
 - Use the capital letter for the name of all external connectors.
 - Remove the optional print_name() function pointer from struct extcon_dev to
 maintain the consistent name of extcon device.
 
 2. Add the new extcon-axp288.c extcon driver.
 - The extcon-axp288.c driver support for AXP288 PMIC which has the BC1.2
 charger detection capability. So this extcon driver can detect the
 EXTCON_SLOW_CHARGER, EXTCON_CHARGE_DOWNSTREAM and EXTCON_FAST_CHARGER.
 
 3. Update the extcon-arizona.c driver.
 - Add support for selective detection mode when headphone detection.
 - Apply HP clamps for WM8280
 
 4. Clean-up the extcon core and drivers.
 - Add manufactor information of each extcon device.
 - Fix checkpatch warning and minor coding style on extcon.c.c
 - Fix build break if GPIOLIB is not enabled on extcon-usb-gpiio.c.
 - Set the direction of gpio when calling devm_gpiod_get() on extcon-usb-gpio.c
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJVblY/AAoJEJzN3yze689TqagP/jUlSyoHGF2oe8hk+3Th6CN/
 sr70kJtuV7IrJ0+IPeHQIj6lg88/pUN8ydRSoY7fZY65It89nEqAaGpIT47Oj3Dy
 LhWLPOgCXpuN96sMBfs7HY9hgvNX2QMzmHMiIJKRdIFm4+0b7KhSa5xu9M2XuIHL
 iWndc76EHv+S5v/rjcjjMH7m0gSPhingBT7DY6qw3kBUhDuwmIRNlUd+DjHhtxyD
 nP87qi5tjGxgqjjmfqcAgHHyBCA2aS4SwC9vym3knwRhzSJwVPaj4EB+wECRrD7R
 0u4KhhirXQDzJrf3soDMqBfWL7/6yi5jSvEOOS9pxzqtUtoOkJuawaVKq5Kd/gqW
 McuS79IheiaQ0XMwUPlqU4AWrMyFFvBo+28ptUDlyJv2P2+K6cuysqMhHoni696z
 r18kkiql8KhM9u96Cgc4M6hRl8SqADiCxnCjpsRos82+HmoMcN/iyXz0tKsJxLBH
 Lds6XGWcbrjwE4VKojCuqWgJc/4q/pFOns6nM9Dz6koneN3c2DNAYrPz+BF7KJVf
 3fqPI7iCAocl1Jkp2kbQQpyjJ5wW2V51uqET8d52ArRsYVEfcETIBnngwiXPrkti
 EK7jlcpHBYZFcAqwaROrtR+VrLH5U4/i8JGppvpekjrRbMqNHYWp4eQ8lykKqubg
 6f/P7Qgkf9sqUoJH8mF/
 =Pc8L
 -----END PGP SIGNATURE-----

Merge tag 'extcon-next-for-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/extcon into char-misc-next

Chanwoo writes:

Update extcon for v4.2

This patchset include the huge update of extcon core and add the new one extcon
driver and fix minor isseu of extcon drivers.

Detailed description for patchset:
1. Update the extcon core.
- Modify the extcon device name on sysfs from device name name to 'extcon[X]'
as following because if same extcon device are included in H/W development board,
the one of the two device driver might be failed on the probe().
: /sys/class/extcon/[device name] -> /sys/class/extcon/extcon[X]

- Use the unique id for external connectors instead of legacy string name.
Previously, extcon used the string name to identify the type of external
connectors. This way have the many potential issues. So, extcon core define the
unique id for each external connectors as following:

	enum extcon {
		EXTCON_NONE             = 0x0,

		/* USB external connector */
		EXTCON_USB              = 0x1,
		EXTCON_USB_HOST		= 0x2,

		/* Charger external connector */
		EXTCON_TA		= 0x10,
		EXTCON_FAST_CHARGER	= 0x11,
		EXTCON_SLOW_CHARGER	= 0x12,
		EXTCON_CHARGE_DOWNSTREAM = 0x13,

		/* Audio and video external connector */
		EXTCON_LINE_IN		= 0x20,
		EXTCON_LINE_OUT		= 0x21,
		EXTCON_MICROPHONE	= 0x22,
		EXTCON_HEADPHONE	= 0x23,
		...
	};

- Update tye prototype of extcon_register_notifier() by using the unique id
(enum extcon) of external connectors.

- Add extcon_get_edev_name() API to get the name of extcon device on extcon
client driver because the name is included in 'struct extcon_dev' and 'struct
extcon_dev' should be handled in only drivers/extcon directory. So. if extcon
client need the name of extcon device, they could use this function.

- Unify the jig/dock and MHL-TA cable name on extcon driver.
: JIG-{USB-ON|USB-OFF|UART-ON|UART-OFF} -> JIG
: Dock-{Smart|Desk|Audio|Card} -> DOCK
: MHL-TA -> TA

- Use the capital letter for the name of all external connectors.
- Remove the optional print_name() function pointer from struct extcon_dev to
maintain the consistent name of extcon device.

2. Add the new extcon-axp288.c extcon driver.
- The extcon-axp288.c driver support for AXP288 PMIC which has the BC1.2
charger detection capability. So this extcon driver can detect the
EXTCON_SLOW_CHARGER, EXTCON_CHARGE_DOWNSTREAM and EXTCON_FAST_CHARGER.

3. Update the extcon-arizona.c driver.
- Add support for selective detection mode when headphone detection.
- Apply HP clamps for WM8280

4. Clean-up the extcon core and drivers.
- Add manufactor information of each extcon device.
- Fix checkpatch warning and minor coding style on extcon.c.c
- Fix build break if GPIOLIB is not enabled on extcon-usb-gpiio.c.
- Set the direction of gpio when calling devm_gpiod_get() on extcon-usb-gpio.c
2015-06-03 14:09:12 +09:00
Tom Lendacky
cfaed10d1f scatterlist: introduce sg_nents_for_len
When performing a dma_map_sg() call, the number of sg entries to map is
required. Using sg_nents to retrieve the number of sg entries will
return the total number of entries in the sg list up to the entry marked
as the end. If there happen to be unused entries in the list, these will
still be counted. Some dma_map_sg() implementations will not handle the
unused entries correctly (lib/swiotlb.c) and execute a BUG_ON.

The sg_nents_for_len() function will traverse the sg list and return the
number of entries required to satisfy the supplied length argument. This
can then be supplied to the dma_map_sg() call to successfully map the
sg.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2015-06-03 10:51:27 +08:00
Srinivas Pandruvada
1e25aa9641 hid-sensor: Fix suspend/resume delay
By default all the sensors are runtime suspended state (lowest power
state). During Linux suspend process, all the run time suspended
devices are resumed and then suspended. This caused all sensors to
power up and introduced delay in suspend time, when we introduced
runtime PM for HID sensors. The opposite process happens during resume
process.

To fix this, we do powerup process of the sensors only when the request
is issued from user (raw or tiggerred). In this way when runtime,
resume calls for powerup it will simply return as this will not match
user requested state.

Note this is a regression fix as the increase in suspend / resume
times can be substantial (report of 8 seconds on Len's laptop!)

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Tested-by: Len Brown <len.brown@intel.com>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
2015-06-02 22:15:26 +01:00
Chuck Lever
632dda833e SUNRPC: Clean up bc_send()
Clean up: Merge bc_send() into bc_svc_process().

Note: even thought this touches svc.c, it is a client-side change.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-06-02 13:30:35 -04:00
Ard Biesheuvel
24bbd929e6 of/fdt: split off FDT self reservation from memreserve processing
This splits off the reservation of the memory occupied by the FDT
binary itself from the processing of the memory reservations it
contains. This is necessary because the physical address of the FDT,
which is needed to perform the reservation, may not be known to the
FDT driver core, i.e., it may be mapped outside the linear direct
mapping, in which case __pa() returns a bogus value.

Cc: Russell King <linux@arm.linux.org.uk>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Acked-by: Rob Herring <robh@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2015-06-02 16:31:25 +01:00
Thomas Gleixner
be3ef76e9d clockevents: Rename state to state_use_accessors
The only sensible way to make abuse of core internal fields obvious
and easy to grep for.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
2015-06-02 16:56:42 +02:00
Tejun Heo
e8a7abf5a5 writeback: disassociate inodes from dying bdi_writebacks
For the purpose of foreign inode detection, wb's (bdi_writeback's) are
identified by the associated memcg ID.  As we create a separate wb for
each memcg, this is enough to identify the active wb's; however, when
blkcg is enabled or disabled higher up in the hierarchy, the mapping
between memcg and blkcg changes which in turn creates a new wb to
service the new mapping.  The old wb is unlinked from index and
released after all references are drained.  The foreign inode
detection logic can't detect this condition because both the old and
new wb's point to the same memcg and thus never decides to move inodes
attached to the old wb to the new one.

This patch adds logic to initiate switching immediately in
wbc_attach_and_unlock_inode() if the associated wb is dying.  We can
make the usual foreign detection logic to distinguish the different
wb's mapped to the memcg but the dying wb is never gonna be in active
service again and there's no point in tracking the usage history and
reaching the switch verdict after enough data points are collected.
It's already known that the wb has to be switched.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Jan Kara <jack@suse.cz>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Greg Thelen <gthelen@google.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-06-02 08:40:20 -06:00
Tejun Heo
aaa2cacf81 writeback: add lockdep annotation to inode_to_wb()
With the previous three patches, all operations which acquire wb from
inode are either under one of inode->i_lock, mapping->tree_lock or
wb->list_lock or protected by unlocked_inode_to_wb transaction.  This
will be depended upon by foreign inode wb switching.

This patch adds lockdep assertion to inode_to_wb() so that usages
outside the above list locks can be caught easily.  There are three
exceptions.

* locked_inode_to_wb_and_lock_list() is holding wb->list_lock but the
  wb may not be the inode's.  Ensuring that is the function's role
  after all.  Updated to deref inode->i_wb directly.

* inode_wb_stat_unlocked_begin() is usually protected by combination
  of !I_WB_SWITCH and rcu_read_lock().  Updated to deref inode->i_wb
  directly.

* inode_congested() wants to test whether inode->i_wb is set before
  starting the transaction.  Added inode_to_wb_is_valid() which tests
  inode->i_wb directly.

v5: might_lock() removed.  It annotates that the lock is grabbed w/
    irq enabled which isn't the case and triggering lockdep warning
    spuriously.

v4: might_lock() added to unlocked_inode_to_wb_begin().

v3: inode_congested() conversion added.

v2: locked_inode_to_wb_and_lock_list() was missing in the first
    version.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Jan Kara <jack@suse.cz>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Greg Thelen <gthelen@google.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-06-02 08:40:20 -06:00
Tejun Heo
682aa8e1a6 writeback: implement unlocked_inode_to_wb transaction and use it for stat updates
The mechanism for detecting whether an inode should switch its wb
(bdi_writeback) association is now in place.  This patch build the
framework for the actual switching.

This patch adds a new inode flag I_WB_SWITCHING, which has two
functions.  First, the easy one, it ensures that there's only one
switching in progress for a give inode.  Second, it's used as a
mechanism to synchronize wb stat updates.

The two stats, WB_RECLAIMABLE and WB_WRITEBACK, aren't event counters
but track the current number of dirty pages and pages under writeback
respectively.  As such, when an inode is moved from one wb to another,
the inode's portion of those stats have to be transferred together;
unfortunately, this is a bit tricky as those stat updates are percpu
operations which are performed without holding any lock in some
places.

This patch solves the problem in a similar way as memcg.  Each such
lockless stat updates are wrapped in transaction surrounded by
unlocked_inode_to_wb_begin/end().  During normal operation, they map
to rcu_read_lock/unlock(); however, if I_WB_SWITCHING is asserted,
mapping->tree_lock is grabbed across the transaction.

In turn, the switching path sets I_WB_SWITCHING and waits for a RCU
grace period to pass before actually starting to switch, which
guarantees that all stat update paths are synchronizing against
mapping->tree_lock.

This patch still doesn't implement the actual switching.

v3: Updated on top of the recent cancel_dirty_page() updates.
    unlocked_inode_to_wb_begin() now nests inside
    mem_cgroup_begin_page_stat() to match the locking order.

v2: The i_wb access transaction will be used for !stat accesses too.
    Function names and comments updated accordingly.

    s/inode_wb_stat_unlocked_{begin|end}/unlocked_inode_to_wb_{begin|end}/
    s/switch_wb/switch_wbs/

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Jan Kara <jack@suse.cz>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Greg Thelen <gthelen@google.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-06-02 08:40:20 -06:00
Tejun Heo
2a81490811 writeback: implement foreign cgroup inode detection
As concurrent write sharing of an inode is expected to be very rare
and memcg only tracks page ownership on first-use basis severely
confining the usefulness of such sharing, cgroup writeback tracks
ownership per-inode.  While the support for concurrent write sharing
of an inode is deemed unnecessary, an inode being written to by
different cgroups at different points in time is a lot more common,
and, more importantly, charging only by first-use can too readily lead
to grossly incorrect behaviors (single foreign page can lead to
gigabytes of writeback to be incorrectly attributed).

To resolve this issue, cgroup writeback detects the majority dirtier
of an inode and will transfer the ownership to it.  To avoid
unnnecessary oscillation, the detection mechanism keeps track of
history and gives out the switch verdict only if the foreign usage
pattern is stable over a certain amount of time and/or writeback
attempts.

The detection mechanism has fairly low space and computation overhead.
It adds 8 bytes to struct inode (one int and two u16's) and minimal
amount of calculation per IO.  The detection mechanism converges to
the correct answer usually in several seconds of IO time when there's
a clear majority dirtier.  Even when there isn't, it can reach an
acceptable answer fairly quickly under most circumstances.

Please see wb_detach_inode() for more details.

This patch only implements detection.  Following patches will
implement actual switching.

v2: wbc_account_io() now checks whether the wbc is associated with a
    wb before dereferencing it.  This can happen when pageout() is
    writing pages directly without going through the usual writeback
    path.  As pageout() path is single-threaded, we don't want it to
    be blocked behind a slow cgroup and ultimately want it to delegate
    actual writing to the usual writeback path.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Jan Kara <jack@suse.cz>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Greg Thelen <gthelen@google.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-06-02 08:40:20 -06:00
Tejun Heo
b16b1deb55 writeback: make writeback_control track the inode being written back
Currently, for cgroup writeback, the IO submission paths directly
associate the bio's with the blkcg from inode_to_wb_blkcg_css();
however, it'd be necessary to keep more writeback context to implement
foreign inode writeback detection.  wbc (writeback_control) is the
natural fit for the extra context - it persists throughout the
writeback of each inode and is passed all the way down to IO
submission paths.

This patch adds wbc_attach_and_unlock_inode(), wbc_detach_inode(), and
wbc_attach_fdatawrite_inode() which are used to associate wbc with the
inode being written back.  IO submission paths now use wbc_init_bio()
instead of directly associating bio's with blkcg themselves.  This
leaves inode_to_wb_blkcg_css() w/o any user.  The function is removed.

wbc currently only tracks the associated wb (bdi_writeback).  Future
patches will add more for foreign inode detection.  The association is
established under i_lock which will be depended upon when migrating
foreign inodes to other wb's.

As currently, once established, inode to wb association never changes,
going through wbc when initializing bio's doesn't cause any behavior
changes.

v2: submit_blk_blkcg() now checks whether the wbc is associated with a
    wb before dereferencing it.  This can happen when pageout() is
    writing pages directly without going through the usual writeback
    path.  As pageout() path is single-threaded, we don't want it to
    be blocked behind a slow cgroup and ultimately want it to delegate
    actual writing to the usual writeback path.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Jan Kara <jack@suse.cz>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Greg Thelen <gthelen@google.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-06-02 08:39:48 -06:00
Tejun Heo
21c6321fbb writeback: relocate wb[_try]_get(), wb_put(), inode_{attach|detach}_wb()
Currently, majority of cgroup writeback support including all the
above functions are implemented in include/linux/backing-dev.h and
mm/backing-dev.c; however, the portion closely related to writeback
logic implemented in include/linux/writeback.h and mm/page-writeback.c
will expand to support foreign writeback detection and correction.

This patch moves wb[_try]_get() and wb_put() to
include/linux/backing-dev-defs.h so that they can be used from
writeback.h and inode_{attach|detach}_wb() to writeback.h and
page-writeback.c.

This is pure reorganization and doesn't introduce any functional
changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Jan Kara <jack@suse.cz>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Greg Thelen <gthelen@google.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-06-02 08:39:08 -06:00
Tejun Heo
c2aa723a60 writeback: implement memcg writeback domain based throttling
While cgroup writeback support now connects memcg and blkcg so that
writeback IOs are properly attributed and controlled, the IO back
pressure propagation mechanism implemented in balance_dirty_pages()
and its subroutines wasn't aware of cgroup writeback.

Processes belonging to a memcg may have access to only subset of total
memory available in the system and not factoring this into dirty
throttling rendered it completely ineffective for processes under
memcg limits and memcg ended up building a separate ad-hoc degenerate
mechanism directly into vmscan code to limit page dirtying.

The previous patches updated balance_dirty_pages() and its subroutines
so that they can deal with multiple wb_domain's (writeback domains)
and defined per-memcg wb_domain.  Processes belonging to a non-root
memcg are bound to two wb_domains, global wb_domain and memcg
wb_domain, and should be throttled according to IO pressures from both
domains.  This patch updates dirty throttling code so that it repeats
similar calculations for the two domains - the differences between the
two are few and minor - and applies the lower of the two sets of
resulting constraints.

wb_over_bg_thresh(), which controls when background writeback
terminates, is also updated to consider both global and memcg
wb_domains.  It returns true if dirty is over bg_thresh for either
domain.

This makes the dirty throttling mechanism operational for memcg
domains including writeback-bandwidth-proportional dirty page
distribution inside them but the ad-hoc memcg throttling mechanism in
vmscan is still in place.  The next patch will rip it out.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Jan Kara <jack@suse.cz>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Greg Thelen <gthelen@google.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-06-02 08:38:13 -06:00
Tejun Heo
2529bb3aad writeback: reset wb_domain->dirty_limit[_tstmp] when memcg domain size changes
The amount of available memory to a memcg wb_domain can change as
memcg configuration changes.  A domain's ->dirty_limit exists to
smooth out sudden drops in dirty threshold; however, when a domain's
size actually drops significantly, it hinders the dirty throttling
from adjusting to the new configuration leading to unexpected
behaviors including unnecessary OOM kills.

This patch resolves the issue by adding wb_domain_size_changed() which
resets ->dirty_limit[_tstmp] and making memcg call it on configuration
changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Jan Kara <jack@suse.cz>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Greg Thelen <gthelen@google.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-06-02 08:38:13 -06:00
Tejun Heo
841710aa6e writeback: implement memcg wb_domain
Dirtyable memory is distributed to a wb (bdi_writeback) according to
the relative bandwidth the wb is writing out in the whole system.
This distribution is global - each wb is measured against all other
wb's and gets the proportinately sized portion of the memory in the
whole system.

For cgroup writeback, the amount of dirtyable memory is scoped by
memcg and thus each wb would need to be measured and controlled in its
memcg.  IOW, a wb will belong to two writeback domains - the global
and memcg domains.

The previous patches laid the groundwork to support the two wb_domains
and this patch implements memcg wb_domain.  memcg->cgwb_domain is
initialized on css online and destroyed on css release,
wb->memcg_completions is added, and __wb_writeout_inc() is updated to
increment completions against both global and memcg wb_domains.

The following patches will update balance_dirty_pages() and its
subroutines to actually consider memcg wb_domain for throttling.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Jan Kara <jack@suse.cz>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Greg Thelen <gthelen@google.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-06-02 08:38:13 -06:00
Tejun Heo
aa661bbe1e writeback: move over_bground_thresh() to mm/page-writeback.c
and rename it to wb_over_bg_thresh().  The function is closely tied to
the dirty throttling mechanism implemented in page-writeback.c.  This
relocation will allow future updates necessary for cgroup writeback
support.

While at it, add function comment.

This is pure reorganization and doesn't introduce any behavioral
changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Jan Kara <jack@suse.cz>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Greg Thelen <gthelen@google.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-06-02 08:38:13 -06:00
Tejun Heo
dcc25ae76e writeback: move global_dirty_limit into wb_domain
This patch is a part of the series to define wb_domain which
represents a domain that wb's (bdi_writeback's) belong to and are
measured against each other in.  This will enable IO backpressure
propagation for cgroup writeback.

global_dirty_limit exists to regulate the global dirty threshold which
is a property of the wb_domain.  This patch moves hard_dirty_limit,
dirty_lock, and update_time into wb_domain.

This is pure reorganization and doesn't introduce any behavioral
changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Jan Kara <jack@suse.cz>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Greg Thelen <gthelen@google.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-06-02 08:38:12 -06:00
Tejun Heo
380c27ca33 writeback: implement wb_domain
Dirtyable memory is distributed to a wb (bdi_writeback) according to
the relative bandwidth the wb is writing out in the whole system.
This distribution is global - each wb is measured against all other
wb's and gets the proportinately sized portion of the memory in the
whole system.

For cgroup writeback, the amount of dirtyable memory is scoped by
memcg and thus each wb would need to be measured and controlled in its
memcg.  IOW, a wb will belong to two writeback domains - the global
and memcg domains.

Currently, what constitutes the global writeback domain are scattered
across a number of global states.  This patch starts collecting them
into struct wb_domain.

* fprop_global which serves as the basis for proportional bandwidth
  measurement and its period timer are moved into struct wb_domain.

* global_wb_domain hosts the states for the global domain.

* While at it, flatten wb_writeout_fraction() into its callers.  This
  thin wrapper doesn't provide any actual benefits while getting in
  the way.

This is pure reorganization and doesn't introduce any behavioral
changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Jan Kara <jack@suse.cz>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Greg Thelen <gthelen@google.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-06-02 08:38:12 -06:00
Tejun Heo
8a73179956 writeback: reorganize [__]wb_update_bandwidth()
__wb_update_bandwidth() is called from two places -
fs/fs-writeback.c::balance_dirty_pages() and
mm/page-writeback.c::wb_writeback().  The latter updates only the
write bandwidth while the former also deals with the dirty ratelimit.
The two callsites are distinguished by whether @thresh parameter is
zero or not, which is cryptic.  In addition, the two files define
their own different versions of wb_update_bandwidth() on top of
__wb_update_bandwidth(), which is confusing to say the least.  This
patch cleans up [__]wb_update_bandwidth() in the following ways.

* __wb_update_bandwidth() now takes explicit @update_ratelimit
  parameter to gate dirty ratelimit handling.

* mm/page-writeback.c::wb_update_bandwidth() is flattened into its
  caller - balance_dirty_pages().

* fs/fs-writeback.c::wb_update_bandwidth() is moved to
  mm/page-writeback.c and __wb_update_bandwidth() is made static.

* While at it, add a lockdep assertion to __wb_update_bandwidth().

Except for the lockdep addition, this is pure reorganization and
doesn't introduce any behavioral changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Jan Kara <jack@suse.cz>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Greg Thelen <gthelen@google.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-06-02 08:38:12 -06:00
Tejun Heo
0d960a383a writeback: clean up wb_dirty_limit()
The function name wb_dirty_limit(), its argument @dirty and the local
variable @wb_dirty are mortally confusing given that the function
calculates per-wb threshold value not dirty pages, especially given
that @dirty and @wb_dirty are used elsewhere for dirty pages.

Let's rename the function to wb_calc_thresh() and wb_dirty to
wb_thresh.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Jan Kara <jack@suse.cz>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Greg Thelen <gthelen@google.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-06-02 08:38:12 -06:00
Tejun Heo
bafc0dba1e buffer, writeback: make __block_write_full_page() honor cgroup writeback
[__]block_write_full_page() is used to implement ->writepage in
various filesystems.  All writeback logic is now updated to handle
cgroup writeback and the block cgroup to issue IOs for is encoded in
writeback_control and can be retrieved from the inode; however,
[__]block_write_full_page() currently ignores the blkcg indicated by
inode and issues all bio's without explicit blkcg association.

This patch adds submit_bh_blkcg() which associates the bio with the
specified blkio cgroup before issuing and uses it in
__block_write_full_page() so that the issued bio's are associated with
inode_to_wb_blkcg_css(inode).

v2: Updated for per-inode wb association.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Jan Kara <jack@suse.cz>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-06-02 08:37:23 -06:00
Tejun Heo
f30a7d0cc8 writeback: restructure try_writeback_inodes_sb[_nr]()
try_writeback_inodes_sb_nr() wraps writeback_inodes_sb_nr() so that it
handles s_umount locking and skips if writeback is already in
progress.  The in progress test is performed on the root wb
(bdi_writeback) which isn't sufficient for cgroup writeback support.
The test must be done per-wb.

To prepare for the change, this patch factors out
__writeback_inodes_sb_nr() from writeback_inodes_sb_nr() and adds
@skip_if_busy and moves the in progress test right before queueing the
wb_writeback_work.  try_writeback_inodes_sb_nr() now just grabs
s_umount and invokes __writeback_inodes_sb_nr() with asserted
@skip_if_busy.  This way, later addition of multiple wb handling can
skip only the wb's which already have writeback in progress.

This swaps the order between in progress test and s_umount test which
can flip the return value when writeback is in progress and s_umount
is being held by someone else but this shouldn't cause any meaningful
difference.  It's a fringe condition and the return value is an
unsynchronized hint anyway.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-06-02 08:33:36 -06:00
Tejun Heo
cc395d7f1f writeback: implement bdi_wait_for_completion()
If the completion of a wb_writeback_work can be waited upon by setting
its ->done to a struct completion and waiting on it; however, for
cgroup writeback support, it's necessary to issue multiple work items
to multiple bdi_writebacks and wait for the completion of all.

This patch implements wb_completion which can wait for multiple work
items and replaces the struct completion with it.  It can be defined
using DEFINE_WB_COMPLETION_ONSTACK(), used for multiple work items and
waited for by wb_wait_for_completion().

Nobody currently issues multiple work items and this patch doesn't
introduce any behavior changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-06-02 08:33:36 -06:00
Tejun Heo
9ecf4866c0 writeback: make bdi_start_background_writeback() take bdi_writeback instead of backing_dev_info
bdi_start_background_writeback() currently takes @bdi and kicks the
root wb (bdi_writeback).  In preparation for cgroup writeback support,
make it take wb instead.

This patch doesn't make any functional difference.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-06-02 08:33:36 -06:00
Tejun Heo
bc05873dcc writeback: make writeback_in_progress() take bdi_writeback instead of backing_dev_info
writeback_in_progress() currently takes @bdi and returns whether
writeback is in progress on its root wb (bdi_writeback).  In
preparation for cgroup writeback support, make it take wb instead.
While at it, make it an inline function.

This patch doesn't make any functional difference.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-06-02 08:33:36 -06:00
Tejun Heo
c00ddad39f writeback: remove bdi_start_writeback()
bdi_start_writeback() is a thin wrapper on top of
__wb_start_writeback() which is used only by laptop_mode_timer_fn().
This patches removes bdi_start_writeback(), renames
__wb_start_writeback() to wb_start_writeback() and makes
laptop_mode_timer_fn() use it instead.

This doesn't cause any functional difference and will ease making
laptop_mode_timer_fn() cgroup writeback aware.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-06-02 08:33:36 -06:00