linux-xiaomi-chiron

Author	SHA1	Message	Date
Daniel Borkmann	35b2108550	ktime: Add __must_check prefix to ktime_to_timespec_cond The function is currently mainly used in the networking code and if others start using it, they must check the result, otherwise it cannot be determined if the timespec conversion suceeded. Currently no user lacks this check, but make future users aware of a possible misusage. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: John Stultz <john.stultz@linaro.org>	2013-05-28 14:00:58 -07:00
Liu Ying	a44b8bd607	ktime: Use macro NSEC_PER_USEC where appropriate We've got the macro NSEC_PER_USEC defined in header file include/linux/time.h. To make the code decent, this patch replaces the immediate number 1000 to convert bewteen a time value in microseconds and one in nanoseconds with the macro NSEC_PER_USEC. Signed-off-by: Liu Ying <Ying.Liu@freescale.com> Cc: David S. Miller <davem@davemloft.net> Cc: Daniel Borkmann <dborkman@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: John Stultz <john.stultz@linaro.org> Signed-off-by: John Stultz <john.stultz@linaro.org>	2013-05-28 14:00:57 -07:00
Jiri Pirko	be9efd3653	net: pass changed flags along with NETDEV_CHANGE event Use new netdevice notifier infrastructure to pass along changed flags. Signed-off-by: Timo Teräs <timo.teras@iki.fi> Signed-off-by: Jiri Pirko <jiri@resnulli.us> v2->v3: shortened notifier_info struct name Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-28 13:11:01 -07:00
Jiri Pirko	351638e7de	net: pass info struct via netdevice notifier So far, only net_device * could be passed along with netdevice notifier event. This patch provides a possibility to pass custom structure able to provide info that event listener needs to know. Signed-off-by: Jiri Pirko <jiri@resnulli.us> v2->v3: fix typo on simeth shortened dev_getter shortened notifier_info struct name v1->v2: fix notifier_call parameter in call_netdevice_notifier() Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-28 13:11:01 -07:00
Alexander Gordeev	65f6ae66a6	PCI: Allocate only as many MSI vectors as requested by driver Because of the encoding of the "Multiple Message Capable" and "Multiple Message Enable" fields, a device can only advertise that it's capable of a power-of-two number of vectors, and the OS can only enable a power-of-two number. For example, a device that's limited internally to using 18 vectors would have to advertise that it's capable of 32. The 14 extra vectors consume vector numbers and IRQ descriptors even though the device can't actually use them. This fix introduces a 'msi_desc::nvec_used' field to address this issue. When non-zero, it is the actual number of MSIs the device will send, as requested by the device driver. This value should be used by architectures to set up and tear down only as many interrupt resources as the device will actually use. Note, although the existing 'msi_desc::multiple' field might seem redundant, in fact it is not. The number of MSIs advertised need not be the smallest power-of-two larger than the number of MSIs the device will send. Thus, it is not always possible to derive the former from the latter, so we need to keep them both to handle this case. [bhelgaas: changelog, rename to "nvec_used"] Signed-off-by: Alexander Gordeev <agordeev@redhat.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2013-05-28 11:31:16 -06:00
Robert P. J. Day	611a75e187	include/linux/cpu.h: Update comments to reflect reality Two minor changes to comments: * Remove reference to drivers/base/sys.c, removed in `0a962657`. * CPUs are now exported by sysfs via devices/system/cpu. Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2013-05-28 12:02:11 +02:00
Steven Rostedt	bcf312cf76	tracing: Put trace_puts() comment above trace_puts() macro for kernel doc Kernel-doc gives the following warning: DOCPROC Documentation/DocBook/kernel-api.xml Warning(/include/linux/kernel.h:590): No description found for parameter 'ip' Warning(/include/linux/kernel.h:590): No description found for parameter 'ip' Due to the externs between the the comment and the trace_puts() macro. This is fixed by moving the externs below the macro and keeping the macro and comment directly together. Reported-by: Robert P. J. Day <rpjday@crashcourse.ca> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2013-05-28 12:02:11 +02:00
Peter Zijlstra	26cb63ad11	perf: Fix perf mmap bugs Vince reported a problem found by his perf specific trinity fuzzer. Al noticed 2 problems with perf's mmap(): - it has issues against fork() since we use vma->vm_mm for accounting. - it has an rb refcount leak on double mmap(). We fix the issues against fork() by using VM_DONTCOPY; I don't think there's code out there that uses this; we didn't hear about weird accounting problems/crashes. If we do need this to work, the previously proposed VM_PINNED could make this work. Aside from the rb reference leak spotted by Al, Vince's example prog was indeed doing a double mmap() through the use of perf_event_set_output(). This exposes another problem, since we now have 2 events with one buffer, the accounting gets screwy because we account per event. Fix this by making the buffer responsible for its own accounting. Reported-by: Vince Weaver <vincent.weaver@maine.edu> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Link: http://lkml.kernel.org/r/20130528085548.GA12193@twins.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-05-28 11:05:08 +02:00
Michael S. Tsirkin	662bbcb274	mm, sched: Allow uaccess in atomic with pagefault_disable() This changes might_fault() so that it does not trigger a false positive diagnostic for e.g. the following sequence: spin_lock_irqsave() pagefault_disable() copy_to_user() pagefault_enable() spin_unlock_irqrestore() In particular vhost wants to do this, to call socket ops from under a lock. There are 3 cases to consider: - CONFIG_PROVE_LOCKING - might_fault is non-inline so it's easy to move the in_atomic test to fix up the false positive warning. - CONFIG_DEBUG_ATOMIC_SLEEP - might_fault is currently inline, but we are calling a non-inline __might_sleep anyway, so let's use the non-line version of might_fault that does the right thing. - !CONFIG_DEBUG_ATOMIC_SLEEP && !CONFIG_PROVE_LOCKING __might_sleep is a nop so might_fault is a nop. Make this explicit. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1369577426-26721-11-git-send-email-mst@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-05-28 09:41:11 +02:00
Michael S. Tsirkin	114276ac0a	mm, sched: Drop voluntary schedule from might_fault() might_fault() is called from functions like copy_to_user() which most callers expect to be very fast, like a couple of instructions. So functions like memcpy_toiovec() call them many times in a loop. But might_fault() calls might_sleep() and with CONFIG_PREEMPT_VOLUNTARY this results in a function call. Let's not do this - just call __might_sleep() that produces a diagnostic for sleep within atomic, but drop might_preempt(). Here's a test sending traffic between the VM and the host, host is built with CONFIG_PREEMPT_VOLUNTARY: before: incoming: 7122.77 Mb/s outgoing: 8480.37 Mb/s after: incoming: 8619.24 Mb/s outgoing: 9455.42 Mb/s As a side effect, this fixes an issue pointed out by Ingo: might_fault might schedule differently depending on PROVE_LOCKING. Now there's no preemption point in both cases, so it's consistent. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1369577426-26721-10-git-send-email-mst@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-05-28 09:41:11 +02:00
Stephane Eranian	62b8563979	perf: Add sysfs entry to adjust multiplexing interval per PMU This patch adds /sys/device/xxx/perf_event_mux_interval_ms to ajust the multiplexing interval per PMU. The unit is milliseconds. Value has to be >= 1. In the 4th version, we renamed the sysfs file to be more consistent with the other /proc/sys/kernel entries for perf_events. In the 5th version, we handle the reprogramming of the hrtimer using hrtimer_forward_now(). That way, we sync up to new timer value quickly (suggested by Jiri Olsa). Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Link: http://lkml.kernel.org/r/1364991694-5876-3-git-send-email-eranian@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-05-28 09:13:51 +02:00
Stephane Eranian	9e6302056f	perf: Use hrtimers for event multiplexing The current scheme of using the timer tick was fine for per-thread events. However, it was causing bias issues in system-wide mode (including for uncore PMUs). Event groups would not get their fair share of runtime on the PMU. With tickless kernels, if a core is idle there is no timer tick, and thus no event rotation (multiplexing). However, there are events (especially uncore events) which do count even though cores are asleep. This patch changes the timer source for multiplexing. It introduces a per-PMU per-cpu hrtimer. The advantage is that even when a core goes idle, it will come back to service the hrtimer, thus multiplexing on system-wide events works much better. The per-PMU implementation (suggested by PeterZ) enables adjusting the multiplexing interval per PMU. The preferred interval is stashed into the struct pmu. If not set, it will be forced to the default interval value. In order to minimize the impact of the hrtimer, it is turned on and off on demand. When the PMU on a CPU is overcommited, the hrtimer is activated. It is stopped when the PMU is not overcommitted. In order for this to work properly, we had to change the order of initialization in start_kernel() such that hrtimer_init() is run before perf_event_init(). The default interval in milliseconds is set to a timer tick just like with the old code. We will provide a sysctl to tune this in another patch. Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Link: http://lkml.kernel.org/r/1364991694-5876-2-git-send-email-eranian@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-05-28 09:07:10 +02:00
Jiri Olsa	ab573844e3	perf: Fix hw breakpoints overflow period sampling The hw breakpoint pmu 'add' function is missing the period_left update needed for SW events. The perf HW breakpoint events use the SW events framework to process the overflow, so it needs to be properly initialized in the PMU 'add' method. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Reviewed-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1367421944-19082-5-git-send-email-jolsa@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-05-28 08:59:54 +02:00
dingtianhong	da6e378ba9	netpoll: remove return value from netpoll_rx_disable() The netpoll_rx_disable() will always return 0, it is no use and looks wordy, so remove the unnecessary code and get rid of it in _dev_open and _dev_close. Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-27 23:18:50 -07:00
Jaegeuk Kim	a9841c4dbb	f2fs: align data types between on-disk and in-memory block addresses The on-disk block address is defined as __le32, but in-memory block address, block_t, does as u64. Let's synchronize them to 32 bits. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-05-28 15:03:04 +09:00
Simon Horman	0d89d2035f	MPLS: Add limited GSO support In the case where a non-MPLS packet is received and an MPLS stack is added it may well be the case that the original skb is GSO but the NIC used for transmit does not support GSO of MPLS packets. The aim of this code is to provide GSO in software for MPLS packets whose skbs are GSO. SKB Usage: When an implementation adds an MPLS stack to a non-MPLS packet it should do the following to skb metadata: * Set skb->inner_protocol to the old non-MPLS ethertype of the packet. skb->inner_protocol is added by this patch. * Set skb->protocol to the new MPLS ethertype of the packet. * Set skb->network_header to correspond to the end of the L3 header, including the MPLS label stack. I have posted a patch, "[PATCH v3.29] datapath: Add basic MPLS support to kernel" which adds MPLS support to the kernel datapath of Open vSwtich. That patch sets the above requirements in datapath/actions.c:push_mpls() and was used to exercise this code. The datapath patch is against the Open vSwtich tree but it is intended that it be added to the Open vSwtich code present in the mainline Linux kernel at some point. Features: I believe that the approach that I have taken is at least partially consistent with the handling of other protocols. Jesse, I understand that you have some ideas here. I am more than happy to change my implementation. This patch adds dev->mpls_features which may be used by devices to advertise features supported for MPLS packets. A new NETIF_F_MPLS_GSO feature is added for devices which support hardware MPLS GSO offload. Currently no devices support this and MPLS GSO always falls back to software. Alternate Implementation: One possible alternate implementation is to teach netif_skb_features() and skb_network_protocol() about MPLS, in a similar way to their understanding of VLANs. I believe this would avoid the need for net/mpls/mpls_gso.c and in particular the calls to __skb_push() and __skb_push() in mpls_gso_segment(). I have decided on the implementation in this patch as it should not introduce any overhead in the case where mpls_gso is not compiled into the kernel or inserted as a module. MPLS GSO suggested by Jesse Gross. Based in part on "v4 GRE: Add TCP segmentation offload for GRE" by Pravin B Shelar. Cc: Jesse Gross <jesse@nicira.com> Cc: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-27 22:50:59 -07:00
Simon Horman	1a37e412a0	net: Use 16bits for _headers fields of struct skbuff In order to mitigate ongoing incresase in the size of struct skbuff use 16 bit integer offsets rather than pointers for inner__headers. This appears to reduce the size of struct skbuff from 0xd0 to 0xc0 bytes on x86_64 with the following all unset. CONFIG_XFRM CONFIG_NF_CONNTRACK CONFIG_NF_CONNTRACK_MODULE NET_SKBUFF_NF_DEFRAG_NEEDED CONFIG_BRIDGE_NETFILTER CONFIG_NET_SCHED CONFIG_IPV6_NDISC_NODETYPE CONFIG_NET_DMA CONFIG_NETWORK_SECMARK Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-27 22:50:59 -07:00
Florian Fainelli	4284b6a535	phy: allow drivers to flag a PHY device as internal libphy currently always reports a PHY as an external transceiver from the ethtool output. This is inaccurate, because some drivers should be able to tell that a PHY device is an internal transceiver of an Ethernet MAC. Add a new flag (PHY_IS_INTERNAL) which can be set by PHY drivers just like other flags, and a corresponding helper: phy_is_internal() which can be used by networking drivers to query if a given PHY device is internal. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-27 22:42:50 -07:00
Olof Johansson	6f39ef575d	This is a set of patches from Lee Jones to start converting the ux500 to fetch DMA channels from the device tree: - Full DT support and channel mapping in the DMA40 driver - Dropping of platform data for migrated devices on the DT boot path. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) iQIcBAABAgAGBQJRnms9AAoJEEEQszewGV1zvAMP/R96pExEBjlf+o61FvPafFpw myZtprUXJvkHphOY+UVHk/gv+1EUFbNwcQzu5k3Ah14DOmv7qtjusylMDbJoFPar dDVU8PtsT+4qmCzgLRd+ATkZEBMOAaJFscD7UV1qtY0AIpz8jS1gJpOO08qqQBg1 unEj7szVEfs1eCrcDnvfMmeBM7gqZMTFkWmGkpuNqH5MdN3HLMKgAFJEc+GafP4d saaX3fCdiif5OYkD9DyQO7I4zsSaAs8OcpYbQKyRmWhm/wuLGPDoFsfBjOXGgFRr V/XwJD74lNgGduC8bCRsgj4LDEdRDO0s1NlqRWpiX/W1GX2lo4wraLdPmd16AaTQ LUb34o10sN4/0mJnbweWxmFPyzGNfw/Kl+ea4gA0vOQ9QDtTIrtnjHOtowGAdhBJ lDob3vAd5huCU287p8me+nc6Y4eJaJS0LjcsU1yy7U4O9mLsZnubPaVPiQu2gUGe GUKYd5J40aslxRuyP/MCA0bnCBhjyH62YfN9omw0Cmt/VQvyKXFF4JWdQuS/Dn3s 29JQfOWCf8xGYwZaQYSzluczP+wokfWnx8wrQ7c1Kb9G2NcDlv6f2/mbDRG9pmXV 2xs8v/fxONDfSCrPVCiBiHH4sYtcbD6UMjiTB76etygOmb3oeSK/Aqx0Sm90MG6Q xL9gkP14kNTaZBGCDNLN =RTiI -----END PGP SIGNATURE----- Merge tag 'ux500-dma40-for-arm-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-stericsson into next/drivers From Linus Walleij: This is a set of patches from Lee Jones to start converting the ux500 to fetch DMA channels from the device tree: - Full DT support and channel mapping in the DMA40 driver - Dropping of platform data for migrated devices on the DT boot path. * tag 'ux500-dma40-for-arm-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-stericsson: (36 commits) ARM: ux500: Register Cryp and Hash platform drivers on Snowball crypto: ux500/[cryp\|hash] - Show successful start-up in the bootlog ARM: ux500: Stop passing Cryp DMA channel config information though pdata crypto: ux500/cryp - Set DMA configuration though dma_slave_config() crypto: ux500/cryp - Prepare clock before enabling it ARM: ux500: Stop passing Hash DMA channel config information though pdata crypto: ux500/hash - Set DMA configuration though dma_slave_config() crypto: ux500/hash - Prepare clock before enabling it ARM: ux500: Remove unnecessary attributes from DMA channel request pdata dmaengine: ste_dma40: Correct copy/paste error ARM: ux500: Remove DMA address look-up table dmaengine: ste_dma40: Remove redundant address fetching function dmaengine: ste_dma40: Only use addresses passed as configuration information ARM: ux500: Stop passing UART's platform data for Device Tree boots dmaengine: ste_dma40: Don't configure runtime configurable setup during allocate dmaengine: ste_dma40: Remove unnecessary call to d40_phy_cfg() dmaengine: ste_dma40: Separate Logical Global Interrupt Mask (GIM) unmasking ARM: ux500: Pass remnant platform data though to DMA40 driver dmaengine: ste_dma40: Supply full Device Tree parsing support dmaengine: ste_dma40: Allow driver to be probe()able when DT is enabled ... Signed-off-by: Olof Johansson <olof@lixom.net>	2013-05-27 20:10:04 -07:00
Gu Zheng	3c6e6ae770	PCI: Introduce pci_alloc_dev(struct pci_bus) to replace alloc_pci_dev() Here we introduce a new interface to replace alloc_pci_dev(): struct pci_dev pci_alloc_dev(struct pci_bus bus) It takes a "struct pci_bus " argument, so we can alloc a PCI device on a target PCI bus, and it acquires a reference on the pci_bus. We use pci_alloc_dev(NULL) to simplify the old alloc_pci_dev(), and keep it for a while but mark it as __deprecated. Holding a reference to the pci_bus ensures that referencing pci_dev->bus is valid as long as the pci_dev is valid. [bhelgaas: keep existing "return error early" structure in pci_alloc_dev()] Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Jiang Liu <jiang.liu@huawei.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2013-05-27 16:23:28 -06:00
Jiang Liu	fe830ef62a	PCI: Introduce pci_bus_{get\|put}() to manage PCI bus reference count Introduce helper functions pci_bus_{get\|put}() to manage PCI bus reference count. Signed-off-by: Jiang Liu <jiang.liu@huawei.com> Signed-off-by: Yijing Wang <wangyijing@huawei.com> Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2013-05-27 16:22:09 -06:00
Viresh Kumar	2361be2366	cpufreq: Don't create empty /sys/devices/system/cpu/cpufreq directory When we don't have any file in cpu/cpufreq directory we shouldn't create it. Specially with the introduction of per-policy governor instance patchset, even governors are moved to cpu/cpu*/cpufreq/governor-name directory and so this directory is just not required. Lets have it only when required. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-05-27 13:24:02 +02:00
Viresh Kumar	72a4ce340a	cpufreq: Move get_cpu_idle_time() to cpufreq.c Governors other than ondemand and conservative can also use get_cpu_idle_time() and they aren't required to compile cpufreq_governor.c. So, move these independent routines to cpufreq.c instead. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-05-27 13:20:56 +02:00
Viresh Kumar	944e9a0316	cpufreq: governors: Move get_governor_parent_kobj() to cpufreq.c get_governor_parent_kobj() can be used by any governor, generic cpufreq governors or platform specific ones and so must be present in cpufreq.c instead of cpufreq_governor.c. This patch moves it to cpufreq.c. This also adds EXPORT_SYMBOL_GPL(get_governor_parent_kobj) so that modules can use this function too. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-05-27 13:20:56 +02:00
Soren Brinkmann	30e1e28598	arm: zynq: Migrate platform to clock controller Migrate the Zynq platform and its drivers to use the new clock controller driver. Signed-off-by: Soren Brinkmann <soren.brinkmann@xilinx.com> Cc: John Stultz <john.stultz@linaro.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jiri Slaby <jslaby@suse.cz> Cc: linux-serial@vger.kernel.org Signed-off-by: Michal Simek <michal.simek@xilinx.com> Acked-by: Mike Turquette <mturquette@linaro.org>	2013-05-27 09:21:22 +02:00
Greg Kroah-Hartman	45f6bc5ff9	Merge 3.10-rc3 into usb-next We want these fixes. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-05-27 11:00:52 +09:00
Greg Kroah-Hartman	8095e4e81b	Merge 3.10-rc3 into tty-next We want these fixes. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-05-27 10:57:53 +09:00
Adrian Hunter	f0710a557c	mmc: sdhci: add ability to stay runtime-resumed if the card is powered up If card power is dependent on SD bus power then the host controller must not be runtime suspended while the card is powered up. Add the ability to stay runtime-resumed in that case and enable it with a new quirk SDHCI_QUIRK2_CARD_ON_NEEDS_BUS_ON. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Chris Ball <cjb@laptop.org>	2013-05-26 14:23:23 -04:00
Guennadi Liakhovetski	eec95ee226	mmc: sdhi/tmio: switch to using dmaengine_slave_config() This removes the deprecated use of the .private member of struct dma_chan and switches the sdhi / tmio mmc driver to using the dmaengine_slave_config() channel configuration method. Signed-off-by: Guennadi Liakhovetski <g.liakhovetski+renesas@gmail.com> Acked-by: Samuel Ortiz <sameo@linux.intel.com> Signed-off-by: Chris Ball <cjb@laptop.org>	2013-05-26 14:23:20 -04:00
Guennadi Liakhovetski	03a0675b2a	mmc: sdhi/tmio: make DMA filter implementation specific So far only the SDHI implementation uses TMIO MMC with DMA. That way a DMA channel filter function, defined in the TMIO driver wasn't a problem. However, such a filter function is DMA controller specific. Since the SDHI glue is only running on systems with the SHDMA DMA controller, the filter function can safely be provided by it. Move it into SDHI. Signed-off-by: Guennadi Liakhovetski <g.liakhovetski+renesas@gmail.com> Acked-by: Samuel Ortiz <sameo@linux.intel.com> Signed-off-by: Chris Ball <cjb@laptop.org>	2013-05-26 14:23:19 -04:00
Fredrik Soderstedt	6044371219	mmc: core: Fix select power class after resume Use the saved values in card->ext_csd when selecting power class. By doing this the power class will be selected even if mmc_init_card is called with oldcard != NULL, which is the case after a suspend/resume. Today ext_csd is NULL if mmc_init_card is called with oldcard != NULL and power class will not be selected. According to the eMMC specification the POWER_CLASS value is reset after power failure, H/W reset assertion and any CMD0 reset. Signed-off-by: Fredrik Soderstedt <fredrik.soderstedt@stericsson.com> Reviewed-by: Johan Rudholm <jrudholm@gmail.com> Acked By: Girish K S <girish.shivananjappa@linaro.org> Signed-off-by: Chris Ball <cjb@laptop.org>	2013-05-26 14:23:18 -04:00
Ulf Hansson	07a6821608	mmc: core: Restructure and simplify code for mmc sleep\|awake The mmc_card_sleep\|awake APIs are not being used since the support is already properly encapsulated within the suspend sequence. Sleep\|awake command is also specific for eMMC. We remove the sleep\|awake bus_ops, the mmc_card_sleep\|awake APIs and move the code into the mmc specific core instead. This also includes the mmc ops function, mmc_sleepawake. All releated functions have then become static and we have got far less code to maintain. Additionally this patch also simplifies the code from mmc_sleepawake, since it is only used to put the card to sleep and not awake. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Chris Ball <cjb@laptop.org>	2013-05-26 14:23:17 -04:00
Ulf Hansson	c4d770d724	mmc: core: Support aggressive power management for (e)MMC/SD Aggressive power management is suitable when saving power is essential. At request inactivity timeout, aka pm runtime autosuspend timeout, the card will be suspended. Once a new request arrives, the card will be re-initalized and thus the first request will suffer from a latency. This latency is card-specific, experiments has shown in general that SD-cards has quite poor initialization time, around 300ms-1100ms. eMMC is not surprisingly far better but still a couple of hundreds of ms has been observed. Except for the request latency, it is important to know that suspending the card will also prevent the card from executing internal house-keeping operations in idle mode. This could mean degradation in performance. To use this feature make sure the request inactivity timeout is chosen carefully. This has not been done as a part of this patch. Enable this feature by using host cap MMC_CAP_AGGRESSIVE_PM and by setting CONFIG_MMC_UNSAFE_RESUME. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Chris Ball <cjb@laptop.org>	2013-05-26 14:23:17 -04:00
Ulf Hansson	e94cfef698	mmc: block: Enable runtime pm for mmc blkdevice Once the mmc blkdevice is being probed, runtime pm will be enabled. By using runtime autosuspend, the power save operations can be done when request inactivity occurs for a certain time. Right now the selected timeout value is set to 3 s. Obviously this value will likely need to be configurable somehow since it needs to be trimmed depending on the power save algorithm. For SD-combo cards, we are still leaving the enablement of runtime PM to the SDIO init sequence since it depends on the capabilities of the SDIO func driver. Moreover, when the blk device is being suspended, we make sure the device will be runtime resumed. The reason for doing this is that we want the host suspend sequence to be unaware of any runtime power save operations done for the card in this phase. Thus it can just handle the suspend as the card is fully powered from a runtime perspective. Finally, this patch prepares to make it possible to move BKOPS handling into the runtime callbacks for the mmc bus_ops. Thus IDLE BKOPS can be accomplished. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Chris Ball <cjb@laptop.org>	2013-05-26 14:23:16 -04:00
Maya Erez	775a9362b5	mmc: card: Adding support for sanitize in eMMC 4.5 The sanitize support is added as a user-app ioctl call, and was removed from the block-device request, since its purpose is to be invoked not via File-System but by a user. This feature deletes the unmap memory region of the eMMC card, by writing to a specific register in the EXT_CSD. unmap region is the memory region that was previously deleted (by erase, trim or discard operation). In order to avoid timeout when sanitizing large-scale cards, the timeout for sanitize operation is 240 seconds. Signed-off-by: Yaniv Gardi <ygardi@codeaurora.org> Signed-off-by: Maya Erez <merez@codeaurora.org> Signed-off-by: Chris Ball <cjb@laptop.org>	2013-05-26 14:23:13 -04:00
Ulf Hansson	b689167984	mmc: core: Re-use code for MMC_CAP2_DETECT_ON_ERR in polling mode Previously the MMC_CAP2_DETECT_ON_ERR was invented for detecting slow card removal. In was never a realy good solution and a proper fix has been merged using gpio debouncing instead. We remove this cap in this patch. Although when using polling card detect mode, the code invented for MMC_CAP2_DETECT_ON_ERR is re-used to complete card removal in an earlier phase. There are no need waiting for the polling timeout to elapse in this case. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Kevin Liu <kliu5@marvell.com> Signed-off-by: Chris Ball <cjb@laptop.org>	2013-05-26 14:23:12 -04:00
Jiri Pirko	42e52bf9e3	net: add netnotifier event for upper device change Now when upper device is changed, event is not propagated via RT Netlink to userspace. Userspace might never now about the change. Fix this by adding upper-device-change notifier event. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-25 23:12:19 -07:00
Linus Torvalds	27a24cfa04	Merge branch 'fixes' of git://git.infradead.org/users/vkoul/slave-dma Pull slave-dma fixes from Vinod Koul: "We have two patches from Andy & Rafael fixing the Lynxpoint dma" * 'fixes' of git://git.infradead.org/users/vkoul/slave-dma: ACPI / LPSS: register clock device for Lynxpoint DMA properly dma: acpi-dma: parse CSRT to extract additional resources	2013-05-25 20:30:31 -07:00
Lars-Peter Clausen	b6b5e76bb8	ASoC: Add ssm2518 support This patch adds a ASoC CODEC driver for the SSM2516. The SSM2516 is a stereo Class-D audio amplifier with an I2S interface for audio in and a built-in dynamic range control processor. Signed-off-by: Lars-Peter Clausen <lars@metafoo.de> Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>	2013-05-25 10:33:30 -04:00
Linus Torvalds	9cf1848278	Merge branch 'akpm' (incoming from Andrew Morton) Merge fixes from Andrew Morton: "A bunch of fixes and one simple fbdev driver which missed the merge window because people will still talking about it (to no great effect)." * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (30 commits) aio: fix kioctx not being freed after cancellation at exit time mm/pagewalk.c: walk_page_range should avoid VM_PFNMAP areas drivers/rtc/rtc-max8998.c: check for pdata presence before dereferencing ocfs2: goto out_unlock if ocfs2_get_clusters_nocache() failed in ocfs2_fiemap() random: fix accounting race condition with lockless irq entropy_count update drivers/char/random.c: fix priming of last_data mm/memory_hotplug.c: fix printk format warnings nilfs2: fix issue of nilfs_set_page_dirty() for page at EOF boundary drivers/block/brd.c: fix brd_lookup_page() race fbdev: FB_GOLDFISH should depend on HAS_DMA drivers/rtc/rtc-pl031.c: pass correct pointer to free_irq() auditfilter.c: fix kernel-doc warnings aio: fix io_getevents documentation revert "selftest: add simple test for soft-dirty bit" drivers/leds/leds-ot200.c: fix error caused by shifted mask mm/THP: use pmd_populate() to update the pmd with pgtable_t pointer linux/kernel.h: fix kernel-doc warning mm compaction: fix of improper cache flush in migration code rapidio/tsi721: fix bug in MSI interrupt handling hfs: avoid crash in hfs_bnode_create ...	2013-05-24 18:12:15 -07:00
David S. Miller	e6ff4c75f9	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Merge net into net-next because some upcoming net-next changes build on top of bug fixes that went into net. Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-24 16:48:28 -07:00
Linus Torvalds	00cec111ac	ARM: SoC fixes for 3.10-rc We didn't have any fixes sent up for -rc2, so this is a slightly larger batch. A bit all over the place platform-wise; OMAP, at91, marvell, renesas, sunxi, ux500, etc. I tried to summarize highlights but there isn't a whole lot to point out. Lots of little things fixed all over. A couple of defconfig updates due to new/changing options. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABAgAGBQJRn+66AAoJEIwa5zzehBx3f/4P/3sqK2z7u5SSa+tpkKYkxezO MykUOpUc4tuwrKiuUEeXPh89pjIrclQVzKYDqdaXIcezKB7IXFfQSyLNxDzGM7Y3 NrqrURvNpDmUi6F/xP89gXWkbvg2zr563mxSMpkF4G8HTYxvCv7sY0W/PDzb48Qg q3Efc/AfQsOGM0Zl8WXX9jZBdCNTquZYWd7YEFxCe9oVRGfmNlIYKirPOLnk9MAZ iIrbfFQG+cOos0NKrjM+tbtQNnPUBQeZdy3MvR7DtXCpKNdxs5EJL9EyHHqGGzPk Jd1CG3hNrPiuKVhmVSDkELNDYNT7J9+/rya1Mc33pwMrReAIKWUU/sitvsOkKypi PnTvlJkUv3UM2fOtoF8mOhQLTGdKSWfaF0BfHNmnCqJFDPs8vuQjx7O1WGKKUdC4 SbYZetUnL3AoMrEbza5EoMyqQwpTXiALWp4/6MunLhNyXo0OQvsqexvT6QIrhKyQ 0iExuSSbXDZAw2GXsqzOlCJecxHYG4qkGbf1DmeTW31GRQQ3nhpiu0GVAvHIEAhT SJGD57P0gbEXMpnSIKLNKIAYlLdFkGx6AhX1/Xb+AL7Jsod+UEwgz2ya4GMI4JI/ 0lUe8fPglU3ws8Up1y5FcIq4gjXTsvsEy317zeW/oq2vac0ACKBika2EVukdMD/5 SYr+m40EzQtVdTKuaZ+e =8tMi -----END PGP SIGNATURE----- Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC fixes from Olof Johansson: "We didn't have any fixes sent up for -rc2, so this is a slightly larger batch. A bit all over the place platform-wise; OMAP, at91, marvell, renesas, sunxi, ux500, etc. I tried to summarize highlights but there isn't a whole lot to point out. Lots of little things fixed all over. A couple of defconfig updates due to new/changing options." * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (44 commits) ARM: at91/sama5: fix incorrect PMC pcr div definition ARM: at91/dt: fix macb pinctrl_macb_rmii_mii_alt definition ARM: at91: at91sam9n12: move external irq declatation to DT ARM: shmobile: marzen: Use error values in usb_power_* ARM: tegra: defconfig fixes ARM: nomadik: fix IRQ assignment for SMC ethernet ARM: vt8500: Add missing NULL terminator in dt_compat clk: tegra: add ac97 controller clock clk: tegra: remove USB from clk init table ARM: dts: mvebu: Fix wrong the address reg value for the L2-cache node ARM: plat-orion: Fix num_resources and id for ge10 and ge11 ARM: OMAP2+: hwmod: Remove sysc slave idle and auto idle apis SERIAL: OMAP: Remove the slave idle handling from the driver ARM: OMAP2+: serial: Remove the un-used slave idle hooks ARM: OMAP2+: hwmod-data: UART IP needs software control to manage sidle modes ARM: OMAP2+: hwmod: Add a new flag to handle SIDLE in SWSUP only in active ARM: OMAP2+: hwmod: Fix sidle programming in _enable_sysc()/_idle_sysc() arm: mvebu: fix the 'ranges' property to handle PCIe ARM: mvebu: select ARCH_REQUIRE_GPIOLIB for mvebu platform ARM: AM33XX: Add missing .clkdm_name to clkdiv32k_ick clock ...	2013-05-24 16:27:37 -07:00
Randy Dunlap	7450231fb3	linux/kernel.h: fix kernel-doc warning Fix kernel-doc warning in <linux/kernel.h>: Warning(include/linux/kernel.h:590): No description found for parameter 'ip' scripts/kernel-doc cannot handle macros, functions, or function prototypes between the function or macro that is being documented and its definition, so move these prototypes above the function that is being documented. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-05-24 16:22:51 -07:00
Imre Deak	4c663cfc52	wait: fix false timeouts when using wait_event_timeout() Many callers of the wait_event_timeout() and wait_event_interruptible_timeout() expect that the return value will be positive if the specified condition becomes true before the timeout elapses. However, at the moment this isn't guaranteed. If the wake-up handler is delayed enough, the time remaining until timeout will be calculated as 0 - and passed back as a return value - even if the condition became true before the timeout has passed. Fix this by returning at least 1 if the condition becomes true. This semantic is in line with what wait_for_condition_timeout() does; see commit `bb10ed09` ("sched: fix wait_for_completion_timeout() spurious failure under heavy load"). Daniel said "We have 3 instances of this bug in drm/i915. One case even where we switch between the interruptible and not interruptible wait_event_timeout variants, foolishly presuming they have the same semantics. I very much like this." One such bug is reported at https://bugs.freedesktop.org/show_bug.cgi?id=64133 Signed-off-by: Imre Deak <imre.deak@intel.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Acked-by: David Howells <dhowells@redhat.com> Acked-by: Jens Axboe <axboe@kernel.dk> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Dave Jones <davej@redhat.com> Cc: Lukas Czerner <lczerner@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-05-24 16:22:50 -07:00
Alexandre Bounine	bc8fcfea18	rapidio: add enumeration/discovery start from user space Add RapidIO enumeration/discovery start from user space. User space start allows to defer RapidIO fabric scan until the moment when all participating endpoints are initialized avoiding mandatory synchronized start of all endpoints (which may be challenging in systems with large number of RapidIO endpoints). Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com> Cc: Matt Porter <mporter@kernel.crashing.org> Cc: Li Yang <leoli@freescale.com> Cc: Kumar Gala <galak@kernel.crashing.org> Cc: Andre van Herk <andre.van.herk@Prodrive.nl> Cc: Micha Nelissen <micha.nelissen@Prodrive.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-05-24 16:22:50 -07:00
Alexandre Bounine	a11650e110	rapidio: make enumeration/discovery configurable Systems that use RapidIO fabric may need to implement their own enumeration and discovery methods which are better suitable for needs of a target application. The following set of patches is intended to simplify process of introduction of new RapidIO fabric enumeration/discovery methods. The first patch offers ability to add new RapidIO enumeration/discovery methods using kernel configuration options. This new configuration option mechanism allows to select statically linked or modular enumeration/discovery method(s) from the list of existing methods or use external module(s). This patch also updates the currently existing enumeration/discovery code to be used as a statically linked or modular method. The corresponding configuration option is named "Basic enumeration/discovery" method. This is the only one configuration option available today but new methods are expected to be introduced after adoption of provided patches. The second patch address a long time complaint of RapidIO subsystem users regarding fabric enumeration/discovery start sequence. Existing implementation offers only a boot-time enumeration/discovery start which requires synchronized boot of all endpoints in RapidIO network. While it works for small closed configurations with limited number of endpoints, using this approach in systems with large number of endpoints is quite challenging. To eliminate requirement for synchronized start the second patch introduces RapidIO enumeration/discovery start from user space. For compatibility with the existing RapidIO subsystem implementation, automatic boot time enumeration/discovery start can be configured in by specifying "rio-scan.scan=1" command line parameter if statically linked basic enumeration method is selected. This patch: Rework to implement RapidIO enumeration/discovery method selection combined with ability to use enumeration/discovery as a kernel module. This patch adds ability to introduce new RapidIO enumeration/discovery methods using kernel configuration options. Configuration option mechanism allows to select statically linked or modular enumeration/discovery method from the list of existing methods or use external modules. If a modular enumeration/discovery is selected each RapidIO mport device can have its own method attached to it. The existing enumeration/discovery code was updated to be used as statically linked or modular method. This configuration option is named "Basic enumeration/discovery" method. Several common routines have been moved from rio-scan.c to make them available to other enumeration methods and reduce number of exported symbols. Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com> Cc: Matt Porter <mporter@kernel.crashing.org> Cc: Li Yang <leoli@freescale.com> Cc: Kumar Gala <galak@kernel.crashing.org> Cc: Andre van Herk <andre.van.herk@Prodrive.nl> Cc: Micha Nelissen <micha.nelissen@Prodrive.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-05-24 16:22:50 -07:00
Linus Torvalds	eb3d33900a	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: "It's been a while since my last pull request so quite a few fixes have piled up." Indeed. 1) Fix nf_{log,queue} compilation with PROC_FS disabled, from Pablo Neira Ayuso. 2) Fix data corruption on some tg3 chips with TSO enabled, from Michael Chan. 3) Fix double insertion of VLAN tags in be2net driver, from Sarveshwar Bandi. 4) Don't have TCP's MD5 support pass > PAGE_SIZE page offsets in scatter-gather entries into the crypto layer, the crypto layer can't handle that. From Eric Dumazet. 5) Fix lockdep splat in 802.1Q MRP code, also from Eric Dumazet. 6) Fix OOPS in netfilter log module when called from conntrack, from Hans Schillstrom. 7) FEC driver needs to use netif_tx_{lock,unlock}_bh() rather than the non-BH disabling variants. From Fabio Estevam. 8) TCP GSO can generate out-of-order packets, fix from Eric Dumazet. 9) vxlan driver doesn't update 'used' field of fdb entries when it should, from Sridhar Samudrala. 10) ipv6 should use kzalloc() to allocate inet6 socket cork options, otherwise we can OOPS in ip6_cork_release(). From Eric Dumazet. 11) Fix races in bonding set mode, from Nikolay Aleksandrov. 12) Fix checksum generation regression added by "r8169: fix 8168evl frame padding.", from Francois Romieu. 13) ip_gre can look at stale SKB data pointer, fix from Eric Dumazet. 14) Fix checksum handling when GSO is enabled in bnx2x driver with certain chips, from Yuval Mintz. 15) Fix double free in batman-adv, from Martin Hundebøll. 16) Fix device startup synchronization with firmware in tg3 driver, from Nithin Sujit. 17) perf networking dropmonitor doesn't work at all due to mixed up trace parameter ordering, from Ben Hutchings. 18) Fix proportional rate reduction handling in tcp_ack(), from Nandita Dukkipati. 19) IPSEC layer doesn't return an error when a valid state is detected, causing an OOPS. Fix from Timo Teräs. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (85 commits) be2net: bug fix on returning an invalid nic descriptor tcp: xps: fix reordering issues net: Revert unused variable changes. xfrm: properly handle invalid states as an error virtio_net: enable napi for all possible queues during open tcp: bug fix in proportional rate reduction. net: ethernet: sun: drop unused variable net: ethernet: korina: drop unused variable net: ethernet: apple: drop unused variable qmi_wwan: Added support for Cinterion's PLxx WWAN Interface perf: net_dropmonitor: Remove progress indicator perf: net_dropmonitor: Use bisection in symbol lookup perf: net_dropmonitor: Do not assume ordering of dictionaries perf: net_dropmonitor: Fix symbol-relative addresses perf: net_dropmonitor: Fix trace parameter order net: fec: use a more proper compatible string for MVF type device qlcnic: Fix updating netdev->features qlcnic: remove netdev->trans_start updates within the driver qlcnic: Return proper error codes from probe failure paths tg3: Update version to 3.132 ...	2013-05-24 08:27:32 -07:00
Tejun Heo	75501a6d59	cgroup: update iterators to use cgroup_next_sibling() This patch converts cgroup_for_each_child(), cgroup_next_descendant_pre/post() and thus cgroup_for_each_descendant_pre/post() to use cgroup_next_sibling() instead of manually dereferencing ->sibling.next. The only reason the iterators couldn't allow dropping RCU read lock while iteration is in progress was because they couldn't determine the next sibling safely once RCU read lock is dropped. Using cgroup_next_sibling() removes that problem and enables all iterators to allow dropping RCU read lock in the middle. Comments are updated accordingly. This makes the iterators easier to use and will simplify controllers. Note that @cgroup argument is renamed to @cgrp in cgroup_for_each_child() because it conflicts with "struct cgroup" used in the new macro body. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com> Reviewed-by: Michal Hocko <mhocko@suse.cz>	2013-05-24 10:55:38 +09:00
Tejun Heo	53fa526174	cgroup: add cgroup->serial_nr and implement cgroup_next_sibling() Currently, there's no easy way to find out the next sibling cgroup unless it's known that the current cgroup is accessed from the parent's children list in a single RCU critical section. This in turn forces all iterators to require whole iteration to be enclosed in a single RCU critical section, which sometimes is too restrictive. This patch implements cgroup_next_sibling() which can reliably determine the next sibling regardless of the state of the current cgroup as long as it's accessible. It currently is impossible to determine the next sibling after dropping RCU read lock because the cgroup being iterated could be removed anytime and if RCU read lock is dropped, nothing guarantess its ->sibling.next pointer is accessible. A removed cgroup would continue to point to its next sibling for RCU accesses but stop receiving updates from the sibling. IOW, the next sibling could be removed and then complete its grace period while RCU read lock is dropped, making it unsafe to dereference ->sibling.next after dropping and re-acquiring RCU read lock. This can be solved by adding a way to traverse to the next sibling without dereferencing ->sibling.next. This patch adds a monotonically increasing cgroup serial number, cgroup->serial_nr, which guarantees that all cgroup->children lists are kept in increasing serial_nr order. A new function, cgroup_next_sibling(), is implemented, which, if CGRP_REMOVED is not set on the current cgroup, follows ->sibling.next; otherwise, traverses the parent's ->children list until it sees a sibling with higher ->serial_nr. This allows the function to always return the next sibling regardless of the state of the current cgroup without adding overhead in the fast path. Further patches will update the iterators to use cgroup_next_sibling() so that they allow dropping RCU read lock and blocking while iteration is in progress which in turn will be used to simplify controllers. v2: Typo fix as per Serge. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>	2013-05-24 10:55:38 +09:00
Tejun Heo	bdc7119f1b	cgroup: make cgroup_is_removed() static cgroup_is_removed() no longer has external users and it shouldn't grow any - controllers should deal with cgroup_subsys_state on/offline state instead of cgroup removal state. Make it static. While at it, make it return bool. Signed-off-by: Tejun Heo <tj@kernel.org>	2013-05-24 10:55:38 +09:00

... 134 135 136 137 138 ...

42289 commits