This patch introduces a device tree bindings for
describing the hardware thermal behavior and limits.
Also a parser to read and interpret the data and feed
it in the thermal framework is presented.
This patch introduces a thermal data parser for device
tree. The parsed data is used to build thermal zones
and thermal binding parameters. The output data
can then be used to deploy thermal policies.
This patch adds also documentation regarding this
API and how to define tree nodes to use
this infrastructure.
Note that, in order to be able to have control
on the sensor registration on the DT thermal zone,
it was required to allow changing the thermal zone
.get_temp callback. For this reason, this patch
also removes the 'const' modifier from the .ops
field of thermal zone devices.
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: linux-pm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Eduardo Valentin <eduardo.valentin@ti.com>
Pull dynticks updates from Frederic Weisbecker:
* Fix a bug where posix cpu timers requeued due to interval got ignored on full
dynticks CPUs (not a regression though as it only impacts full dynticks and the
bug is there since we merged full dynticks).
* Optimizations and cleanups on the use of per CPU APIs to improve code readability,
performance and debuggability in the nohz subsystem;
* Optimize posix cpu timer by sparing stub workqueue queue with full dynticks off case
* Rename some functions to extend with *_this_cpu() suffix for clarity
* Refine the naming of some context tracking subsystem state accessors
* Trivial spelling fix by Paul Gortmaker
Signed-off-by: Ingo Molnar <mingo@kernel.org>
- Fix up tc3589x bindings so this chip works again.
- Remove SSP platform devices, as we now boot from device tree
exclusively.
- Delete surplus AB8500/DB8500 platform data, not obtained from
the device tree.
- Add DMA config for the MSP devices.
- A series of 21 patches moving pin control config for the
on-chip Nomadik pin controller from the board file
to the device tree, step by step.
- Two patches to the STE DMA40 driver regarding the high-prio
DMA channel so this can be moved to the device tree. Both have
Vinod's ACK.
- Decommission of the non-device tree boot path for the timer
initialization code.
- Deletion of the non-devicetree probe path from the MTU timer
driver, as all platforms using it are now using device tree.
This has Daniel Lezcano's ACK.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.15 (GNU/Linux)
iQIcBAABAgAGBQJSlQIFAAoJEEEQszewGV1zaaoP/iUClda7fps7dhBZMHOaDr48
DAe9yBaUoGBgyjOd0QUUP4/P5PxsXQVCM13gdaITt+R4eg8XYxG3jLNxopMfGKYV
bifbXqEd4wf0t8yUYCzhdfAeWchzSph3W2C7I2hW09b9nhJ+RWgXdzcO061E9Tbj
BbZ4aS7IR3HfXzQ/zdMBtlbNXkWtn++HXCjC5Z0t0FKqSL7X8LKZrr3GEHZ6vfnB
sI5/bTK3AfXt7h2wsu4z/IZK89Ttt9AATNhxajW2Pptkuggc+/KfRRDyYeZFBT70
9KIRsF3NXSsOhOYcRNbWjNU2S8OtC7UyKMB5+y07cYU53ehaiEH8tUMait1cphlc
kFJAqKXY87I85E6WTEgv1s4Q8H3rQzVc/IjT3LODFbuM3MeTrsrnbKDeFCOqqpJC
4Ggq6VBXhn7G2srcM5KBYZcZhmq86qhmo0/vUOqaYcdmgdD7KJyewLNafYf5W+DX
JxVlnzpdWsAhihUV3oBeJybM0YGvE1wa855ZqbsAZ8nrLTBoTD+zi30mIhXrQwQp
v54ywGa0JtjpTLY6EZG2Rptq4qOUVUbkuNSv+3BYgVTqxCcCqXOu+tuzGk2O1jTh
8/2biKCCrdbSWCmbMZDi7jlkO5ywqcn2Brav5KyZ+bCwITRybfoxeOv6f8DCT741
bwx00XNaYVO/H+Vptd+h
=w9pg
-----END PGP SIGNATURE-----
Merge tag 'ux500-devicetree-v3.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-stericsson into next/dt
From Linus Walleij:
Ux500 device tree patches for v3.14, take one:
- Fix up tc3589x bindings so this chip works again.
- Remove SSP platform devices, as we now boot from device tree
exclusively.
- Delete surplus AB8500/DB8500 platform data, not obtained from
the device tree.
- Add DMA config for the MSP devices.
- A series of 21 patches moving pin control config for the
on-chip Nomadik pin controller from the board file
to the device tree, step by step.
- Two patches to the STE DMA40 driver regarding the high-prio
DMA channel so this can be moved to the device tree. Both have
Vinod's ACK.
- Decommission of the non-device tree boot path for the timer
initialization code.
- Deletion of the non-devicetree probe path from the MTU timer
driver, as all platforms using it are now using device tree.
This has Daniel Lezcano's ACK.
* tag 'ux500-devicetree-v3.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-stericsson: (39 commits)
ARM: ux500: decomission custom SMP TWD timer init
clksrc: delete nomadik MTU non-DT boot path
ARM: ux500: decomission the non-DT MTU init sequence
dma: ste_dma40: Parse flags property for new 'high priority channel' request
dma: ste_dma40: Expand DT binding to accept 'high-priority channel' flag
ARM: ux500: Remove checking for DT during timer init
ARM: ux500: Clean-up legacy extern prototype
ARM: ux500: Remove unused call to register AMBA devices
ARM: ux500: Clean-up non-DT IRQ initialisation
pinctrl: nomadik: decomission non-DT boot path
pinctrl: nomadik: move platform data handling into driver
ARM: ux500: get rid of unused header
ARM: ux500: delete Nomadik pinctrl AUXDATA
ARM: ux500: delete remnant pin config macros
ARM: ux500: move snowball pin configs to device tree
ARM: ux500: move snowball LED pin control to device tree
ARM: ux500: convert Snowball SPI pin reference
ARM: ux500: move snowball ethernet config to device tree
ARM: ux500: move HREFv60plus pin configs to device tree
ARM: ux500: move final HREFv60 LCD pins to device tree
...
Signed-off-by: Olof Johansson <olof@lixom.net>
2 fixes here.
* The gp2ap020a00f is a simple missing kconfig dependency.
* The hid sensors hub fix is a work around for an issue introduced by hardware
changes due to a certain large software vendor having an 'interesting'
interpretation of the specification and hence indexing some arrays from 1
rather than 0. The fix takes advantage of the logical min and max reading
facilities introduced by the precursor patch to figure out whether we have
a 0 indexed or 1 indexed device and to adjust appropriately. It also
drops a previous kconfig option that allowed this issue to be worked around
at build time.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABAgAGBQJSnkM1AAoJEFSFNJnE9BaIV/0P/26YRyWfInekegDN9nviEMno
bBZIQy1iuiZOSqw78XdiN4NBfK3Vy4F/cx+yGkr2fVa6NV/B6p3iL0+Pv4HiZSqs
VY4TlNs5el7GD399+nOatgCAM6NXvkSE9rOAEdNWgWUupy/XVRx6nCpghRltnmu3
mSZCywSnHuCCj2TG9Q/FVjMp3GR3GboSnF09Q/8Xxt7HNj2poxAuYCcMGl9SYgDw
gfaKsmGZk488H5CA2FjIWQ5/5vOTCxoPr5MgTjqfpLD2tB5+ZsdMmDacsEa31nZS
RHKzLDvYuAlBeaeR/FUeB5S6YgZpB0MulNsF2z2ms99mSmmjWTr1qun5cdMIh9su
OBKtmew9MnXWlP6yxSVNTvmPbGHSzzb9x2grvptPN6x/08Qk/Sv0qJcmb267Vr0u
EwqjSqIDdMOTfhlltpWeIF3+6bHj7RTrVfXvNi+plAKx1gp8/2pYZAp05eNHO6tD
ITyOCiL7O1KNcG5si+EI5EdCqWj9Dmdul16gHFafPohDJWiCNo/aLhb3jQKZWZyA
JnM6DENEmdicHeYuQKaBV95ptLdzAU9POapJzMMMQVNY5RS6DnHPBVCAuqrkdzAo
57FT5dewm6FI2+k+ckawtUgyf56nCzsbEHo3iFaHn0ElxWJeR8/Acn/jR6Agm1AN
DGNgavLLU3kFN9lHjPk3
=BgVg
-----END PGP SIGNATURE-----
Merge tag 'iio-fixes-for-3.13b' of git://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into staging-linus
Jonathan writes:
Second round of IIO fixes for the 3.13 cycle.
2 fixes here.
* The gp2ap020a00f is a simple missing kconfig dependency.
* The hid sensors hub fix is a work around for an issue introduced by hardware
changes due to a certain large software vendor having an 'interesting'
interpretation of the specification and hence indexing some arrays from 1
rather than 0. The fix takes advantage of the logical min and max reading
facilities introduced by the precursor patch to figure out whether we have
a 0 indexed or 1 indexed device and to adjust appropriately. It also
drops a previous kconfig option that allowed this issue to be worked around
at build time.
Added usage id processing for Inclinometer 3D. This uses IIO
interfaces for triggered buffer to present data to user
mode.This uses HID sensor framework for registering callback
events from the sensor hub.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
There are already humidity sensors in the hwmon subsystem,
so we use their unit (milli percent) here as well.
Signed-off-by: Harald Geyer <harald@ccbib.org>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
This patch adds a new data_available() callback to the iio_buffer_access_funcs
struct. The callback is used to indicate whether data is available in the buffer
for reading. It is meant to replace the stufftoread flag from the iio_buffer
struct. The reasoning for this is that the buffer implementation usually can
determine whether data is available rather easily based on its state, on the
other hand it can be rather tricky to update the stufftoread flag in a race free
way.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Traditionally the "get" functions increment the reference count of the
object that is returned, which does not happen with vme_slot_get. The
function vme_slot_get returns the physical VME slot associated with a
particular struct vme_dev. Rename vme_slot_num to avoid any confusion.
Signed-off-by: Martyn Welch <martyn.welch@ge.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The match function for vme_user is completely wrong. It will blindly bind
against the first VME slot on each bus (at this point that would be just the
first bus as the driver can only handle one bus).
The original intention (before some major subsystem changes) was that the
driver bind against the slot to which the bridge was attached in the VME
system and to the bus(es) provided via the "bus" module parameter.
To do this cleanly (i.e. without poking arround in the subsystems internal
stuctures) a functionality has been added to provide access to the bus
enumeration.
Signed-off-by: Martyn Welch <martyn.welch@ge.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Add declaration of 'struct of_phandle_args' to avoid the following
warning:
In file included from arch/arm/mach-tegra/board-paz00.c:21:0:
include/linux/gpio/driver.h:102:17: warning: 'struct of_phandle_args' declared inside parameter list
include/linux/gpio/driver.h:102:17: warning: its scope is only this definition or declaration, which is probably not what you want
Also proactively add other definitions/includes that could be missing
in other contexts.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Reported-by: Stephen Warren <swarren@wwwdotorg.org>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
The PIN_CONFIG_OUTPUT parameter is really tricky to understand
and needs an explicit pointer to the documentation.
Cc: Tomasz Figa <t.figa@samsung.com>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
When multiple wireless USB devices are connected and one of the devices
disconnects, the host will distribute a new group key to the remaining
devicese using wusbhc_gtk_rekey. wusbhc_gtk_rekey takes the
wusbhc->mutex and holds it while it submits a URB to set the new key.
This causes a deadlock in wa_urb_enqueue when it calls a device lookup
helper function that takes the same lock.
This patch changes wusbhc_gtk_rekey to submit a work item to set the GTK
so that the URB is submitted without holding wusbhc->mutex.
Signed-off-by: Thomas Pugliese <thomas.pugliese@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This reverts commit 018c5bba05.
It causes regressions for people using chips driven by the sungem
driver. Suspicion is that the skb->csum value isn't being adjusted
properly.
The change also has a bug in that if __pskb_trim() fails, we'll leave
a corruped skb->csum value in there. We would really need to revert
it to it's original value in that case.
Signed-off-by: David S. Miller <davem@davemloft.net>
In the original HID sensor hub firmwares all Named array enums were
to 0-based. But the most recent hub implemented as 1-based,
because of the implementation by one of the major OS vendor.
Using logical minimum for the field as the base of enum. So we add
logical minimum to the selector values before setting those fields.
Some sensor hub FWs already changed logical minimum from 0 to 1
to reflect this and hope every other vendor will follow.
There is no easy way to add a common HID quirk for NAry elements,
even if the standard specifies these field as NAry, the collection
used to describe selectors is still just "logical".
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Exporting logical minimum and maximum of HID fields as part of the
hid sensor attribute info. This can be used for range checking and
to calculate enumeration base for NAry fields of HID sensor hub.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Modify tg3_chip_reset() and tg3_close() to check if the PCI network
adapter device is accessible at all in order to skip poking it or
trying to handle a carrier loss in vain when that's not the case.
Introduce a special PCI helper function pci_device_is_present()
for this purpose.
Of course, this uncovers the lack of the appropriate RTNL locking
in tg3_suspend() and tg3_resume(), so add that locking in there
too.
These changes prevent tg3 from burning a CPU at 100% load level for
solid several seconds after the Thunderbolt link is disconnected from
a Matrox DS1 docking station.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Section 4.11.7.1 of rev 1.0 of the xhci specification states that a link TRB
can only occur at a boundary between underlying USB frames (512 bytes for
high speed devices).
If this isn't done the USB frames aren't formatted correctly and, for example,
the USB3 ethernet ax88179_178a card will stop sending (while still receiving)
when running a netperf tcp transmit test with (say) and 8k buffer.
This should be a candidate for stable, the ax88179_178a driver defaults to
gso and tso enabled so it passes a lot of fragmented skb to the USB stack.
Notes from Sarah:
Discussion: http://marc.info/?l=linux-usb&m=138384509604981&w=2
This patch fixes a long-standing xHCI driver bug that was revealed by a
change in 3.12 in the usb-net driver. Commit
638c5115a7 "USBNET: support DMA SG" added
support to use bulk endpoint scatter-gather (urb->sg). Only the USB
ethernet drivers trigger this bug, because the mass storage driver sends
sg list entries in page-sized chunks.
This patch only fixes the issue for bulk endpoint scatter-gather. The
problem will still occur for periodic endpoints, because hosts will
interpret no-op transfers as a request to skip a service interval, which
is not what we want.
Luckily, the USB core isn't set up for scatter-gather on isochronous
endpoints, and no USB drivers use scatter-gather for interrupt
endpoints. Document this known limitation so that developers won't try
to use urb->sg for interrupt endpoints until this issue is fixed. The
more comprehensive fix would be to allow link TRBs in the middle of the
endpoint ring and revert this patch, but that fix would touch too much
code to be allowed in for stable.
This patch should be backported to kernels as old as 3.12, that contain
the commit 638c5115a7 "USBNET: support DMA
SG". Without this patch, the USB network device gets wedged, and stops
sending packets. Mark Lord confirms this patch fixes the regression:
http://marc.info/?l=linux-netdev&m=138487107625966&w=2
Signed-off-by: David Laight <david.laight@aculab.com>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Tested-by: Mark Lord <mlord@pobox.com>
Cc: stable@vger.kernel.org
We currently have a confusing couple of API naming with the existing
context_tracking_active() and context_tracking_is_enabled().
Lets keep the latter one, context_tracking_is_enabled(), for global
context tracking state check and use context_tracking_cpu_is_enabled()
for local state check.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Use a function with a meaningful name to check the global context
tracking state. static_key_false() is a bit confusing for reviewers.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
A few functions use remote per CPU access APIs when they
deal with local values.
Just do the right conversion to improve performance, code
readability and debug checks.
While at it, lets extend some of these function names with *_this_cpu()
suffix in order to display their purpose more clearly.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Pull irq fixes from Thomas Gleixner:
- Correction of fuzzy and fragile IRQ_RETVAL macro
- IRQ related resume fix affecting only XEN
- ARM/GIC fix for chained GIC controllers
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip: Gic: fix boot for chained gics
irq: Enable all irqs unconditionally in irq_resume
genirq: Correct fuzzy and fragile IRQ_RETVAL() definition
Pull scheduler fixes from Ingo Molnar:
"Various smaller fixlets, all over the place"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/doc: Fix generation of device-drivers
sched: Expose preempt_schedule_irq()
sched: Fix a trivial typo in comments
sched: Remove unused variable in 'struct sched_domain'
sched: Avoid NULL dereference on sd_busy
sched: Check sched_domain before computing group power
MAINTAINERS: Update file patterns in the lockdep and scheduler entries
Pull perf fixes from Ingo Molnar:
"Misc kernel and tooling fixes"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
tools lib traceevent: Fix conversion of pointer to integer of different size
perf/trace: Properly use u64 to hold event_id
perf: Remove fragile swevent hlist optimization
ftrace, perf: Avoid infinite event generation loop
tools lib traceevent: Fix use of multiple options in processing field
perf header: Fix possible memory leaks in process_group_desc()
perf header: Fix bogus group name
perf tools: Tag thread comm as overriden
Binary was written as binay, probably by mistake. Fix it.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
This patch adds new at91 pll clock implementation using common clk framework.
The pll clock layout describe the PLLX register layout.
There are four pll clock layouts:
- at91rm9200
- at91sam9g20
- at91sam9g45
- sama5d3
PLL clocks are given characteristics:
- min/max clock source rate
- ranges of valid clock output rates
- values to set in out and icpll fields for each supported output range
These characteristics are checked during rate change to avoid
over/underclocking.
These characteristics are described in atmel's SoC datasheet in
"Electrical Characteristics" paragraph.
Signed-off-by: Boris BREZILLON <b.brezillon@overkiz.com>
Acked-by: Mike Turquette <mturquette@linaro.org>
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
This patch moves at91_pmc.h header from machine specific directory
(arch/arm/mach-at91/include/mach/at91_pmc.h) to clk include directory
(include/linux/clk/at91_pmc.h).
We need this to avoid reference to machine specific headers in clk
drivers.
Signed-off-by: Boris BREZILLON <b.brezillon@overkiz.com>
Acked-by: Felipe Balbi <balbi@ti.com>
Acked-by: Mike Turquette <mturquette@linaro.org>
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
We have a problem where the big_key key storage implementation uses a
shmem backed inode to hold the key contents. Because of this detail of
implementation LSM checks are being done between processes trying to
read the keys and the tmpfs backed inode. The LSM checks are already
being handled on the key interface level and should not be enforced at
the inode level (since the inode is an implementation detail, not a
part of the security model)
This patch implements a new function shmem_kernel_file_setup() which
returns the equivalent to shmem_file_setup() only the underlying inode
has S_PRIVATE set. This means that all LSM checks for the inode in
question are skipped. It should only be used for kernel internal
operations where the inode is not exposed to userspace without proper
LSM checking. It is possible that some other users of
shmem_file_setup() should use the new interface, but this has not been
explored.
Reproducing this bug is a little bit difficult. The steps I used on
Fedora are:
(1) Turn off selinux enforcing:
setenforce 0
(2) Create a huge key
k=`dd if=/dev/zero bs=8192 count=1 | keyctl padd big_key test-key @s`
(3) Access the key in another context:
runcon system_u:system_r:httpd_t:s0-s0:c0.c1023 keyctl print $k >/dev/null
(4) Examine the audit logs:
ausearch -m AVC -i --subject httpd_t | audit2allow
If the last command's output includes a line that looks like:
allow httpd_t user_tmpfs_t:file { open read };
There was an inode check between httpd and the tmpfs filesystem. With
this patch no such denial will be seen. (NOTE! you should clear your
audit log if you have tested for this previously)
(Please return you box to enforcing)
Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Hugh Dickins <hughd@google.com>
cc: linux-mm@kvack.org
If sufficient keys (or keyrings) are added into a keyring such that a node in
the associative array's tree overflows (each node has a capacity N, currently
16) and such that all N+1 keys have the same index key segment for that level
of the tree (the level'th nibble of the index key), then assoc_array_insert()
calls ops->diff_objects() to indicate at which bit position the two index keys
vary.
However, __key_link_begin() passes a NULL object to assoc_array_insert() with
the intention of supplying the correct pointer later before we commit the
change. This means that keyring_diff_objects() is given a NULL pointer as one
of its arguments which it does not expect. This results in an oops like the
attached.
With the previous patch to fix the keyring hash function, this can be forced
much more easily by creating a keyring and only adding keyrings to it. Add any
other sort of key and a different insertion path is taken - all 16+1 objects
must want to cluster in the same node slot.
This can be tested by:
r=`keyctl newring sandbox @s`
for ((i=0; i<=16; i++)); do keyctl newring ring$i $r; done
This should work fine, but oopses when the 17th keyring is added.
Since ops->diff_objects() is always called with the first pointer pointing to
the object to be inserted (ie. the NULL pointer), we can fix the problem by
changing the to-be-inserted object pointer to point to the index key passed
into assoc_array_insert() instead.
Whilst we're at it, we also switch the arguments so that they are the same as
for ->compare_object().
BUG: unable to handle kernel NULL pointer dereference at 0000000000000088
IP: [<ffffffff81191ee4>] hash_key_type_and_desc+0x18/0xb0
...
RIP: 0010:[<ffffffff81191ee4>] hash_key_type_and_desc+0x18/0xb0
...
Call Trace:
[<ffffffff81191f9d>] keyring_diff_objects+0x21/0xd2
[<ffffffff811f09ef>] assoc_array_insert+0x3b6/0x908
[<ffffffff811929a7>] __key_link_begin+0x78/0xe5
[<ffffffff81191a2e>] key_create_or_update+0x17d/0x36a
[<ffffffff81192e0a>] SyS_add_key+0x123/0x183
[<ffffffff81400ddb>] tracesys+0xdd/0xe2
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Stephen Gallagher <sgallagh@redhat.com>
This patch adds a notifier chain to the power_supply, this helps drivers
in other subsystem to listen to changes in power supply subsystem.
This would help to take some actions in those drivers on changing the
power supply properties. One such scenario is to increase/decrease system
performance based on the battery capacity/voltage. Another scenario is to
adjust the h/w peak current detection voltage/current thresholds based on
battery voltage/capacity. The notifier helps drivers to listen to changes
in power_suppy susbystem without polling the power_supply properties
Signed-off-by: Jenny TC <jenny.tc@intel.com>
Signed-off-by: Pali Rohár <pali.rohar@gmail.com>
Acked-by: Jenny TC <jenny.tc@intel.com>
Signed-off-by: Anton Vorontsov <anton@enomsg.org>
fs/sysfs/symlink.c::sysfs_delete_link() tests @sd->s_flags for
SYSFS_FLAG_NS. Let's add kernfs_ns_enabled() so that sysfs doesn't
have to test sysfs_dirent flag directly. This makes things tidier for
kernfs proper too.
This is purely cosmetic.
v2: To avoid possible NULL deref, use noop dummy implementation which
always returns false when !CONFIG_SYSFS.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
sysfs_dirent includes some information which should be available to
kernfs users - the type, flags, name and parent pointer. This patch
moves sysfs_dirent definition from kernfs/kernfs-internal.h to
include/linux/kernfs.h so that kernfs users can access them.
The type part of flags is exported as enum kernfs_node_type, the flags
kernfs_node_flag, sysfs_type() and kernfs_enable_ns() are moved to
include/linux/kernfs.h and the former is updated to return the enum
type. sysfs_dirent->s_parent and ->s_name are marked explicitly as
public.
This patch doesn't introduce any functional changes.
v2: Flags exported too and kernfs_enable_ns() definition moved.
v3: While moving kernfs_enable_ns() to include/linux/kernfs.h, v1 and
v2 put the definition outside CONFIG_SYSFS replacing the dummy
implementation with the actual implementation too. Unfortunately,
this can lead to oops when !CONFIG_SYSFS because
kernfs_enable_ns() may be called on a NULL @sd and now tries to
dereference @sd instead of not doing anything. This issue was
reported by Yuanhan Liu.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We're in the process of separating out core sysfs functionality into
kernfs which will deal with sysfs_dirents directly. This patch
rearranges mount path so that the kernfs and sysfs parts are separate.
* As sysfs_super_info won't be visible outside kernfs proper,
kernfs_super_ns() is added to allow kernfs users to access a
super_block's namespace tag.
* Generic mount operation is separated out into kernfs_mount_ns().
sysfs_mount() now just performs sysfs-specific permission check,
acquires namespace tag, and invokes kernfs_mount_ns().
* Generic superblock release is separated out into kernfs_kill_sb()
which can be used directly as file_system_type->kill_sb(). As sysfs
needs to put the namespace tag, sysfs_kill_sb() wraps
kernfs_kill_sb() with ns tag put.
* sysfs_dir_cachep init and sysfs_inode_init() are separated out into
kernfs_init(). kernfs_init() uses only small amount of memory and
trying to handle and propagate kernfs_init() failure doesn't make
much sense. Use SLAB_PANIC for sysfs_dir_cachep and make
sysfs_inode_init() panic on failure.
After this change, kernfs_init() should be called before
sysfs_init(), fs/namespace.c::mnt_init() modified accordingly.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: linux-fsdevel@vger.kernel.org
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
kernfs is being updated to allow multiple sysfs_dirent hierarchies so
that it can also be used by other users. Currently, inode number is
allocated using a global ida, sysfs_ino_ida; however, inos for
different hierarchies should be handled separately.
This patch makes ino allocation per kernfs_root. sysfs_ino_ida is
replaced by kernfs_root->ino_ida and sysfs_new_dirent() is updated to
take @root and allocate ino from it. ida_simple_get/remove() are used
instead of sysfs_ino_lock and sysfs_alloc/free_ino().
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There currently is single kernfs hierarchy in the whole system which
is used for sysfs. kernfs needs to support multiple hierarchies to
allow other users. This patch introduces struct kernfs_root which
serves as the root of each kernfs hierarchy and implements
kernfs_create/destroy_root().
* Each kernfs_root is associated with a root sd (sysfs_dentry). The
root is freed when the root sd is released and kernfs_destory_root()
simply invokes kernfs_remove() on the root sd. sysfs_remove_one()
is updated to handle release of the root sd. Note that ps_iattr
update in sysfs_remove_one() is trivially updated for readability.
* Root sd's are now dynamically allocated using sysfs_new_dirent().
Update sysfs_alloc_ino() so that it gives out ino from 1 so that the
root sd still gets ino 1.
* While kernfs currently only points to the root sd, it'll soon grow
fields which are specific to each hierarchy. As determining a given
sd's root will be necessary, sd->s_dir.root is added. This backlink
fits better as a separate field in sd; however, sd->s_dir is inside
union with space to spare, so use it to save space and provide
kernfs_root() accessor to determine the root sd.
* As hierarchies may be destroyed now, each mount needs to hold onto
the hierarchy it's attached to. Update sysfs_fill_super() and
sysfs_kill_sb() so that they get and put the kernfs_root
respectively.
* sysfs_root is replaced with kernfs_root which is dynamically created
by invoking kernfs_create_root() from sysfs_init().
This patch doesn't introduce any visible behavior changes.
v2: kernfs_create_root() forgot to set @sd->priv. Fixed.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Introduce kernfs interface for finding, getting and putting
sysfs_dirents.
* sysfs_find_dirent() is renamed to kernfs_find_ns() and lockdep
assertion for sysfs_mutex is added.
* sysfs_get_dirent_ns() is renamed to kernfs_find_and_get().
* Macro inline dancing around __sysfs_get/put() are removed and
kernfs_get/put() are made proper functions implemented in
fs/sysfs/dir.c.
While the conversions are mostly equivalent, there's one difference -
kernfs_get() doesn't return the input param as its return value. This
change is intentional. While passing through the input increases
writability in some areas, it is unnecessary and has been shown to
cause confusion regarding how the last ref is handled.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently, sysfs_dirent active_ref lockdep annotation uses
attribute->[s]key as the lockdep key, which forces
kernfs_create_file_ns() to assume that sysfs_dirent->priv is pointing
to a struct attribute which may not be true for non-sysfs users. This
patch restructures the lockdep annotation such that
* kernfs_ops contains lockdep_key which is used by default for files
created kernfs_create_file_ns().
* kernfs_create_file_ns_key() is introduced which takes an extra @key
argument. The created file will use the specified key for
active_ref lockdep annotation. If NULL is specified, lockdep for
the file is disabled.
* sysfs_add_file_mode_ns() is updated to use
kernfs_create_file_ns_key() with the appropriate key from the
attribute or NULL if ignore_lockdep is set.
This makes the lockdep annotation properly contained in kernfs while
allowing sysfs to cleanly keep its current behavior. This patch
doesn't introduce any behavior differences.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Introduce kernfs interface to wake up poll(2) which takes and returns
sysfs_dirents.
sysfs_notify_dirent() is renamed to kernfs_notify() and sysfs_notify()
is updated so that it doesn't directly grab sysfs_mutex but acquires
the target sysfs_dirents using sysfs_get_dirent().
sysfs_notify_dirent() is reimplemented as a dumb inline wrapper around
kernfs_notify().
This patch doesn't introduce any behavior changes.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
kernfs_ops currently only supports single_open() behavior which is
pretty restrictive. Add optional callbacks ->seq_{start|next|stop}()
which, when implemented, are invoked for seq_file traversal. This
allows full seq_file functionality for kernfs users. This currently
doesn't have any user and doesn't change any behavior.
v2: Refreshed on top of the updated "sysfs, kernfs: prepare read path
for kernfs".
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Introduce kernfs interface to create a file which takes and returns
sysfs_dirents.
The actual file creation part is separated out from
sysfs_add_file_mode_ns() into kernfs_create_file_ns(). The former now
only decides the kernfs_ops to use and the file's size and invokes the
latter.
This patch doesn't introduce behavior changes.
v2: Dummy implementation for !CONFIG_SYSFS updated to return -ENOSYS.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We're in the process of separating out core sysfs functionality into
kernfs which will deal with sysfs_dirents directly. This patch
introduces kernfs_ops which hosts methods kernfs users implement and
updates fs/sysfs/file.c such that sysfs_kf_*() functions populate
kernfs_ops and kernfs_file_*() functions call the matching entries
from kernfs_ops.
kernfs_ops contains the following groups of methods.
* seq_show() - for kernfs files which use seq_file for reads.
* read() - for direct read implementations. Used iff seq_show() is
not implemented.
* write() - for writes.
* mmap() - for mmaps.
Notes:
* sysfs_elem_attr->ops is added so that kernfs_ops can be accessed
from sysfs_dirent. kernfs_ops() helper is added to verify locking
and access the field.
* SYSFS_FLAG_HAS_(SEQ_SHOW|MMAP) added. sd->s_attr->ops is accessible
only while holding active_ref and there are cases where we want to
take different actions depending on which ops are implemented.
These two flags cache whether the two ops are implemented for those.
* kernfs_file_*() no longer test sysfs type but chooses different
behaviors depending on which methods in kernfs_ops are implemented.
The conversions are trivial except for the open path. As
kernfs_file_open() now decides whether to allow read/write accesses
depending on the kernfs_ops implemented, the presence of methods in
kobjs and attribute_bin should be propagated to kernfs_ops.
sysfs_add_file_mode_ns() is updated so that it propagates presence /
absence of the callbacks through _empty, _ro, _wo, _rw kernfs_ops.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
sysfs_open_file will be used as the primary handle for kernfs methods.
Move its definition from fs/sysfs/file.c to include/linux/kernfs.h and
mark the public and private fields.
This is pure relocation.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Introduce kernfs interface to manipulate a directory which takes and
returns sysfs_dirents.
create_dir() is renamed to kernfs_create_dir_ns() and its argumantes
and return value are updated. create_dir() usages are replaced with
kernfs_create_dir_ns() and sysfs_create_subdir() usages are replaced
with kernfs_create_dir(). Dup warnings are handled explicitly by
sysfs users of the kernfs interface.
sysfs_enable_ns() is renamed to kernfs_enable_ns().
This patch doesn't introduce any behavior changes.
v2: Dummy implementation for !CONFIG_SYSFS updated to return -ENOSYS.
v3: kernfs_enable_ns() added.
v4: Refreshed on top of "sysfs: drop kobj_ns_type handling, take #2"
so that this patch removes sysfs_enable_ns().
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
For some reason, tasks and cgroup.procs guarantee that the result is
sorted. This is the only reason this whole pidlist logic is necessary
instead of just iterating through sorted member tasks. We can't do
anything about the existing interface but at least ensure that such
expectation doesn't exist for the new interface so that pidlist logic
may be removed in the distant future.
This patch scrambles the sort order if sane_behavior so that the
output is usually not sorted in the new interface.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Li Zefan <lizefan@huawei.com>
Now that pidlist files don't use cftype->release(), it doesn't have
any user left. Remove it.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Li Zefan <lizefan@huawei.com>
Currently, when mounting pstore file system, a read callback of
efi_pstore driver runs mutiple times as below.
- In the first read callback, scan efivar_sysfs_list from head and pass
a kmsg buffer of a entry to an upper pstore layer.
- In the second read callback, rescan efivar_sysfs_list from the entry
and pass another kmsg buffer to it.
- Repeat the scan and pass until the end of efivar_sysfs_list.
In this process, an entry is read across the multiple read function
calls. To avoid race between the read and erasion, the whole process
above is protected by a spinlock, holding in open() and releasing in
close().
At the same time, kmemdup() is called to pass the buffer to pstore
filesystem during it. And then, it causes a following lockdep warning.
To make the dynamic memory allocation runnable without taking spinlock,
holding off a deletion of sysfs entry if it happens while scanning it
via efi_pstore, and deleting it after the scan is completed.
To implement it, this patch introduces two flags, scanning and deleting,
to efivar_entry.
On the code basis, it seems that all the scanning and deleting logic is
not needed because __efivars->lock are not dropped when reading from the
EFI variable store.
But, the scanning and deleting logic is still needed because an
efi-pstore and a pstore filesystem works as follows.
In case an entry(A) is found, the pointer is saved to psi->data. And
efi_pstore_read() passes the entry(A) to a pstore filesystem by
releasing __efivars->lock.
And then, the pstore filesystem calls efi_pstore_read() again and the
same entry(A), which is saved to psi->data, is used for resuming to scan
a sysfs-list.
So, to protect the entry(A), the logic is needed.
[ 1.143710] ------------[ cut here ]------------
[ 1.144058] WARNING: CPU: 1 PID: 1 at kernel/lockdep.c:2740 lockdep_trace_alloc+0x104/0x110()
[ 1.144058] DEBUG_LOCKS_WARN_ON(irqs_disabled_flags(flags))
[ 1.144058] Modules linked in:
[ 1.144058] CPU: 1 PID: 1 Comm: systemd Not tainted 3.11.0-rc5 #2
[ 1.144058] 0000000000000009 ffff8800797e9ae0 ffffffff816614a5 ffff8800797e9b28
[ 1.144058] ffff8800797e9b18 ffffffff8105510d 0000000000000080 0000000000000046
[ 1.144058] 00000000000000d0 00000000000003af ffffffff81ccd0c0 ffff8800797e9b78
[ 1.144058] Call Trace:
[ 1.144058] [<ffffffff816614a5>] dump_stack+0x54/0x74
[ 1.144058] [<ffffffff8105510d>] warn_slowpath_common+0x7d/0xa0
[ 1.144058] [<ffffffff8105517c>] warn_slowpath_fmt+0x4c/0x50
[ 1.144058] [<ffffffff8131290f>] ? vsscanf+0x57f/0x7b0
[ 1.144058] [<ffffffff810bbd74>] lockdep_trace_alloc+0x104/0x110
[ 1.144058] [<ffffffff81192da0>] __kmalloc_track_caller+0x50/0x280
[ 1.144058] [<ffffffff815147bb>] ? efi_pstore_read_func.part.1+0x12b/0x170
[ 1.144058] [<ffffffff8115b260>] kmemdup+0x20/0x50
[ 1.144058] [<ffffffff815147bb>] efi_pstore_read_func.part.1+0x12b/0x170
[ 1.144058] [<ffffffff81514800>] ? efi_pstore_read_func.part.1+0x170/0x170
[ 1.144058] [<ffffffff815148b4>] efi_pstore_read_func+0xb4/0xe0
[ 1.144058] [<ffffffff81512b7b>] __efivar_entry_iter+0xfb/0x120
[ 1.144058] [<ffffffff8151428f>] efi_pstore_read+0x3f/0x50
[ 1.144058] [<ffffffff8128d7ba>] pstore_get_records+0x9a/0x150
[ 1.158207] [<ffffffff812af25c>] ? selinux_d_instantiate+0x1c/0x20
[ 1.158207] [<ffffffff8128ce30>] ? parse_options+0x80/0x80
[ 1.158207] [<ffffffff8128ced5>] pstore_fill_super+0xa5/0xc0
[ 1.158207] [<ffffffff811ae7d2>] mount_single+0xa2/0xd0
[ 1.158207] [<ffffffff8128ccf8>] pstore_mount+0x18/0x20
[ 1.158207] [<ffffffff811ae8b9>] mount_fs+0x39/0x1b0
[ 1.158207] [<ffffffff81160550>] ? __alloc_percpu+0x10/0x20
[ 1.158207] [<ffffffff811c9493>] vfs_kern_mount+0x63/0xf0
[ 1.158207] [<ffffffff811cbb0e>] do_mount+0x23e/0xa20
[ 1.158207] [<ffffffff8115b51b>] ? strndup_user+0x4b/0xf0
[ 1.158207] [<ffffffff811cc373>] SyS_mount+0x83/0xc0
[ 1.158207] [<ffffffff81673cc2>] system_call_fastpath+0x16/0x1b
[ 1.158207] ---[ end trace 61981bc62de9f6f4 ]---
Signed-off-by: Seiji Aguchi <seiji.aguchi@hds.com>
Tested-by: Madper Xie <cxie@redhat.com>
Cc: stable@kernel.org
Signed-off-by: Matt Fleming <matt.fleming@intel.com>