Commit graph

68773 commits

Author SHA1 Message Date
Lu Baolu
95587a75de iommu/vt-d: Add per-device IOMMU feature ops entries
This adds the iommu ops entries for aux-domain per-device
feature query and enable/disable.

Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Sanjay Kumar <sanjay.k.kumar@intel.com>
Signed-off-by: Liu Yi L <yi.l.liu@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-04-11 17:15:48 +02:00
Lu Baolu
d7cbc0f322 iommu/vt-d: Make intel_iommu_enable_pasid() more generic
This moves intel_iommu_enable_pasid() out of the scope of
CONFIG_INTEL_IOMMU_SVM with more and more features requiring
pasid function.

Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-04-11 17:15:48 +02:00
Jean-Philippe Brucker
26b25a2b98 iommu: Bind process address spaces to devices
Add bind() and unbind() operations to the IOMMU API.
iommu_sva_bind_device() binds a device to an mm, and returns a handle to
the bond, which is released by calling iommu_sva_unbind_device().

Each mm bound to devices gets a PASID (by convention, a 20-bit system-wide
ID representing the address space), which can be retrieved with
iommu_sva_get_pasid(). When programming DMA addresses, device drivers
include this PASID in a device-specific manner, to let the device access
the given address space. Since the process memory may be paged out, device
and IOMMU must support I/O page faults (e.g. PCI PRI).

Using iommu_sva_set_ops(), device drivers provide an mm_exit() callback
that is called by the IOMMU driver if the process exits before the device
driver called unbind(). In mm_exit(), device driver should disable DMA
from the given context, so that the core IOMMU can reallocate the PASID.
Whether the process exited or nor, the device driver should always release
the handle with unbind().

To use these functions, device driver must first enable the
IOMMU_DEV_FEAT_SVA device feature with iommu_dev_enable_feature().

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-04-11 17:08:52 +02:00
Lu Baolu
a3a195929d iommu: Add APIs for multiple domains per device
Sharing a physical PCI device in a finer-granularity way
is becoming a consensus in the industry. IOMMU vendors
are also engaging efforts to support such sharing as well
as possible. Among the efforts, the capability of support
finer-granularity DMA isolation is a common requirement
due to the security consideration. With finer-granularity
DMA isolation, subsets of a PCI function can be isolated
from each others by the IOMMU. As a result, there is a
request in software to attach multiple domains to a physical
PCI device. One example of such use model is the Intel
Scalable IOV [1] [2]. The Intel vt-d 3.0 spec [3] introduces
the scalable mode which enables PASID granularity DMA
isolation.

This adds the APIs to support multiple domains per device.
In order to ease the discussions, we call it 'a domain in
auxiliary mode' or simply 'auxiliary domain' when multiple
domains are attached to a physical device.

The APIs include:

* iommu_dev_has_feature(dev, IOMMU_DEV_FEAT_AUX)
  - Detect both IOMMU and PCI endpoint devices supporting
    the feature (aux-domain here) without the host driver
    dependency.

* iommu_dev_feature_enabled(dev, IOMMU_DEV_FEAT_AUX)
  - Check the enabling status of the feature (aux-domain
    here). The aux-domain interfaces are available only
    if this returns true.

* iommu_dev_enable/disable_feature(dev, IOMMU_DEV_FEAT_AUX)
  - Enable/disable device specific aux-domain feature.

* iommu_aux_attach_device(domain, dev)
  - Attaches @domain to @dev in the auxiliary mode. Multiple
    domains could be attached to a single device in the
    auxiliary mode with each domain representing an isolated
    address space for an assignable subset of the device.

* iommu_aux_detach_device(domain, dev)
  - Detach @domain which has been attached to @dev in the
    auxiliary mode.

* iommu_aux_get_pasid(domain, dev)
  - Return ID used for finer-granularity DMA translation.
    For the Intel Scalable IOV usage model, this will be
    a PASID. The device which supports Scalable IOV needs
    to write this ID to the device register so that DMA
    requests could be tagged with a right PASID prefix.

This has been updated with the latest proposal from Joerg
posted here [5].

Many people involved in discussions of this design.

Kevin Tian <kevin.tian@intel.com>
Liu Yi L <yi.l.liu@intel.com>
Ashok Raj <ashok.raj@intel.com>
Sanjay Kumar <sanjay.k.kumar@intel.com>
Jacob Pan <jacob.jun.pan@linux.intel.com>
Alex Williamson <alex.williamson@redhat.com>
Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Joerg Roedel <joro@8bytes.org>

and some discussions can be found here [4] [5].

[1] https://software.intel.com/en-us/download/intel-scalable-io-virtualization-technical-specification
[2] https://schd.ws/hosted_files/lc32018/00/LC3-SIOV-final.pdf
[3] https://software.intel.com/en-us/download/intel-virtualization-technology-for-directed-io-architecture-specification
[4] https://lkml.org/lkml/2018/7/26/4
[5] https://www.spinics.net/lists/iommu/msg31874.html

Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Cc: Liu Yi L <yi.l.liu@intel.com>
Suggested-by: Kevin Tian <kevin.tian@intel.com>
Suggested-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Suggested-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-04-11 17:02:51 +02:00
Jinyu Qi
14bd9a607f iommu/iova: Separate atomic variables to improve performance
In struct iova_domain, there are three atomic variables, the former two
are about TLB flush counters which use atomic_add operation, anoter is
used to flush timer that use cmpxhg operation.
These variables are in the same cache line, so it will cause some
performance loss under the condition that many cores call queue_iova
function, Let's isolate the two type atomic variables to different
cache line to reduce cache line conflict.

Cc: Joerg Roedel <joro@8bytes.org>
Signed-off-by: Jinyu Qi <jinyuqi@huawei.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-04-11 15:42:54 +02:00
Russell King
e6818d29ea gpio: gpio-omap: configure edge detection for level IRQs for idle wakeup
The GPIO block can enter idle independently of the CPU power management
calls via smart-idle.  When the GPIO block enters idle, level detection
stops working due to clocks being shut off, and an alternative form of
edge detection is used.  However, this needs the edge detection
registers set to mark the appropriate edges.

Arrange to configure the edge detection enables along with the level
detection to ensure that any transition to active interrupt state that
occurs while the block is idle is detected as a wake-up event.

Since we enable the edge detection when configuring the IRQ, both
omap2_gpio_enable_level_quirk() nor omap2_gpio_disable_level_quirk()
become redundant, which also means OMAP_GPIO_QUIRK_IDLE_REMOVE_TRIGGER
can be removed. This can be now done without regressions as patch
"gpio: gpio-omap: fix level interrupt idling" allows level interrupts
to idle on omap4 without a workaround.

Cc: Aaro Koskinen <aaro.koskinen@iki.fi>
Cc: Grygorii Strashko <grygorii.strashko@ti.com>
Cc: Keerthy <j-keerthy@ti.com>
Cc: Peter Ujfalusi <peter.ujfalusi@ti.com>
Cc: Tero Kristo <t-kristo@ti.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
[tony@atomide.com: update description for the fix dependency]
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2019-04-11 15:13:33 +02:00
Anson Huang
851826c756 firmware: imx: enable imx scu general irq function
The System Controller Firmware (SCFW) controls RTC, thermal
and WDOG etc., these resources' interrupt function are managed
by SCU. When any IRQ pending, SCU will notify Linux via MU general
interrupt channel #3, and Linux kernel needs to call SCU APIs
to get IRQ status and notify each module to handle the interrupt.

Since there is no data transmission for SCU IRQ notification, so
doorbell mode is used for this MU channel, and SCU driver will
use notifier mechanism to broadcast to every module which registers
the SCU block notifier.

Signed-off-by: Anson Huang <Anson.Huang@nxp.com>
Reviewed-by: Dong Aisheng <aisheng.dong@nxp.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
2019-04-11 15:31:02 +08:00
Si-Wei Liu
8065a779f1 failover: allow name change on IFF_UP slave interfaces
When a netdev appears through hot plug then gets enslaved by a failover
master that is already up and running, the slave will be opened
right away after getting enslaved. Today there's a race that userspace
(udev) may fail to rename the slave if the kernel (net_failover)
opens the slave earlier than when the userspace rename happens.
Unlike bond or team, the primary slave of failover can't be renamed by
userspace ahead of time, since the kernel initiated auto-enslavement is
unable to, or rather, is never meant to be synchronized with the rename
request from userspace.

As the failover slave interfaces are not designed to be operated
directly by userspace apps: IP configuration, filter rules with
regard to network traffic passing and etc., should all be done on master
interface. In general, userspace apps only care about the
name of master interface, while slave names are less important as long
as admin users can see reliable names that may carry
other information describing the netdev. For e.g., they can infer that
"ens3nsby" is a standby slave of "ens3", while for a
name like "eth0" they can't tell which master it belongs to.

Historically the name of IFF_UP interface can't be changed because
there might be admin script or management software that is already
relying on such behavior and assumes that the slave name can't be
changed once UP. But failover is special: with the in-kernel
auto-enslavement mechanism, the userspace expectation for device
enumeration and bring-up order is already broken. Previously initramfs
and various userspace config tools were modified to bypass failover
slaves because of auto-enslavement and duplicate MAC address. Similarly,
in case that users care about seeing reliable slave name, the new type
of failover slaves needs to be taken care of specifically in userspace
anyway.

It's less risky to lift up the rename restriction on failover slave
which is already UP. Although it's possible this change may potentially
break userspace component (most likely configuration scripts or
management software) that assumes slave name can't be changed while
UP, it's relatively a limited and controllable set among all userspace
components, which can be fixed specifically to listen for the rename
events on failover slaves. Userspace component interacting with slaves
is expected to be changed to operate on failover master interface
instead, as the failover slave is dynamic in nature which may come and
go at any point.  The goal is to make the role of failover slaves less
relevant, and userspace components should only deal with failover master
in the long run.

Fixes: 30c8bd5aa8 ("net: Introduce generic failover module")
Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
Reviewed-by: Liran Alon <liran.alon@oracle.com>
Acked-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-10 22:12:26 -07:00
Kimberly Brown
14948e3944 Drivers: hv: vmbus: Fix race condition with new ring_buffer_info mutex
Fix a race condition that can result in a ring buffer pointer being set
to null while a "_show" function is reading the ring buffer's data. This
problem was discussed here: https://lkml.org/lkml/2018/10/18/779

To fix the race condition, add a new mutex lock to the
"hv_ring_buffer_info" struct. Add a new function,
"hv_ringbuffer_pre_init()", where a channel's inbound and outbound
ring_buffer_info mutex locks are initialized.

Acquire/release the locks in the "hv_ringbuffer_cleanup()" function,
which is where the ring buffer pointers are set to null.

Acquire/release the locks in the four channel-level "_show" functions
that access ring buffer data. Remove the "const" qualifier from the
"vmbus_channel" parameter and the "rbi" variable of the channel-level
"_show" functions so that the locks can be acquired/released in these
functions.

Acquire/release the locks in hv_ringbuffer_get_debuginfo(). Remove the
"const" qualifier from the "hv_ring_buffer_info" parameter so that the
locks can be acquired/released in this function.

Signed-off-by: Kimberly Brown <kimbrownkd@gmail.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-04-10 18:58:56 -04:00
David Müller
7c2e071300 clk: x86: Add system specific quirk to mark clocks as critical
Since commit 648e921888 ("clk: x86: Stop marking clocks as
CLK_IS_CRITICAL"), the pmc_plt_clocks of the Bay Trail SoC are
unconditionally gated off. Unfortunately this will break systems where these
clocks are used for external purposes beyond the kernel's knowledge. Fix it
by implementing a system specific quirk to mark the necessary pmc_plt_clks as
critical.

Fixes: 648e921888 ("clk: x86: Stop marking clocks as CLK_IS_CRITICAL")
Signed-off-by: David Müller <dave.mueller@gmx.ch>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2019-04-10 15:54:12 -07:00
Chris Packham
ecb0abc1d8 of: use correct function prototype for of_overlay_fdt_apply()
When CONFIG_OF_OVERLAY is not enabled the fallback stub for
of_overlay_fdt_apply() does not match the prototype for the case when
CONFIG_OF_OVERLAY is enabled. Update the stub to use the correct
function prototype.

Fixes: 39a751a4cb ("of: change overlay apply input data from unflattened to FDT")
Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Reviewed-by: Frank Rowand <frank.rowand@sony.com>
Signed-off-by: Rob Herring <robh@kernel.org>
2019-04-10 16:33:34 -05:00
Jason Gunthorpe
5331fa0db7 Merge branch 'mlx5-next' into rdma.git for-next
From
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux

Required for dependencies on the next series

* branch 'mlx5-next':
  net/mlx5: E-Switch, add a new prio to be used by the RDMA side
  net/mlx5: E-Switch, don't use hardcoded values for FDB prios
  net/mlx5: Fix false compilation warning
  net/mlx5: Expose MPEIN (Management PCIE INfo) register layout
  net/mlx5: Add rate limit print macros
  net/mlx5: Add explicit bar address field
  net/mlx5: Replace dev_err/warn/info by mlx5_core_err/warn/info
  net/mlx5: Use dev->priv.name instead of dev_name
  net/mlx5: Make mlx5_core messages independent from mdev->pdev
  net/mlx5: Break load_one into three stages
  net/mlx5: Function setup/teardown procedures
  net/mlx5: Move health and page alloc init to mdev_init
  net/mlx5: Split mdev init and pci init
  net/mlx5: Remove redundant init functions parameter
  net/mlx5: Remove spinlock support from mlx5_write64
  net/mlx5: Remove unused MLX5_*_DOORBELL_LOCK macros

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-10 14:59:27 -03:00
Jann Horn
0b9dc6c9f0 keys: safe concurrent user->{session,uid}_keyring access
The current code can perform concurrent updates and reads on
user->session_keyring and user->uid_keyring. Add a comment to
struct user_struct to document the nontrivial locking semantics, and use
READ_ONCE() for unlocked readers and smp_store_release() for writers to
prevent memory ordering issues.

Fixes: 69664cf16a ("keys: don't generate user and user session keyrings unless they're accessed")
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: James Morris <james.morris@microsoft.com>
2019-04-10 10:29:50 -07:00
Jann Horn
5c7e372caa security: don't use RCU accessors for cred->session_keyring
sparse complains that a bunch of places in kernel/cred.c access
cred->session_keyring without the RCU helpers required by the __rcu
annotation.

cred->session_keyring is written in the following places:

 - prepare_kernel_cred() [in a new cred struct]
 - keyctl_session_to_parent() [in a new cred struct]
 - prepare_creds [in a new cred struct, via memcpy]
 - install_session_keyring_to_cred()
  - from install_session_keyring() on new creds
  - from join_session_keyring() on new creds [twice]
  - from umh_keys_init()
   - from call_usermodehelper_exec_async() on new creds

All of these writes are before the creds are committed; therefore,
cred->session_keyring doesn't need RCU protection.

Remove the __rcu annotation and fix up all existing users that use __rcu.

Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: James Morris <james.morris@microsoft.com>
2019-04-10 10:28:21 -07:00
Tony Lindgren
35667d812c Merge branch 'omap-for-v5.2/am4-ddr3' into omap-for-v5.2/am4-pm-v2 2019-04-10 09:06:01 -07:00
Ming Lei
1b8f21b74c blk-mq: introduce blk_mq_complete_request_sync()
In NVMe's error handler, follows the typical steps of tearing down
hardware for recovering controller:

1) stop blk_mq hw queues
2) stop the real hw queues
3) cancel in-flight requests via
	blk_mq_tagset_busy_iter(tags, cancel_request, ...)
cancel_request():
	mark the request as abort
	blk_mq_complete_request(req);
4) destroy real hw queues

However, there may be race between #3 and #4, because blk_mq_complete_request()
may run q->mq_ops->complete(rq) remotelly and asynchronously, and
->complete(rq) may be run after #4.

This patch introduces blk_mq_complete_request_sync() for fixing the
above race.

Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Bart Van Assche <bvanassche@acm.org>
Cc: James Smart <james.smart@broadcom.com>
Cc: linux-nvme@lists.infradead.org
Reviewed-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-04-10 09:57:33 -06:00
Waiman Long
364f784f04 locking/rwsem: Optimize rwsem structure for uncontended lock acquisition
For an uncontended rwsem, count and owner are the only fields a task
needs to touch when acquiring the rwsem. So they are put next to each
other to increase the chance that they will share the same cacheline.

On a ThunderX2 99xx (arm64) system with 32K L1 cache and 256K L2
cache, a rwsem locking microbenchmark with one locking thread was
run to write-lock and write-unlock an array of rwsems separated 2
cachelines apart in a 1M byte memory block. The locking rates (kops/s)
of the microbenchmark when the rwsems are at various "long" (8-byte)
offsets from beginning of the cacheline before and after the patch were
as follows:

  Cacheline Offset   Pre-patch    Post-patch
  ----------------   ---------    ----------
        0             17,449        16,588
        1             17,450        16,465
	2             17,450        16,460
	3             17,453        16,462
	4             14,867        16,471
	5             14,867        16,470
	6             14,853        16,464
	7             14,867        13,172

Before the patch, the count and owner are 4 "long"s apart. After the
patch, they are only 1 "long" apart.

The rwsem data have to be loaded from the L3 cache for each access. It
can be seen that the locking rates are more consistent after the patch
than before. Note that for this particular system, the performance
drop happens whenever the count and owner are at an odd multiples of
"long"s apart. No performance drop was observed when only a single rwsem
was used (hot cache). So the drop is likely just an idiosyncrasy of the
cache architecture of this chip than an inherent problem with the patch.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Waiman Long <longman@redhat.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Will Deacon <will.deacon@arm.com>
Link: http://lkml.kernel.org/r/20190404174320.22416-12-longman@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-04-10 10:56:06 +02:00
Waiman Long
12a30a7fc1 locking/rwsem: Move rwsem internal function declarations to rwsem-xadd.h
We don't need to expose rwsem internal functions which are not supposed
to be called directly from other kernel code.

Signed-off-by: Waiman Long <longman@redhat.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Link: http://lkml.kernel.org/r/20190404174320.22416-4-longman@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-04-10 10:56:00 +02:00
Ingo Molnar
54bbfe75cb Merge branch 'linus' into locking/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-04-10 09:14:42 +02:00
Andrew-sh.Cheng
2f36bde0fc OPP: Introduce dev_pm_opp_find_freq_ceil_by_volt()
This patch introduces a new helper routine in the OPP core, which
returns the OPP with the highest frequency which has voltage less than
or equal to the target voltage passed to the helper.

Signed-off-by: Andrew-sh.Cheng <andrew-sh.cheng@mediatek.com>
[ Viresh: Massaged the commit log and renamed the helper with some
cleanups. ]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2019-04-10 12:13:31 +05:30
Mark Bloch
d9cb06759e net/mlx5: E-Switch, add a new prio to be used by the RDMA side
Create a new prio in the FDB, it will be used when inserting steering rules
into the FDB from the RDMA side. We create a new PRIO so rules from the
net side and rules from the RDMA side won't be inserted to the same PRIO,
each side has it's own sandbox to play in.

Signed-off-by: Mark Bloch <markb@mellanox.com>
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2019-04-10 09:29:11 +03:00
Mark Bloch
b6d9ccb112 net/mlx5: E-Switch, don't use hardcoded values for FDB prios
When creating the FDB prios, use the enum values already defined and not
the hardcoded values.

Signed-off-by: Mark Bloch <markb@mellanox.com>
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2019-04-10 09:28:41 +03:00
Daniel Borkmann
2824ecb701 bpf: allow for key-less BTF in array map
Given we'll be reusing BPF array maps for global data/bss/rodata
sections, we need a way to associate BTF DataSec type as its map
value type. In usual cases we have this ugly BPF_ANNOTATE_KV_PAIR()
macro hack e.g. via 38d5d3b3d5 ("bpf: Introduce BPF_ANNOTATE_KV_PAIR")
to get initial map to type association going. While more use cases
for it are discouraged, this also won't work for global data since
the use of array map is a BPF loader detail and therefore unknown
at compilation time. For array maps with just a single entry we make
an exception in terms of BTF in that key type is declared optional
if value type is of DataSec type. The latter LLVM is guaranteed to
emit and it also aligns with how we regard global data maps as just
a plain buffer area reusing existing map facilities for allowing
things like introspection with existing tools.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-04-09 17:05:46 -07:00
Daniel Borkmann
87df15de44 bpf: add syscall side map freeze support
This patch adds a new BPF_MAP_FREEZE command which allows to
"freeze" the map globally as read-only / immutable from syscall
side.

Map permission handling has been refactored into map_get_sys_perms()
and drops FMODE_CAN_WRITE in case of locked map. Main use case is
to allow for setting up .rodata sections from the BPF ELF which
are loaded into the kernel, meaning BPF loader first allocates
map, sets up map value by copying .rodata section into it and once
complete, it calls BPF_MAP_FREEZE on the map fd to prevent further
modifications.

Right now BPF_MAP_FREEZE only takes map fd as argument while remaining
bpf_attr members are required to be zero. I didn't add write-only
locking here as counterpart since I don't have a concrete use-case
for it on my side, and I think it makes probably more sense to wait
once there is actually one. In that case bpf_attr can be extended
as usual with a flag field and/or others where flag 0 means that
we lock the map read-only hence this doesn't prevent to add further
extensions to BPF_MAP_FREEZE upon need.

A map creation flag like BPF_F_WRONCE was not considered for couple
of reasons: i) in case of a generic implementation, a map can consist
of more than just one element, thus there could be multiple map
updates needed to set the map into a state where it can then be
made immutable, ii) WRONCE indicates exact one-time write before
it is then set immutable. A generic implementation would set a bit
atomically on map update entry (if unset), indicating that every
subsequent update from then onwards will need to bail out there.
However, map updates can fail, so upon failure that flag would need
to be unset again and the update attempt would need to be repeated
for it to be eventually made immutable. While this can be made
race-free, this approach feels less clean and in combination with
reason i), it's not generic enough. A dedicated BPF_MAP_FREEZE
command directly sets the flag and caller has the guarantee that
map is immutable from syscall side upon successful return for any
future syscall invocations that would alter the map state, which
is also more intuitive from an API point of view. A command name
such as BPF_MAP_LOCK has been avoided as it's too close with BPF
map spin locks (which already has BPF_F_LOCK flag). BPF_MAP_FREEZE
is so far only enabled for privileged users.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-04-09 17:05:46 -07:00
Daniel Borkmann
591fe9888d bpf: add program side {rd, wr}only support for maps
This work adds two new map creation flags BPF_F_RDONLY_PROG
and BPF_F_WRONLY_PROG in order to allow for read-only or
write-only BPF maps from a BPF program side.

Today we have BPF_F_RDONLY and BPF_F_WRONLY, but this only
applies to system call side, meaning the BPF program has full
read/write access to the map as usual while bpf(2) calls with
map fd can either only read or write into the map depending
on the flags. BPF_F_RDONLY_PROG and BPF_F_WRONLY_PROG allows
for the exact opposite such that verifier is going to reject
program loads if write into a read-only map or a read into a
write-only map is detected. For read-only map case also some
helpers are forbidden for programs that would alter the map
state such as map deletion, update, etc. As opposed to the two
BPF_F_RDONLY / BPF_F_WRONLY flags, BPF_F_RDONLY_PROG as well
as BPF_F_WRONLY_PROG really do correspond to the map lifetime.

We've enabled this generic map extension to various non-special
maps holding normal user data: array, hash, lru, lpm, local
storage, queue and stack. Further generic map types could be
followed up in future depending on use-case. Main use case
here is to forbid writes into .rodata map values from verifier
side.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-04-09 17:05:46 -07:00
Daniel Borkmann
d8eca5bbb2 bpf: implement lookup-free direct value access for maps
This generic extension to BPF maps allows for directly loading
an address residing inside a BPF map value as a single BPF
ldimm64 instruction!

The idea is similar to what BPF_PSEUDO_MAP_FD does today, which
is a special src_reg flag for ldimm64 instruction that indicates
that inside the first part of the double insns's imm field is a
file descriptor which the verifier then replaces as a full 64bit
address of the map into both imm parts. For the newly added
BPF_PSEUDO_MAP_VALUE src_reg flag, the idea is the following:
the first part of the double insns's imm field is again a file
descriptor corresponding to the map, and the second part of the
imm field is an offset into the value. The verifier will then
replace both imm parts with an address that points into the BPF
map value at the given value offset for maps that support this
operation. Currently supported is array map with single entry.
It is possible to support more than just single map element by
reusing both 16bit off fields of the insns as a map index, so
full array map lookup could be expressed that way. It hasn't
been implemented here due to lack of concrete use case, but
could easily be done so in future in a compatible way, since
both off fields right now have to be 0 and would correctly
denote a map index 0.

The BPF_PSEUDO_MAP_VALUE is a distinct flag as otherwise with
BPF_PSEUDO_MAP_FD we could not differ offset 0 between load of
map pointer versus load of map's value at offset 0, and changing
BPF_PSEUDO_MAP_FD's encoding into off by one to differ between
regular map pointer and map value pointer would add unnecessary
complexity and increases barrier for debugability thus less
suitable. Using the second part of the imm field as an offset
into the value does /not/ come with limitations since maximum
possible value size is in u32 universe anyway.

This optimization allows for efficiently retrieving an address
to a map value memory area without having to issue a helper call
which needs to prepare registers according to calling convention,
etc, without needing the extra NULL test, and without having to
add the offset in an additional instruction to the value base
pointer. The verifier then treats the destination register as
PTR_TO_MAP_VALUE with constant reg->off from the user passed
offset from the second imm field, and guarantees that this is
within bounds of the map value. Any subsequent operations are
normally treated as typical map value handling without anything
extra needed from verification side.

The two map operations for direct value access have been added to
array map for now. In future other types could be supported as
well depending on the use case. The main use case for this commit
is to allow for BPF loader support for global variables that
reside in .data/.rodata/.bss sections such that we can directly
load the address of them with minimal additional infrastructure
required. Loader support has been added in subsequent commits for
libbpf library.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-04-09 17:05:46 -07:00
Al Viro
ab1152dd56 unexport d_alloc_pseudo()
No modular uses since introducion of alloc_file_pseudo(),
and the only non-modular user not in alloc_file_pseudo()
had actually been wrong - should've been d_alloc_anon().

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-04-09 19:20:46 -04:00
Al Viro
5467a68cbf dcache: sort the freeing-without-RCU-delay mess for good.
For lockless accesses to dentries we don't have pinned we rely
(among other things) upon having an RCU delay between dropping
the last reference and actually freeing the memory.

On the other hand, for things like pipes and sockets we neither
do that kind of lockless access, nor want to deal with the
overhead of an RCU delay every time a socket gets closed.

So delay was made optional - setting DCACHE_RCUACCESS in ->d_flags
made sure it would happen.  We tried to avoid setting it unless
we knew we need it.  Unfortunately, that had led to recurring
class of bugs, in which we missed the need to set it.

We only really need it for dentries that are created by
d_alloc_pseudo(), so let's not bother with trying to be smart -
just make having an RCU delay the default.  The ones that do
*not* get it set the replacement flag (DCACHE_NORCU) and we'd
better use that sparingly.  d_alloc_pseudo() is the only
such user right now.

FWIW, the race that finally prompted that switch had been
between __lock_parent() of immediate subdirectory of what's
currently the root of a disconnected tree (e.g. from
open-by-handle in progress) racing with d_splice_alias()
elsewhere picking another alias for the same inode, either
on outright corrupted fs image, or (in case of open-by-handle
on NFS) that subdirectory having been just moved on server.
It's not easy to hit, so the sky is not falling, but that's
not the first race on similar missed cases and the logics
for settinf DCACHE_RCUACCESS has gotten ridiculously
convoluted.

Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-04-09 19:18:04 -04:00
Ulf Hansson
6f9b83ac87 cpuidle: Export the next timer expiration for CPUs
To be able to predict the sleep duration for a CPU entering idle, it
is essential to know the expiration time of the next timer.  Both the
teo and the menu cpuidle governors already use this information for
CPU idle state selection.

Moving forward, a similar prediction needs to be made for a group of
idle CPUs rather than for a single one and the following changes
implement a new genpd governor for that purpose.

In order to support that feature, add a new function called
tick_nohz_get_next_hrtimer() that will return the next hrtimer
expiration time of a given CPU to be invoked after deciding
whether or not to stop the scheduler tick on that CPU.

Make the cpuidle core call tick_nohz_get_next_hrtimer() right
before invoking the ->enter() callback provided by the cpuidle
driver for the given state and store its return value in the
per-CPU struct cpuidle_device, so as to make it available to code
outside of cpuidle.

Note that at the point when cpuidle calls tick_nohz_get_next_hrtimer(),
the governor's ->select() callback has already returned and indicated
whether or not the tick should be stopped, so in fact the value
returned by tick_nohz_get_next_hrtimer() always is the next hrtimer
expiration time for the given CPU, possibly including the tick (if
it hasn't been stopped).

Co-developed-by: Lina Iyer <lina.iyer@linaro.org>
Co-developed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
[ rjw: Subject & changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-04-10 00:32:34 +02:00
Ulf Hansson
eb594b7325 PM / Domains: Add support for CPU devices to genpd
To enable a CPU device to be attached to a PM domain managed by genpd,
make a few changes to it for convenience.

To be able to quickly find out what CPUs are attached to a genpd,
which typically becomes useful from a genpd governor as subsequent
changes are about to show, add a cpumask to struct generic_pm_domain
to be updated when a CPU device gets attached to the genpd containing
that cpumask. Also, propagate the cpumask changes upwards in the
domain hierarchy to the master PM domains. This way, the cpumask for
a genpd hierarchically reflects all CPUs attached to the topology
below it.

Finally, make this an opt-in feature, to avoid having to manage CPUs
and the cpumask for a genpd that don't need it. To that end, add
a new genpd configuration bit, GENPD_FLAG_CPU_DOMAIN.

Co-developed-by: Lina Iyer <lina.iyer@linaro.org>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
[ rjw: Changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-04-10 00:32:34 +02:00
Ulf Hansson
49a27e2790 PM / Domains: Add generic data pointer to struct genpd_power_state
Add a data pointer to the genpd_power_state struct, to allow a genpd
backend driver to store per-state specific data. To introduce the
pointer, change the way genpd deals with freeing of the corresponding
allocated data.

More precisely, clarify the responsibility of whom that shall free the
data, by adding a ->free_states() callback to the generic_pm_domain
structure. The one allocating the data will be expected to set the
callback, to allow genpd to invoke it from genpd_remove().

Co-developed-by: Lina Iyer <lina.iyer@linaro.org>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
[ rjw: Subject & changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-04-10 00:32:34 +02:00
Tobin C. Harding
8c1007fdc7 docs: Add colon clearing sphinx warning
Sphinx emits various warnings all caused by a missing colon before code
block:

	WARNING: Block quote ends without a blank line; unexpected unindent.
	ERROR: Unexpected indentation.
	WARNING: Block quote ends without a blank line; unexpected unindent.

Add the colon, clearing sphinx warnings.

Signed-off-by: Tobin C. Harding <tobin@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2019-04-09 15:14:49 -06:00
Dave Gerlach
6c110561eb memory: ti-emif-sram: Add ti_emif_run_hw_leveling for DDR3 hardware leveling
In certain situations, such as when returning from low power modes, the
EMIF must re-run hardware leveling to properly restore DDR3 access.

This is accomplished by introducing a new ti-emif-sram-pm call,
ti_emif_run_hw_leveling, to check if DDR3 is in use and if so, trigger
the full write and read leveling processes.

Suggested-by: Brad Griffis <bgriffis@ti.com>
Signed-off-by: Dave Gerlach <d-gerlach@ti.com>
Acked-by: Santosh Shilimkar <ssantosh@kernel.org>
Signed-off-by: Tony Lindgren <tony@atomide.com>
2019-04-09 08:31:51 -07:00
Paul E. McKenney
6cdbc07a5a Merge branches 'consolidate.2019.04.09a', 'doc.2019.03.26b', 'fixes.2019.03.26b', 'srcu.2019.03.26b', 'stall.2019.03.26b' and 'torture.2019.03.26b' into HEAD
consolidate.2019.04.09a: Lingering RCU flavor consolidation cleanups.
doc.2019.03.26b: Documentation updates.
fixes.2019.03.26b: Miscellaneous fixes.
srcu.2019.03.26b: SRCU updates.
stall.2019.03.26b: RCU CPU stall warning updates.
torture.2019.03.26b: Torture-test updates.
2019-04-09 08:08:13 -07:00
David S. Miller
310655b07a Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2019-04-08 23:39:36 -07:00
Tobin C. Harding
458a3bf82d lib/string: Add strscpy_pad() function
We have a function to copy strings safely and we have a function to copy
strings and zero the tail of the destination (if source string is
shorter than destination buffer) but we do not have a function to do
both at once.  This means developers must write this themselves if they
desire this functionality.  This is a chore, and also leaves us open to
off by one errors unnecessarily.

Add a function that calls strscpy() then memset()s the tail to zero if
the source string is shorter than the destination buffer.

Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Tobin C. Harding <tobin@kernel.org>
Signed-off-by: Shuah Khan <shuah@kernel.org>
2019-04-08 16:44:21 -06:00
Heiner Kallweit
3b8b11f966 net: phy: improve link partner capability detection
genphy_read_status() so far checks phydev->supported, not the actual
PHY capabilities. This can make a difference if the supported speeds
have been limited by of_set_phy_supported() or phy_set_max_speed().

It seems that this issue only affects the link partner advertisements
as displayed by ethtool. Also this patch wouldn't apply to older
kernels because linkmode bitmaps have been introduced recently.
Therefore net-next.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-08 15:18:00 -07:00
David S. Miller
8bb309e67f mlx5-updates-2019-04-02
This series provides misc updates to mlx5 driver
 
 1) Aya Levin (1): Handle event of power detection in the PCIE slot
 
 2) Eli Britstein (6):
   Some TC VLAN related updates and fixes to the previous VLAN modify action
   support patchset.
   Offload TC e-switch rules with egress/ingress VLAN devices
 
 3) Max Gurtovoy (1): Fix double mutex initialization in esiwtch.c
 
 4) Tariq Toukan (3): Misc small updates
   A write memory barrier is sufficient in EQ ci update
   Obsolete param field holding a constant value
   Unify logic of MTU boundaries
 
 5) Tonghao Zhang (4): Misc updates to en_tc.c
   Make the log friendly when decapsulation offload not supported
   Remove 'parse_attr' argument in parse_tc_fdb_actions()
   Deletes unnecessary setting of esw_attr->parse_attr
   Return -EOPNOTSUPP when attempting to offload an unsupported action
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJcp8RLAAoJEEg/ir3gV/o+etQH/ArD5o0gKsQdro02oLIQ97t8
 t3DrT07jv+C3sHzV1uVK76mikZdao7Dgjj132quB5HLEnZfpJ0HWbbQ1ZGAd124P
 3vaONL47bDAxJM/5P4JM18dtQrLNJEJ9vPS3fK5HyR6qpnjbXSVKnwdN5cFtidoj
 B+CGxDFizx9WuYaRugrW5NVatHvZIgfigYf1LctrDyVV8yzJLwb+5xiDMJ9c6v28
 QONVpvfuwk294T/Hs1mN3z1V4IrypV1ZuSKcbXIklFdabV+p0tdn6OYTOmtyQ0U7
 XwIomQIn0QqU5CHPAMdgANymle2Qb+qx9fRZ+4hpuPdLIFM/BAP35ZEofVNVMfg=
 =2qTS
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-updates-2019-04-02' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mamameed says:

====================
mlx5-updates-2019-04-02

This series provides misc updates to mlx5 driver

1) Aya Levin (1): Handle event of power detection in the PCIE slot

2) Eli Britstein (6):
  Some TC VLAN related updates and fixes to the previous VLAN modify action
  support patchset.
  Offload TC e-switch rules with egress/ingress VLAN devices

3) Max Gurtovoy (1): Fix double mutex initialization in esiwtch.c

4) Tariq Toukan (3): Misc small updates
  A write memory barrier is sufficient in EQ ci update
  Obsolete param field holding a constant value
  Unify logic of MTU boundaries

5) Tonghao Zhang (4): Misc updates to en_tc.c
  Make the log friendly when decapsulation offload not supported
  Remove 'parse_attr' argument in parse_tc_fdb_actions()
  Deletes unnecessary setting of esw_attr->parse_attr
  Return -EOPNOTSUPP when attempting to offload an unsupported action
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-08 14:31:25 -07:00
Florian Westphal
3b0a081db1 netfilter: make two functions static
They have no external callers anymore.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-04-08 23:28:33 +02:00
Fernando Fernandez Mancera
22c7652cda netfilter: nft_osf: Add version option support
Add version option support to the nftables "osf" expression.

Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-04-08 23:27:12 +02:00
Cornelia Huck
cf94db2190 virtio: Honour 'may_reduce_num' in vring_create_virtqueue
vring_create_virtqueue() allows the caller to specify via the
may_reduce_num parameter whether the vring code is allowed to
allocate a smaller ring than specified.

However, the split ring allocation code tries to allocate a
smaller ring on allocation failure regardless of what the
caller specified. This may cause trouble for e.g. virtio-pci
in legacy mode, which does not support ring resizing. (The
packed ring code does not resize in any case.)

Let's fix this by bailing out immediately in the split ring code
if the requested size cannot be allocated and may_reduce_num has
not been specified.

While at it, fix a typo in the usage instructions.

Fixes: 2a2d1382fe ("virtio: Add improved queue allocation API")
Cc: stable@vger.kernel.org # v4.6+
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
Reviewed-by: Jens Freimann <jfreimann@redhat.com>
2019-04-08 17:05:52 -04:00
Florian Westphal
4806e97572 netfilter: replace NF_NAT_NEEDED with IS_ENABLED(CONFIG_NF_NAT)
NF_NAT_NEEDED is true whenever nat support for either ipv4 or ipv6 is
enabled.  Now that the af-specific nat configuration switches have been
removed, IS_ENABLED(CONFIG_NF_NAT) has the same effect.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-04-08 23:02:52 +02:00
Florian Westphal
c1deb065cf netfilter: nf_tables: merge route type into core
very little code, so it really doesn't make sense to have extra
modules or even a kconfig knob for this.

Merge them and make functionality available unconditionally.
The merge makes inet family route support trivial, so add it
as well here.

Before:
   text	   data	    bss	    dec	    hex	filename
    835	    832	      0	   1667	    683 nft_chain_route_ipv4.ko
    870	    832	      0	   1702	    6a6	nft_chain_route_ipv6.ko
 111568	   2556	    529	 114653	  1bfdd	nf_tables.ko

After:
   text	   data	    bss	    dec	    hex	filename
 113133	   2556	    529	 116218	  1c5fa	nf_tables.ko

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-04-08 23:01:42 +02:00
Li RongQing
01902f8c85 netfilter: optimize nf_inet_addr_cmp
optimize nf_inet_addr_cmp by 64bit xor computation
similar to ipv6_addr_equal()

Signed-off-by: Yuan Linsi <yuanlinsi01@baidu.com>
Signed-off-by: Li RongQing <lirongqing@baidu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-04-08 22:58:16 +02:00
Li RongQing
3b15d09f7e time: Introduce jiffies64_to_msecs()
there is a similar helper in net/netfilter/nf_tables_api.c,
this maybe become a common request someday, so move it to
time.c

Signed-off-by: Zhang Yu <zhangyu31@baidu.com>
Signed-off-by: Li RongQing <lirongqing@baidu.com>
Acked-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-04-08 22:56:14 +02:00
Keerthy
44c22a2d12 ARM: OMAP2+: pm33xx: Add support for rtc+ddr in self refresh mode
Add support for rtc+ddr in self refresh mode. Add addtional
pm hooks for save/restore and rtc suspend/resume.

Signed-off-by: Keerthy <j-keerthy@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
2019-04-08 10:39:01 -07:00
Keerthy
6256f7f7f2 rtc: OMAP: Add support for rtc-only mode
Prepare rtc driver for rtc-only with DDR in self-refresh mode.
omap_rtc_power_off now should cater to two features:

1) RTC plus DDR in self-refresh is power a saving mode where in the
entire system including the different voltage rails from PMIC are
shutdown except the ones feeding on to RTC and DDR. DDR is kept in
self-refresh hence the contents are preserved. RTC ALARM2 is connected
to PMIC_EN line once we the ALARM2 is triggered we enter the mode with
DDR in self-refresh and RTC Ticking. After a predetermined time an RTC
ALARM1 triggers waking up the system[1]. The control goes to bootloader.
The bootloader then checks RTC scratchpad registers to confirm it was an
rtc_only wakeup and follows a different path, configure bare minimal
clocks for ddr and then jumps to the resume address in another RTC
scratchpad registers and transfers the control to Kernel. Kernel then
restores the saved context. omap_rtc_power_off_program does the ALARM2
programming part.

     [1] http://www.ti.com/lit/ug/spruhl7h/spruhl7h.pdf Page 2884

2) Power-off: This is usual poweroff mode. omap_rtc_power_off calls the
above omap_rtc_power_off_program function and in addition to that
programs the OMAP_RTC_PMIC_REG for any external wake ups for PMIC like
the pushbutton and shuts off the PMIC.

Hence the split in omap_rtc_power_off.

Signed-off-by: Keerthy <j-keerthy@ti.com>
Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
[tony@atomide.com: folded in a fix for compile warning]
Signed-off-by: Tony Lindgren <tony@atomide.com>
2019-04-08 10:22:12 -07:00
Paolo Abeni
fd69c399c7 datagram: remove rendundant 'peeked' argument
After commit a297569fe0 ("net/udp: do not touch skb->peeked unless
really needed") the 'peeked' argument of __skb_try_recv_datagram()
and friends is always equal to !!'flags & MSG_PEEK'.

Since such argument is really a boolean info, and the callers have
already 'flags & MSG_PEEK' handy, we can remove it and clean-up the
code a bit.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-08 09:51:54 -07:00
Christoph Hellwig
d7e02a9312 dma-mapping: remove leftover NULL device support
Most dma_map_ops implementations already had some issues with a NULL
device, or did simply crash if one was fed to them.  Now that we have
cleaned up all the obvious offenders we can stop to pretend we
support this mode.

Signed-off-by: Christoph Hellwig <hch@lst.de>
2019-04-08 17:52:46 +02:00
Ming Lei
1200e07f3a block: don't use for-inside-for in bio_for_each_segment_all
Commit 6dc4f100c1 ("block: allow bio_for_each_segment_all() to
iterate over multi-page bvec") changes bio_for_each_segment_all()
to use for-inside-for.

This way breaks all bio_for_each_segment_all() call with error out
branch via 'break', since now 'break' can only break from the inner
loop.

Fixes this issue by implementing bio_for_each_segment_all() via
single 'for' loop, and now the logic is very similar with normal
bvec iterator.

Cc: Qu Wenruo <quwenruo.btrfs@gmx.com>
Cc: linux-btrfs@vger.kernel.org
Cc: linux-fsdevel@vger.kernel.org
Cc: Omar Sandoval <osandov@fb.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reported-and-Tested-by: Qu Wenruo <quwenruo.btrfs@gmx.com>
Fixes: 6dc4f100c1 ("block: allow bio_for_each_segment_all() to iterate over multi-page bvec")
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-04-08 08:19:41 -06:00