I noticed that sometime I2C clock is kept enabled during suspend-resume.
This happens because runtime PM defers dynamic suspension and thus it may
happen that runtime PM is in active state when system enters into suspend.
In particular I2C controller that is used for CPU's DVFS is often kept ON
during suspend because CPU's voltage scaling happens quite often.
Fixes: 8ebf15e9c8 ("i2c: tegra: Move suspend handling to NOIRQ phase")
Cc: <stable@vger.kernel.org> # v5.4+
Tested-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Now that Gen4 is properly supported, advertise support in the module's
device ID table and add the same IDs to the list of switchtec quirks.
[logang@deltatee.com: add commit message and quirk IDs]
Link: https://lore.kernel.org/r/20200115035648.2578-8-logang@deltatee.com
Signed-off-by: Kelvin Cao <kelvin.cao@microchip.com>
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Gen4 hardware provides new MRPC commands to read and write directly from
any address in the PCI BAR (which Microsemi refers to as GAS). Since
accessing BARs can be dangerous and break the driver, we don't want
unprivileged users to have this ability.
Therefore, require CAP_SYS_ADMIN for the local and remote GAS access MRPC
commands. Privileged processes will already have access to the BAR through
the sysfs resource file so this doesn't give userspace any capabilities it
didn't already have.
[logang@deltatee.com: rework commit message]
Link: https://lore.kernel.org/r/20200106190337.2428-11-logang@deltatee.com
Signed-off-by: Kelvin Cao <kelvin.cao@microchip.com>
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Add the new flash_info registers struct and the implementation of
ioctl_flash_part_info() for the new Gen4 hardware.
[logang@deltatee.com: rewrote commit message]
Link: https://lore.kernel.org/r/20200115035648.2578-7-logang@deltatee.com
Signed-off-by: Kelvin Cao <kelvin.cao@microchip.com>
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Add the Gen4-specific system info registers and ensure their usage is
guarded by a check on the device's generation.
Link: https://lore.kernel.org/r/20200115035648.2578-6-logang@deltatee.com
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Since the sys_info and flash_info registers differ significantly in Gen4
hardware, separate out the Gen3 registers into their own structure with a
union in the main structure.
No functional changes intended.
Link: https://lore.kernel.org/r/20200115035648.2578-5-logang@deltatee.com
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Refactor ioctl_flash_part_info() into a Gen3-specific function because the
registers for flash partition information have changed significantly in
Gen4 and will require a completely different implementation.
No functional changes intended.
Co-developed-by: Kelvin Cao <kelvin.cao@microchip.com>
Link: https://lore.kernel.org/r/20200115035648.2578-4-logang@deltatee.com
Signed-off-by: Kelvin Cao <kelvin.cao@microchip.com>
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Add a generation variable passed through the device ID table and test for
Gen3-specific registers. This will allow us to add Gen4 and other devices
that extend the programming model.
Link: https://lore.kernel.org/r/20200115035648.2578-3-logang@deltatee.com
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Gen4 hardware will have different values for the SWITCHTEC_X_RUNNING and
SWITCHTEC_IOCTL_NUM_PARTITIONS, so rename them with GEN3 in their name.
No functional changes intended.
Link: https://lore.kernel.org/r/20200115035648.2578-2-logang@deltatee.com
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The event ID check doesn't depend on anything in the mask_all_events() to
mask_event() path. Do it in switchtec_event_isr() to avoid the CSR read in
mask_event().
Link: https://lore.kernel.org/r/20200106190337.2428-6-logang@deltatee.com
Signed-off-by: Wesley Sheng <wesley.sheng@microchip.com>
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Add support for the Inter Fabric Manager Communication (Intercomm) Notify
event in PAX variants of Switchtec hardware and the Upstream Error
Containment port in the MR1 release of Gen3 firmware.
Link: https://lore.kernel.org/r/20200106190337.2428-4-logang@deltatee.com
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Rather than open-code the extraction of the E0PD field from the MMFR2
register, we can use the cpuid_feature_extract_unsigned_field() helper
instead.
Acked-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Now that the decision to use non-global mappings is stored in a variable,
the check to avoid enabling them for the terminally broken ThunderX1
platform can be simplified so that it is only keyed off the MIDR value.
Acked-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
xfs_check_ondisk_structs() verifies that the sizes of the data types
used by xfs are correct via the XFS_CHECK_STRUCT_SIZE() macro.
Since the structures padding can vary depending on the ABI (e.g. on
ARM OABI structures are padded to multiple of 32 bits), it may happen
that xfs_dir2_sf_entry_t size check breaks the compilation with the
assertion below:
In file included from linux/include/linux/string.h:6,
from linux/include/linux/uuid.h:12,
from linux/fs/xfs/xfs_linux.h:10,
from linux/fs/xfs/xfs.h:22,
from linux/fs/xfs/xfs_super.c:7:
In function ‘xfs_check_ondisk_structs’,
inlined from ‘init_xfs_fs’ at linux/fs/xfs/xfs_super.c:2025:2:
linux/include/linux/compiler.h:350:38:
error: call to ‘__compiletime_assert_107’ declared with attribute
error: XFS: sizeof(xfs_dir2_sf_entry_t) is wrong, expected 3
_compiletime_assert(condition, msg, __compiletime_assert_, __LINE__)
Restore the correct behavior adding __packed to the structure definition.
Cc: Darrick J. Wong <darrick.wong@oracle.com>
Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Use the new 'as-instr' Kconfig macro to define CONFIG_BROKEN_GAS_INST
directly, making it available everywhere.
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
[will: Drop redundant 'y if' logic]
Signed-off-by: Will Deacon <will@kernel.org>
The SMMUv3 driver, which may be built without CONFIG_PCI, will soon gain
PASID support. Partially revert commit c6e9aefbf9 ("PCI/ATS: Remove
unused PRI and PASID stubs") to re-introduce the PASID stubs, and avoid
adding more #ifdefs to the SMMU driver.
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Will Deacon <will@kernel.org>
Although we WARN in arm_smmu_add_device() if the device being added has
been added already without a subsequent call to arm_smmu_remove_device(),
we still continue half-heartedly, initialising the stream-table for any
new StreamIDs that may have magically appeared and re-establishing device
links that should still be there from last time.
Given that calling ->add_device() twice without removing the device in the
meantime is indicative of an error in the caller, just return -EBUSY after
warning.
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Jean Philippe-Brucker <jean-philippe@linaro.org>
Signed-off-by: Will Deacon <will@kernel.org>
Let add_device() clean up after itself. The iommu_bus_init() function
does call remove_device() on error, but other sites (e.g. of_iommu) do
not.
Don't free level-2 stream tables because we'd have to track if we
allocated each of them or if they are used by other endpoints. It's not
worth the hassle since they are managed resources.
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Will Deacon <will@kernel.org>
If, for some bizarre reason, the compiler decided to split up the write
of STE DWORD 0, we could end up making a partial structure valid.
Although this probably won't happen, follow the example of the
context-descriptor code and use WRITE_ONCE() to ensure atomicity of the
write.
Reported-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Will Deacon <will@kernel.org>
The SMMU can support up to 20 bits of SSID. Add a second level of page
tables to accommodate this. Devices that support more than 1024 SSIDs now
have a table of 1024 L1 entries (8kB), pointing to tables of 1024 context
descriptors (64kB), allocated on demand.
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Will Deacon <will@kernel.org>
Second-level context descriptor tables will be allocated lazily in
arm_smmu_write_ctx_desc(). Help with handling allocation failure by
moving the CD write into arm_smmu_domain_finalise_s1().
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
[will: Add comment per discussion on list]
Signed-off-by: Will Deacon <will@kernel.org>
Now that we support substream IDs, initialize s1cdmax with the number of
SSID bits supported by a master and the SMMU.
Context descriptor tables are allocated once for the first master
attached to a domain. Therefore attaching multiple devices with
different SSID sizes is tricky, and we currently don't support it.
As a future improvement it would be nice to at least support attaching a
SSID-capable device to a domain that isn't using SSID, by reallocating
the SSID table. This would allow supporting a SSID-capable device that
is in the same IOMMU group as a bridge, for example. Varying SSID size
is less of a concern, since the PCIe specification "highly recommends"
that devices supporting PASID implement all 20 bits of it.
Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Will Deacon <will@kernel.org>
At the moment, the SMMUv3 driver implements only one stage-1 or stage-2
page directory per device. However SMMUv3 allows more than one address
space for some devices, by providing multiple stage-1 page directories. In
addition to the Stream ID (SID), that identifies a device, we can now have
Substream IDs (SSID) identifying an address space. In PCIe, SID is called
Requester ID (RID) and SSID is called Process Address-Space ID (PASID).
A complete stage-1 walk goes through the context descriptor table:
Stream tables Ctx. Desc. tables Page tables
+--------+ ,------->+-------+ ,------->+-------+
: : | : : | : :
+--------+ | +-------+ | +-------+
SID->| STE |---' SSID->| CD |---' IOVA->| PTE |--> IPA
+--------+ +-------+ +-------+
: : : : : :
+--------+ +-------+ +-------+
Rewrite arm_smmu_write_ctx_desc() to modify context descriptor table
entries. To keep things simple we only implement one level of context
descriptor tables here, but as with stream and page tables, an SSID can
be split to index multiple levels of tables.
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Will Deacon <will@kernel.org>
The test_cgcore_no_internal_process_constraint_on_threads selftest when
running with subsystem controlling noise triggers two warnings:
> [ 597.443115] WARNING: CPU: 1 PID: 28167 at kernel/cgroup/cgroup.c:3131 cgroup_apply_control_enable+0xe0/0x3f0
> [ 597.443413] WARNING: CPU: 1 PID: 28167 at kernel/cgroup/cgroup.c:3177 cgroup_apply_control_disable+0xa6/0x160
Both stem from a call to cgroup_type_write. The first warning was also
triggered by syzkaller.
When we're switching cgroup to threaded mode shortly after a subsystem
was disabled on it, we can see the respective subsystem css dying there.
The warning in cgroup_apply_control_enable is harmless in this case
since we're not adding new subsys anyway.
The warning in cgroup_apply_control_disable indicates an attempt to kill
css of recently disabled subsystem repeatedly.
The commit prevents these situations by making cgroup_type_write wait
for all dying csses to go away before re-applying subtree controls.
When at it, the locations of WARN_ON_ONCE calls are moved so that
warning is triggered only when we are about to misuse the dying css.
Reported-by: syzbot+5493b2a54d31d6aea629@syzkaller.appspotmail.com
Reported-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
The trace event class workqueue_work now has only one consumer, so get
rid of it. No functional change.
Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Tejun Heo <tj@kernel.org>
It's surprising that workqueue_execute_end includes only the work when
its counterpart workqueue_execute_start has both the work and the worker
function.
You can't set a tracing filter or trigger based on the function, and
postprocessing scripts interested in specific functions are harder to
write since they have to remember the work from _start and match it up
with the same field in _end.
Add the function name, taking care to use the copy stashed in the
worker since the work is no longer safe to touch.
Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Tejun Heo <tj@kernel.org>
Support for SSID will require allocating context descriptor tables. Move
the context descriptor allocation to separate functions.
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Will Deacon <will@kernel.org>
When adding SSID support to the SMMUv3 driver, we'll need to manipulate
leaf pasid tables and context descriptors. Extract the context
descriptor structure and align with the way stream tables are handled.
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Will Deacon <will@kernel.org>
Named component nodes in the IORT tables describe the number of
Substream ID bits (aka. PASID) supported by the device. Propagate this
value to the fwspec structure in order to enable PASID for platform
devices.
Acked-by: Hanjun Guo <guohanjun@huawei.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Will Deacon <will@kernel.org>
For platform devices that support SubstreamID (SSID), firmware provides
the number of supported SSID bits. Restrict it to what the SMMU supports
and cache it into master->ssid_bits, which will also be used for PCI
PASID.
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Will Deacon <will@kernel.org>
On Arm systems, some platform devices behind an SMMU may support the PASID
feature, which offers multiple address space. Let the firmware tell us
when a device supports PASID.
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Will Deacon <will@kernel.org>
Since commit 518a2f1925 ("dma-mapping: zero memory returned from
dma_alloc_*"), dma_alloc_* always initializes memory to zero, so there
is no need to use dma_zalloc_* or pass the __GFP_ZERO flag anymore.
The flag was introduced by commit 04fa26c71b ("iommu/arm-smmu: Convert
DMA buffer allocations to the managed API"), since the managed API
didn't provide a dmam_zalloc_coherent() function.
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Will Deacon <will@kernel.org>
Function name cgroup_rstat_cpu_pop_upated() in comment should be
cgroup_rstat_cpu_pop_updated().
Signed-off-by: Chen Zhou <chenzhou10@huawei.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
We find a bug when running test under nfsv3 as below.
1)
chacl u::r--,g::rwx,o:rw- file1
2)
chmod u+w file1
3)
chacl -l file1
We expect u::rw-, but it shows u::r--, more likely it returns the
cached acl in inode.
We dig the code find that the code path is different.
chacl->..->__nfs3_proc_setacls->nfs_zap_acl_cache
Then nfs_zap_acl_cache clears the NFS_INO_INVALID_ACL in
NFS_I(inode)->cache_validity.
chmod->..->nfs3_proc_setattr
Because NFS_INO_INVALID_ACL has been cleared by chacl path,
nfs_zap_acl_cache wont be called.
nfs_setattr_update_inode will set NFS_INO_INVALID_ACL so let it
before nfs_zap_acl_cache call.
Signed-off-by: Su Yanjun <suyanjun218@gmail.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Ever since the commit 0e0cb35b41, it's possible to lose an open stateid
while retrying a CLOSE due to ERR_OLD_STATEID. Once that happens,
operations that require openstateid fail with EAGAIN which is propagated
to the application then tests like generic/446 and generic/168 fail with
"Resource temporarily unavailable".
Instead of returning this error, initiate state recovery when possible to
recover the open stateid and then try calling nfs4_select_rw_stateid()
again.
Fixes: 0e0cb35b41 ("NFSv4: Handle NFS4ERR_OLD_STATEID in CLOSE/OPEN_DOWNGRADE")
Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
For the krb5i and krb5p mount, it was problematic to truncate the
received ACL to the provided buffer because an integrity check
could not be preformed.
Instead, provide enough pages to accommodate the largest buffer
bounded by the largest RPC receive buffer size.
Note: I don't think it's possible for the ACL to be truncated now.
Thus NFS4_ACL_TRUNC flag and related code could be possibly
removed but since I'm unsure, I'm leaving it.
v2: needs +1 page.
Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Add a mount option 'softreval' that allows attribute revalidation 'getattr'
calls to time out, and causes them to fall back to using the cached
attributes.
The use case for this option is for ensuring that we can still (slowly)
traverse paths and use cached information even when the server is down.
Once the server comes back up again, the getattr calls start succeeding,
and the caches will revalidate as usual.
The 'softreval' mount option is automatically enabled if you have
specified 'softerr'. It can be turned off using the options
'nosoftreval', or 'hard'.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
If we've already revalidated the inode once then don't distrust the
access cache unless the NFS_INO_INVALID_ACCESS flag is actually set.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
The 'hdr->good_bytes' is defined as the number of bytes we expect to
read or write starting at offset hdr->io_start. In the case of a partial
read/write we may end up adjusting hdr->args.offset and hdr->args.count
to skip I/O for data that was already read/written, and so we must ensure
the calculation takes that into account.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
If we're resending a write due to a short read or write, ensure we
reset the reply count to zero.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
On exit from nfs_do_access(), record the mask representing the requested
permissions, as well as the server-supplied set of access rights for
this user.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Clean up the generic file commit tracepoints to use a 64-bit value
for the verifier, and to display the pNFS filehandle, if it exists.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Clean up the generic writeback tracepoints so they do pass the
full structures as arguments. Also ensure we report the number
of bytes actually written.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Clean up the generic file read tracepoints so they do pass the
full structures as arguments. Also ensure we report the number
of bytes actually read.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
If the attempt to do pNFS fails, then record what action we
take to recover (resend, reset to pnfs or reset to mds).
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Casting a negative value to an unsigned long is not the same as
converting it to its absolute value.
Fixes: 96650e2eff ("NFS: Fix show_nfs_errors macros again")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Ensure we always return the number of bytes read/written. Also display
the pnfs filehandle if it is in use.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>