Unlike the Armada XP and the Armada 370, this SoC uses a Cortex A9
core. Consequently, the procedure to enter the idle state is
different: interaction with the SCU, not disabling snooping, etc.
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Link: https://lkml.kernel.org/r/1406120453-29291-16-git-send-email-thomas.petazzoni@free-electrons.com
Signed-off-by: Jason Cooper <jason@lakedaemon.net>
This commit introduces the cpuidle support for Armada 370. The main
difference compared to the already supported Armada XP is that the
Armada 370 has an issue caused by "a slow exit process from the deep
idle state due to heavy L1/L2 cache cleanup operations performed by
the BootROM software" (cf errata GL-BootROM-10).
To work around this issue, we replace the restart code of the BootROM
by some custom code located in an internal SRAM. For this purpose, we
use the common function mvebu_boot_addr_wa() introduced in the commit
"ARM: mvebu: Add a common function for the boot address work around".
The message in case of failure to suspend the system was switched from
the warn level to the debug level. Indeed due to the "slow exit
process from the deep idle state" in Armada 370, this situation
happens quite often. Using the debug level avoids spamming the kernel
logs, but still allows to enable it if needed.
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Link: https://lkml.kernel.org/r/1406120453-29291-15-git-send-email-thomas.petazzoni@free-electrons.com
Signed-off-by: Jason Cooper <jason@lakedaemon.net>
This commit adds the list of cpuidle states supported by the Armada
38x SoC in the cpuidle-mvebu-v7 driver, as well as the necessary logic
around it to support this SoC.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lkml.kernel.org/r/1406120453-29291-14-git-send-email-thomas.petazzoni@free-electrons.com
Signed-off-by: Jason Cooper <jason@lakedaemon.net>
This commit adds the list of cpuidle states supported by the Armada
370 SoC in the cpuidle-mvebu-v7 driver, as well as the necessary logic
around it to support this SoC.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lkml.kernel.org/r/1406120453-29291-13-git-send-email-thomas.petazzoni@free-electrons.com
Signed-off-by: Jason Cooper <jason@lakedaemon.net>
This driver will be able to manage the cpuidle for more SoCs than just
Armada 370 and XP. It will also support Armada 38x and potentially
other SoC of the Marvell Armada EBU family. To take this into account,
this patch renames the driver and its symbols.
It also changes the driver name from cpuidle-armada-370-xp to
cpuidle-armada-xp, because separate platform drivers will be
registered for the other SoC types. This change must be done
simultaneously in the cpuidle driver and in the PMSU code in order to
remain bisectable.
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lkml.kernel.org/r/1406120453-29291-12-git-send-email-thomas.petazzoni@free-electrons.com
Signed-off-by: Jason Cooper <jason@lakedaemon.net>
The SCU address will be needed in other files than board-v7.c,
especially in pmsu.c for cpuidle related activities. So this patch
adds a function that allows to retrieve the virtual address at which
the SCU has been mapped.
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Link: https://lkml.kernel.org/r/1406120453-29291-10-git-send-email-thomas.petazzoni@free-electrons.com
Signed-off-by: Jason Cooper <jason@lakedaemon.net>
On some mvebu v7 SoCs (the ones using a Cortex-A9 core and not a PJ4B
core), the snoop disabling feature does not exist as the hardware
coherency is handled in a different way. Therefore, in preparation to
the introduction of the cpuidle support for those SoCs, this commit
modifies the mvebu_v7_psmu_idle_prepare() function to take several
flags, which allow to decide whether snooping should be disabled, and
whether we should use the deep idle mode or not.
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Link: https://lkml.kernel.org/r/1406120453-29291-9-git-send-email-thomas.petazzoni@free-electrons.com
Signed-off-by: Jason Cooper <jason@lakedaemon.net>
The resume address used by the cpuidle code will not always be the
same depending on the SoC. Using a local variable to store the resume
address allows to keep the same function for the PM notifier but with
a different address. This address will be set during the
initialization of the cpuidle logic in pmsu.c.
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Link: https://lkml.kernel.org/r/1406120453-29291-8-git-send-email-thomas.petazzoni@free-electrons.com
Signed-off-by: Jason Cooper <jason@lakedaemon.net>
In preparation to the addition of the cpuidle support for more SoCs,
this patch moves the Armada XP specific initialization to a separate
function.
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Link: https://lkml.kernel.org/r/1406120453-29291-7-git-send-email-thomas.petazzoni@free-electrons.com
Signed-off-by: Jason Cooper <jason@lakedaemon.net>
Most of the function related to the PMSU are not specific to the
Armada 370 or Armada XP SoCs. They can also be used for most of the
other mvebu ARMv7 SoCs, and will actually be used to support cpuidle
on Armada 38x.
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Link: https://lkml.kernel.org/r/1406120453-29291-6-git-send-email-thomas.petazzoni@free-electrons.com
Signed-off-by: Jason Cooper <jason@lakedaemon.net>
Use the common function mvebu_setup_boot_addr_wa() introduced in the
commit "ARM: mvebu: Add a common function for the boot address work
around" instead of the dedicated version for Armada 375.
This commit also moves the workaround in the system-controller
module. Indeed the workaround on 375 is really related to setting the
boot address which is done by the system controller.
As a bonus we no longer use an harcoded value to access the register
storing the boot address.
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Link: https://lkml.kernel.org/r/1406120453-29291-5-git-send-email-thomas.petazzoni@free-electrons.com
Signed-off-by: Jason Cooper <jason@lakedaemon.net>
On some of the mvebu SoCs and due to internal BootROM issue, the CPU
initial jump code must be placed in the SRAM memory of the SoC. In
order to achieve this, we have to unmap the BootROM and at some
specific location where the BootROM was placed, create a dedicated
MBus window for the SRAM. This SRAM is initialized with a few
instructions of code that allows to jump to the real secondary CPU
boot address. The SRAM used is the Crypto engine one.
This work around is currently needed for booting SMP on Armada 375 Z1
and will be needed for cpuidle support on Armada 370. Instead of
duplicating the same code, this commit introduces a common function to
handle it: mvebu_setup_boot_addr_wa().
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Link: https://lkml.kernel.org/r/1406120453-29291-4-git-send-email-thomas.petazzoni@free-electrons.com
Signed-off-by: Jason Cooper <jason@lakedaemon.net>
Sorting the headers in alphabetic order will help to reduce conflicts
when adding new headers later.
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Link: https://lkml.kernel.org/r/1406120453-29291-3-git-send-email-thomas.petazzoni@free-electrons.com
Signed-off-by: Jason Cooper <jason@lakedaemon.net>
do_armada_370_xp_cpu_suspend() and armada_370_xp_pmsu_idle_prepare(),
have been merged into a single function called
armada_370_xp_pmsu_idle_enter() by the commit "bbb92284b6 ARM:
mvebu: slightly refactor/rename PMSU idle related functions", in
prepare for the introduction of the CPU hotplug support for Armada XP.
But for cpuidle the prepare function will be common to all the mvebu
SoCs that use the PMSU, while the suspend function will be specific to
each SoC. Keeping the prepare function separate will help reducing
code duplication while new SoC support is added.
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Link: https://lkml.kernel.org/r/1406120453-29291-2-git-send-email-thomas.petazzoni@free-electrons.com
Signed-off-by: Jason Cooper <jason@lakedaemon.net>
Copy desc and buffer data even for ARQ events which return error status.
Previously, a check for NVM related AQ commands which is done later in this
function would not recognize that such a command was received and would
not clear nvm_busy flag. This would block access to NVM until a driver reset.
This will fix that.
Change-ID: If69ad74e165b56081c0686b97402511d2e2880c0
Signed-off-by: Kamil Krawczyk <kamil.krawczyk@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
We are intended to check up uflags against FS_PROJ_QUOTA rather than
FS_USER_UQUOTA once more, it looks to me like a typo, but might cause
the project quota metadata space can not be removed.
Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Remove the XFS_IS_OQUOTA_ON macros as it is obsoleted.
Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
* Fix SD2CKCR register address of r8a7791 (R-Car M2) SoC
This corrects a bug introduced in v3.14 by
59e79895b9 ("ARM: shmobile: r8a7791: Add clocks").
However, it does not manifest in mainline code until
SDHI devices were enabled on the Koelsch board in v3.15 by
2c60a7df72 ("ARM: shmobile: Add SDHI devices for Koelsch DTS").
It also manifests on the Henninger board when
SDHI devices were enabled in v3.16-rc1 by
1299df03d7 ("ARM: shmobile: henninger: add SDHI0/2 DT support")
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJT0FapAAoJENfPZGlqN0++socQAI25b++UQ89EiUQE/98yB/vq
ma63gl346dXdv6lb8iN+rx9Vk3wtzNuroozds2iulmAnutMTdQ83huektaYdPkQ4
T+RxOl4liJUbQRBg2+8UBtiaTB+oGQWbRkqt/MOvniTbOqYKs71Oym901ULlQMKL
9RSyWORUx5x2c+zZMJ0eU8T6Bl9hf0P0ikHwUv9ZYIQzrYxjJubdmyVoNI6slxF9
S4O4sUZfOSVqfSis5rEfWgG29jBYjlGdTyFXp1+kE19J02wikGRTjgM9bWkxarrF
wfa+ZQXjMywsdY8rCSxeNEYFuS3NW7s4ylRIC4UfpvSg1cvNVVE5m4BK0E2IOIyJ
kW/zSimYfh/RJSX+HMZ2hfipvFDE6dDOL5PHpwJ8fhWfM5Yinn1Wgfa2mUDxsPFe
CAfhtiTB56Rhzo+ln9UcNwaakZs/uSsG0jq3lI2zNTUd+1VMS+7xB/3pdKQjDGo/
75+UKvdcqJ4yprfcvKwIp0LsJCKWIm1hQTqSFWuYfpVaNbM+9CFQculw0YZxOWxc
xFr3LznFidUJCHKiM4ngnPBsLfDCdjVU5MyqDXcW9iVEKeSQJJ01CCORA4MjTUT5
niZEMcAoIMaeXeuVnL5FlRK5Owd6SgryWpy4XUlf+ioGLLE5EzuLqA9TJe4eo41P
caa5gpyUDbTjH1J4GktY
=SQyi
-----END PGP SIGNATURE-----
Merge tag 'renesas-fixes2-for-v3.16' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas into fixes
Merge "Second Round of Renesas ARM Based SoC Fixes for v3.16" from Simon Horman
* Fix SD2CKCR register address of r8a7791 (R-Car M2) SoC
This corrects a bug introduced in v3.14 by
59e79895b9 ("ARM: shmobile: r8a7791: Add clocks").
However, it does not manifest in mainline code until
SDHI devices were enabled on the Koelsch board in v3.15 by
2c60a7df72 ("ARM: shmobile: Add SDHI devices for Koelsch DTS").
It also manifests on the Henninger board when
SDHI devices were enabled in v3.16-rc1 by
1299df03d7 ("ARM: shmobile: henninger: add SDHI0/2 DT support")
* tag 'renesas-fixes2-for-v3.16' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas:
ARM: shmobile: r8a7791: Fix SD2CKCR register address
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
The CCN driver makes no sense without PERF_EVENTS, and trying to
build it when that option is disabled results in compile errors,
so it's best to just add a strong Kconfig dependency.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
In function cap_task_prctl(), we would allocate a credential
unconditionally and then check if we support the requested function.
If not we would release this credential with abort_creds() by using
RCU method. But on some archs such as powerpc, the sys_prctl is heavily
used to get/set the floating point exception mode. So the unnecessary
allocating/releasing of credential not only introduce runtime overhead
but also do cause OOM due to the RCU implementation.
This patch removes abort_creds() from cap_task_prctl() by calling
prepare_creds() only when we need to modify it.
Reported-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reviewed-by: Paul Moore <paul@paul-moore.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: James Morris <james.l.morris@oracle.com>
xt_hashlimit cannot be used with large hash tables, because garbage
collector is run from a timer. If table is really big, its possible
to hold cpu for more than 500 msec, which is unacceptable.
Switch to a work queue, and use proper scheduling points to remove
latencies spikes.
Later, we also could switch to a smoother garbage collection done
at lookup time, one bucket at a time...
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Florian Westphal <fw@strlen.de>
Cc: Patrick McHardy <kaber@trash.net>
Reviewed-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
We really don't need to delay an entire millisecond just to get into our
critical section. A microsecond will be sufficient, thank you.
Change-ID: I2d02ece6610007d98cabcb3f42df9a774bb54e59
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
xfs_set_inode32() caught my eye because it had weird spacing around
the "-1's". In cleaning that up, I realized that the assignment in
the declaration of "ino" is never used; it's rewritten before it
gets read.
Drop the ino initializer from its declaration since it's not used,
and move the agino initialization into the body of the function,
mostly so that we can have pretty whitespace and not exceed 80
columns. :)
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Today, if we perform an xfs_growfs which adds allocation groups,
mp->m_maxagi is not properly updated when the growfs is complete.
Therefore inodes will continue to be allocated only in the
AGs which existed prior to the growfs, and the new space
won't be utilized.
This is because of this path in xfs_growfs_data_private():
xfs_growfs_data_private
xfs_initialize_perag(mp, nagcount, &nagimax);
if (mp->m_flags & XFS_MOUNT_32BITINODES)
index = xfs_set_inode32(mp);
else
index = xfs_set_inode64(mp);
if (maxagi)
*maxagi = index;
where xfs_set_inode* iterates over the (old) agcount in
mp->m_sb.sb_agblocks, which has not yet been updated
in the growfs path. So "index" will be returned based on
the old agcount, not the new one, and new AGs are not available
for inode allocation.
Fix this by explicitly passing the proper AG count (which
xfs_initialize_perag() already has) down another level,
so that xfs_set_inode* can make the proper decision about
acceptable AGs for inode allocation in the potentially
newly-added AGs.
This has been broken since 3.7, when these two
xfs_set_inode* functions were added in commit 2d2194f.
Prior to that, we looped over "agcount" not sb_agblocks
in these calculations.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
xfs_qm_quotacheck() is not used outside of xfs_qm.c. Mark it static
and move it around in the file to avoid a forward declaration.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
When the CIL checkpoint is fully written to the log, the LSN of the checkpoint
commit record is written into the CIL context structure. This allows log force
waiters to correctly detect when the checkpoint they are waiting on have been
fully written into the log buffers.
However, the initial context after mount is initialised with a non-zero commit
LSN, so appears to waiters as though it is complete even though it may not have
even been pushed, let alone written to the log buffers. Hence a log force
immediately after a filesystem is mounted may not behave correctly, nor does
commit record ordering if multiple CIL pushes interleave immediately after
mount.
To fix this, make sure the initial context commit LSN is not touched until the
first checkpointis actually pushed.
[dchinner: rewrite commit message]
Signed-off-by: Mark Tinguely <tinguely@sgi.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
The hardware design requires that the driver avoid indicating
checksum offload success on some ipv6 frames with extension
headers.
The code needs to just check for the IPV6EXADD bit and if
it is set punt the checksum to the stack. I don't know why
the code was checking TCP on inner protocol, as that code
doesn't make any sense to me but seems wrong, so remove it.
Change-ID: I10d3aacdbb1819fb60b4b0eb80e6cc67ef2c9599
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-By: Jim Young <jamesx.m.young@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This implements a state machine intended to support the userland tool for
updating the device eeprom. The state machine implements one-shot reads,
writes, multi-step write sessions, and checksum requests. If we're in the middle
of a multi-step write session, no one should fire off other writes, however, one
shot reads are valid. The userland tool is expected to keep track of its session
status, arrange the placement and ordering of the writes, and deal with the
checksum requirement.
This patch also adds nvmupdate support to ethtool callbacks.
The get_eeprom() and set_eeprom() services in ethtool are used here to
facilitate the userland NVMUpdate tool. The 'magic' value in the get and
set commands is used to pass additional control information for managing
the read and write steps.
The read operation works both as normally expected in the standard ethtool
method, as well as with the extra NVM controls. The write operation
works only for the expanded NVM functions - the normal ethtool method is
not allowed because of the NVM semaphore management needed for multipart
writes, as well as the checksum requirement.
Change-ID: I1d84a170153a9f437906744e2e350fd68fe7563d
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
spotted by cppcheck
Signed-off-by: Toralf Förster <toralf.foerster@gmx.de>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Currently umount on symlink blocks following umount:
/vz is separate mount
# ls /vz/ -al | grep test
drwxr-xr-x. 2 root root 4096 Jul 19 01:14 testdir
lrwxrwxrwx. 1 root root 11 Jul 19 01:16 testlink -> /vz/testdir
# umount -l /vz/testlink
umount: /vz/testlink: not mounted (expected)
# lsof /vz
# umount /vz
umount: /vz: device is busy. (unexpected)
In this case mountpoint_last() gets an extra refcount on path->mnt
Signed-off-by: Vasily Averin <vvs@openvz.org>
Acked-by: Ian Kent <raven@themaw.net>
Acked-by: Jeff Layton <jlayton@primarydata.com>
Cc: stable@vger.kernel.org
Signed-off-by: Christoph Hellwig <hch@lst.de>
The following warnings:
fs/direct-io.c: In function ‘__blockdev_direct_IO’:
fs/direct-io.c:1011:12: warning: ‘to’ may be used uninitialized in this function [-Wmaybe-uninitialized]
fs/direct-io.c:913:16: note: ‘to’ was declared here
fs/direct-io.c:1011:12: warning: ‘from’ may be used uninitialized in this function [-Wmaybe-uninitialized]
fs/direct-io.c:913:10: note: ‘from’ was declared here
are false positive because dio_get_page() either fails, or sets both
'from' and 'to'.
Paul Bolle said ...
Maybe it's better to move initializing "to" and "from" out of
dio_get_page(). That _might_ make it easier for both the the reader and
the compiler to understand what's going on. Something like this:
Christoph Hellwig said ...
The fix of moving the code definitively looks nicer, while I think
uninitialized_var is horrible wart that won't get anywhere near my code.
Boaz Harrosh: I agree with Christoph and Paul
Signed-off-by: Boaz Harrosh <boaz@plexistor.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Bump version number.
Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
During suspend we call sched_clock_poll() to update the epoch and
accumulated time and reprogram the sched_clock_timer to fire
before the next wrap-around time. Unfortunately,
sched_clock_poll() doesn't restart the timer, instead it relies
on the hrtimer layer to do that and during suspend we aren't
calling that function from the hrtimer layer. Instead, we're
reprogramming the expires time while the hrtimer is enqueued,
which can cause the hrtimer tree to be corrupted. Furthermore, we
restart the timer during suspend but we update the epoch during
resume which seems counter-intuitive.
Let's fix this by saving the accumulated state and canceling the
timer during suspend. On resume we can update the epoch and
restart the timer similar to what we would do if we were starting
the clock for the first time.
Fixes: a08ca5d108 "sched_clock: Use an hrtimer instead of timer"
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Link: http://lkml.kernel.org/r/1406174630-23458-1-git-send-email-john.stultz@linaro.org
Cc: Ingo Molnar <mingo@kernel.org>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This patch adds a check and prints the error cause register value when
the hardware detects a malformed packet. This is a very unlikely
scenario but has been seen occasionally, so printing the message to
assist the user.
Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
From: Brian Foster <bfoster@redhat.com>
Commit 4d559a3b introduced heavy prealloc. squashing to catch the case
of requesting too large a prealloc on smaller filesystems, leading to
repeated flush and retry cycles that occur on ENOSPC. Now that we issue
eofblocks scans on EDQUOT/ENOSPC, squash the prealloc against the
minimum available free space across all applicable quotas as well to
avoid a similar problem of repeated eofblocks scans.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
From: Brian Foster <bfoster@redhat.com>
Speculative preallocation and and the associated throttling metrics
assume we're working with large files on large filesystems. Users have
reported inefficiencies in these mechanisms when we happen to be dealing
with large files on smaller filesystems. This can occur because while
prealloc throttling is aggressive under low free space conditions, it is
not active until we reach 5% free space or less.
For example, a 40GB filesystem has enough space for several files large
enough to have multi-GB preallocations at any given time. If those files
are slow growing, they might reserve preallocation for long periods of
time as well as avoid the background scanner due to frequent
modification. If a new file is written under these conditions, said file
has no access to this already reserved space and premature ENOSPC is
imminent.
To handle this scenario, modify the buffered write ENOSPC handling and
retry sequence to invoke an eofblocks scan. In the smaller filesystem
scenario, the eofblocks scan resets the usage of preallocation such that
when the 5% free space threshold is met, throttling effectively takes
over to provide fair and efficient preallocation until legitimate
ENOSPC.
The eofblocks scan is selective based on the nature of the failure. For
example, an EDQUOT failure in a particular quota will use a filtered
scan for that quota. Because we don't know which quota might have caused
an allocation failure at any given time, we include each applicable
quota determined to be under low free space conditions in the scan.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
From: Brian Foster <bfoster@redhat.com>
The eofblocks scan inode filter uses intersection logic by default.
E.g., specifying both user and group quota ids filters out inodes that
are not covered by both the specified user and group quotas. This is
suitable for behavior exposed to userspace.
Scans that are initiated from within the kernel might require more broad
semantics, such as scanning all inodes under each quota associated with
an inode to alleviate low free space conditions in each.
Create the XFS_EOF_FLAGS_UNION flag to support a conditional union-based
filtering algorithm for eofblocks scans. This flag is intentionally left
out of the valid mask as it is not supported for scans initiated from
userspace.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
This patch prevents the display of the minimum link qualification check
if we might be in a virtual machine. This check is incorrect and
misleading in this case, since we actually don't really know what the
available bandwidth is. To do so, we simply check whether each function
on the bus matches our device id. If it doesn't the most likely scenario
is that we're directly assigned to a virtual machine.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
From: Brian Foster <bfoster@redhat.com>
The scan owner field represents an optional inode number that is
responsible for the current scan. The purpose is to identify that an
inode is under iolock and as such, the iolock shouldn't be attempted
when trimming eofblocks. This is an internal only field.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Fix a bug in the misuse of the list_for_each macro to loop over every
entry in the bus_list. Instead of attempting to loop over the list from
a random entry point, go up to the bus and use the real list_head entry
point. This prevents the possible read or write of unallocated or
incorrectly addressed memory.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The architecture specification states that both DSB and ISB are required
between page table modifications and subsequent memory accesses using the
corresponding virtual address. When TLB invalidation takes place, the
tlb_flush_* functions already have the necessary barriers. However, there are
other functions like create_mapping() for which this is not the case.
The patch adds the DSB+ISB instructions in the set_pte() function for
valid kernel mappings. The invalid pte case is handled by tlb_flush_*
and the user mappings in general have a corresponding update_mmu_cache()
call containing a DSB. Even when update_mmu_cache() isn't called, the
kernel can still cope with an unlikely spurious page fault by
re-executing the instruction.
In addition, the set_pmd, set_pud() functions gain an ISB for
architecture compliance when block mappings are created.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Leif Lindholm <leif.lindholm@linaro.org>
Acked-by: Steve Capper <steve.capper@linaro.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: <stable@vger.kernel.org>
Change some uses of strncpy to use the more appropriate strlcpy
when clearing is not needed to prevent information leakage. Also
change some length arguments to use the preferred sizeof form.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Whilst I strongly advise against doing so for the implicit coherency
issues between the multiple buffer objects accessing the same backing
store, it nevertheless is a valid use case, akin to mmaping the same
file multiple times.
The reason why we forbade it earlier was that our use of the interval
tree for fast invalidation upon vma changes excluded overlapping
objects. So in the case where the user wishes to create such pairs of
overlapping objects, we degrade the range invalidation to walkin the
linear list of objects associated with the mm.
A situation where overlapping objects could arise is the lax implementation
of MIT-SHM Pixmaps in the xserver. A second situation is where the user
wishes to have different access modes to a region of memory (e.g. access
through a read-only userptr buffer and through a normal userptr buffer).
v2: Compile for mmu-notifiers after tweaking
v3: Rename is_linear/has_linear
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: "Li, Victor Y" <victor.y.li@intel.com>
Cc: "Kelley, Sean V" <sean.v.kelley@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: "Gong, Zhipeng" <zhipeng.gong@intel.com>
Cc: Akash Goel <akash.goel@intel.com>
Cc: "Volkin, Bradley D" <bradley.d.volkin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Let's march ahead with the deprecation plan laid out in
commit b30324adaf
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date: Wed Nov 13 22:11:25 2013 +0100
drm/i915: Deprecated UMS support
Thus far no regression report yet, so the transparent fallback plan
seems to pan out.
Cc: Dave Airlie <airlied@gmail.com>
Cc: David Herrmann <dh.herrmann@gmail.com>
Suggested-by: David Herrmann <dh.herrmann@gmail.com>
Acked-by: Dave Airlie <airlied@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The addition of the "select OF if ARM64" has led to a Kconfig
recursive dependency error when "make ARCH=sh rsk7269_defconfig"
was run. Since OF is selected by ARM64 and the of_property_read_bool
is defined no matter what, delete the Kconfig line that selects OF.
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
From: Jie Liu <jeff.liu@oracle.com>
Introduce xfs_bulkstat_grab_ichunk() to look up an inode chunk in where
the given inode resides, then grab the record. Update the data for the
pointed-to record if the inode was not the last in the chunk and there
are some left allocated, return the grabbed inode count on success.
Refactor xfs_bulkstat() with it.
Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
From: Jie Liu <jeff.liu@oracle.com>
Introduce xfs_bulkstat_ichunk_ra() to loop over all clusters in the
next inode chunk, then performs readahead if there are any allocated
inodes in that cluster.
Refactor xfs_bulkstat() with it.
Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
From: Jie Liu <jeff.liu@oracle.com>
We should not ignore the btree operation errors at xfs_bulkstat() but
to propagate them if any. This patch fix two places in this function
and the remaining things will be fixed with code refactoring thereafter.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>