prefer min_t() macro over two open-coded logical tests
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Resolve new strict checkpatch hits
CHECK:UNNECESSARY_PARENTHESES: Unnecessary parentheses around ...
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Remove include of a no longer necessary header file.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Fix new checkpatch hits:
CHECK:LINE_SPACING: Please use a blank line after
function/struct/union/enum declarations
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Remove unnecessary return code variables and change function types
accordingly.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Change formal parameters to not clash with global names to
eliminate many W=2 warnings.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Due to a missing newline in the I-cache policy detection log output,
it's possible to get some ratehr unfortunate output at boot time:
CPU1: Booted secondary processor
Detected VIPT I-cache on CPU1CPU2: Booted secondary processor
Detected VIPT I-cache on CPU2CPU3: Booted secondary processor
Detected VIPT I-cache on CPU3CPU4: Booted secondary processor
Detected PIPT I-cache on CPU4CPU5: Booted secondary processor
Detected PIPT I-cache on CPU5Brought up 6 CPUs
SMP: Total of 6 processors activated.
This patch adds the missing newline to the format string, cleaning up
the output.
Fixes: 59ccc0d41b ("arm64: cachetype: report weakest cache policy")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The large segment table entry format has block of bits for the
ACC/F values for the large page. These bits are valid only if
another bit (AV bit 0x10000) of the segment table entry is set.
The ACC/F bits do not have a meaning if the AV bit is off.
This allows to put the THP splitting bit, the segment young bit
and the new segment dirty bit into the ACC/F bits as long as
the AV bit stays off. The dirty and young information is only
available if the pmd is large.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Fix printk format warnings (seen on i386 builds):
../drivers/scsi/u14-34f.c: In function 'port_detect':
../drivers/scsi/u14-34f.c:630:28: warning: format '%u' expects argument of type 'unsigned int', but argument 4 has type 'u64' [-Wformat=]
../drivers/scsi/u14-34f.c: In function 'u14_34f_queuecommand_lck':
../drivers/scsi/u14-34f.c:1290:25: warning: format '%llu' expects argument of type 'long long unsigned int', but argument 6 has type 'int' [-Wformat=]
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
The current implementation may mix the negative value returned from
pm8001_set_nvmd with count. -(-ENOMEM) could be interpreted as bytes
programmed, this patch fixes it.
Signed-off-by: Tomas Henzl <thenzl@redhat.com>
Signed-off-by: Suresh Thiagarajan <Suresh.Thiagarajan@pmcs.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
The loopcount is calculated by using some weird magic. Use instead a boring
macro.
Signed-off-by: Tomas Henzl <thenzl@redhat.com>
Acked-by: Suresh Thiagarajan <Suresh.Thiagarajan@pmcs.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Update pmcs mail list for pm8001 driver support
Signed-off-by: Suresh Thiagarajan <Suresh.Thiagarajan@pmcs.com>
Acked-by: Jack Wang <xjtuwjp@gmail.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
When a iscsi nop as ping timedout we were failing with the
common connection error code, ISCSI_ERR_CONN_FAILED. This
patch adds a new error code for this problem so can properly
track/distinguish in userspace.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Acked-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
When the get_host_stats call was not supported we were
returing EINVAL. This has us return ENOSYS, because for
software iscsi drivers where there is no host it is ok to not
have this callout.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Acked-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
iscsi_get_host_stats was dropping the error code returned
by drivers like qla4xxx.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Acked-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
qla4xxx was not always returning -EXYZ error codes when
qla4xxx_get_host_stats failed.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Acked-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Commit f0a3eaff71 (ARM64: KVM: fix big endian issue in
access_vm_reg for 32bit guest) changed the way we handle CP15
VM accesses, so that all 64bit accesses are done via vcpu_sys_reg.
This looks like a good idea as it solves indianness issues in an
elegant way, except for one small detail: the register index is
doesn't refer to the same array! We end up corrupting some random
data structure instead.
Fix this by reverting to the original code, except for the introduction
of a vcpu_cp15_64_high macro that deals with the endianness thing.
Tested on Juno with 32bit SMP guests.
Cc: Victor Kamensky <victor.kamensky@linaro.org>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
The bits-per-color is provided by the EDID normally, but if we're using
panels, we need to store it somewhere. So we add a field to the panel
descriptor for it.
Signed-off-by: Stéphane Marchesin <marcheu@chromium.org>
Signed-off-by: Thierry Reding <treding@nvidia.com>
commit ec66ad66a0 (s390/mm: enable
split page table lock for PMD level) activated the split pmd lock
for s390. Turns out that we missed one place: We also have to take
the pmd lock instead of the page table lock when we reallocate the
page tables (==> changing entries in the PMD) during sie enablement.
Cc: stable@vger.kernel.org # 3.15+
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Rename enable_PAV to enable_pav.
Signed-off-by: Stefan Haberland <stefan.haberland@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Fix coding style problems in arch/arm/mach-omap2/control.c.
Signed-off-by: Jeremy Vial <jvial@adeneo-embedded.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
According to the comment “restore_es3: applies to 34xx >= ES3.0" in
"arch/arm/mach-omap2/sleep34xx.S”, omap3_restore_es3 should be used
if the revision of an OMAP34xx is ES3.1.2.
Signed-off-by: Jeremy Vial <jvial@adeneo-embedded.com>
Cc: stable@vger.kernel.org
Signed-off-by: Tony Lindgren <tony@atomide.com>
Fix the broken check for calling sys_fallocate() on an active swapfile,
introduced by commit 0790b31b69 ("fs: disallow all fallocate
operation on active swapfile").
Signed-off-by: Eric Biggers <ebiggers3@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
The direct-io.c rewrite to use the iov_iter infrastructure stopped updating
the size field in struct dio_submit, and thus rendered the check for
allowing asynchronous completions to always return false. Fix this by
comparing it to the count of bytes in the iov_iter instead.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Tim Chen <tim.c.chen@linux.intel.com>
Tested-by: Tim Chen <tim.c.chen@linux.intel.com>
In current R-Car rsnd driver,
the SND_SOC_DAIFMT_xB_xF flags are used to HW default behavior,
but, it should be used to specific format.
The waveforms of LEFT_J/RIGHT_J format with
SND_SOC_DAIFMT_NB_NF flag will be
started from "falling edge" without this patch.
But, it should be started from "rising edge".
Reported-by: Jun Watanabe <jun.watanabe.ue@renesas.com>
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: Mark Brown <broonie@linaro.org>
When performing segmentation, the mac_len value is copied right
out of the original skb. However, this value is not always set correctly
(like when the packet is VLAN-tagged) and we'll end up copying a bad
value.
One way to demonstrate this is to configure a VM which tags
packets internally and turn off VLAN acceleration on the forwarding
bridge port. The packets show up corrupt like this:
16:18:24.985548 52:54:00🆎be:25 > 52:54:00:26:ce:a3, ethertype 802.1Q
(0x8100), length 1518: vlan 100, p 0, ethertype 0x05e0,
0x0000: 8cdb 1c7c 8cdb 0064 4006 b59d 0a00 6402 ...|...d@.....d.
0x0010: 0a00 6401 9e0d b441 0a5e 64ec 0330 14fa ..d....A.^d..0..
0x0020: 29e3 01c9 f871 0000 0101 080a 000a e833)....q.........3
0x0030: 000f 8c75 6e65 7470 6572 6600 6e65 7470 ...unetperf.netp
0x0040: 6572 6600 6e65 7470 6572 6600 6e65 7470 erf.netperf.netp
0x0050: 6572 6600 6e65 7470 6572 6600 6e65 7470 erf.netperf.netp
0x0060: 6572 6600 6e65 7470 6572 6600 6e65 7470 erf.netperf.netp
...
This also leads to awful throughput as GSO packets are dropped and
cause retransmissions.
The solution is to set the mac_len using the values already available
in then new skb. We've already adjusted all of the header offset, so we
might as well correctly figure out the mac_len using skb_reset_mac_len().
After this change, packets are segmented correctly and performance
is restored.
CC: Eric Dumazet <edumazet@google.com>
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Vrabel says:
====================
xen-netfront: more multiqueue fixes
A few more xen-netfront fixes for the multiqueue support added in
3.16-rc1. It would be great if these could make it into 3.16 but I
suspect it's a little late for that now.
The second patch fixes a significant resource leak that prevents
guests from migrating more than a handful of times.
These have been tested by repeatedly migrating a guest over 250 times
(it would previously fail with this guest after only 8 iterations).
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
When less than the requested number of queues could be created, include
the actual number in the warning (instead of the requested number).
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since netfront may reconnect to a backend with a different number of
queues, all per-queue Rx and Tx resources (skbs and grant references)
should be freed when disconnecting.
Without this fix, the Tx and Rx grant refs are not released and
netfront will exhaust them after only a few reconnections. netfront
will fail to connect when no free grant references are available.
Since all Rx bufs are freed and reallocated instead of reused this
will add some additional delay to the reconnection but this is
expected to be small compared to the time taken by any backend hotplug
scripts etc.
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If no queues could be created when connecting to the backend, one of the
error paths would deadlock.
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
1) We don't allocate enough space for the NUL terminator so we end up
corrupting one character beyond the end of the buffer.
2) The "len - 1" should just be "len". The code is trying to copy a
word from a buffer up to a comma or the last word in the buffer.
Say you have the buffer, "foo,bar,baz", then this code truncates the
last letter off each word so you get "fo", "ba", and "ba". You would
hope this kind of bug would get noticed in testing...
I'm not very familiar with this code and I can't test it, but I think
we should copy the final character.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Macvlan devices do not initialize vlan_features. As a result,
any vlan devices configured on top of macvlans perform very poorly.
Initialize vlan_features based on the vlan features of the lower-level
device.
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add functions to support ethtool EEE manipulating, and the EEE
is disabled in default setting to enhance the compatibility
with certain switch.
Signed-off-by: Freddy Xin <freddy@asix.com.tw>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use PAGE_ALIGNED(...) instead of IS_ALIGNED(..., PAGE_SIZE).
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
When dealing with ICMPv[46] Error Message, function icmp_socket_deliver()
and icmpv6_notify() do some valid checks on packet's length, but then some
protocols check packet's length redaudantly. So remove those duplicated
statements, and increase counter ICMP_MIB_INERRORS/ICMP6_MIB_INERRORS in
function icmp_socket_deliver() and icmpv6_notify() respectively.
In addition, add missed counter in udp6/udplite6 when socket is NULL.
Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The SCTP socket extensions API document describes the v4mapping option as
follows:
8.1.15. Set/Clear IPv4 Mapped Addresses (SCTP_I_WANT_MAPPED_V4_ADDR)
This socket option is a Boolean flag which turns on or off the
mapping of IPv4 addresses. If this option is turned on, then IPv4
addresses will be mapped to V6 representation. If this option is
turned off, then no mapping will be done of V4 addresses and a user
will receive both PF_INET6 and PF_INET type addresses on the socket.
See [RFC3542] for more details on mapped V6 addresses.
This description isn't really in line with what the code does though.
Introduce addr_to_user (renamed addr_v4map), which should be called
before any sockaddr is passed back to user space. The new function
places the sockaddr into the correct format depending on the
SCTP_I_WANT_MAPPED_V4_ADDR option.
Audit all places that touched v4mapped and either sanely construct
a v4 or v6 address then call addr_to_user, or drop the
unnecessary v4mapped check entirely.
Audit all places that call addr_to_user and verify they are on a sycall
return path.
Add a custom getname that formats the address properly.
Several bugs are addressed:
- SCTP_I_WANT_MAPPED_V4_ADDR=0 often returned garbage for
addresses to user space
- The addr_len returned from recvmsg was not correct when
returning AF_INET on a v6 socket
- flowlabel and scope_id were not zerod when promoting
a v4 to v6
- Some syscalls like bind and connect behaved differently
depending on v4mapped
Tested bind, getpeername, getsockname, connect, and recvmsg for proper
behaviour in v4mapped = 1 and 0 cases.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Tested-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Net_device is a vast and important structure, but it has no kernel-doc
compliant documentation. This patch extracts the comments from the structure
to clean it up, and let the scripts extract documentation from it. I know that
the patch is big, but it's just reordering of comments into the appropriate
form, and adding a few more, for the missing members.
Signed-off-by: Karoly Kemeny <karoly.kemeny@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Separate rawntlmssp authentication from CIFS_SessSetup(). Also cleanup
CIFS_SessSetup() since we no longer do any auth within it.
Signed-off-by: Sachin Prabhu <sprabhu@redhat.com>
Reviewed-by: Shirish Pargaonkar <spargaonkar@suse.com>
Signed-off-by: Steve French <smfrench@gmail.com>
In preparation for splitting CIFS_SessSetup() into smaller more
manageable chunks, we first add helper functions.
We then proceed to split out lanman auth out of CIFS_SessSetup()
Signed-off-by: Sachin Prabhu <sprabhu@redhat.com>
Reviewed-by: Shirish Pargaonkar <spargaonkar@suse.com>
Signed-off-by: Steve French <smfrench@gmail.com>
The functionality provided by free_rsp_buf() is duplicated in a number
of places. Replace these instances with a call to free_rsp_buf().
Signed-off-by: Sachin Prabhu <sprabhu@redhat.com>
Reviewed-by: Shirish Pargaonkar <spargaonkar@suse.com>
Signed-off-by: Steve French <smfrench@gmail.com>
The bitfield allocation function returns error condition
as a negative value, but in two cases its result
was assigned to an unsigned member of the hw_perf_event
structure, thus the error would not be ever detected.
Fixed by using an intermediate, signed variable.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Pawel Moll <pawel.moll@arm.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
This is based on tags/exynos-power because this DT changes
are depending PMU cleanup.
Fixes boot for exynos5260 and exynos5410,
- Since exynos cannot boot without obtaining PMU address via
DT from now on, add PMU node for exynos5260 and exynos5410
For preparing exynos5250-spring,
- move max77686 and cypress,cyapa trackpad from exynos5250-
cros-common to exynos5250-snow DT file
(Note exynos5250-spring is not included in this branch yet)
For exynos3250,
- add TMU node and remove duplicated interrupt-parent
- add missing pinctrl property for uart0 and uart1
For exynos5250-smdk5250 board
- add max77686 pmic interrupt property which is connected to
gpx3
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJT2of/AAoJEA0Cl+kVi2xqNtkP/RzgKbqZfBVFVj7j9gC/c8qs
0Cnh3l6JHXRQXMLvL7vI9MXS41kH0jaLWxSDgZlf87pARINT2zDP6KNAYleJJz3j
+VX++YP6e4Yhw72Jzxuzqjjdu0IMNoo0JL7T5w/aHK5C186YKM7mEuG98+/qbwBh
nF2pEPlQ9x3D78XawyC8IdGBkB6YlpqWWYThPDVGN2wvm14Jf9CRJFVyDjcjO281
y4K7cXnxKYzc51Oih5ktzT0K4oHaHag5biLLOwBCCQFJVr6ZWbXG/Nd849v8chG+
Cfg9ckkch+CGre8NsdW4qfu1UGIK3qubHw9fKZzm8j9/0PANgLMfrX4/JKFhPbsQ
2Hluc3u4UFkNheDJ4xfY127MQCvIlXwUZLAhDS3PGvm7nP/BiPjpX4HsLsBq11we
wRCEtiD1aWb5XcVUyJQfUIETEfQDoc7bU9YjSEpz30ilTc/F5rLs2/zAgrz4/yLE
rhFyGl4qWW8LbveD6X7QjZ2mLEG5kYOd0job2SfJeXN0uQDJV3JpZbhJ7pvHDQTk
3OiR7xtrpfOsMGjxW3kpCs/kz9h7MLwbIdlUa5cFl3kpTTLmexo+0TagzcGIus+z
Yzt02gnM5zLUYmy0YjDI2YqvL17urTnLvrQnUeM/WlXOHPJkSnHvh9ywbsm9SrIe
iGQG4cuxEtVk+s6ECVUf
=aXEe
-----END PGP SIGNATURE-----
Merge tag 'samsung-dt-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung into next/dt
Merge "Samsung DT 2nd updates for v3.17" from Kukjin Kim:
This is based on tags/exynos-power because this DT changes
are depending PMU cleanup.
Fixes boot for exynos5260 and exynos5410,
- Since exynos cannot boot without obtaining PMU address via
DT from now on, add PMU node for exynos5260 and exynos5410
For preparing exynos5250-spring,
- move max77686 and cypress,cyapa trackpad from exynos5250-
cros-common to exynos5250-snow DT file
(Note exynos5250-spring is not included in this branch yet)
For exynos3250,
- add TMU node and remove duplicated interrupt-parent
- add missing pinctrl property for uart0 and uart1
For exynos5250-smdk5250 board
- add max77686 pmic interrupt property which is connected to
gpx3
* tag 'samsung-dt-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung: (28 commits)
ARM: dts: Add missing pinctrl for uart0/1 for exynos3250
ARM: dts: Remove duplicate 'interrput-parent' property for exynos3250
ARM: dts: Add TMU dt node to monitor the temperature for exynos3250
ARM: dts: Specify MAX77686 pmic interrupt for exynos5250-smdk5250
ARM: dts: cypress,cyapa trackpad is exynos5250-Snow only
ARM: dts: max77686 is exynos5250-snow only
ARM: EXYNOS: Add exynos5260 PMU compatible string to DT match table
ARM: dts: Add PMU DT node for exynos5260 SoC
ARM: EXYNOS: Add support for Exynos5410 PMU
ARM: dts: Add PMU to exynos5410
ARM: dts: Document exynos5410 PMU
ARM: EXYNOS: Move cpufreq and cpuidle device registration to init_machine
ARM: EXYNOS: Refactored code for using PMU address via DT
ARM: EXYNOS: Support cluster power off on exynos5420/5800
ARM: EXYNOS: populate suspend and powered_up callbacks for mcpm
ARM: EXYNOS: do not allow cpuidle registration for exynos5420
cpuidle: big.LITTLE: init driver for exynos5420
cpuidle: big.LITTLE: Add ARCH_EXYNOS entry in config
ARM: EXYNOS: add generic function to calculate cpu number
cpuidle: big.LITTLE: add of_device_id structure
...
Signed-off-by: Olof Johansson <olof@lixom.net>
INT3438 is the ADSP device on Wildcat Point platform
with 2 DW DMA engines built In. The DMA engines are
used for DSP FW loading and audio data transferring.
These DMA engine probing need the clock, without it,
probing may failed and can't go forward.
Add LPSS device "INT3438" for Wildcat Point PCH, to
provide clock for its ADSP DMA engine probing.
Signed-off-by: Jie Yang <yang.jie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Kenton Varda <kenton@sandstorm.io> discovered that by remounting a
read-only bind mount read-only in a user namespace the
MNT_LOCK_READONLY bit would be cleared, allowing an unprivileged user
to the remount a read-only mount read-write.
Upon review of the code in remount it was discovered that the code allowed
nosuid, noexec, and nodev to be cleared. It was also discovered that
the code was allowing the per mount atime flags to be changed.
The first naive patch to fix these issues contained the flaw that using
default atime settings when remounting a filesystem could be disallowed.
To avoid this problems in the future add tests to ensure unprivileged
remounts are succeeding and failing at the appropriate times.
Cc: stable@vger.kernel.org
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Since March 2009 the kernel has treated the state that if no
MS_..ATIME flags are passed then the kernel defaults to relatime.
Defaulting to relatime instead of the existing atime state during a
remount is silly, and causes problems in practice for people who don't
specify any MS_...ATIME flags and to get the default filesystem atime
setting. Those users may encounter a permission error because the
default atime setting does not work.
A default that does not work and causes permission problems is
ridiculous, so preserve the existing value to have a default
atime setting that is always guaranteed to work.
Using the default atime setting in this way is particularly
interesting for applications built to run in restricted userspace
environments without /proc mounted, as the existing atime mount
options of a filesystem can not be read from /proc/mounts.
In practice this fixes user space that uses the default atime
setting on remount that are broken by the permission checks
keeping less privileged users from changing more privileged users
atime settings.
Cc: stable@vger.kernel.org
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
While invesgiating the issue where in "mount --bind -oremount,ro ..."
would result in later "mount --bind -oremount,rw" succeeding even if
the mount started off locked I realized that there are several
additional mount flags that should be locked and are not.
In particular MNT_NOSUID, MNT_NODEV, MNT_NOEXEC, and the atime
flags in addition to MNT_READONLY should all be locked. These
flags are all per superblock, can all be changed with MS_BIND,
and should not be changable if set by a more privileged user.
The following additions to the current logic are added in this patch.
- nosuid may not be clearable by a less privileged user.
- nodev may not be clearable by a less privielged user.
- noexec may not be clearable by a less privileged user.
- atime flags may not be changeable by a less privileged user.
The logic with atime is that always setting atime on access is a
global policy and backup software and auditing software could break if
atime bits are not updated (when they are configured to be updated),
and serious performance degradation could result (DOS attack) if atime
updates happen when they have been explicitly disabled. Therefore an
unprivileged user should not be able to mess with the atime bits set
by a more privileged user.
The additional restrictions are implemented with the addition of
MNT_LOCK_NOSUID, MNT_LOCK_NODEV, MNT_LOCK_NOEXEC, and MNT_LOCK_ATIME
mnt flags.
Taken together these changes and the fixes for MNT_LOCK_READONLY
should make it safe for an unprivileged user to create a user
namespace and to call "mount --bind -o remount,... ..." without
the danger of mount flags being changed maliciously.
Cc: stable@vger.kernel.org
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>