Provide generic non-voltage sensing socket support for StrongARM
platforms using the gpiolib and regulator subsystems to obtain the
resources to control the socket.
Acked-by: Dominik Brodowski <linux@dominikbrodowski.net> (for drivers/pcmcia)
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
After previous refactoring, there is only one user in the same file
left. Make the function static now.
[wsa: added 'int' to bare 'unsigned']
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
According to documentation, Bit 7 of ICMSR is unused and 0 should be
written to it. Fix the mask accordingly.
Signed-off-by: Hiromitsu Yamasaki <hiromitsu.yamasaki.ym@renesas.com>
[wsa: edited commit message]
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEFp3rbAvDxGAT0sefEacuoBRx13IFAlqyYdkACgkQEacuoBRx
13LnHA/8CFMaR7o5ZYlDaDNizlg0Wu7oT/zEKK/erF+/3LYoYHl5FaCuFgZ6R9e3
hEwqBkyJ1QjwhQhS4IIZn1Z34zSpXgUTXzb6+U+zWhdQNj8j+QqG81u+eLjNgcsa
5DxOcwyEpqUTXQI2MMYxK60NHVWfJaq2VmxzKDTqZkJHiOzQsoR4H1O7tPOCDvHE
R8uDgSamz4lRRANuxu5jONbgD04p4c+No42LLsYLo/DKUJN5rS6IbFyzURjWHaA3
+8WA2y6bG3v6YVdIvRXfw7NCi+U/pWCbSD3OqzIQkUKnWfDLnbqCw1wRrWVPLhnE
edpkmyHLhaXHG0vb1J4Hwq9mV4UXSb/XyH4DZp5v4EZKT4ImnI64GnDU3O+JEqXe
upb2r0WQ7z6bA1A99GqNc0em7BI/8nXIY4LZCJXTgAkBLQq6+b3EYQGElMSiI0X6
k638GcY87t0AaJgR2laOIx0ADgdxoabyb07v8oh2o9C1o/Ujb4AhCg5TouPHNMtS
CZmohAY6rmPMZ9jT6GzAsGfF7+hXQ6OtSXowQ713Eez8haq8dC9wNo3DUiCCyMMv
KssF1uPwa6pnSF3mtAFAq8wI2AqPfVcnzI14OkaxHVHrm8iWpT2+FD2kJcMAvJX0
OGljH0WN/WTw/v8BUeyqsvOyEurglOR3dyns8IWNLnoV9R6Gzvg=
=rCow
-----END PGP SIGNATURE-----
Merge tag 'at24-4.17-updates-for-wolfram' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux into i2c/for-4.17
"three new special cases for device tree compatible strings"
Now that the i2c-pca-plaform driver is using the device managed API for
gpios there is no need for the reset gpio to be specified via
i2c_pca9564_pf_platform_data.
Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Use device_property_read_u32 instead of of_property_read_u32_index to
lookup the "clock-frequency" property.
Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Allow for the reset-gpios property to be defined in the device tree
or via a GPIO lookup table.
Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Define the GPIO connected to the PCA9564 using a GPIO lookup table. This
will allow the i2c-pca-platform driver to use the device managed APIs to
lookup the gpio instead of using platform_data.
Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Following are the major issues in current driver code
1. The current driver simply assumes the transfer completion
whenever its gets any non-error interrupts and then simply do the
polling of available/free bytes in FIFO.
2. The block mode is not working properly since no handling in
being done for OUT_BLOCK_WRITE_REQ and IN_BLOCK_READ_READ.
3. An i2c transfer can contain multiple message and QUP v2
supports reconfiguration during run in which the mode should be same
for all the sub transfer. Currently the mode is being programmed
before every sub transfer which is functionally wrong. If one message
is less than FIFO length and other message is greater than FIFO
length, then transfers will fail.
Because of above, i2c v2 transfers of size greater than 64 are failing
with following error message
i2c_qup 78b6000.i2c: timeout for fifo out full
To make block mode working properly and move to use the interrupts
instead of polling, major code reorganization is required. Following
are the major changes done in this patch
1. Remove the polling of TX FIFO free space and RX FIFO available
bytes and move to interrupts completely. QUP has QUP_MX_OUTPUT_DONE,
QUP_MX_INPUT_DONE, OUT_BLOCK_WRITE_REQ and IN_BLOCK_READ_REQ
interrupts to handle FIFO’s properly so check all these interrupts.
2. Determine the mode for transfer before starting by checking
all the tx/rx data length in each message. The complete message can be
transferred either in DMA mode or Programmed IO by FIFO/Block mode.
in DMA mode, both tx and rx uses same mode but in PIO mode, the TX and
RX can be in different mode.
3. During write, For FIFO mode, TX FIFO can be directly written
without checking for FIFO space. For block mode, the QUP will generate
OUT_BLOCK_WRITE_REQ interrupt whenever it has block size of available
space.
4. During read, both TX and RX FIFO will be used. TX will be used
for writing tags and RX will be used for receiving the data. In QUP,
TX and RX can operate in separate mode so configure modes accordingly.
5. For read FIFO mode, wait for QUP_MX_INPUT_DONE interrupt which
will be generated after all the bytes have been copied in RX FIFO. For
read Block mode, QUP will generate IN_BLOCK_READ_REQ interrupts
whenever it has block size of available data.
6. Split the transfer in chunk of one QUP block size(256 bytes)
and schedule each block separately. QUP v2 supports reconfiguration
during run in which QUP can transfer multiple blocks without issuing a
stop events.
7. Port the SMBus block read support for new code changes.
Signed-off-by: Abhishek Sahu <absahu@codeaurora.org>
Reviewed-by: Sricharan R <sricharan@codeaurora.org>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Following are the major issues in current driver code
1. The current driver simply assumes the transfer completion
whenever its gets any non-error interrupts and then simply do the
polling of available/free bytes in FIFO.
2. The block mode is not working properly since no handling in
being done for OUT_BLOCK_WRITE_REQ and IN_BLOCK_READ_REQ.
Because of above, i2c v1 transfers of size greater than 32 are failing
with following error message
i2c_qup 78b6000.i2c: timeout for fifo out full
To make block mode working properly and move to use the interrupts
instead of polling, major code reorganization is required. Following
are the major changes done in this patch
1. Remove the polling of TX FIFO free space and RX FIFO available
bytes and move to interrupts completely. QUP has QUP_MX_OUTPUT_DONE,
QUP_MX_INPUT_DONE, OUT_BLOCK_WRITE_REQ and IN_BLOCK_READ_REQ
interrupts to handle FIFO’s properly so check all these interrupts.
2. During write, For FIFO mode, TX FIFO can be directly written
without checking for FIFO space. For block mode, the QUP will generate
OUT_BLOCK_WRITE_REQ interrupt whenever it has block size of available
space.
3. During read, both TX and RX FIFO will be used. TX will be used
for writing tags and RX will be used for receiving the data. In QUP,
TX and RX can operate in separate mode so configure modes accordingly.
4. For read FIFO mode, wait for QUP_MX_INPUT_DONE interrupt which
will be generated after all the bytes have been copied in RX FIFO. For
read Block mode, QUP will generate IN_BLOCK_READ_REQ interrupts
whenever it has block size of available data.
Signed-off-by: Abhishek Sahu <absahu@codeaurora.org>
Reviewed-by: Sricharan R <sricharan@codeaurora.org>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
According to I2c specification, “If a master-receiver sends a
repeated START condition, it sends a not-acknowledge (A) just
before the repeated START condition”. QUP v2 supports sending
of NACK without stop with QUP_TAG_V2_DATARD_NACK so added the
same.
Signed-off-by: Abhishek Sahu <absahu@codeaurora.org>
Reviewed-by: Austin Christ <austinwc@codeaurora.org>
Reviewed-by: Andy Gross <andy.gross@linaro.org>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
The BAM mode requires buffer for start tag data and tx, rx SG
list. Currently, this is being taken for maximum transfer length
(65K). But an I2C transfer can have multiple messages and each
message can be of this maximum length so the buffer overflow will
happen in this case. Since increasing buffer length won’t be
feasible since an I2C transfer can contain any number of messages
so this patch does following changes to make i2c transfers working
for multiple messages case.
1. Calculate the required buffers for 2 maximum length messages
(65K * 2).
2. Split the descriptor formation and descriptor scheduling.
The idea is to fit as many messages in one DMA transfers for 65K
threshold value (max_xfer_sg_len). Whenever the sg_cnt is
crossing this, then schedule the BAM transfer and subsequent
transfer will again start from zero.
Signed-off-by: Abhishek Sahu <absahu@codeaurora.org>
Reviewed-by: Andy Gross <andy.gross@linaro.org>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Currently the completion timeout is being taken according to
maximum transfer length which is too high if SCL is operating in
high frequency. This patch calculates timeout on the basis of
one-byte transfer time and uses the same for completion timeout.
Signed-off-by: Abhishek Sahu <absahu@codeaurora.org>
Reviewed-by: Andy Gross <andy.gross@linaro.org>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Currently each message length in complete transfer is being
checked for determining DMA mode and if any of the message length
is less than FIFO length then non DMA mode is being used which
will increase overhead. DMA can be used for any length and it
should be determined with complete transfer length. Now, this
patch selects DMA mode if the total length is greater than FIFO
length.
Signed-off-by: Abhishek Sahu <absahu@codeaurora.org>
Reviewed-by: Austin Christ <austinwc@codeaurora.org>
Reviewed-by: Andy Gross <andy.gross@linaro.org>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Currently the i2c error handling in BAM mode is not working
properly in stress condition.
1. After an error, the FIFO are being written with FLUSH and
EOT tags which should not be required since already these tags
have been written in BAM descriptor itself.
2. QUP state is being moved to RESET in IRQ handler in case
of error. When QUP HW encounters an error in BAM mode then it
moves the QUP STATE to PAUSE state. In this case, I2C_FLUSH
command needs to be executed while moving to RUN_STATE by writing
to the QUP_STATE register with the I2C_FLUSH bit set to 1.
3. In Error case, sometimes, QUP generates more than one
interrupt which will trigger the complete again. After an error,
the flush operation will be scheduled after doing
reinit_completion which should be triggered by BAM IRQ callback.
If the second QUP IRQ comes during this time then it will call
the complete and the transfer function will assume the all the
BAM HW descriptors have been completed.
4. The release DMA is being called after each error which
will free the DMA tx and rx channels. The error like NACK is very
common in I2C transfer and every time this will be overhead. Now,
since the error handling is proper so this release channel can be
completely avoided.
Signed-off-by: Abhishek Sahu <absahu@codeaurora.org>
Reviewed-by: Sricharan R <sricharan@codeaurora.org>
Reviewed-by: Austin Christ <austinwc@codeaurora.org>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
In case of FLUSH operation, BAM copies INPUT EOT FLUSH (0x94)
instead of normal EOT (0x93) tag in input data stream when an
input EOT tag is received during flush operation. So only one tag
will be written instead of 2 separate tags.
Signed-off-by: Abhishek Sahu <absahu@codeaurora.org>
Reviewed-by: Andy Gross <andy.gross@linaro.org>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
The role of FLUSH and EOT tag is to flush already scheduled
descriptors in BAM HW in case of error. EOT is required only
when descriptors are scheduled in RX FIFO. If all the messages
are WRITE, then only FLUSH tag will be used.
A single BAM transfer can have multiple read and write messages.
The EOT and FLUSH tags should be scheduled at the end of BAM HW
descriptors. Since the READ and WRITE can be present in any order
so for some of the cases, these tags are not being written
correctly.
Following is one of the example
READ, READ, READ, READ
Currently EOT and FLUSH tags are being written after each READ.
If QUP gets NACK for first READ itself, then flush will be
triggered. It will look for first FLUSH tag in TX FIFO and will
stop there so only descriptors for first READ descriptors be
flushed. All the scheduled descriptors should be cleared to
generate BAM DMA completion.
Now this patch is scheduling FLUSH and EOT only once after all the
descriptors. So, flush will clear all the scheduled descriptors and
BAM will generate the completion interrupt.
Signed-off-by: Abhishek Sahu <absahu@codeaurora.org>
Reviewed-by: Sricharan R <sricharan@codeaurora.org>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
The rx_nents and tx_nents are redundant. rx_buf and tx_buf can
be used for total number of SG entries. Since rx_buf and tx_buf
give the impression that it is buffer instead of count so rename
it to tx_cnt and rx_cnt for giving it more meaningful variable
name.
Signed-off-by: Abhishek Sahu <absahu@codeaurora.org>
Reviewed-by: Austin Christ <austinwc@codeaurora.org>
Reviewed-by: Andy Gross <andy.gross@linaro.org>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
1. Assigns use_dma in qup_dev structure itself which will
help in subsequent patches to determine the mode in IRQ handler.
2. Does minor code reorganization for loops to reduce the
unnecessary comparison and assignment.
Signed-off-by: Abhishek Sahu <absahu@codeaurora.org>
Reviewed-by: Austin Christ <austinwc@codeaurora.org>
Reviewed-by: Andy Gross <andy.gross@linaro.org>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
The QUP BSLP BAM generates the following error sometimes if the
current I2C DMA transfer fails and the flush operation has been
scheduled
“bam-dma-engine 7884000.dma: Cannot free busy channel”
If any I2C error comes during BAM DMA transfer, then the QUP I2C
interrupt will be generated and the flush operation will be
carried out to make I2C consume all scheduled DMA transfer.
Currently, the same completion structure is being used for BAM
transfer which has already completed without reinit. It will make
flush operation wait_for_completion_timeout completed immediately
and will proceed for freeing the DMA resources where the
descriptors are still in process.
Signed-off-by: Abhishek Sahu <absahu@codeaurora.org>
Acked-by: Sricharan R <sricharan@codeaurora.org>
Reviewed-by: Austin Christ <austinwc@codeaurora.org>
Reviewed-by: Andy Gross <andy.gross@linaro.org>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
The file has been updated from 2016 to 2018 so fixed the
copyright years.
Signed-off-by: Abhishek Sahu <absahu@codeaurora.org>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
DHCP connectivity issues can currently occur if the following conditions
are met:
1) A DHCP packet from a client to a server
2) This packet has a multicast destination
3) This destination has a matching entry in the translation table
(FF:FF:FF:FF:FF:FF for IPv4, 33:33:00:01:00:02/33:33:00:01:00:03
for IPv6)
4) The orig-node determined by TT for the multicast destination
does not match the orig-node determined by best-gateway-selection
In this case the DHCP packet will be dropped.
The "gateway-out-of-range" check is supposed to only be applied to
unicasted DHCP packets to a specific DHCP server.
In that case dropping the the unicasted frame forces the client to
retry via a broadcasted one, but now directed to the new best
gateway.
A DHCP packet with broadcast/multicast destination is already ensured to
always be delivered to the best gateway. Dropping a multicasted
DHCP packet here will only prevent completing DHCP as there is no
other fallback.
So far, it seems the unicast check was implicitly performed by
expecting the batadv_transtable_search() to return NULL for multicast
destinations. However, a multicast address could have always ended up in
the translation table and in fact is now common.
To fix this potential loss of a DHCP client-to-server packet to a
multicast address this patch adds an explicit multicast destination
check to reliably bail out of the gateway-out-of-range check for such
destinations.
The issue and fix were tested in the following three node setup:
- Line topology, A-B-C
- A: gateway client, DHCP client
- B: gateway server, hop-penalty increased: 30->60, DHCP server
- C: gateway server, code modifications to announce FF:FF:FF:FF:FF:FF
Without this patch, A would never transmit its DHCP Discover packet
due to an always "out-of-range" condition. With this patch,
a full DHCP handshake between A and B was possible again.
Fixes: be7af5cf9c ("batman-adv: refactoring gateway handling code")
Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
For multicast frames AP isolation is only supposed to be checked on
the receiving nodes and never on the originating one.
Furthermore, the isolation or wifi flag bits should only be intepreted
as such for unicast and never multicast TT entries.
By injecting flags to the multicast TT entry claimed by a single
target node it was verified in tests that this multicast address
becomes unreachable, leading to packet loss.
Omitting the "src" parameter to the batadv_transtable_search() call
successfully skipped the AP isolation check and made the target
reachable again.
Fixes: 1d8ab8d3c1 ("batman-adv: Modified forwarding behaviour for multicast packets")
Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
Allow the device tree to specify a watchdog to fallover to
the alternate boot source.
The aspeeed watchdog can set a latch directing flash chip select 0 to
chip select 1, allowing boot from an alternate media if the watchdog
is not reset in time. On the ast2400 bank 1 also goes to flash bank 1,
while on the ast2500 the chip selects are swapped.
Also clear the secondary boot bit during the machine restart operation.
Otherwise, the system will switch to the alternate boot after every
reboot, which is not desired.
Signed-off-by: Milton Miller <miltonm@us.ibm.com>
Signed-off-by: Eddie James <eajames@linux.vnet.ibm.com>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
The Nuvoton NPCM750 has a watchdog implemented as a single register
inside the timer peripheral.
This driver exposes that watchdog as a standard watchdog device with
coarse timeout intervals, limited by the combination of prescaler and
counter that is provided by the hardware. The calculation is taken from
the Nuvoton vendor tree.
The watchdog is left running if a bootloader had it going. The rate is
the one specified in the device tree, or the default value (obtained
from the datasheet).
There is a pre-timeout IRQ that is wired up. This timeout always occurs
1024 clocks before the timeout.
Signed-off-by: Joel Stanley <joel@jms.id.au>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
These bindings describe the watchdog IP as used by the Nuvoton NPCM750
(Poleg) BMC SoC.
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Joel Stanley <joel@jms.id.au>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Make the "clock valid" control a global control instead of a mixer
so that it doesn't appear in mixer applications.
Additionally, remove the check for writeability prohibited by spec, and
Use common code to read the control value.
Tested with a UAC2 Audio device that presents a clock validity
control. The control still shows up in /proc usbmixer but not
in alsamixer.
Signed-off-by: Andrew Chant <achant@google.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
This implements UAC2 jack detection support, presenting
jack status as a boolean read-only mono mixer.
The presence of any channel in the UAC2_TE_CONNECTOR
control for a terminal will result in the mixer saying
the jack is connected.
Mixer naming follows the convention in sound/core/ctljack.c,
terminating the mixer with " Jack".
For additional clues as to which jack is being presented,
the name is prefixed with " - Input Jack" or " - Output Jack"
depending on if it's an input or output terminal.
This is required because terminal names are ambiguous
between inputs and outputs and often duplicated -
Bidirectional terminal types (0x400 -> 0x4FF)
"... may be used separately for input only or output only.
These types require two Terminal descriptors. Both have the same type."
(quote from "USB Device Class Definition for Terminal Types")
Since bidirectional terminal types are common for headphone adapters,
this distinguishes between two otherwise identically-named
jack controls.
Tested with a UAC2 audio device with connector control capability.
Signed-off-by: Andrew Chant <achant@google.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
With the cherry-picked perf/urgent commit merged separately we can now
merge all the fixes without conflicts.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
use u16 in place of __be16 to suppress the following sparse warnings:
net/sched/act_vlan.c:150:26: warning: incorrect type in assignment (different base types)
net/sched/act_vlan.c:150:26: expected restricted __be16 [usertype] push_vid
net/sched/act_vlan.c:150:26: got unsigned short
net/sched/act_vlan.c:151:21: warning: restricted __be16 degrades to integer
net/sched/act_vlan.c:208:26: warning: incorrect type in assignment (different base types)
net/sched/act_vlan.c:208:26: expected unsigned short [unsigned] [usertype] tcfv_push_vid
net/sched/act_vlan.c:208:26: got restricted __be16 [usertype] push_vid
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
tcf_idr_cleanup() is no more used, so remove it.
Suggested-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In net commit 8175f7c4736f ("mlxsw: spectrum: Prevent duplicate
mirrors") we prevented the user from mirroring more than once from a
single binding point (port-direction pair).
The fix was essentially reverted in a merge conflict resolution when net
was merged into net-next. Restore it.
Fixes: 03fe2debbb ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We can only get into the branch if CRCs are enabled, so there's no
need to check inside the branch for CRCs being enabled....
Signed-Off-By: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
We recently came across a V4 filesystem causing memory corruption
due to a newly allocated inode being setup twice and being added to
the superblock inode list twice. From code inspection, the only way
this could happen is if a newly allocated inode was not marked as
free on disk (i.e. di_mode wasn't zero).
Running the metadump on an upstream debug kernel fails during inode
allocation like so:
XFS: Assertion failed: ip->i_d.di_nblocks == 0, file: fs/xfs/xfs_inod=
e.c, line: 838
------------[ cut here ]------------
kernel BUG at fs/xfs/xfs_message.c:114!
invalid opcode: 0000 [#1] PREEMPT SMP
CPU: 11 PID: 3496 Comm: mkdir Not tainted 4.16.0-rc5-dgc #442
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/0=
1/2014
RIP: 0010:assfail+0x28/0x30
RSP: 0018:ffffc9000236fc80 EFLAGS: 00010202
RAX: 00000000ffffffea RBX: 0000000000004000 RCX: 0000000000000000
RDX: 00000000ffffffc0 RSI: 000000000000000a RDI: ffffffff8227211b
RBP: ffffc9000236fce8 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000bec R11: f000000000000000 R12: ffffc9000236fd30
R13: ffff8805c76bab80 R14: ffff8805c77ac800 R15: ffff88083fb12e10
FS: 00007fac8cbff040(0000) GS:ffff88083fd00000(0000) knlGS:0000000000000=
000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fffa6783ff8 CR3: 00000005c6e2b003 CR4: 00000000000606e0
Call Trace:
xfs_ialloc+0x383/0x570
xfs_dir_ialloc+0x6a/0x2a0
xfs_create+0x412/0x670
xfs_generic_create+0x1f7/0x2c0
? capable_wrt_inode_uidgid+0x3f/0x50
vfs_mkdir+0xfb/0x1b0
SyS_mkdir+0xcf/0xf0
do_syscall_64+0x73/0x1a0
entry_SYSCALL_64_after_hwframe+0x42/0xb7
Extracting the inode number we crashed on from an event trace and
looking at it with xfs_db:
xfs_db> inode 184452204
xfs_db> p
core.magic = 0x494e
core.mode = 0100644
core.version = 2
core.format = 2 (extents)
core.nlinkv2 = 1
core.onlink = 0
.....
Confirms that it is not a free inode on disk. xfs_repair
also trips over this inode:
.....
zero length extent (off = 0, fsbno = 0) in ino 184452204
correcting nextents for inode 184452204
bad attribute fork in inode 184452204, would clear attr fork
bad nblocks 1 for inode 184452204, would reset to 0
bad anextents 1 for inode 184452204, would reset to 0
imap claims in-use inode 184452204 is free, would correct imap
would have cleared inode 184452204
.....
disconnected inode 184452204, would move to lost+found
And so we have a situation where the directory structure and the
inobt thinks the inode is free, but the inode on disk thinks it is
still in use. Where this corruption came from is not possible to
diagnose, but we can detect it and prevent the kernel from oopsing
on lookup. The reproducer now results in:
$ sudo mkdir /mnt/scratch/{0,1,2,3,4,5}{0,1,2,3,4,5}
mkdir: cannot create directory =E2=80=98/mnt/scratch/00=E2=80=99: File ex=
ists
mkdir: cannot create directory =E2=80=98/mnt/scratch/01=E2=80=99: File ex=
ists
mkdir: cannot create directory =E2=80=98/mnt/scratch/03=E2=80=99: Structu=
re needs cleaning
mkdir: cannot create directory =E2=80=98/mnt/scratch/04=E2=80=99: Input/o=
utput error
mkdir: cannot create directory =E2=80=98/mnt/scratch/05=E2=80=99: Input/o=
utput error
....
And this corruption shutdown:
[ 54.843517] XFS (loop0): Corruption detected! Free inode 0xafe846c not=
marked free on disk
[ 54.845885] XFS (loop0): Internal error xfs_trans_cancel at line 1023 =
of file fs/xfs/xfs_trans.c. Caller xfs_create+0x425/0x670
[ 54.848994] CPU: 10 PID: 3541 Comm: mkdir Not tainted 4.16.0-rc5-dgc #=
443
[ 54.850753] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIO=
S 1.10.2-1 04/01/2014
[ 54.852859] Call Trace:
[ 54.853531] dump_stack+0x85/0xc5
[ 54.854385] xfs_trans_cancel+0x197/0x1c0
[ 54.855421] xfs_create+0x425/0x670
[ 54.856314] xfs_generic_create+0x1f7/0x2c0
[ 54.857390] ? capable_wrt_inode_uidgid+0x3f/0x50
[ 54.858586] vfs_mkdir+0xfb/0x1b0
[ 54.859458] SyS_mkdir+0xcf/0xf0
[ 54.860254] do_syscall_64+0x73/0x1a0
[ 54.861193] entry_SYSCALL_64_after_hwframe+0x42/0xb7
[ 54.862492] RIP: 0033:0x7fb73bddf547
[ 54.863358] RSP: 002b:00007ffdaa553338 EFLAGS: 00000246 ORIG_RAX: 0000=
000000000053
[ 54.865133] RAX: ffffffffffffffda RBX: 00007ffdaa55449a RCX: 00007fb73=
bddf547
[ 54.866766] RDX: 0000000000000001 RSI: 00000000000001ff RDI: 00007ffda=
a55449a
[ 54.868432] RBP: 00007ffdaa55449a R08: 00000000000001ff R09: 00005623a=
8670dd0
[ 54.870110] R10: 00007fb73be72d5b R11: 0000000000000246 R12: 000000000=
00001ff
[ 54.871752] R13: 00007ffdaa5534b0 R14: 0000000000000000 R15: 00007ffda=
a553500
[ 54.873429] XFS (loop0): xfs_do_force_shutdown(0x8) called from line 1=
024 of file fs/xfs/xfs_trans.c. Return address = ffffffff814cd050
[ 54.882790] XFS (loop0): Corruption of in-memory data detected. Shutt=
ing down filesystem
[ 54.884597] XFS (loop0): Please umount the filesystem and rectify the =
problem(s)
Note that this crash is only possible on v4 filesystemsi or v5
filesystems mounted with the ikeep mount option. For all other V5
filesystems, this problem cannot occur because we don't read inodes
we are allocating from disk - we simply overwrite them with the new
inode information.
Signed-Off-By: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Tested-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
In xfs_scrub_iallocbt_xref_rmap_inodes we're checking inodes against
rmap records, so we should use xfs_scrub_btree_xref_set_corrupt if we
encounter discrepancies here so that we know that it's a cross
referencing error, not necessarily a corruption in the inobt itself.
The userspace xfs_scrub program will try to repair outright corruptions
in the agi/inobt prior to phase 3 so that the inode scan will proceed.
If only a cross-referencing error is noted, the repair program defers
the repair attempt until it can check the other space metadata at least
once.
It is therefore essential that the inobt scrubber can correctly
distinguish between corruptions and "unable to cross-reference something
else with this inobt". The same reasoning applies to "xfs: record inode
buf errors as a xref error in inobt scrubber".
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
If a directory's parent inode pointer doesn't point to an inode, the
directory should be flagged as corrupt. Enable IGET_UNTRUSTED here so
that _iget will return -EINVAL if the inobt does not confirm that the
inode is present and allocated and we can flag the directory corruption.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
When we're verifying inode buffers, sanity-check the unlinked pointer.
We don't want to run the risk of trying to purge something that's
obviously broken.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Extent size hint validation is used by scrub to decide if there's an
error, and it will be used by repair to decide to remove the hint.
Since these use the same validation functions, move them to libxfs.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
During the inode btree scrubs we try to confirm the freemask bits
against the inode records. If the inode buffer read fails, this is a
cross-referencing error, not a corruption of the inode btree itself.
Use the xref_process_error call here. Found via core.version middlebit
fuzz in xfs/415.
The userspace xfs_scrub program will try to repair outright corruptions
in the agi/inobt prior to phase 3 so that the inode scan will proceed.
If only a cross-referencing error is noted, the repair program defers
the repair attempt until it can check the other space metadata at least
once.
It is therefore essential that the inobt scrubber can correctly
distinguish between corruptions and "unable to cross-reference something
else with this inobt".
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Now that we no longer do raw inode buffer scrubbing, the bp parameter is
no longer used anywhere we're dealing with an inode, so remove it and
all the useless NULL parameters that go with it.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
The inode scrubber tries to _iget the inode prior to running checks.
If that _iget call fails with corruption errors that's an automatic
fail, regardless of whether it was the inode buffer read verifier,
the ifork verifier, or the ifork formatter that errored out.
Therefore, get rid of the raw mode scrub code because it's not needed.
Found by trying to fix some test failures in xfs/379 and xfs/415.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
When we're scanning an extent mapping inode fork, ensure that every rmap
record for this ifork has a corresponding bmbt record too. This
(mostly) provides the ability to cross-reference rmap records with bmap
data. The rmap scrubber cannot do the xref on its own because that
requires taking an ilock with the agf lock held, which violates our
locking order rules (inode, then agf).
Note that we only do this for forks that are in btree format due to the
increased complexity; or forks that should have data but suspiciously
have zero extents because the inode could have just had its iforks
zapped by the inode repair code and now we need to reclaim the old
extents.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>