Commit graph

252938 commits

Author SHA1 Message Date
Nicholas Bellinger
53ab6709b4 [SCSI] target: Fix interrupt context bug with stats_lock and core_tmr_alloc_req
This patch fixes two bugs wrt to the interrupt context usage of target
core with HW target mode drivers.  It first converts the usage of struct
se_device->stats_lock in transport_get_lun_for_cmd() and core_tmr_lun_reset()
to properly use spin_lock_irq() to address an BUG with CONFIG_LOCKDEP_SUPPORT=y
enabled.

This patch also adds a 'in_interrupt()' check to allow GFP_ATOMIC usage from
core_tmr_alloc_req() to fix a 'sleeping in interrupt context' BUG with HW
target fabrics that require this logic to function.

Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Cc: stable@kernel.org
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:58:17 -04:00
Nicholas Bellinger
97868c8905 [SCSI] target: Fix multi task->task_sg[] chaining logic bug
This patch fixes a bug in transport_do_task_sg_chain() used by HW target
mode modules with sg_chain() to provide a single sg_next() walkable memory
layout for use with pci_map_sg() and friends.  This patch addresses an
issue with mapping multiple small block max_sector tasks across multiple
struct se_task->task_sg[] mappings for HW target mode operation.

This was causing OOPs with (cmd->t_task->t_tasks_no > 1) I/O traffic for
HW target drivers using transport_do_task_sg_chain(), and has been tested
so far with tcm_fc(openfcoe), tcm_qla2xxx, and ib_srpt fabrics with
t_tasks_no > 1 IBLOCK backends using a smaller max_sectors to trigger the
original issue.

Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Acked-by: Kiran Patil <kiran.patil@intel.com>
Cc: stable@kernel.org
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:56:58 -04:00
David Jeffery
3eef6257de [SCSI] Reduce error recovery time by reducing use of TURs
In error recovery, most scsi error recovery stages will send a TUR command
for every bad command when a driver's error handler reports success.  When
several bad commands to the same device, this results in a device
being probed multiple times.

This becomes very problematic if the device or connection is in a state
where the device still doesn't respond to commands even after a recovery
function returns success.  The error handler must wait for the test
commands to time out.  The time waiting for the redundant commands can
drastically lengthen error recovery.

This patch alters the scsi mid-layer's error routines to send test commands
once per device instead of once per bad command.  This can drastically
lower error recovery time.

[jejb: fixed up whitespace and formatting]
Signed-of-by: David Jeffery <djeffery@redhat.com>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:51:53 -04:00
Steve Wise
c337374bf2 RDMA/cxgb4: Use completion objects for event blocking
There exists a race condition when using wait_queue_head_t objects
that are declared on the stack.  This was being done in a few places
where we are sending work requests to the FW and awaiting replies, but
we don't have an endpoint structure with an embedded c4iw_wr_wait
struct.  So the code was allocating it locally on the stack.  Bad
design.  The race is:

  1) thread on cpuX declares the wait_queue_head_t on the stack, then
     posts a firmware WR with that wait object ptr as the cookie to be
     returned in the WR reply.  This thread will proceed to block in
     wait_event_timeout() but before it does:

  2) An interrupt runs on cpuY with the WR reply.  fw6_msg() handles
     this and calls c4iw_wake_up().  c4iw_wake_up() sets the condition
     variable in the c4iw_wr_wait object to TRUE and will call
     wake_up(), but before it calls wake_up():

  3) The thread on cpuX calls c4iw_wait_for_reply(), which calls
     wait_event_timeout().  The wait_event_timeout() macro checks the
     condition variable and returns immediately since it is TRUE.  So
     this thread never blocks/sleeps. The function then returns
     effectively deallocating the c4iw_wr_wait object that was on the
     stack.

  4) So at this point cpuY has a pointer to the c4iw_wr_wait object
     that is no longer valid.  Further its pointing to a stack frame
     that might now be in use by some other context/thread.  So cpuY
     continues execution and calls wake_up() on a ptr to a wait object
     that as been effectively deallocated.

This race, when it hits, can cause a crash in wake_up(), which I've
seen under heavy stress. It can also corrupt the referenced stack
which can cause any number of failures.

The fix:

Use struct completion, which supports on-stack declarations.
Completions use a spinlock around setting the condition to true and
the wake up so that steps 2 and 4 above are atomic and step 3 can
never happen in-between.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
2011-05-24 09:47:38 -07:00
Luben Tuikov
0bcaa11154 [SCSI] Retrieve the Caching mode page (version 2)
Some kernel transport drivers unconditionally disable
retrieval of the Caching mode page. One such for example is
the BBB/CBI transport over USB. Such a restraint is too
harsh as some devices do support the Caching mode
page. Unconditionally enabling the retrieval of this mode
page over those transports at their transport code level may
result in some devices failing and becoming unusable.

This patch implements a method of retrieving the Caching
mode page without unconditionally enabling it in the
transports which unconditionally disable it. The idea is to
ask for all supported pages, page code 0x3F, and then search
for the Caching mode page in the mode parameter data
returned. The sd driver already asks for all the mode pages
supported by the attached device by setting the page code to
0x3F in order to find out if the media is write protected by
reading the WP bit in the Device Specific Parameter
field. It then attempts to retrieve only the Caching mode
page by setting the page code to 8 and actually attempting
to retrieve it if and only if the transport allows it.

The method implemented here is that if the transport doesn't
allow retrieval of the Caching mode page and the device is
not RBC, then we ask for all pages supported by setting the
page code to 0x3F (similarly to how the WP bit is retrieved
above), and then we search for the Caching mode page in the
mode parameter data returned.

With this patch, devices over SATA, report this (no change):

Oct 22 18:45:58 localhost kernel: sd 0:0:0:0: [sda] 976773168 512-byte logical blocks: (500 GB/465 GiB)
Oct 22 18:45:58 localhost kernel: sd 0:0:0:0: Attached scsi generic sg0 type 0
Oct 22 18:45:58 localhost kernel: sd 0:0:0:0: [sda] Write Protect is off
Oct 22 18:45:58 localhost kernel: sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
Oct 22 18:45:58 localhost kernel: sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA

Smart devices report their Caching mode page. This is a
change where we'd previously see the kernel making
assumption about the device's cache being write-through:

Oct 22 18:45:58 localhost kernel: sd 6:0:0:0: Attached scsi generic sg2 type 0
Oct 22 18:45:58 localhost kernel: sd 6:0:0:0: [sdb] 610472646 4096-byte logical blocks: (2.50 TB/2.27 TiB)
Oct 22 18:45:58 localhost kernel: sd 6:0:0:0: [sdb] Write Protect is off
Oct 22 18:45:58 localhost kernel: sd 6:0:0:0: [sdb] Mode Sense: 47 00 10 08
Oct 22 18:45:58 localhost kernel: sd 6:0:0:0: [sdb] Write cache: enabled, read cache: enabled, supports DPO and FUA

And "dumb" devices over BBB, are correctly shown not to
support reporting the Caching mode page:

Oct 22 18:49:06 localhost kernel: sd 7:0:0:0: [sdc] 15663104 512-byte logical blocks: (8.01 GB/7.46 GiB)
Oct 22 18:49:06 localhost kernel: sd 7:0:0:0: [sdc] Write Protect is off
Oct 22 18:49:06 localhost kernel: sd 7:0:0:0: [sdc] Mode Sense: 23 00 00 00
Oct 22 18:49:06 localhost kernel: sd 7:0:0:0: [sdc] No Caching mode page present
Oct 22 18:49:06 localhost kernel: sd 7:0:0:0: [sdc] Assuming drive cache: write through

Version 2 adds this:

Some devices don't support page code 0x3F, and others require a
fixed transfer length of 192 bytes. This single commit includes a
patch by Alan Stern which fixes this.

Reported-and-tested-by: Richard Senior <richard@r-senior.demon.co.uk>
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Luben Tuikov <ltuikov@yahoo.com>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:43:52 -04:00
Eddie Wai
9ae58e144d [SCSI] bnx2i: Optimized the iSCSI offload performance
Modified the event coalescing code for iSCSI offload to combat both
corner cases and optimize performance as follows:

1. Added mechanism to loop back a second time to process any leftover
CQEs that was generated by the hardware during the time the driver is
busy processing previous CQEs in the bh.  This not only helps the
performance but also fixes the corner case when no more CQEs are being
generated in the pipeline; so those leftover CQEs will get a a chance
to be processed.

2. Added ARM_CQE_FP to distinguish between fast path arming versus
slow path arming.  This change will guarantee that the CQEs will
always get a chance to be re-armed during fast path completions.

3. Removed the inline event coalescing division for perf optimization.
Also fixed a division-by-zero error when the event_coal_div module
param was set to 0.

4. Changed the default SQ WQEs size from 256 to 128 to match chip
default.

5. Changed the cmd_per_lun from 32 to 24.

Signed-off-by: Eddie Wai <eddie.wai@broadcom.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:41:10 -04:00
Eddie Wai
d5307a078b [SCSI] bnx2i: Updated the connection shutdown/cleanup timeout
Modified the 10s wait time for inflight offload connections to
advance to the next state to 2s based on test result.
Modified the 20s shutdown timeout to 30s based on test result.

Signed-off-by: Eddie Wai <eddie.wai@broadcom.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Cc: stable@kernel.org
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:40:57 -04:00
Eddie Wai
7287c63e98 [SCSI] bnx2i: Fixed packet error created when the sq_size is set to 16
The number of chip's internal command cell, which is use to generate
SCSI cmd packets to the target, was not initialized correctly by
the driver when the sq_size is changed from the default 128.
This, in turn, will create a problem where the chip's transmit pipe
will erroneously reuse an old command cell that is no longer valid.
The fix is to correctly initialize the chip's command cell upon setup.

Signed-off-by: Eddie Wai <eddie.wai@broadcom.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Cc: stable@kernel.org
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:40:45 -04:00
Vikas Chaudhary
6b278656f2 [SCSI] qla4xxx: Update driver version to 5.02.00-k7
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:40:34 -04:00
Harish Zunjarrao
7ad633c06b [SCSI] qla4xxx: Added vendor specific sysfs attributes
Added fw_version, serial_num, iscsi version and boot loader version
sysfs attributes.

Signed-off-by: Harish Zunjarrao <harish.zunjarrao@qlogic.com>
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:40:23 -04:00
Vikas Chaudhary
8f0722cae6 [SCSI] qla4xxx: Remove host_lock in queuecommand function
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:40:12 -04:00
Lalit Chandivade
1b46807e0b [SCSI] qla4xxx: Remove AF_DPC_SCHEDULED flag from ha.
Since queue_work does not requeue, there is no need to check
if a work is in progress using the AF_DPC_SCHEDULED flag.
queue_work would return if work is pending without adding the
work, do_dpc would again get invoked from qla4xxx_timer if
there is still DPC flags set.

Signed-off-by: Lalit Chandivade <lalit.chandivade@qlogic.com>
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:40:02 -04:00
Vikas Chaudhary
977f46a4bb [SCSI] qla4xxx: Don't check FW alive if ISP82XX reset is in progress
Corrected logic to don't check for F/W is alive if reset is already
in progress for ISP82XX

Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:39:52 -04:00
Vikas Chaudhary
0160ef1269 [SCSI] qla4xxx: Don't process mbx interrupt unconditionally
Do not process interrupt unconditionally during mailbox processing  which can
lead to spurious interrupt. Mailbox completion are now polled if interrupt are
disabled or wait for interrupt to come in if its enabled

Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:39:42 -04:00
Prasanna Mumbai
6d78bd56be [SCSI] qla4xxx: Complete the cmd if sense_len is zero
Complete the cmd if sense length is zero. For cases where sense
data spans across multiple iocb's by FW, we need to hold on to the
I/O (ha->status_srb != NULL) till we have processed them all and
copied the sense data from internal buffer to scsi_cmd sense buffer.

Signed-off-by: Prasanna Mumbai <prasanna.mumbai@qlogic.com>
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:39:29 -04:00
Vikas Chaudhary
68d92ebf59 [SCSI] qla4xxx: Dump HW/FW reg to figure out what caused FW to be hung for ISP82XX
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:39:20 -04:00
Vikas Chaudhary
cb74428ee3 [SCSI] qla4xxx: Updated the reset sequence for ISP82xx
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:38:57 -04:00
Prasanna Mumbai
185f107ef9 [SCSI] qla4xxx: update function qla4xxx_isr_decode_mailbox()
- Added MBOX_ASTS_DUPLICATE_IP AEN handling.
- Update MBOX_AEN_REG_COUNT to 8 so that driver will save status
  of all mbox registers in aen_q

Signed-off-by: Prasanna Mumbai <prasanna.mumbai@qlogic.com>
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:38:46 -04:00
Martin K. Petersen
c498bf1a1b [SCSI] scsi_trace: Decode UNMAP bit in WRITE SAME(10)
As of SBC3r26 WRITE SAME(10) supports the UNMAP bit.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:38:36 -04:00
Martin K. Petersen
756aca7edd [SCSI] mpt2sas: Fix missing reference tag seed with Type 2 devices
Ensure that the initial reference tag is passed on to the HBA firmware
for DIF Type 2 devices.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Acked-by: Kashyap Desai <Kashyap.Desai@lsi.com>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:38:25 -04:00
Martin K. Petersen
2a8cfad06e [SCSI] sd: Unmap discard alignment needs to be converted to bytes
The block layer discard alignment is reported in bytes, not in units of
the logical block size.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:38:15 -04:00
Jing Huang
45d7f0cc58 [SCSI] bfa: kdump fix
Root cause: When kernel crashes, bfa IOC state machine and FW doesn't get
a notification and hence are not cleanly shutdown. So registers holding
driver/IOC state information are not reset back to valid disabled/parking
values. This causes subsequent driver initialization to hang during kdump
kernel boot.

Fix description: during the initialization of first PCI function, reset
corresponding register when unclean shutown is detect by reading chip
registers. This will make sure that ioc/fw gets clean re-initialization.

Signed-off-by: Jing Huang <huangj@brocade.com>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:38:02 -04:00
Wayne Boyer
a5442ba4a4 [SCSI] ipr: fix possible false positive detection of stuck interrupt
If the driver is getting flooded with interrupts, there's a possibility
that the interrupt service routine could falsely detect a stuck interrupt
condition and reset the adapter.

This patch changes the logic such that the routine will loop back into
the command processing code one more time after detecting the stuck
interrupt signature.  If there are no commands to process after that pass,
and the interrupt is still not cleared, then the driver will print the
"Error clearing HRRQ" message and reset the adapter.

Signed-off-by: Wayne Boyer <wayneb@linux.vnet.ibm.com>
Acked-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:37:50 -04:00
Robert Love
d85e607b34 [SCSI] libfcoe: Remove unnecessary module state checks
libfcoe's interface consists of create, destroy, enable,
disable and create_vn2vn. These are currently module
paramaters added durring the module initialization. A
concern arose that the module parameters were being added
with write permissions before the module had completed
initialization. The following code was added to each
sysfs store file.

* Make sure the module has been initialized, and is not about to be
* removed.  Module parameter sysfs files are writable before the
* module_init function is called and after module_exit.
*/
if (THIS_MODULE->state != MODULE_STATE_LIVE)
    goto out_nodev;

This check was called out as unhelpful as the module can
go dead at any time and therefore its state isn't a reliable
thing to look at as a sign of stability and initialization
completion. Also, that functional interfaces like these
should be added after module initialization.

This patch removes the unnecessary checks and hopes to
disprove the concern about initialization ordering.

Recent fcoe transport rework changes now require fcoe
transports to register with libfcoe before any operation
can take place. libfcoe may access some static variables
but nothing that could cause a problem. Once a fcoe transport
is registered, libfcoe is usable and any interface calls will
be functional.

Signed-off-by: Robert Love <robert.w.love@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:37:35 -04:00
Yi Zou
8467b96c03 [SCSI] libfc: do not immediately retry the cmd when seq_send fails in fc_fcp_send_data
Currently, when seq_send() fails in fc_fcp_send_data(),
fc_fcp_retry_cmd() would complete this failed I/O directly and let
scsi-ml retry. However, target side is not notified which may hang the
target. Instead, we should just bail out from from fc_fcp_send_data
and let scsi-ml times it out and aborts this I/O instead.

Signed-off-by: Yi Zou <yi.zou@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:37:25 -04:00
Vasu Dev
0a219edb26 [SCSI] libfc: fix race in SRR response
In this case fsp was freed before error handler was invoked,
this is fixed by having SRR fsp reference freed by exch
destructor so that fsp will be always held until it exch
is freed.

Also don't reset fsp->recov_seq since this is needed by
SRR error handler to do exch done.

Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:37:15 -04:00
Vasu Dev
8d23f4ba38 [SCSI] libfc: don't call resp handler after FC_EX_TIMEOUT
In cases exch is already timed out then exch layer could
end up calling resp handler again for its response frame
received after timeout, though in this case fc_exch_timeout
handler would have already called resp with FC_EX_TIMEOUT.

This would cause REC response handler to release its
fsp pkt hold twice instead once and possibly similar issues
with other ELS exchanges in this race.

To avoid this race have resp updated under exch lock
in rx path, the resp would get set to NULL in case
of FC_EX_TIMEOUT under the same lock to prevent resp
callback after FC_EX_TIMEOUT.

Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:37:03 -04:00
Yi Zou
6a716a8535 [SCSI] libfc: release DDP context if frame_send() fails
In case frame_send() fails, make sure to let the underlying HW release the DDP
context that has already been set up before calling frame_send().

Signed-off-by: Yi Zou <yi.zou@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:36:51 -04:00
Hillf Danton
83383dd11a [SCSI] libfc: fix mm leak in handling incoming request for target discovery
When handling incoming request, if the operation code carried by the
received frame is not RSCN, the frame should be freed as in the RSCN
case, or there is memory leakage.

Signed-off-by: Hillf Danton <dhillf@gmail.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:36:41 -04:00
Neerav Parikh
bdf252183e [SCSI] fcoe: Prevent creation of an NPIV port with duplicate WWPN
This patch adds a validation step before allowing creation of a new NPIV port.
It checks whether the WWPN passed for the new NPIV port to be created is unique
for the given physical port.

Signed-off-by: Neerav Parikh <Neerav.Parikh@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:36:29 -04:00
Bhanu Prakash Gollapudi
c051ad2e57 [SCSI] libfcoe: Incorrect CVL handling for NPIV ports
Host doesnt handle CVL to NPIV instantiated ports correctly.
- As per FC-BB-5 Rev 2 CVLs with no VN_Port descriptors shall be treated as
  implicit logout of ALL vn_ports.
- CVL for NPIV ports should be handled before physical port even if descriptor
  for physical port appears before NPIV ports

Signed-off-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:36:17 -04:00
adam radford
4f788dce0b [SCSI] megaraid_sas: Version and Changelog update
Signed-off-by: Adam Radford <aradford@gmail.com>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:36:06 -04:00
adam radford
3cc6851f9a [SCSI] megaraid_sas: Add 1078 OCR support
Signed-off-by: Adam Radford <aradford@gmail.com>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:35:56 -04:00
adam radford
495c560970 [SCSI] megaraid_sas: Convert 6,10,12 byte CDB's for FastPath IO
The following patch for megaraid_sas converts 6,10,12 byte CDB's to 16
byte CDB for large LBA's for FastPath IO.

Signed-off-by: Adam Radford <aradford@gmail.com>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:35:46 -04:00
adam radford
541f90b7c6 [SCSI] megaraid_sas: Fix bug where AENs could be lost in probe() and resume()
Signed-off-by: Adam Radford <aradford@gmail.com>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:35:34 -04:00
adam radford
46fd256e05 [SCSI] megaraid_sas: Disable interrupts/free_irq() in megasas_shutdown()
The following patch for megaraid_sas disables interrupts and
free_irq() in megasas_shutdown().

Signed-off-by: Adam Radford <aradford@gmail.com>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:35:09 -04:00
adam radford
7e70e73365 [SCSI] megaraid_sas: Check MFI_REG_STATE.fault.resetAdapter
The following patch for megaraid_sas fixes the function
megasas_reset_fusion() and makes the reset code check
MFI_REG_STATE.fault.resetAdapter.

Signed-off-by: Adam Radford <aradford@gmail.com>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:34:59 -04:00
adam radford
70d031f36f [SCSI] megaraid_sas: Remove un-used function
The following patch for megaraid_sas removes un-used function
megasas_return_cmd_for_smid().

Signed-off-by: Adam Radford <aradford@gmail.com>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:34:47 -04:00
adam radford
3f1abce4ab [SCSI] megaraid_sas: Remove MSI-X black list, use MFI_REG_STATE instead
This patch for megaraid_sas removes the MSI-X black list and uses
MFI_REG_STATE.ready.msiEnable instead.

Signed-off-by: Adam Radford <aradford@gmail.com>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:34:12 -04:00
Xiangliang Yu
bb650a1bef [SCSI] libsas: fix SATA NCQ error
Current version of libsas can not handle SATA NCQ error.
This patch handle SATA NCQ error as AHCI do.

Signed-off-by: Xiangliang Yu <yuxiangl@marvell.com>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:34:01 -04:00
Kashyap, Desai
5fd5cc83a8 [SCSI] mpt2sas: Driver version upgrade 08.100.00.02
Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:33:35 -04:00
Kashyap, Desai
3ace8e052b [SCSI] mpt2sas: move even handling of MPT2SAS_TURN_ON_FAULT_LED into process context
Driver was a sending a SEP request during interrupt context which
required to go to sleep.

The fix is to rearrange the code so a fake event
MPT2SAS_TURN_ON_FAULT_LED is fired from interrupt context, then later
during the kernel worker threads processing, the SEP request is issued
to firmware.

Cc: stable@kernel.org
Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:33:01 -04:00
Arun Sharma
0bd41dfc9f kbuild: Create a kernel-headers RPM
To compile binaries which depend on new kernel interfaces, we need a
kernel-headers RPM

Signed-off-by: Arun Sharma <asharma@fb.com>
Signed-off-by: Michal Marek <mmarek@suse.cz>
2011-05-24 18:28:29 +02:00
Christoph Hellwig
55a7bc5a30 xfs: do not discard alloc btree blocks
Blocks for the allocation btree are allocated from and released to
the AGFL, and thus frequently reused.  Even worse we do not have an
easy way to avoid using an AGFL block when it is discarded due to
the simple FILO list of free blocks, and thus can frequently stall
on blocks that are currently undergoing a discard.

Add a flag to the busy extent tracking structure to skip the discard
for allocation btree blocks.  In normal operation these blocks are
reused frequently enough that there is no need to discard them
anyway, but if they spill over to the allocation btree as part of a
balance we "leak" blocks that we would otherwise discard.  We could
fix this by adding another flag and keeping these block in the
rbtree even after they aren't busy any more so that we could discard
them when they migrate out of the AGFL.  Given that this would cause
significant overhead I don't think it's worthwile for now.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
2011-05-24 11:17:22 -05:00
Christoph Hellwig
e84661aa84 xfs: add online discard support
Now that we have reliably tracking of deleted extents in a
transaction we can easily implement "online" discard support
which calls blkdev_issue_discard once a transaction commits.

The actual discard is a two stage operation as we first have
to mark the busy extent as not available for reuse before we
can start the actual discard.  Note that we don't bother
supporting discard for the non-delaylog mode.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
2011-05-24 11:17:13 -05:00
Jan Kara
93628ffb9b ext4: fix waiting and sending of a barrier in ext4_sync_file()
jbd2_log_start_commit() returns 1 only when we really start a
transaction.  But we also need to wait for a transaction when the
commit is already running.  Fix this problem by waiting for
transaction commit unconditionally (which is just a quick check if the
transaction is already committed).

Also we have to be more careful with sending of a barrier because when
transaction is being committed in parallel to ext4_sync_file()
running, we cannot be sure that the barrier the journalling code sends
happens after we wrote all the data for fsync (note that not every
data writeout needs to trigger metadata changes thus commit of some
metadata changes can be running while other data is still written
out). So use jbd2_will_send_data_barrier() helper to detect the common
cases when we can be sure barrier will be issued by the commit code
and issue the barrier ourselves in the remaining cases.

Reported-by: Edward Goggin <egoggin@vmware.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-05-24 12:00:54 -04:00
Jan Kara
bbd2be3691 jbd2: Add function jbd2_trans_will_send_data_barrier()
Provide a function which returns whether a transaction with given tid
will send a flush to the filesystem device.  The function will be used
by ext4 to detect whether fsync needs to send a separate flush or not.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-05-24 11:59:18 -04:00
Jan Kara
81be12c817 jbd2: fix sending of data flush on journal commit
In data=ordered mode, it's theoretically possible (however rare) that
an inode is filed to transaction's t_inode_list and a flusher thread
writes all the data and inode is reclaimed before the transaction
starts to commit.  In such a case, we could erroneously omit sending a
flush to file system device when it is different from the journal
device (because data can still be in disk cache only).

Fix the problem by setting a flag in a transaction when some inode is added
to it and then send disk flush in the commit code when the flag is set.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-05-24 11:52:40 -04:00
Michal Marek
857c7e4387 rpm-pkg: Fix when current directory is a symlink
The better fix would be to stop using the parent directory (principle of
least surprise), but as long as we use it, use it consistently.

Signed-off-by: Michal Marek <mmarek@suse.cz>
2011-05-24 17:44:00 +02:00
Yongqiang Yang
b221349fa8 ext4: fix ext4_ext_fiemap_cb() to handle blocks before request range correctly
To get delayed-extent information, ext4_ext_fiemap_cb() looks up
pagecache, it thus collects information starting from a page's
head block.

If blocksize < pagesize, the beginning blocks of a page may lies
before the request range. So ext4_ext_fiemap_cb() should proceed
ignoring them, because they has been handled before. If no mapped
buffer in the range is found in the 1st page, we need to look up
the 2nd page, otherwise delayed-extents after a hole will be ignored.

Without this patch, xfstests 225 will hung on ext4 with 1K block.

Reported-by: Amir Goldstein <amir73il@users.sourceforge.net>
Signed-off-by: Yongqiang Yang <xiaoqiangnk@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-05-24 11:36:58 -04:00