This change places a netdev pointer directly into the ring structure. This
way we can avoid having to determine which netdev we are supposed to be
using and can just access the one on the ring directly.
As a result of this change further collapse of the code is possible by
dropping the adapter from ixgbe_alloc_rx_buffers, and the netdev pointer
from ixgbe_xmit_frame_ring_adv and ixgbe_maybe_stop_tx.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change moved some of the RX and TX stats into separate structures and
them placed those structures in a union in order to help reduce the size of
the ring structure.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change is meant to simplify DMA map/unmap by providing a device
pointer. As a result the adapter pointer can be dropped from many of
the calls.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change drops ring->head since it is not used in any hot-path and can
easily be determined using IXGBE_[RT]DH(ring->reg_idx).
It also changes ring->tail into a true pointer so we can avoid unnecessary
pointer math to find the location of the tail.
In addition I also dropped the setting of head and tail in
ixgbe_clean_[rx|tx]_ring. The only location that should be setting the head
and tail values is ixgbe_configure_[rx|tx]_ring and that is only while the
queue is disabled.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change re-orders alloc_rx_buffers to make better use of the packet
split enabled flag. The new setup should require less branching in the
code since now we are down to fewer if statements since we either are
handling packet split or aren't.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change simplifies the work being done by the TX interrupt handler and
pushes it into the tx_map call. This allows for fewer cache misses since
the TX cleanup now accesses almost none of the skb members.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
There is no need to reset the adapter when changing the Rx checksum
settings. Since the only change is a software flag we can disable it
without needing to reset the entire adapter.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The maximum credits per traffic class only needs to be greater
then the TSO size for 82598 devices. The 82599 devices do not
have this requirement so only do this test for 82598 devices.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Currently the high and low water marks for PFC are being set
conservatively for jumbo frames. This means the RX buffers
are being underutilized in the default 1500 MTU. This patch
fixes this so that the water marks are set as described in
the data sheet considering the MTU size.
The equation used is,
RTT * 1.44 + MTU * 1.44 + MTU
Where RTT is the round trip time and MTU is the max frame size
in KB. To avoid floating point arithmetic FC_HIGH_WATER is
defined
((((RTT + MTU) * 144) + 99) / 100) + MTU
This changes how the hardware field fc.low_water and
fc.high_water are used. With this change they are no longer
storing the actual low water and high water markers but are
storing the required head room in the buffer. This simplifies
the logic and we do not need to account for the size of the
buffer when setting the thresholds.
Testing with iperf and 16 threads showed a slight uptick in
throughput over a single traffic class .1-.2Gbps and a reduction
in pause frames. Without the patch a 30 second run would show
~10-15 pause frames being transmitted with the patch ~2-5 are
seen. Test were run back to back with 82599.
Note RXPBSIZE is in KB and low and high water marks fields are
also in KB. However the FCRT* registers are 32B granularity and
right shifted 5 into the register,
(((rx_pbsize - water_mark) * 1024) / 32) << 5
is the most explicit conversion here we simplify
(rx_pbsize - water_mark) * 32 << 5 = (rx_pbsize - water_mark) << 10
This patch updates the PFC thresholds and legacy FC thresholds.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Update version string and copyright notice.
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Tested-by: Emil Tantilov <emil.s.tantilov@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
"cat /proc/net/dev" uses RCU protection only.
Its quite possible we call a driver get_stats() method while device is
dismantling and freeing its data structures.
So get_stats() methods must be very careful not accessing driver private
data without appropriate locking.
In ixgbe case, we access rx_ring pointers. These pointers are freed in
ixgbe_clear_interrupt_scheme() and set to NULL, this can trigger NULL
dereference in ixgbe_get_stats64()
A possible fix is to use RCU locking in ixgbe_get_stats64() and defer
rx_ring freeing after a grace period in ixgbe_clear_interrupt_scheme()
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Reported-by: Tantilov, Emil S <emil.s.tantilov@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Making /proc/kallsyms readable only for root by default makes it
slightly harder for attackers to write generic kernel exploits by
removing one source of knowledge where things are in the kernel.
This is the second submit, discussion happened on this on first submit
and mostly concerned that this is just one hole of the sieve ... but
one of the bigger ones.
Changing the permissions of at least System.map and vmlinux is also
required to fix the same set, but a packaging issue.
Target of this starter patch and follow ups is removing any kind of
kernel space address information leak from the kernel.
[ Side note: the default of root-only reading is the "safe" value, and
it's easy enough to then override at any time after boot. The /proc
filesystem allows root to change the permissions with a regular
chmod, so you can "revert" this at run-time by simply doing
chmod og+r /proc/kallsyms
as root if you really want regular users to see the kernel symbols.
It does help some tools like "perf" figure them out without any
setup, so it may well make sense in some situations. - Linus ]
Signed-off-by: Marcus Meissner <meissner@suse.de>
Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: Eugene Teo <eugeneteo@kernel.org>
Reviewed-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
nfs: Ignore kmemleak false positive in nfs_readdir_make_qstr
SUNRPC: Simplify rpc_alloc_iostats by removing pointless local variable
nfs: trivial: remove unused nfs_wait_event macro
NFS: readdir shouldn't read beyond the reply returned by the server
NFS: Fix a couple of regressions in readdir.
Revert "NFSv4: Fall back to ordinary lookup if nfs4_atomic_open() returns EISDIR"
Regression: fix mounting NFS when NFSv3 support is not compiled
NLM: Fix a regression in lockd
* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sched: Fix cross-sched-class wakeup preemption
sched: Fix runnable condition for stoptask
sched: Use group weight, idle cpu metrics to fix imbalances during idle
* 'pm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6:
PM / PM QoS: Fix reversed min and max
PM / OPP: Hide OPP configuration when SoCs do not provide an implementation
PM: Allow devices to be removed during late suspend and early resume
This prevents firewire-net from submitting write requests in fast
succession until failure due to all 64 transaction labels were used up
for unfinished split transactions. The netif_stop/wake_queue API is
used for this purpose.
Without this stop/wake mechanism, datagrams were simply lost whenever
the tlabel pool was exhausted. Plus, tlabel exhaustion by firewire-net
also prevented other unrelated outbound transactions to be initiated.
The chosen queue depth was checked by me to hit the maximum possible
throughput with an OS X peer whose receive DMA is good enough to never
reject requests due to busy inbound request FIFO. Current Linux peers
show a mixed picture of -5%...+15% change in bandwidth; their current
bottleneck are RCODE_BUSY situations (fewer or more, depending on TX
queue depth) due to too small AR buffer in firewire-ohci.
Maxim Levitsky tested this change with similar watermarks with a Linux
peer and some pending firewire-ohci improvements that address the
RCODE_BUSY problem and confirmed that these TX queue limits are good.
Note: This removes some netif_wake_queue from reception code paths.
They were apparently copy&paste artefacts from a nonsensical
netif_wake_queue use in the older eth1394 driver. This belongs only
into the transmit path.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Tested-by: Maxim Levitsky <maximlevitsky@gmail.com>
The current transmit code does not at all make use of
- fwnet_device.packet_list
and only very limited use of
- fwnet_device.broadcasted_list,
- fwnet_device.queued_packets.
Their current function is to track whether the TX soft-IRQ finished
dealing with an skb when the AT-req tasklet takes over, and to discard
pending tx datagrams (if there are any) when the local node is removed.
The latter does actually contain a race condition bug with TX soft-IRQ
and AT-req tasklet.
Instead of these lists and the corresponding link in fwnet_packet_task,
- a flag in fwnet_packet_task to track whether fwnet_tx is done,
- a counter of queued datagrams in fwnet_device
do the job as well.
The above mentioned theoretic race condition is resolved by letting
fwnet_remove sleep until all datagrams were flushed. It may sleep
almost arbitrarily long since fwnet_remove is executed in the context of
a multithreaded (concurrency managed) workqueue.
The type of max_payload is changed to u16 here to avoid waste in struct
fwnet_packet_task. This value cannot exceed 4096 per IEEE 1394:2008
table 16-18 (or 32678 per specification of packet headers, if there is
ever going to be something else than beta mode).
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
a) fwnet_transmit_packet_done used to poison ptask->pt_link by list_del.
If fwnet_send_packet checked later whether it was responsible to clean
up (in the border case that the TX soft IRQ was outpaced by the AT-req
tasklet on another CPU), it missed this because ptask->pt_link was no
longer shown as empty.
b) If fwnet_write_complete got an rcode other than RCODE_COMPLETE, we
missed to free the skb and ptask entirely.
Also, count stats.tx_dropped and stats.tx_errors when rcode != 0.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
The per-cpu event channel masks can be updated unlocked from multiple
CPUs, so use the locked variant.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
To bind all event channels to CPU#0, it is not sufficient to set all
of its cpu_evtchn_mask[] bits; all other CPUs also need to get their
bits cleared. Otherwise, evtchn_do_upcall() will start handling
interrupts on CPUs they're not intended to run on, which can be
particularly bad for per-CPU ones.
[ linux-2.6.18-xen.hg 7de7453dee36 ]
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
This patch (as1434) cleans up the uses of usb_mark_last_busy() in
usbcore. The function will be called when a device is resumed and
whenever a usage count is decremented. A call that was missing from
the hub driver is added: A hub is used whenever one of its ports gets
suspended (this prevents hubs from suspending immediately after their
last child).
In addition, the call to disable autosuspend support for new devices
by default is moved from usb_detect_quirks() (where it doesn't really
belong) into usb_new_device() along with all the other runtime-PM
initializations. Finally, an extra pm_runtime_get_noresume() is added
to prevent new devices from autosuspending while they are being
registered.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch (as1428) converts USB over to the new runtime-PM core
autosuspend framework. One slightly awkward aspect of the conversion
is that USB devices will now have two suspend-delay attributes: the
old power/autosuspend file and the new power/autosuspend_delay_ms
file. One expresses the delay time in seconds and the other in
milliseconds, but otherwise they do the same thing. The old attribute
can be deprecated and then removed eventually.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Since the runtime-PM core already defines a .last_busy field in
device.power, this patch uses it to replace the .last_busy field
defined in usb_device and uses pm_runtime_mark_last_busy to implement
usb_mark_last_busy.
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Reviewed-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch (as1426) makes use of the new sysfs_merge_group() and
sysfs_unmerge_group() routines to simplify the handling of power
attributes for USB devices.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Call pm_runtime_no_callbacks to set no_callbacks flag for USB
interfaces. Since interfaces cannot be power-managed separately from
their parent devices, there's no reason for the runtime-PM core to
invoke any callbacks for them.
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Reviewed-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch adds the USB device driver of EG20T(Topcliff) PCH.
EG20T PCH is the platform controller hub that is going to be used in
Intel's upcoming general embedded platform. All IO peripherals in
EG20T PCH are actually devices sitting on AMBA bus.
EG20T PCH has USB device I/F. Using this I/F, it is able to access system
devices connected to USB device.
Signed-off-by: Toshiharu Okada <toshiharu-linux@dsn.okisemi.com>
Acked-by: Michał Nazarewicz <m.nazarewicz@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This commit removes custom printk() wrappers from the f_fs.c
file. They served little purpose above what pr_*() family of
macros provides. Only FVDBG() has been left but renamed to
pr_vdebug() to match other uses.
Signed-off-by: Michal Nazarewicz <mina86@mina86.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This commit changes FunctionFS as to make it more compliant
with coding style as well as fixes several typos.
Signed-off-by: Michal Nazarewicz <mina86@mina86.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch adds support for ehci and ohci controller in the SPEAr platform.
Changes since V2:
added clear_tt_buffer_complete in ehci_spear_hc_driver
Signed-off-by: Deepak Sikri <deepak.sikri@st.com>
Signed-off-by: Viresh Kumar <viresh.kumar@st.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The calibration data variable size is based on the number of
channels available in the ath9k driver.
Signed-off-by: Mohammed Shafi Shajakhan <mshajakhan@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Subtract of jiffies is fine even if one variable overwrap.
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Acked-by: Wey-Yi Guy <wey-yi.w.guy@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
We can simplify length calculation in iwlagn_tx_skb, that function
is enough complex, without fuzz it more than necessary.
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Acked-by: Wey-Yi Guy <wey-yi.w.guy@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Acked-by: Wey-Yi Guy <wey-yi.w.guy@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Chipsets with hardware based connection monitoring need to autonomically
send directed probe-request frames to the AP (in the event of beacon loss,
for example.)
For the hardware to be able to do this, it requires a template for the frame
to transmit to the AP, filled in with the BSSID and SSID of the AP, but also
the supported rate IE's.
This patch adds a function to mac80211, which allows the hardware driver to
fetch this template after association, so it can be configured to the hardware.
Signed-off-by: Juuso Oikarinen <juuso.oikarinen@nokia.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Merge ath_tx_send_normal and ath_tx_send_ht_normal.
Move the paprd state initialization and sequence number assignment
to reduce the number of redundant checks.
This not only simplifies buffer allocation error handling, but also
removes a small inconsistency in the buffer HT flag.
This flag should only be set if the frame is also a QoS data frame.
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
An earlier review suggested moving the code in a small
method that was only called once inline. This patch
accomplishes that.
Signed-off-by: Ben Greear <greearb@candelatech.com>
Acked-by: Bruno Randolf <br1@einfach.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
TX underruns were noticed when RTS/CTS preceded aggregates.
This issue was noticed in ar93xx family of chipsets only.
The workaround involves padding the RTS or CTS length up
to the min packet length of 256 bytes required by the
hardware by adding delimiters to the fist descriptor of
the aggregate.
Signed-off-by: Senthil Balasubramanian <senthilkumar@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
There is a roundng error in delimiter padding computation
which causes severe throughput drop with some of AR9003.
signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: Vasanthakumar Thiagarajan <vasanth@atheros.com>
Cc:stable@kernel.org
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Also round off interpolated values this would improve power
accuracy by 0.5dB in some cases.
Signed-off-by: Vasanthakumar Thiagarajan <vasanth@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>