In its receive path, mlx4_en driver maps each page chunk that it pushes
to the hardware and unmaps it when pushing it up the stack. This limits
throughput to about 3Gbps on a Power7 8-core machine.
One solution is to map the entire allocated page at once. However, this
requires that we keep track of every page fragment we give to a
descriptor. We also need to work with the discipline that all fragments will
be released (in the sense that it will not be reused by the driver
anymore) in the order they are allocated to the driver.
This requires that we don't reuse any fragments, every single one of
them must be reallocated. We do that by releasing all the fragments that
are processed and only after finished processing the descriptors, we
start the refill.
We also must somehow guarantee that we either refill all fragments in a
descriptor or none at all, without resorting to giving up a page
fragment that we would have already given. Otherwise, we would break the
discipline of only releasing the fragments in the order they were
allocated.
This has passed page allocation fault injections (restricted to the
driver by using required-start and required-end) and device hotplug
while 16 TCP streams were able to deliver more than 9Gbps.
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dynamically allocated sysfs attributes must be initialized using
sysfs_attr_init(), otherwise lockdep complains:
BUG: key <address> not in .data!
Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Acked-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Update the references to bridge utilities and web pages
to current locations
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit 9ac32e1b firmware: convert e100 driver to request_firmware()
did a straight conversion of the in-driver ucode to external
files. This introduced the possibility of the driver failing
to enable an interface due to missing ucode. There was no
evaluation of the importance of the ucode at the time.
Based on comments in earlier versions of this driver, and in
the source code for the FreeBSD fxp driver, we can assume that
the ucode implements the "CPU Cycle Saver" feature on supported
adapters. Although generally wanted, this is an optional
feature. The ucode source is not available, preventing it from
being included in free distributions. This creates unnecessary
problems for the end users. Doing a network install based on a
free distribution installer requires the user to download and
insert the ucode into the installer.
Making the ucode optional when possible improves the user
experience and driver usability.
The ucode for some adapters include a bugfix, making it
essential. We continue to fail for these adapters unless the
ucode is available.
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since commit 16626b0cc3 the asix
driver depends on the phylib. Select phylib when the asix driver is
selected.
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Cc: kernel-janitors@vger.kernel.org
Signed-off-by: Christian Riesch <christian.riesch@omicron.at>
Tested-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds the asix_set_eeprom() function to provide support for
programming the configuration EEPROM via ethtool.
Signed-off-by: Christian Riesch <christian.riesch@omicron.at>
Signed-off-by: David S. Miller <davem@davemloft.net>
The current code for reading the EEPROM via ethtool in the asix
driver has a few issues. It cannot handle odd length values
(accesses must be aligned at 16 bit boundaries) and interprets the
offset provided by ethtool as 16 bit word offset instead as byte offset.
The new code for asix_get_eeprom() introduced by this patch is
modeled after the code in
drivers/net/ethernet/atheros/atl1e/atl1e_ethtool.c
and provides read access to the entire EEPROM with arbitrary
offsets and lengths.
Signed-off-by: Christian Riesch <christian.riesch@omicron.at>
Signed-off-by: David S. Miller <davem@davemloft.net>
Because there are multiple variants to the stmmac/dwmac driver, the
dts bindings should be updated to include version of the IP used.
Signed-off-by: Dinh Nguyen <dinguyen@altera.com>
Acked-by: Stefan Roese <sr@denx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
cxgb3 interface has a bad performance when VLAN is set. On my current
setup, a PowerLinux 7R2, I am able to get around 7 Gbps on a TCP_STREAM
(8 instances, 4k message).
With this patch, I am able to reach 9.5 Gbps.
Signed-off-by: Breno Leitao <brenohl@br.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The Ethernet II wrapper is only used by IPX protocol, may have once
been used by Appletalk but not currently. Therefore it makes sense to
move it to the IPX dust bin and drop the exports.
Build tested only.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The stack_not_used() function in <linux/sched.h> assumes that stacks
grow downwards. This is not true on IA64 or PARISC, so this function
would walk off in the wrong direction and into the weeds.
Found on IA64 because of a compilation failure with recursive dependencies
on IA64_TASKSIZE and IA64_THREAD_INFO_SIZE.
Fixing the code is possible, but should be combined with other
infrastructure additions to set up the "canary" at the end of the stack.
Reported-by: Fengguang Wu <fengguang.wu@intel.com> (failed allmodconfig build)
Signed-off-by: Tony Luck <tony.luck@intel.com>
tcp_v4_send_reset() and tcp_v4_send_ack() use a single socket
per network namespace.
This leads to bad behavior on multiqueue NICS, because many cpus
contend for the socket lock and once socket lock is acquired, extra
false sharing on various socket fields slow down the operations.
To better resist to attacks, we use a percpu socket. Each cpu can
run without contention, using appropriate memory (local node)
Additional features :
1) We also mirror the queue_mapping of the incoming skb, so that
answers use the same queue if possible.
2) Setting SOCK_USE_WRITE_QUEUE socket flag speedup sock_wfree()
3) We now limit the number of in-flight RST/ACK [1] packets
per cpu, instead of per namespace, and we honor the sysctl_wmem_default
limit dynamically. (Prior to this patch, sysctl_wmem_default value was
copied at boot time, so any further change would not affect tcp_sock
limit)
[1] These packets are only generated when no socket was matched for
the incoming packet.
Reported-by: Bill Sommerfeld <wsommerfeld@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use global seqlock for the nh_exceptions. Call
fnhe_oldest with the right hash chain. Correct the diff
value for dst_set_expires.
v2: after suggestions from Eric Dumazet:
* get rid of spin lock fnhe_lock, rearrange update_or_create_fnhe
* continue daddr search in rt_bind_exception
v3:
* remove the daddr check before seqlock in rt_bind_exception
* restart lookup in rt_bind_exception on detected seqlock change,
as suggested by David Miller
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use RFS infrastructure and flow steering in HW to keep CPU
affinity of rx interrupts and application per TCP stream.
A flow steering filter is added to the HW whenever the RFS
ndo callback is invoked by core networking code.
Because the invocation takes place in interrupt context, the
actual setup of HW is done using workqueue. Whenever new filter
is added, the driver checks for expiry of existing filters.
Since there's window in time between the point where the core
RFS code invoked the ndo callback, to the point where the HW
is configured from the workqueue context, the 2nd, 3rd etc
packets from that stream will cause the net core to invoke
the callback again and again.
To prevent inefficient/double configuration of the HW, the filters
are kept in a database which is indexed using hash function to enable
fast access.
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Enable callers of mlx4_assign_eq to supply a pointer to cpu_rmap.
If supplied, the assigned IRQ is tracked using rmap infrastructure.
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Acked-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Define this macro is one common place instead of duplicating it over the code
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ip_options_compile can be called for forwarded packets,
make sure the specific-destionation address is a local one as
specified in RFC 1812, 4.2.2.2 Addresses in Options
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move fib_compute_spec_dst at the only place where it
is needed.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
One of the bugs was introduced in 3.5-rc1. Others have
been there for longer.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (GNU/Linux)
iQIVAwUAUAeiTznsnt1WYoG5AQLzcRAAqAIBtmGRzTpCTGzfJxF3ciDptQLfDKzx
JDwcKsmD3+70bjkKUsHhu22EK/Fgdi2T7XirjlJTnMZFtwfQFbRBNFe+6AneefzI
fJvTwEOsvgNBEJvEUtp9hboZBZBZBmujdXYuEf9NbSEoK+bOcxtTh6V+CwcCKAEI
ulreMNCBX/e9RRP/ayUsj33TJGvDGEJWmFOoEj/3sZ9soKC3GYFkr5I50FcRhrgh
J78mX64Qf1KCsIG2zwN5w/pE9Nnz5mJ4iBElhl3xQT6nDikhe4AZv2Z51s6UMQ9K
oQSVgJg9IkAH+Vl/IFzvK/4mU1/xnCydA/Q+CEXxLXor0kFnl9XxpSwHhjQTQ/Ag
l5cA+U5RR0wkCS7OGv0mxwY2rfw0wg7I+v9GQu9hg+XyeZTjWC4rU1EiANAWDHiE
eCoUTi4MkwAA5vkL+G7B2I9fXD11eigTPFbiHvm5a3SNzqiKklMQgh7SoonnRZ9L
iL3qfIBwhpQAbAChqTS92WofYvNKTfE6qTrQUfEomgr4EWtr2teiR36shDnfKRCw
vdmL7d+ql9qHbD83NfV7tE08clK4h5MtKDmJtHoOdeeGK1UUI+VmMhQQSHEX3UAW
TduJnL4bhaw5Z4QCrpS5IO5R/mTBX1VRrbzjcJl+Te4HrmRl/ccQKachvn1N597L
ah3SlO5LBy0=
=zpJG
-----END PGP SIGNATURE-----
Merge tag 'md-3.5-fixes' of git://neil.brown.name/md
Pull three md bugfixes from NeilBrown:
"One of the bugs was introduced in 3.5-rc1. Others have been there for
longer."
* tag 'md-3.5-fixes' of git://neil.brown.name/md:
md/raid1: close some possible races on write errors during resync
md: avoid crash when stopping md array races with closing other open fds.
md: fix bug in handling of new_data_offset
Pull networking changes from David Miller:
"Ok, we should be good to go now"
1) We have to statically initialize the init_net device list head rather
than do so in an initcall, otherwise netprio_cgroup crashes if it's
built statically rather than modular (Mark D. Rustad)
2) Fix SKB null oopser in CIPSO ipv4 option processing (Paul Moore)
3) Qlogic maintainers update (Anirban Chakraborty)
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
net: Statically initialize init_net.dev_base_head
MAINTAINERS: Changes in qlcnic and qlge maintainers list
cipso: don't follow a NULL pointer when setsockopt() is called
Pull HID update from Jiri Kosina:
"A final round of changes for HID for 3.5: just device ID additions."
* 'upstream-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
HID: hid-multitouch: add support for Zytronic panels
HID: add Sennheiser BTD500USB device support
HID: add battery quirk for Apple Wireless ANSI
The strcpy was being used to set the name of the board. Since the
destination char* was read-only and the name is set statically at
compile time; this was both wrong and redundant.
The type of char* is changed to const char* to prevent future errors.
Reported-by: Radek Masin <radek@masin.eu>
Signed-off-by: Ezequiel Garcia <elezegarcia@gmail.com>
[ Taking directly due to vacations - Linus ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
With the changes in the random tree, IRQF_SAMPLE_RANDOM is now a
no-op; interrupt randomness is now collected unconditionally in a very
low-overhead fashion; see commit 775f4b297b. The IRQF_SAMPLE_RANDOM
flag was scheduled to be removed in 2009 on the
feature-removal-schedule, so this patch is preparation for the final
removal of this flag.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
With the changes in the random tree, IRQF_SAMPLE_RANDOM is now a
no-op; interrupt randomness is now collected unconditionally in a very
low-overhead fashion; see commit 775f4b297b. The IRQF_SAMPLE_RANDOM
flag was scheduled to be removed in 2009 on the
feature-removal-schedule, so this patch is preparation for the final
removal of this flag.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Acked-by: "David S. Miller" <davem@davemloft.net>
With the changes in the random tree, IRQF_SAMPLE_RANDOM is now a
no-op; interrupt randomness is now collected unconditionally in a very
low-overhead fashion; see commit 775f4b297b. The IRQF_SAMPLE_RANDOM
flag was scheduled to be removed in 2009 on the
feature-removal-schedule, so this patch is preparation for the final
removal of this flag.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Acked-by: Haojian Zhuang <haojian.zhuang@gmail.com>
Cc: Eric Miao <eric.y.miao@gmail.com>
Cc: Russell King <linux@arm.linux.org.uk>
With the changes in the random tree, IRQF_SAMPLE_RANDOM is now a
no-op; interrupt randomness is now collected unconditionally in a very
low-overhead fashion; see commit 775f4b297b. The IRQF_SAMPLE_RANDOM
flag was scheduled to be removed in 2009 on the
feature-removal-schedule, so this patch is preparation for the final
removal of this flag.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: Tony Lindgren <tony@atomide.com>
With the changes in the random tree, IRQF_SAMPLE_RANDOM is now a
no-op; interrupt randomness is now collected unconditionally in a very
low-overhead fashion; see commit 775f4b297b. The IRQF_SAMPLE_RANDOM
flag was scheduled to be removed in 2009 on the
feature-removal-schedule, so this patch is preparation for the final
removal of this flag.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: Felipe Balbi <balbi@ti.com>
With the changes in the random tree, IRQF_SAMPLE_RANDOM is now a
no-op; interrupt randomness is now collected unconditionally in a very
low-overhead fashion; see commit 775f4b297b. The IRQF_SAMPLE_RANDOM
flag was scheduled to be removed in 2009 on the
feature-removal-schedule, so this patch is preparation for the final
removal of this flag.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Acked-by: Haojian Zhuang <haojian.zhuang@gmail.com>
Cc: Eric Miao <eric.y.miao@gmail.com>
Cc: Russell King <linux@arm.linux.org.uk>
With the changes in the random tree, IRQF_SAMPLE_RANDOM is now a
no-op; interrupt randomness is now collected unconditionally in a very
low-overhead fashion; see commit 775f4b297b. The IRQF_SAMPLE_RANDOM
flag was scheduled to be removed in 2009 on the
feature-removal-schedule, so this patch is preparation for the final
removal of this flag.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: Felipe Balbi <balbi@ti.com>
With the changes in the random tree, IRQF_SAMPLE_RANDOM is now a
no-op; interrupt randomness is now collected unconditionally in a very
low-overhead fashion; see commit 775f4b297b. The IRQF_SAMPLE_RANDOM
flag was scheduled to be removed in 2009 on the
feature-removal-schedule.
The flag was only commented-out in the driver, but we should just
remove it altogether.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: Felipe Balbi <balbi@ti.com>
With the changes in the random tree, IRQF_SAMPLE_RANDOM is now a
no-op; interrupt randomness is now collected unconditionally in a very
low-overhead fashion; see commit 775f4b297b. The IRQF_SAMPLE_RANDOM
flag was scheduled to be removed in 2009 on the
feature-removal-schedule, so this patch is preparation for the final
removal of this flag.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Acked-by: Peter Korsgaard <jacmet@sunsite.dk>
With the changes in the random tree, IRQF_SAMPLE_RANDOM is now a
no-op; interrupt randomness is now collected unconditionally in a very
low-overhead fashion; see commit 775f4b297b. The IRQF_SAMPLE_RANDOM
flag was scheduled to be removed in 2009 on the
feature-removal-schedule, so this patch is preparation for the final
removal of this flag.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Acked-by: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
With the changes in the random tree, IRQF_SAMPLE_RANDOM is now a
no-op; interrupt randomness is now collected unconditionally in a very
low-overhead fashion; see commit 775f4b297b. The IRQF_SAMPLE_RANDOM
flag was scheduled to be removed in 2009 on the
feature-removal-schedule, so this patch is preparation for the final
removal of this flag.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
With the changes in the random tree, IRQF_SAMPLE_RANDOM is now a
no-op; interrupt randomness is now collected unconditionally in a very
low-overhead fashion; see commit 775f4b297b. The IRQF_SAMPLE_RANDOM
flag was scheduled to be removed in 2009 on the
feature-removal-schedule, so this patch is preparation for the final
removal of this flag.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
With the changes in the random tree, IRQF_SAMPLE_RANDOM is now a
no-op; interrupt randomness is now collected unconditionally in a very
low-overhead fashion; see commit 775f4b297b. The IRQF_SAMPLE_RANDOM
flag was scheduled to be removed in 2009 on the
feature-removal-schedule, so this patch is preparation for the final
removal of this flag.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Acked-by: Anton Vorontsov <cbou@mail.ru>
Cc: David Woodhouse <dwmw2@infradead.org>
With the changes in the random tree, IRQF_SAMPLE_RANDOM is now a
no-op; interrupt randomness is now collected unconditionally in a very
low-overhead fashion; see commit 775f4b297b. The IRQF_SAMPLE_RANDOM
flag was scheduled to be removed in 2009 on the
feature-removal-schedule, so this patch is preparation for the final
removal of this flag.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: "Ben Dooks" <ben-linux@fluff.org>
Cc: "Wolfram Sang" <w.sang@pengutronix.de>
With the changes in the random tree, IRQF_SAMPLE_RANDOM is now a
no-op; interrupt randomness is now collected unconditionally in a very
low-overhead fashion; see commit 775f4b297b. The IRQF_SAMPLE_RANDOM
flag was scheduled to be removed in 2009 on the
feature-removal-schedule, so this patch is preparation for the final
removal of this flag.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
With the changes in the random tree, IRQF_SAMPLE_RANDOM is now a
no-op; interrupt randomness is now collected unconditionally in a very
low-overhead fashion; see commit 775f4b297b. The IRQF_SAMPLE_RANDOM
flag was scheduled to be removed in 2009 on the
feature-removal-schedule, so this patch is preparation for the final
removal of this flag.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: Samuel Ortiz <sameo@linux.intel.com>
With the new interrupt sampling system, we are no longer using the
timer_rand_state structure in the irq descriptor, so we can stop
initializing it now.
[ Merged in fixes from Sedat to find some last missing references to
rand_initialize_irq() ]
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Sedat Dilek <sedat.dilek@gmail.com>
wm831x devices contain a unique ID value. Feed this into the newly added
device_add_randomness() to add some per device seed data to the pool.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
The tamper evident features of the RTC include the "write counter" which
is a pseudo-random number regenerated whenever we set the RTC. Since this
value is unpredictable it should provide some useful seeding to the random
number generator.
Only do this on boot since the goal is to seed the pool rather than add
useful entropy.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
Matt Mackall stepped down as the /dev/random driver maintainer last
year, so Theodore Ts'o is taking back the /dev/random driver.
Cc: Matt Mackall <mpm@selenic.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
This patch reduces GFS2 file fragmentation by pre-reserving blocks. The
resulting improved on disk layout greatly speeds up operations in cases
which would have resulted in interlaced allocation of blocks previously.
A typical example of this is 10 parallel dd processes, each writing to a
file in a common dirctory.
The implementation uses an rbtree of reservations attached to each
resource group (and each inode).
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>