Or Gerlitz says:
====================
mlx4: Add CHECKSUM_COMPLETE support
These patches from Shani, Matan and myself add support for
CHECKSUM_COMPLETE reporting on non TCP/UDP packets such as
GRE and ICMP. I'd like to deeply thank Jerry Chu for his
innovation and support in that effort.
Based on the feedback from Eric and Ido Shamay, in V2 we dropped
the patch which removed the calls to napi_gro_frags() and added
a patch which makes the RX code to go through that path
regardless of the checksum status.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
When processing received traffic, pass CHECKSUM_COMPLETE status to the
stack, with calculated checksum for non TCP/UDP packets (such
as GRE or ICMP).
Although the stack expects checksum which doesn't include the pseudo
header, the HW adds it. To address that, we are subtracting the pseudo
header checksum from the checksum value provided by the HW.
In the IPv6 case, we also compute/add the IP header checksum which
is not added by the HW for such packets.
Cc: Jerry Chu <hkchu@google.com>
Signed-off-by: Shani Michaeli <shanim@mellanox.com>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We can call napi_gro_frags for all the received traffic regardless
of the checksum status. Specifically, received packets whose status
is CHECKSUM_NONE (and soon to be added CHECKSUM_COMPLETE)
are eligible for napi_gro_frags as well.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Shani Michaeli <shanim@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
blk-mq is using preempt_disable/enable in order to ensure that the
queue runners are placed on the right CPU. This does not work with
the RT patches, because __blk_mq_run_hw_queue takes a non-raw
spinlock with the preemption-disabled region. If there is contention
on the lock, this violates the rules for preemption-disabled regions.
While this should be easily fixable within the RT patches just by doing
migrate_disable/enable, we can do better and document _why_ this
particular region runs with disabled preemption. After the previous
patch, it is trivial to switch it to get/put_cpu; the RT patches then
can change it to get_cpu_light, which lets virtio-blk run under RT
kernels.
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Reported-by: Clark Williams <williams@redhat.com>
Tested-by: Clark Williams <williams@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
preempt_disable/enable surrounds every call to blk_mq_run_hw_queue,
except the one in blk-flush.c. In fact that one is always asynchronous,
and it does not need smp_processor_id().
We can do the same for all other calls, avoiding preempt_disable when
async is true. This avoids peppering blk-mq.c with preemption-disabled
regions.
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Reported-by: Clark Williams <williams@redhat.com>
Tested-by: Clark Williams <williams@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Currently spidev allows callers to set the default speed by overriding the
max_speed_hz in the underlying device. This achieves the immediate goal but
is not what devices expect and can easily lead to userspace trying to set
unsupported speeds and succeeding, apart from anything else drivers can't
set a limit on the speed using max_speed_hz as they'd expect and any other
devices on the bus will be affected.
Instead store the default speed in the spidev struct and fill this in on
each transfer.
Signed-off-by: Mark Brown <broonie@kernel.org>
Eric Dumazet says:
====================
net: SO_INCOMING_CPU support
SO_INCOMING_CPU socket option (read by getsockopt()) provides
an alternative to RPS/RFS for high performance servers using
multi queues NIC.
TCP should use sk_mark_napi_id() for established sockets only.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Alternative to RPS/RFS is to use hardware support for multiple
queues.
Then split a set of million of sockets into worker threads, each
one using epoll() to manage events on its own socket pool.
Ideally, we want one thread per RX/TX queue/cpu, but we have no way to
know after accept() or connect() on which queue/cpu a socket is managed.
We normally use one cpu per RX queue (IRQ smp_affinity being properly
set), so remembering on socket structure which cpu delivered last packet
is enough to solve the problem.
After accept(), connect(), or even file descriptor passing around
processes, applications can use :
int cpu;
socklen_t len = sizeof(cpu);
getsockopt(fd, SOL_SOCKET, SO_INCOMING_CPU, &cpu, &len);
And use this information to put the socket into the right silo
for optimal performance, as all networking stack should run
on the appropriate cpu, without need to send IPI (RPS/RFS).
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
sk_mark_napi_id() is used to record for a flow napi id of incoming
packets for busypoll sake.
We should do this only on established flows, not on listeners.
This was 'working' by virtue of the socket cloning, but doing
this on SYN packets in unecessary cache line dirtying.
Even if we move sk_napi_id in the same cache line than sk_lock,
we are working to make SYN processing lockless, so it is desirable
to set sk_napi_id only for established flows.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The only code that references tracing_sched_switch_trace() and
tracing_sched_wakeup_trace() is the wakeup latency tracer. Those
two functions use to belong to the sched_switch tracer which has
long been removed. These functions were left behind because the
wakeup latency tracer used them. But since the wakeup latency tracer
is the only one to use them, they should be static functions inside
that code.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
After the previous patch it is clear that "tracer_enabled" can never be
true, we can remove the "if (tracer_enabled)" code in probe_sched_switch()
and probe_sched_wakeup(). Plus we can obviously remove tracer_enabled,
ctx_trace, and sched_stopped as well.
Link: http://lkml.kernel.org/p/20140723193503.GA30217@redhat.com
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
tracing_{start,stop}_sched_switch_record() have no callers since
87d80de280 "tracing: Remove obsolete sched_switch tracer".
The last caller of tracing_sched_switch_assign_trace() was removed
by 30dbb20e68 "tracing: Remove boot tracer".
Link: http://lkml.kernel.org/p/20140723193501.GA30214@redhat.com
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
With the introduction of the dynamic trampolines, it is useful that if
things go wrong that ftrace_bug() produces more information about what
the current state is. This can help debug issues that may arise.
Ftrace has lots of checks to make sure that the state of the system it
touchs is exactly what it expects it to be. When it detects an abnormality
it calls ftrace_bug() and disables itself to prevent any further damage.
It is crucial that ftrace_bug() produces sufficient information that
can be used to debug the situation.
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Borislav Petkov <bp@suse.de>
Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Tested-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
When the static ftrace_ops (like function tracer) enables tracing, and it
is the only callback that is referencing a function, a trampoline is
dynamically allocated to the function that calls the callback directly
instead of calling a loop function that iterates over all the registered
ftrace ops (if more than one ops is registered).
But when it comes to dynamically allocated ftrace_ops, where they may be
freed, on a CONFIG_PREEMPT kernel there's no way to know when it is safe
to free the trampoline. If a task was preempted while executing on the
trampoline, there's currently no way to know when it will be off that
trampoline.
But this is not true when it comes to !CONFIG_PREEMPT. The current method
of calling schedule_on_each_cpu() will force tasks off the trampoline,
becaues they can not schedule while on it (kernel preemption is not
configured). That means it is safe to free a dynamically allocated
ftrace ops trampoline when CONFIG_PREEMPT is not configured.
Cc: H. Peter Anvin <hpa@linux.intel.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Borislav Petkov <bp@suse.de>
Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Tested-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
The file edac_pci_sysfs.c is dependent on CONFIG_PCI. This is already
modelled in the Makefile, but edac_pci_sysfs.o is still contained in
the list of files compiled even without CONFIG_PCI.
This change removes edac_pci_sysfs.o from the list of built objects
when not having CONFIG_PCI enabled and removes the then-unnecessary
ifdef from the source file.
Signed-off-by: Andreas Ruprecht <rupran@einserver.de>
Link: http://lkml.kernel.org/r/1407697803-3837-1-git-send-email-rupran@einserver.de
Signed-off-by: Borislav Petkov <bp@suse.de>
My static checker complains because the "e->location" has up to 256
characters but we are copying it into the "pvt->detail_location" which
only has space for 240 characters. That's not counting the surrounding
text and the "e->other_detail" string which can be over 80 characters
long.
I am not familiar with this code but presumably it normally works.
Let's add a limit though for safety.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Link: http://lkml.kernel.org/r/20140801082514.GD28869@mwanda
Signed-off-by: Borislav Petkov <bp@suse.de>
M-audio FastTrack Ultra quirk doesn't release the kzalloc'ed memory.
This patch adds the private_free callback to release it properly.
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Previous versions of the espfix had a single function which did setup
the pagetables. It was later split into BSP and AP version. Drop unused
leftovers after that split.
Signed-off-by: Borislav Petkov <bp@suse.de>
Put duplicate detection into its own RX handler, and separate
out the conditions a bit to make the code more readable.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
When aborting d0i3 due to taken refs, other threads might
already wait on d0i3_exit_waitq for IWL_MVM_STATUS_IN_D0I3
to get cleared (it's somewhat likely, as synchronize_rcu()
might take a while), so make sure to wake them up.
Signed-off-by: Eliad Peller <eliadx.peller@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
In some rare cases, the firmware can put the device to
sleep after the driver requested the access. This is
because the access request can take a short time to be
propagated to the firmware.
If that happens, the driver may think that it has access
since the firmware hasn't put the device to sleep yet, but
right after the driver's check, the firmware might put the
device to sleep.
Warn when this happens by allowing the firmware to finish
the "put the device sleep" flow so that the driver will
not get access to the device. This will make the issue
visible.
This still doesn't fix the race, but at least it makes
it more visible.
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Currently, the firmware only sends temperature notificaitions inside
RX statistics notifications, which are tied to beacon filtering. This
is a problem because beacon filtering is not used with vifs that don't
receive beacons (e.g. P2P GO and AP), so the driver doesn't receive
temperature notifications in those cases. To solve that, the firmware
will be changed so that it sends DTS_MEASUREMENT_NOTIFICATIONs,
independently from the beacon filtering flows.
To support that, the driver needs to also handle unsolicited
DTS_MEASUREMENT_NOTIFICATIONs, that are not triggered by
DTS_TRIGGER_CMD_FLAGS_TEMP requests.
This change is backwards compatible and will simply not be used with
firmware versions that do not send the notifications.
Signed-off-by: Luciano Coelho <luciano.coelho@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Refactor the temperature handling code so that it is easier to reuse
it with other notification flows.
Signed-off-by: Luciano Coelho <luciano.coelho@intel.com>
Reviewed-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
The code to capture firmware errors works during the reconfiguration
phase after an error. As the D3->D0 transition uses the same flow to
get the D0 image reconfigured, this triggered and caused a firmware
coredump to be collected. This in turn, if it isn't picked up by
userspace, can cause module unloading to fail, which is how the bug
was detected.
To fix this issue, introduce a new status flag (D3_RECONFIG) and use
it to detect that during reconfiguration no coredump should be taken
and reported.
Reported-by: Avi Kraif <avix.kraif@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
When we need only one antenna, we should refrain from using
the antenna that is shared with BT if BT load is high.
Fix this.
Reviewed-by: Eyal Shapira <eyal@wizery.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
When Tx STBC is being used and RS switches to a search column
using the alternate antenna from the current one there is a
problem with using rs_rate_match to figure out which table
is the active and which one is the search one. The root cause
is because in STBC the antenna mask in the ucode rate is set
to ANT_AB and in this specific scenario it matches both the
active and search table (e.g. SISO_ANT_A and SISO_ANT_B).
This leads to tx stats being updated in the wrong table and later
on to getting stuck in a test window and not moving on to other
columns. If this happens during the initial search cycle we
never end it and therefore never enable aggregation which leads
naturally to severe througput degradation.
Fix it by deducing which table is which by knowing whether we're
in a search or not like it's being done in rs_rate_scale_perform
Signed-off-by: Eyal Shapira <eyalx.shapira@intel.com>
Reviewed-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Other contexts might call iwl_mvm_ref_sync() right before
we set IWL_MVM_STATUS_IN_D0I3, and then assume the fw/bus
is not in d0i3 state.
However, since we currently don't check for held references
in the d0i3_enter flow, we might enter d0i3 although there
is an active reference.
Solve it by aborting the d0i3 enter flow if there is an
active reference. Since users are assumed to use
iwl_mvm_ref_sync, which takes a ref before checking the
flag, we don't need further locking.
Signed-off-by: Eliad Peller <eliadx.peller@intel.com>
Reviewed-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
This was taken care of in case we're doing STBC with HT
but not when working with VHT.
Signed-off-by: Eyal Shapira <eyalx.shapira@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Even if running the driver with param init_dbg=1 - on INIT
image error - iwl_trans_stop_device() was still called. This
patch fixes that and calls iwl_trans_stop_device() on INIT
image failure only if init_dbg=0.
Signed-off-by: Liad Kaufman <liad.kaufman@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
It is wrong that sp2 device uses the i2c adapter from m88ds3103 return.
sp2 device sits on the same i2c bus with m88ds3103, not behind m88ds3103.
Signed-off-by: Nibble Max <nibble.max@gmail.com>
Reviewed-by: Antti Palosaari <crope@iki.fi>
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
The apb2 clocks are actually the same as apb1 clocks on the other sunxi
platforms, hence compatible with "allwinner,sun4i-a10-apb1-clk".
Update the dtsi to use the new unified apb1 clk.
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
With the new factors infrastructure in place, we can unify apb1 and
apb1_mux as a single clock now.
Signed-off-by: Emilio López <emilio@elopez.com.ar>
[wens@csie.org: Change apb1 node label to "apb1"; reword commit title]
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
This commit unifies the APB1 mux with the APB1 clock, using the new
factors infrastructure.
Signed-off-by: Emilio López <emilio@elopez.com.ar>
[wens@csie.org: Add mux mask bits]
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Split off the setting of the RSS key into its own function. This
will help when we add support for X550 which can have different
RSS keys per pool.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch will add in the new MAC defines and fit it into the switch
cases throughout the driver. New functionality and enablement support will
be added in following patches.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Move setting of drop enable to support function. This not only makes the
code more readable but is also prep for following patches that add
additional MAC support.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Clean up functionality in ixgbe_ndo_set_vf_vlan that will simplify later
patches.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch provides duplex support for the Digidesign Mbox 1 sound
card and has been a work in progress for about a year.
Users have confirmed on my website that previous versions of this patch
have worked on the hardware and I have been testing extensively.
It also enables the mixer control for providing clock source
selector based on the previous patch.
The sample rate has been hardcoded to 48kHz because it works better with
the S/PDIF sync mode when the sample rate is locked. This is the
highest rate that the device supports and no loss of functionality
is observed by restricting the sample rate apart from the inability to selec
a lower rate.
Signed-off-by: Damien Zammit <damien@zamaudio.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
This patch provides the infrastructure for the Digidesign Mbox 1
to have a mixer control for selecting the clock source.
Valid options are Internal and S/PDIF external sync.
A non-documented command is sent to the device to enable this feature
found by reverse engineering and bus snooping.
Signed-off-by: Damien Zammit <damien@zamaudio.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
On topologies including few levels of PCIe switching X540 can run into an
unexpected completion error. We get around this by waiting after enabling
loopback a sufficient amount of time until Tx Data Fetch is sent. We then
poll the pending transaction bit to ensure we received the completion. Only
then do we go on to clear the buffers.
Signed-of-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
It's kind of silly to configure and attempt to use a bunch of queue
pairs when you're running on a single (virtual) CPU. Instead of
unconditionally configuring all of the queues that the PF gives us,
clamp the number of queue pairs to the number of CPUs.
Change-ID: I321714c9e15072ee76de8f95ab9a81f86ed347d1
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Patrick Lu <patrick.lu@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
In early init, if we get an unexpected message from the PF (such as link
status), we just kick an error back to the init task, causing it to
restart its state machine and delaying initialization.
Make the early init AQ message receive code more robust by handling
messages in a loop, and ignoring those that we aren't interested in.
This also gets rid of some scary log messages that really didn't
indicate a problem.
Change-ID: I620e8c72e49c49c665ef33eeab2425dd10e721cf
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Patrick Lu <patrick.lu@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The interrupt throttle rate minimum is actually 2us, so
fix that define and while we are there, remove some unused defines.
Change some strings in the function to be a bit less wrappy, and
express the correct limits.
Change-ID: I96829bbc77935e0b57c6f0fc1439fb4152b2960a
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Patrick Lu <patrick.lu@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The ARQ events cause a service_task execution, and we do a link_status
check and full stats gathering for each service_task. However, when
there are a lot of ARQ events, such as when doing an NVM update, we end up
doing 10's if not 100's of these per second, thereby heavily abusing the
PCI bus and especially the Firmware. This patch adds a check to keep the
service_task from running these periodic tasks more than once per second,
while still allowing quick action to service the events.
Change-ID: Iec7670c37bfae9791c43fec26df48aea7f70b33e
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Patrick Lu <patrick.lu@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The code was polling the firmware tail register for completion every
10 microseconds, which is way faster than the firmware can respond.
This changes the poll interval to 1ms, which reduces polling CPU
utilization, and the number of times we loop.
The maximum delay is still 100ms.
Change-ID: I4bbfa6b66d802890baf8b4154061e55942b90958
Signed-off-by: Kamil Krawczyk <kamil.krawczyk@intel.com>
Acked-by: Shannon Nelson <shannon.nelson@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Output driver has two parameters that can be configured to reduce
pop noise: power-on delay and ramp-up step time. Two new kcontrols
have been added to set these parameters.
Signed-off-by: Misael Lopez Cruz <misael.lopez@ti.com>
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
It is easier to find the relevant enums in the code. Use the
SOC_ENUM_*_DECL macro for the individual items.
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: Mark Brown <broonie@kernel.org>