Commit graph

932869 commits

Author SHA1 Message Date
Anson Huang
9254bf1007 dt-bindings: clock: Convert i.MX6Q clock to json-schema
Convert the i.MX6Q clock binding to DT schema format using json-schema.

Signed-off-by: Anson Huang <Anson.Huang@nxp.com>
Acked-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Rob Herring <robh@kernel.org>
2020-05-27 19:30:24 -06:00
Sandeep Maheswaram
dffe511504 dt-bindings: usb: qcom,dwc3: Add compatible for SC7180
Add compatible for SC7180 in usb dwc3 bindings.

Signed-off-by: Sandeep Maheswaram <sanm@codeaurora.org>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Acked-by: Rob Herring <robh@kernel.org>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Signed-off-by: Rob Herring <robh@kernel.org>
2020-05-27 19:30:24 -06:00
Sandeep Maheswaram
3828026c9e dt-bindings: usb: qcom,dwc3: Convert USB DWC3 bindings
Convert USB DWC3 bindings to DT schema format using json-schema.

Signed-off-by: Sandeep Maheswaram <sanm@codeaurora.org>
[robh: fixup example warnings]
Signed-off-by: Rob Herring <robh@kernel.org>
2020-05-27 19:29:41 -06:00
Krzysztof Kazimierczak
3f0d97cdfe ice: Check UMEM FQ size when allocating bufs
If a UMEM is present on a queue when an interface/queue pair is being
enabled, the driver will try to prepare the Rx buffers in advance to
improve performance. However, if fill queue is shorter than HW Rx ring,
the driver will report failure after getting the last address from the
fill queue.

This still lets the driver process the packets correctly during the NAPI
poll, but leads to a constant NAPI rescheduling. Not allocating the
buffers in advance would result in a potential performance decrease.

Commit d57d76428a ("xsk: Add API to check for available entries in FQ")
provides an API that lets drivers check the number of addresses that the
fill queue holds.

Notify the user if fill queue is not long enough to prepare all buffers
before packet processing starts, and allocate the buffers during the
NAPI poll. If the fill queue size is sufficient, prepare Rx buffers in
advance.

Signed-off-by: Krzysztof Kazimierczak <krzysztof.kazimierczak@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-27 18:13:59 -07:00
Alex Vesker
ed03a418ab net/mlx5: DR, Split RX and TX lock for parallel insertion
Change the locking flow to support RX and TX locks, splitting
the single lock to two will allow inserting rules in parallel
for RX and TX parts of the FDB.

Locking the dr_domain will be done by locking the RX domain
and the TX domain locks, this is mostly used for control operations
on the dr_domain. When inserting rules for RX or TX the single
nic_doamin RX or TX lock will be used. Splitting the lock is safe since
RX and TX domains are logically separated from each other, shared
objects such the send-ring and memory pool are protected by locks.

Signed-off-by: Alex Vesker <valex@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Reviewed-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-27 18:13:52 -07:00
Alex Vesker
cedb28191f net/mlx5: DR, Add a spinlock to protect the send ring
Adding this lock will allow writing steering entries without
locking the dr_domain and allow parallel insertion.

Signed-off-by: Alex Vesker <valex@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-27 18:13:51 -07:00
Eli Britstein
fca533041a net/mlx5e: Optimize performance for IPv4/IPv6 ethertype
The HW is optimized for IPv4/IPv6. For such cases, pending capability,
avoid matching on ethertype, and use ip_version field instead.

Signed-off-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-27 18:13:51 -07:00
Eli Britstein
4a5d5d7392 net/mlx5e: Helper function to set ethertype
Set ethertype match in a helper function as a pre-step towards
optimizing it.

Signed-off-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-27 18:13:50 -07:00
Parav Pandit
810cbb2554 net/mlx5: Add missing mutex destroy
Add mutex destroy calls to balance with mutex_init() done in the init
path.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-27 18:13:50 -07:00
Vu Pham
9728366f53 net/mlx5e: Use change upper event to setup representors' bond_metadata
Use change upper event to detect slave representor from
enslaving/unslaving to/from lag device.

On enslaving event, call mlx5_enslave_rep() API to create, add
this slave representor shadow entry to the slaves list of
bond_metadata structure representing master lag device and use
its metadata to setup ingress acl metadata header.

On unslaving event, resetting the vport of unslaved representor
to use its default ingress/egress acls and rx rules with its
default_metadata.

The last slave will free the shared bond_metadata and its
unique metadata.

Signed-off-by: Vu Pham <vuhuong@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-27 18:13:50 -07:00
Vu Pham
88e96e533c net/mlx5e: Slave representors sharing unique metadata for match
Bonded slave representors' vports must share a unique metadata
for match.

On enslaving event of slave representor to lag device, allocate
new unique "bond_metadata" for match if this is the first slave.
The subsequent enslaved representors will share the same unique
"bond_metadata".

On unslaving event of slave representor, reset the slave
representor's vport to use its own default metadata.

Replace ingress acl and rx rules of the slave representors' vports
using new vport->bond_metadata.

Signed-off-by: Vu Pham <vuhuong@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-27 18:13:49 -07:00
Vu Pham
133dcfc577 net/mlx5: E-Switch, Alloc and free unique metadata for match
Introduce infrastructure to create unique metadata for match
for vport without depending on vport_num. Vport uses its
default metadata for match in standalone configuration but
will share a different unique "bond_metadata" for match with
other vports in bond configuration.

Using ida to generate unique metadata for match for vports
in default and bond configurations.

Introduce APIs to generate, free metadata for match.
Introduce APIs to set vport's bond_metadata and replace its
ingress acl rules with bond_metatada.

Signed-off-by: Vu Pham <vuhuong@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-27 18:13:49 -07:00
Vu Pham
d97555e145 net/mlx5e: Add bond_metadata and its slave entries
Adding bond_metadata and its slave entries to represent a lag device
and its slaves VF representors. Bond_metadata structure includes a
unique metadata shared by slaves VF respresentors, and a list of slaves
representors slave entries.

On enslaving event, create a bond_metadata structure representing
the upper lag device of this slave representor if it has not been
created yet. Create and add entry for the slave representor to the
slaves list.

On unslaving event, free the slave entry of the slave representor.
On the last unslave event, free the bond_metadata structure and its
resources.

Introduce APIs to create and remove bond_metadata and its resources,
enslave and unslave VF representor slave entries.

Signed-off-by: Vu Pham <vuhuong@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-27 18:13:49 -07:00
Or Gerlitz
d34eb2fcd0 net/mlx5e: Offload flow rules to active lower representor
When a bond device is created over one or more non uplink representors,
and when a flow rule is offloaded to such bond device, offload a rule
to the active lower device.

Assuming that this is active-backup lag, the rules should be offloaded
to the active lower device which is the representor of the direct
path (not the failover).

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Vu Pham <vuhuong@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-27 18:13:48 -07:00
Vu Pham
553f932838 net/mlx5e: Support tc block sharing for representors
Currently offloading a rule over a tc block shared by multiple
representors fails because an e-switch global hashtable to keep
the mapping from tc cookies to mlx5e flow instances is used, and
tc block sharing offloads the same rule/cookie multiple times,
each time for different representor sharing the tc block.

Changing the implementation and behavior by acknowledging and returning
success if the same rule/cookie is offloaded again to other slave
representor sharing the tc block by setting, checking and comparing
the netdev that added the rule first.

Signed-off-by: Vu Pham <vuhuong@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-27 18:13:48 -07:00
Or Gerlitz
7e51891a23 net/mlx5e: Use netdev events to set/del egress acl forward-to-vport rule
Register a notifier block to handle netdev events for bond device
of non-uplink representors to support eswitch vports bonding.

When a non-uplink representor is a lower dev (slave) of bond and
becomes active, adding egress acl forward-to-vport rule of all slave
netdevs (active + standby) to forward to this representor's vport. Use
change lower netdev event to do this.

Use change upper event to detect slave representor unslaved from lag
device to delete its vport egress acl forward rule if any.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Vu Pham <vuhuong@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-27 18:13:47 -07:00
Vu Pham
bf773dc0e6 net/mlx5: E-Switch, Introduce APIs to enable egress acl forward-to-vport rule
By default, e-switch vport's egress acl just forward packets to its
counterpart NIC vport using existing egress acl table.

During port failover in bonding scenario where two VFs representors
are bonded, the egress acl forward-to-vport rule will be added to
the existing egress acl table of e-switch vport of passive/inactive
slave representor to forward packets to other NIC vport ie. the active
slave representor's NIC vport to handle egress "failover" traffic.

Enable egress acl and have APIs to create and destroy egress acl
forward-to-vport rule and group.

Signed-off-by: Vu Pham <vuhuong@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-27 18:13:47 -07:00
Vu Pham
07bab95026 net/mlx5: E-Switch, Refactor eswitch ingress acl codes
Restructure the eswitch ingress acl codes into eswitch directory
and different files:
. Acl ingress helper functions to acl_helper.c/h
. Acl ingress functions used in offloads mode to acl_ingress_ofld.c
. Acl ingress functions used in legacy mode to acl_ingress_lgy.c

This patch does not change any functionality.

Signed-off-by: Vu Pham <vuhuong@mellanox.com>
2020-05-27 18:13:47 -07:00
Vu Pham
ea651a86d4 net/mlx5: E-Switch, Refactor eswitch egress acl codes
Refactor the egress acl codes so that offloads and legacy modes
can configure specifically their own needs of egress acl table,
groups and rules. While at it, restructure the eswitch egress
acl codes into eswitch directory and different files:
. Acl egress helper functions to acl_helper.c/h
. Acl egress functions used in offloads mode to acl_egress_ofld.c
. Acl egress functions used in legacy mode to acl_egress_lgy.c

This patch does not change any functionality.

Signed-off-by: Vu Pham <vuhuong@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-27 18:13:46 -07:00
Anirudh Venkataramanan
13f90b393f ice: Refactor Rx checksum checks
We don't need both rx_status and rx_error parameters, as the latter is
a subset of the former. Remove rx_error completely and check the right bit
in rx_status.

Rename rx_status to rx_status0, and rx_status_err1 to
rx_status1. This naming more closely reflects the specification.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-27 18:00:35 -07:00
Bruce Allan
7e34786a74 ice: avoid undefined behavior
When writing the driver's struct ice_tlan_ctx structure, do not write the
8-bit element int_q_state with the associated internal-to-hardware field
which is 122-bits, otherwise the helper function ice_write_byte() will use
undefined behavior when setting the mask used for that write.  This should
not cause any functional change and will avoid use of undefined behavior.
Also, update a comment to highlight this structure element is not written.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-27 17:58:21 -07:00
Paul Mackerras
11362b1bef KVM: PPC: Book3S HV: Close race with page faults around memslot flushes
There is a potential race condition between hypervisor page faults
and flushing a memslot.  It is possible for a page fault to read the
memslot before a memslot is updated and then write a PTE to the
partition-scoped page tables after kvmppc_radix_flush_memslot has
completed.  (Note that this race has never been explicitly observed.)

To close this race, it is sufficient to increment the MMU sequence
number while the kvm->mmu_lock is held.  That will cause
mmu_notifier_retry() to return true, and the page fault will then
return to the guest without inserting a PTE.

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2020-05-28 10:56:42 +10:00
Marta Plantykow
ae15e0ba1b ice: Change number of XDP Tx queues to match number of Rx queues
In current implementation number of XDP Tx queues is the same as
the number of transmit queues, which is not always true. This
patch changes this number to match the number of receive queues.
XDP programs are running on Rx rings, so what we actually need to
provide is the XDP Tx ring per each Rx ring so that the whole XDP
ecosystem is functional, e.g. if the result of XDP prog is XDP_TX
then you have the need to access the XDP Tx ring.

Signed-off-by: Marta Plantykow <marta.a.plantykow@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-27 17:55:56 -07:00
Lubomir Rintel
725262d291 clk: mmp2: Add audio clock controller driver
This is a driver for a block that generates master and bit clocks for
the I2S interface. It's separate from the PMUs that generate clocks for
the peripherals.

Signed-off-by: Lubomir Rintel <lkundrak@v3.sk>
Link: https://lkml.kernel.org/r/20200519224151.2074597-14-lkundrak@v3.sk
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2020-05-27 17:55:12 -07:00
Lubomir Rintel
e787c5b725 dt-bindings: clock: Add Marvell MMP Audio Clock Controller binding
This describes the bindings for a controller that generates master and bit
clocks for the I2S interface.

Signed-off-by: Lubomir Rintel <lkundrak@v3.sk>
Link: https://lkml.kernel.org/r/20200519224151.2074597-13-lkundrak@v3.sk
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2020-05-27 17:55:12 -07:00
Lubomir Rintel
ee4df23634 clk: mmp2: Add support for power islands
Apart from the clocks and resets, the PMU hardware also controls power
to peripherals that are on separate power islands. On MMP2, that's the
GC860 GPU and the SSPA audio interface, while on MMP3 also the camera
interface is on a separate island, along with the pair of GC2000 and GC300
GPUs and the SSPA.

Signed-off-by: Lubomir Rintel <lkundrak@v3.sk>
Link: https://lkml.kernel.org/r/20200519224151.2074597-12-lkundrak@v3.sk
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2020-05-27 17:55:12 -07:00
Lubomir Rintel
17d43046fd dt-bindings: marvell,mmp2: Add ids for the power domains
On MMP2 the audio and GPU blocks are on separate power islands. On MMP3
the camera block's power is also controlled separately.

Add the numbers that we could use to refer to the power domains for
respective power islands from the device tree.

Signed-off-by: Lubomir Rintel <lkundrak@v3.sk>
Acked-by: Rob Herring <robh@kernel.org>
Link: https://lkml.kernel.org/r/20200519224151.2074597-11-lkundrak@v3.sk
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2020-05-27 17:55:12 -07:00
Lubomir Rintel
ec6bbddef6 dt-bindings: clock: Make marvell,mmp2-clock a power controller
This is a binding for the MMP2 power management units. As such apart from
providing the clocks, they also manage the power islands.

Signed-off-by: Lubomir Rintel <lkundrak@v3.sk>
Link: https://lkml.kernel.org/r/20200519224151.2074597-10-lkundrak@v3.sk
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2020-05-27 17:55:12 -07:00
Lubomir Rintel
232a313435 clk: mmp2: Add the audio clock
This clocks the Audio block.

Signed-off-by: Lubomir Rintel <lkundrak@v3.sk>
Link: https://lkml.kernel.org/r/20200519224151.2074597-9-lkundrak@v3.sk
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2020-05-27 17:55:12 -07:00
Lubomir Rintel
71d8254af9 clk: mmp2: Add the I2S clocks
A pair of fractional clock sources for PLLs and gates.

Signed-off-by: Lubomir Rintel <lkundrak@v3.sk>
Link: https://lkml.kernel.org/r/20200519224151.2074597-8-lkundrak@v3.sk
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2020-05-27 17:55:12 -07:00
Lubomir Rintel
2766c19815 clk: mmp2: Rename mmp2_pll_init() to mmp2_main_clk_init()
This is a trivial rename for a routine that registers more clock sources
than the PLLs -- there's also a XO.

Signed-off-by: Lubomir Rintel <lkundrak@v3.sk>
Link: https://lkml.kernel.org/r/20200519224151.2074597-7-lkundrak@v3.sk
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2020-05-27 17:55:12 -07:00
Lubomir Rintel
8c2427b8f7 clk: mmp2: Move thermal register defines up a bit
A trivial change to keep the sorting sane. The APBC registers are happier
when they are grouped together, instead of mixed with the APMU ones.

Signed-off-by: Lubomir Rintel <lkundrak@v3.sk>
Link: https://lkml.kernel.org/r/20200519224151.2074597-6-lkundrak@v3.sk
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2020-05-27 17:55:11 -07:00
Lubomir Rintel
c227df7a09 dt-bindings: marvell,mmp2: Add clock id for the Audio clock
This clocks the Audio block.

Signed-off-by: Lubomir Rintel <lkundrak@v3.sk>
Acked-by: Rob Herring <robh@kernel.org>
Link: https://lkml.kernel.org/r/20200519224151.2074597-5-lkundrak@v3.sk
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2020-05-27 17:55:11 -07:00
Lubomir Rintel
edcec4a869 dt-bindings: marvell,mmp2: Add clock id for the I2S clocks
There are two of these on a MMP2.

Signed-off-by: Lubomir Rintel <lkundrak@v3.sk>
Acked-by: Rob Herring <robh@kernel.org>
Link: https://lkml.kernel.org/r/20200519224151.2074597-4-lkundrak@v3.sk
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2020-05-27 17:55:11 -07:00
Lubomir Rintel
5278acc441 clk: mmp: frac: Allow setting bits other than the numerator/denominator
For the I2S fractional clocks, there are more bits that need to be set
for the clock to run. Their actual meaning is unknown.

Signed-off-by: Lubomir Rintel <lkundrak@v3.sk>
Link: https://lkml.kernel.org/r/20200519224151.2074597-3-lkundrak@v3.sk
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2020-05-27 17:55:11 -07:00
Lubomir Rintel
06030c4e33 clk: mmp: frac: Do not lose last 4 digits of precision
While calculating the output rate of a fractional divider clock, the
value is divided and multipled by 10000, discarding the least
significant digits -- presumably to fit the intermediate value within 32
bits.

The precision we're losing is, however, not insignificant for things like
I2S clock. Maybe also elsewhere, now that since commit ea56ad6026 ("clk:
mmp2: Stop pretending PLL outputs are constant") the parent rates are more
precise and no longer rounded to 10000s.

Signed-off-by: Lubomir Rintel <lkundrak@v3.sk>
Link: https://lkml.kernel.org/r/20200519224151.2074597-2-lkundrak@v3.sk
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2020-05-27 17:55:11 -07:00
Marta Plantykow
49d358e0e7 ice: Add XDP Tx to VSI ring stats
When XDP Tx program is loaded and packets are sent from
interface, VSI statistics are not updated. This patch adds
packets sent on Tx XDP ring to VSI ring stats.

Signed-off-by: Marta Plantykow <marta.a.plantykow@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-27 17:54:16 -07:00
Marta Plantykow
c8f135c6ee ice: Change number of XDP TxQ to 0 when destroying rings
When XDP Tx rings are destroyed the number of XDP Tx queues
is not changing. This patch is changing this number to 0.

Signed-off-by: Marta Plantykow <marta.a.plantykow@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-27 17:49:56 -07:00
Evan Swanson
b5c7f857e5 ice: Handle critical FW error during admin queue initialization
A race condition between FW and SW can occur between admin queue setup and
the first command sent. A link event may occur and FW attempts to notify a
non-existent queue. FW will set the critical error bit and disable the
queue. When this happens retry queue setup.

Signed-off-by: Evan Swanson <evan.swanson@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-27 17:48:23 -07:00
Paul Mackerras
3d89c2ef24 KVM: PPC: Book3S HV: Remove user-triggerable WARN_ON
Although in general we do not expect valid PTEs to be found in
kvmppc_create_pte when we are inserting a large page mapping, there
is one situation where this can occur.  That is when dirty page
logging is turned off for a memslot while the VM is running.
Because the new memslots are installed before the old memslot is
flushed in kvmppc_core_commit_memory_region_hv(), there is a
window where a hypervisor page fault can try to install a 2MB
(or 1GB) page where there are already small page mappings which
were installed while dirty page logging was enabled and which
have not yet been flushed.

Since we have a situation where valid PTEs can legitimately be
found by kvmppc_unmap_free_pte, and which can be triggered by
userspace, just remove the WARN_ON_ONCE, since it is undesirable
to have userspace able to trigger a kernel warning.

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2020-05-28 10:48:18 +10:00
Brett Creeley
1960827570 ice: Don't allow VLAN stripping change when pvid set
Currently, if the PVID is set in the VLAN handling section of the VSI
context the driver still allows VLAN stripping to be enabled/disabled.
VLAN stripping should only be modifiable when the PVID is not set. Fix
this by preventing VLAN stripping modification when PVID is set.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-27 17:46:00 -07:00
Brett Creeley
4f1fe43c92 ice: Add more Rx errors to netdev's rx_error counter
Currently we are only including illegal_bytes and rx_crc_errors in the
PF netdev's rx_error counter. There are many more causes of Rx errors
that the device supports and reports via Ethtool. Accumulate all Rx
errors in the PF netdev's rx_error counter.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-27 17:44:06 -07:00
Surabhi Boob
68d2707837 ice: Fix for memory leaks and modify ICE_FREE_CQ_BUFS
Handle memory leaks during control queue initialization and
buffer allocation failures. The macro ICE_FREE_CQ_BUFS is modified to
re-use for this fix.

Signed-off-by: Surabhi Boob <surabhi.boob@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-27 17:32:50 -07:00
Guo Ren
f36e0aab6f csky: Fixup CONFIG_DEBUG_RSEQ
Put the rseq_syscall check point at the prologue of the syscall
will break the a0 ... a7. This will casue system call bug when
DEBUG_RSEQ is enabled.

So move it to the epilogue of syscall, but before syscall_trace.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
2020-05-28 00:18:36 +00:00
Guo Ren
20f69538b9 csky: Coding convention in entry.S
There is no fixup or feature in the patch, we only cleanup with:

 - Remove unnecessary reg used (r11, r12), just use r9 & r10 &
   syscallid regs as temp useage.
 - Add _TIF_SYSCALL_WORK and _TIF_WORK_MASK to gather macros.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
2020-05-28 00:18:36 +00:00
Guo Ren
e0bbb53843 csky: Fixup abiv2 syscall_trace break a4 & a5
Current implementation could destory a4 & a5 when strace, so we need to get them
from pt_regs by SAVE_ALL.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
2020-05-28 00:18:36 +00:00
Guo Ren
900897591b csky: Fixup CONFIG_PREEMPT panic
log:
[    0.13373200] Calibrating delay loop...
[    0.14077600] ------------[ cut here ]------------
[    0.14116700] WARNING: CPU: 0 PID: 0 at kernel/sched/core.c:3790 preempt_count_add+0xc8/0x11c
[    0.14348000] DEBUG_LOCKS_WARN_ON((preempt_count() < 0))Modules linked in:
[    0.14395100] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.6.0 #7
[    0.14410800]
[    0.14427400] Call Trace:
[    0.14450700] [<807cd226>] dump_stack+0x8a/0xe4
[    0.14473500] [<80072792>] __warn+0x10e/0x15c
[    0.14495900] [<80072852>] warn_slowpath_fmt+0x72/0xc0
[    0.14518600] [<800a5240>] preempt_count_add+0xc8/0x11c
[    0.14544900] [<807ef918>] _raw_spin_lock+0x28/0x68
[    0.14572600] [<800e0eb8>] vprintk_emit+0x84/0x2d8
[    0.14599000] [<800e113a>] vprintk_default+0x2e/0x44
[    0.14625100] [<800e2042>] vprintk_func+0x12a/0x1d0
[    0.14651300] [<800e1804>] printk+0x30/0x48
[    0.14677600] [<80008052>] lockdep_init+0x12/0xb0
[    0.14703800] [<80002080>] start_kernel+0x558/0x7f8
[    0.14730000] [<800052bc>] csky_start+0x58/0x94
[    0.14756600] irq event stamp: 34
[    0.14775100] hardirqs last  enabled at (33): [<80067370>] ret_from_exception+0x2c/0x72
[    0.14793700] hardirqs last disabled at (34): [<800e0eae>] vprintk_emit+0x7a/0x2d8
[    0.14812300] softirqs last  enabled at (32): [<800655b0>] __do_softirq+0x578/0x6d8
[    0.14830800] softirqs last disabled at (25): [<8007b3b8>] irq_exit+0xec/0x128

The preempt_count of reg could be destroyed after csky_do_IRQ without reload
from memory.

After reference to other architectures (arm64, riscv), we move preempt entry
into ret_from_exception and disable irq at the beginning of
ret_from_exception instead of RESTORE_ALL.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Reported-by: Lu Baoquan <lu.baoquan@intellif.com>
2020-05-28 00:18:36 +00:00
Valentine Fatiev
1acba6a817 IB/ipoib: Fix double free of skb in case of multicast traffic in CM mode
When connected mode is set, and we have connected and datagram traffic in
parallel, ipoib might crash with double free of datagram skb.

The current mechanism assumes that the order in the completion queue is
the same as the order of sent packets for all QPs. Order is kept only for
specific QP, in case of mixed UD and CM traffic we have few QPs (one UD and
few CM's) in parallel.

The problem:
----------------------------------------------------------

Transmit queue:
-----------------
UD skb pointer kept in queue itself, CM skb kept in spearate queue and
uses transmit queue as a placeholder to count the number of total
transmitted packets.

0   1   2   3   4  5  6  7  8   9  10  11 12 13 .........127
------------------------------------------------------------
NL ud1 UD2 CM1 ud3 cm2 cm3 ud4 cm4 ud5 NL NL NL ...........
------------------------------------------------------------
    ^                                  ^
   tail                               head

Completion queue (problematic scenario) - the order not the same as in
the transmit queue:

  1  2  3  4  5  6  7  8  9
------------------------------------
 ud1 CM1 UD2 ud3 cm2 cm3 ud4 cm4 ud5
------------------------------------

1. CM1 'wc' processing
   - skb freed in cm separate ring.
   - tx_tail of transmit queue increased although UD2 is not freed.
     Now driver assumes UD2 index is already freed and it could be used for
     new transmitted skb.

0   1   2   3   4  5  6  7  8   9  10  11 12 13 .........127
------------------------------------------------------------
NL NL  UD2 CM1 ud3 cm2 cm3 ud4 cm4 ud5 NL NL NL ...........
------------------------------------------------------------
        ^   ^                       ^
      (Bad)tail                    head
(Bad - Could be used for new SKB)

In this case (due to heavy load) UD2 skb pointer could be replaced by new
transmitted packet UD_NEW, as the driver assumes its free.  At this point
we will have to process two 'wc' with same index but we have only one
pointer to free.

During second attempt to free the same skb we will have NULL pointer
exception.

2. UD2 'wc' processing
   - skb freed according the index we got from 'wc', but it was already
     overwritten by mistake. So actually the skb that was released is the
     skb of the new transmitted packet and not the original one.

3. UD_NEW 'wc' processing
   - attempt to free already freed skb. NUll pointer exception.

The fix:
-----------------------------------------------------------------------

The fix is to stop using the UD ring as a placeholder for CM packets, the
cyclic ring variables tx_head and tx_tail will manage the UD tx_ring, a
new cyclic variables global_tx_head and global_tx_tail are introduced for
managing and counting the overall outstanding sent packets, then the send
queue will be stopped and waken based on these variables only.

Note that no locking is needed since global_tx_head is updated in the xmit
flow and global_tx_tail is updated in the NAPI flow only.  A previous
attempt tried to use one variable to count the outstanding sent packets,
but it did not work since xmit and NAPI flows can run at the same time and
the counter will be updated wrongly. Thus, we use the same simple cyclic
head and tail scheme that we have today for the UD tx_ring.

Fixes: 2c104ea683 ("IB/ipoib: Get rid of the tx_outstanding variable in all modes")
Link: https://lore.kernel.org/r/20200527134705.480068-1-leon@kernel.org
Signed-off-by: Valentine Fatiev <valentinef@mellanox.com>
Signed-off-by: Alaa Hleihel <alaa@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Acked-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-27 21:14:09 -03:00
Surabhi Boob
1aaef2bc4e ice: Fix memory leak
Handle memory leak on filter management initialization failure.

Signed-off-by: Surabhi Boob <surabhi.boob@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-27 17:11:29 -07:00
Jesse Brandeburg
5df42c8267 ice: fix MAC write command
The manage MAC write command was implemented in an overly complex way
that actually didn't work, as it wasn't symmetric to the manage MAC
read command, and was feeding bytes out of order to the firmware. Fix
the implementation by just using a simple array to represent the MAC
address when it is being written via firmware command.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-27 17:06:44 -07:00