Commit graph

871037 commits

Author SHA1 Message Date
Jesse Brandeburg
2fb0821fd5 ice: clean up arguments
There are a couple of functions that don't need two arguments
passed in when the second argument already had access to
the pointer pointed to by the first.

Remove the unnecessary arguments.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-05 08:13:40 -07:00
Anirudh Venkataramanan
ade78c2ec1 ice: Check root pointer for validity
ice_sched_get_tc_node uses pi->root without checking for NULL. Add a
check to prevent NULL pointer dereference.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-05 08:13:40 -07:00
Anirudh Venkataramanan
208ff75135 ice: Add ice_get_main_vsi to get PF/main VSI
There are multiple places where we currently use ice_find_vsi_by_type
to get the PF (a.k.a. main) VSI. The PF VSI by definition is always
the first element in the pf->vsi array (i.e. pf->vsi[0]). So instead
add and use a new helper function ice_get_main_vsi, which just returns
pf->vsi[0].

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-05 08:13:40 -07:00
Brett Creeley
34cdcb165b ice: Update fields in ice_vsi_set_num_qs when reconfiguring
Currently when vsi->req_txqs or vsi->req_rxqs are set we don't
correctly set the number of vsi->num_q_vectors. Fix this by
setting the number of queue vectors based on the max
between the vsi->alloc_txqs and vsi->alloc_rxqs.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-05 08:13:40 -07:00
Marc Zyngier
c9c96e30ec irqchip/gic-v3-its: Fix LPI release for Multi-MSI devices
When allocating a range of LPIs for a Multi-MSI capable device,
this allocation extended to the closest power of 2.

But on the release path, the interrupts are released one by
one. This results in not releasing the "extra" range, leaking
the its_device. Trying to reprobe the device will then fail.

Fix it by releasing the LPIs the same way we allocate them.

Fixes: 8208d1708b ("irqchip/gic-v3-its: Align PCI Multi-MSI allocation on their size")
Reported-by: Jiaxing Luo <luojiaxing@huawei.com>
Tested-by: John Garry <john.garry@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/f5e948aa-e32f-3f74-ae30-31fee06c2a74@huawei.com
2019-09-05 16:03:48 +01:00
Helge Deller
b0a26f11ee parisc: Drop comments which are already in pci.h
Signed-off-by: Helge Deller <deller@gmx.de>
2019-09-05 16:41:11 +02:00
Helge Deller
ebee4b02d0 parisc: Convert eisa_enumerator to use pr_cont()
Clean up and beautify kernel output by using pr_cont() and printk
formats like %pR for resources.
This was noticed on a HP D350/2 machine.

Signed-off-by: Helge Deller <deller@gmx.de>
2019-09-05 16:37:38 +02:00
Tony Lindgren
d098913a10 bus: ti-sysc: Fix clock handling for no-idle quirks
NFSroot can fail on dra7 when cpsw is probed using ti-sysc interconnect
target module driver as reported by Keerthy.

Device clocks and the interconnect target module may or may not be
enabled by the bootloader on init, but we currently assume the clocks
and module are on from the bootloader for "ti,no-idle" and
"ti,no-idle-on-init" quirks as reported by Grygorii Strashko.

Let's fix the issue by always enabling clocks init, and
never disable them for "ti,no-idle" quirk. For "ti,no-idle-on-init"
quirk, we must decrement the usage count later on to allow PM
runtime to idle the module if requested.

Fixes: 1a5cd7c23c ("bus: ti-sysc: Enable all clocks directly during init to read revision")
Cc: Keerthy <j-keerthy@ti.com>
Cc: Vignesh Raghavendra <vigneshr@ti.com>
Reported-by: Keerthy <j-keerthy@ti.com>
Reported-by: Grygorii Strashko <grygorii.strashko@ti.com>
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
2019-09-05 07:37:22 -07:00
Helge Deller
4ccac58e56 parisc: Avoid warning when loading hppb driver
ccio_request_resource() may fail to allocate regions for the hppb
driver. Do not print a misleading warning in this case.
This was noticed on a HP D350/2 machine.

Signed-off-by: Helge Deller <deller@gmx.de>
2019-09-05 16:35:12 +02:00
Harald Freudenberger
9e323d45ba s390/crypto: xts-aes-s390 fix extra run-time crypto self tests finding
With 'extra run-time crypto self tests' enabled, the selftest
for s390-xts fails with

  alg: skcipher: xts-aes-s390 encryption unexpectedly succeeded on
  test vector "random: len=0 klen=64"; expected_error=-22,
  cfg="random: inplace use_digest nosimd src_divs=[2.61%@+4006,
  84.44%@+21, 1.55%@+13, 4.50%@+344, 4.26%@+21, 2.64%@+27]"

This special case with nbytes=0 is not handled correctly and this
fix now makes sure that -EINVAL is returned when there is en/decrypt
called with 0 bytes to en/decrypt.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-09-05 15:18:15 +02:00
Wei Yongjun
987ca7ca1f vfio-ccw: fix error return code in vfio_ccw_sch_init()
Fix to return negative error code -ENOMEM from the memory alloc failed
error handling case instead of 0, as done elsewhere in this function.

Fixes: 60e05d1cf0 ("vfio-ccw: add some logging")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Link https://lore.kernel.org/kvm/20190904083315.105600-1-weiyongjun1@huawei.com/
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-09-05 15:18:15 +02:00
Halil Pasic
024cdcdbf3 s390: vfio-ap: fix warning reset not completed
The intention seems to be to warn once when we don't wait enough for the
reset to complete. Let's use the right retry counter to accomplish that
semantic.

Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Link: https://lore.kernel.org/r/20190903133618.9122-1-pasic@linux.ibm.com
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-09-05 15:18:15 +02:00
Alan Stern
98375b86c7 HID: prodikeys: Fix general protection fault during probe
The syzbot fuzzer provoked a general protection fault in the
hid-prodikeys driver:

kasan: CONFIG_KASAN_INLINE enabled
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] SMP KASAN
CPU: 0 PID: 12 Comm: kworker/0:1 Not tainted 5.3.0-rc5+ #28
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Workqueue: usb_hub_wq hub_event
RIP: 0010:pcmidi_submit_output_report drivers/hid/hid-prodikeys.c:300  [inline]
RIP: 0010:pcmidi_set_operational drivers/hid/hid-prodikeys.c:558 [inline]
RIP: 0010:pcmidi_snd_initialise drivers/hid/hid-prodikeys.c:686 [inline]
RIP: 0010:pk_probe+0xb51/0xfd0 drivers/hid/hid-prodikeys.c:836
Code: 0f 85 50 04 00 00 48 8b 04 24 4c 89 7d 10 48 8b 58 08 e8 b2 53 e4 fc
48 8b 54 24 20 48 b8 00 00 00 00 00 fc ff df 48 c1 ea 03 <80> 3c 02 00 0f
85 13 04 00 00 48 ba 00 00 00 00 00 fc ff df 49 8b

The problem is caused by the fact that pcmidi_get_output_report() will
return an error if the HID device doesn't provide the right sort of
output report, but pcmidi_set_operational() doesn't bother to check
the return code and assumes the function call always succeeds.

This patch adds the missing check and aborts the probe operation if
necessary.

Reported-and-tested-by: syzbot+1088533649dafa1c9004@syzkaller.appspotmail.com
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
CC: <stable@vger.kernel.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2019-09-05 15:02:59 +02:00
Ping Cheng
bbbe3ac8f9 HID: wacom: add new MobileStudio Pro 13 support
wacom_wac_pad_event is the only routine we need to update.

Signed-off-by: Ping Cheng <ping.cheng@wacom.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2019-09-05 15:01:39 +02:00
Dan Carpenter
08b0c89160 drm/vmwgfx: Fix double free in vmw_recv_msg()
We recently added a kfree() after the end of the loop:

	if (retries == RETRIES) {
		kfree(reply);
		return -EINVAL;
	}

There are two problems.  First the test is wrong and because retries
equals RETRIES if we succeed on the last iteration through the loop.
Second if we fail on the last iteration through the loop then the kfree
is a double free.

When you're reading this code, please note the break statement at the
end of the while loop.  This patch changes the loop so that if it's not
successful then "reply" is NULL and we can test for that afterward.

Cc: <stable@vger.kernel.org>
Fixes: 6b7c3b86f0 ("drm/vmwgfx: fix memory leak when too many retries have occurred")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
2019-09-05 14:44:28 +02:00
Roderick Colenbrander
2bcdacb703 HID: sony: Fix memory corruption issue on cleanup.
The sony driver is not properly cleaning up from potential failures in
sony_input_configured. Currently it calls hid_hw_stop, while hid_connect
is still running. This is not a good idea, instead hid_hw_stop should
be moved to sony_probe. Similar changes were recently made to Logitech
drivers, which were also doing improper cleanup.

Signed-off-by: Roderick Colenbrander <roderick.colenbrander@sony.com>
CC: stable@vger.kernel.org
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2019-09-05 14:27:14 +02:00
Daniel Borkmann
593f191a80 Merge branch 'bpf-af-xdp-barrier-fixes'
Björn Töpel says:

====================
This is a four patch series of various barrier, {READ, WRITE}_ONCE
cleanups in the AF_XDP socket code. More details can be found in the
corresponding commit message. Previous revisions: v1 [4] and v2 [5].

For an AF_XDP socket, most control plane operations are done under the
control mutex (struct xdp_sock, mutex), but there are some places
where members of the struct is read outside the control mutex. The
dev, queue_id members are set in bind() and cleared at cleanup. The
umem, fq, cq, tx, rx, and state member are all assigned in various
places, e.g. bind() and setsockopt(). When the members are assigned,
they are protected by the control mutex, but since they are read
outside the mutex, a WRITE_ONCE is required to avoid store-tearing on
the read-side.

Prior the state variable was introduced by Ilya, the dev member was
used to determine whether the socket was bound or not. However, when
dev was read, proper SMP barriers and READ_ONCE were missing. In order
to address the missing barriers and READ_ONCE, we start using the
state variable as a point of synchronization. The state member
read/write is paired with proper SMP barriers, and from this follows
that the members described above does not need READ_ONCE statements if
used in conjunction with state check.

To summarize: The members struct xdp_sock members dev, queue_id, umem,
fq, cq, tx, rx, and state were read lock-less, with incorrect barriers
and missing {READ, WRITE}_ONCE. After this series umem, fq, cq, tx,
rx, and state are read lock-less. When these members are updated,
WRITE_ONCE is used. When read, READ_ONCE are only used when read
outside the control mutex (e.g. mmap) or, not synchronized with the
state member (XSK_BOUND plus smp_rmb())

[1] https://lore.kernel.org/bpf/beef16bb-a09b-40f1-7dd0-c323b4b89b17@iogearbox.net/
[2] https://lwn.net/Articles/793253/
[3] https://github.com/google/ktsan/wiki/READ_ONCE-and-WRITE_ONCE
[4] https://lore.kernel.org/bpf/20190822091306.20581-1-bjorn.topel@gmail.com/
[5] https://lore.kernel.org/bpf/20190826061053.15996-1-bjorn.topel@gmail.com/

v2->v3:
  Minor restructure of commits.
  Improve cover and commit messages. (Daniel)
v1->v2:
  Removed redundant dev check. (Jonathan)
====================

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-09-05 14:11:53 +02:00
Björn Töpel
25dc18ff9b xsk: lock the control mutex in sock_diag interface
When accessing the members of an XDP socket, the control mutex should
be held. This commit fixes that.

Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Fixes: a36b38aa2a ("xsk: add sock_diag interface for AF_XDP")
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-09-05 14:11:52 +02:00
Björn Töpel
42fddcc7c6 xsk: use state member for socket synchronization
Prior the state variable was introduced by Ilya, the dev member was
used to determine whether the socket was bound or not. However, when
dev was read, proper SMP barriers and READ_ONCE were missing. In order
to address the missing barriers and READ_ONCE, we start using the
state variable as a point of synchronization. The state member
read/write is paired with proper SMP barriers, and from this follows
that the members described above does not need READ_ONCE if used in
conjunction with state check.

In all syscalls and the xsk_rcv path we check if state is
XSK_BOUND. If that is the case we do a SMP read barrier, and this
implies that the dev, umem and all rings are correctly setup. Note
that no READ_ONCE are needed for these variable if used when state is
XSK_BOUND (plus the read barrier).

To summarize: The members struct xdp_sock members dev, queue_id, umem,
fq, cq, tx, rx, and state were read lock-less, with incorrect barriers
and missing {READ, WRITE}_ONCE. Now, umem, fq, cq, tx, rx, and state
are read lock-less. When these members are updated, WRITE_ONCE is
used. When read, READ_ONCE are only used when read outside the control
mutex (e.g. mmap) or, not synchronized with the state member
(XSK_BOUND plus smp_rmb())

Note that dev and queue_id do not need a WRITE_ONCE or READ_ONCE, due
to the introduce state synchronization (XSK_BOUND plus smp_rmb()).

Introducing the state check also fixes a race, found by syzcaller, in
xsk_poll() where umem could be accessed when stale.

Suggested-by: Hillf Danton <hdanton@sina.com>
Reported-by: syzbot+c82697e3043781e08802@syzkaller.appspotmail.com
Fixes: 77cd0d7b3f ("xsk: add support for need_wakeup flag in AF_XDP rings")
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-09-05 14:11:52 +02:00
Björn Töpel
9764f4b301 xsk: avoid store-tearing when assigning umem
The umem member of struct xdp_sock is read outside of the control
mutex, in the mmap implementation, and needs a WRITE_ONCE to avoid
potential store-tearing.

Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Fixes: 423f38329d ("xsk: add umem fill queue support and mmap")
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-09-05 14:11:52 +02:00
Björn Töpel
94a997637c xsk: avoid store-tearing when assigning queues
Use WRITE_ONCE when doing the store of tx, rx, fq, and cq, to avoid
potential store-tearing. These members are read outside of the control
mutex in the mmap implementation.

Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Fixes: 37b076933a ("xsk: add missing write- and data-dependency barrier")
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-09-05 14:11:52 +02:00
Alexei Starovoitov
2339cd6cd0 bpf: fix precision tracking of stack slots
The problem can be seen in the following two tests:
0: (bf) r3 = r10
1: (55) if r3 != 0x7b goto pc+0
2: (7a) *(u64 *)(r3 -8) = 0
3: (79) r4 = *(u64 *)(r10 -8)
..
0: (85) call bpf_get_prandom_u32#7
1: (bf) r3 = r10
2: (55) if r3 != 0x7b goto pc+0
3: (7b) *(u64 *)(r3 -8) = r0
4: (79) r4 = *(u64 *)(r10 -8)

When backtracking need to mark R4 it will mark slot fp-8.
But ST or STX into fp-8 could belong to the same block of instructions.
When backtracing is done the parent state may have fp-8 slot
as "unallocated stack". Which will cause verifier to warn
and incorrectly reject such programs.

Writes into stack via non-R10 register are rare. llvm always
generates canonical stack spill/fill.
For such pathological case fall back to conservative precision
tracking instead of rejecting.

Reported-by: syzbot+c8d66267fd2b5955287e@syzkaller.appspotmail.com
Fixes: b5dc0163d8 ("bpf: precise scalar_value tracking")
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-09-05 14:06:58 +02:00
Alexei Starovoitov
310f4204ee selftests/bpf: precision tracking tests
Add two tests to check that stack slot marking during backtracking
doesn't trigger 'spi > allocated_stack' warning.
One test is using BPF_ST insn. Another is using BPF_STX.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-09-05 13:55:50 +02:00
Oded Gabbay
6dc66f7c26 habanalabs: correctly cast variable to __le32
When using the macro le32_to_cpu(x), we need to correctly convert x to be
__le32 in case it is defined as u32 variable.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Tomer Tayar <ttayar@habana.ai>
2019-09-05 14:55:28 +03:00
Oded Gabbay
307eae93d5 habanalabs: show correct id in error print
If the initialization of a device failed, the driver prints an error
message with the id of the device. The device index on the file system is
that id divided by 2.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>
2019-09-05 14:55:28 +03:00
Oded Gabbay
4c172bbfaa habanalabs: stop using the acronym KMD
We want to stop using the acronym KMD. Therefore, replace all locations
(except for register names we can't modify) where KMD is written to other
terms such as "Linux kernel driver" or "Host kernel driver", etc.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>
2019-09-05 14:55:27 +03:00
Oded Gabbay
0996bd1c74 habanalabs: display card name as sensors header
To allow the user to use a custom file for the HWMON lm-sensors library
per card type, the driver needs to register the HWMON sensors with the
specific card type name.

The card name is supplied by the F/W running on the device. If the F/W is
old and doesn't supply a card name, a default card name is displayed as
the sensors group name.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>
2019-09-05 14:55:27 +03:00
Oded Gabbay
e9730763a2 habanalabs: add uapi to retrieve aggregate H/W events
Add a new opcode to INFO IOCTL to retrieve aggregate H/W events. i.e. the
events counters are NOT cleared upon device reset, but count from the
loading of the driver.

Add the code to support it in the device event handling function.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>
2019-09-05 14:55:27 +03:00
Oded Gabbay
75b3cb2bb0 habanalabs: add uapi to retrieve device utilization
Users and sysadmins usually want to know what is the device utilization as
a level 0 indication if they are efficiently using the device.

Add a new opcode to the INFO IOCTL that will return the device utilization
over the last period of 100-1000ms. The return value is 0-100,
representing as percentage the total utilization rate.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>
2019-09-05 14:55:27 +03:00
Tomer Tayar
413cf576fd habanalabs: Make the Coresight timestamp perpetual
The Coresight timestamp is enabled for a specific debug session using
the HL_DEBUG_OP_TIMESTAMP opcode of the debug IOCTL.
In order to have a perpetual timestamp that would be comparable between
various debug sessions, this patch moves the timestamp enablement to be
part of the HW initialization.
The HL_DEBUG_OP_TIMESTAMP opcode turns to be deprecated and shouldn't be
used. Old user-space that will call it won't see any change in the
behavior of the debug session.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-09-05 14:55:27 +03:00
Dotan Barak
4fd2cb15cd habanalabs: explicitly set the queue-id enumerated numbers
When looking at kernel log messages and when debugging user applications,
we only see the queue id. This patch explicitly set the queue id in the
queue enumeration which will be helpful for finding the queue name when we
have its id.

Signed-off-by: Dotan Barak <dbarak@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-09-05 14:55:27 +03:00
Oded Gabbay
867b58ac94 habanalabs: print to kernel log when reset is finished
Now that we don't print the queue testing messages, we need to print when
the reset is finished so whoever looks at the kernel log will know the
reset process was finished successfully and the driver is not stuck.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-05 14:55:27 +03:00
Oded Gabbay
fe9a52c97f habanalabs: replace __le32_to_cpu with le32_to_cpu
In some files the driver uses __le32_to_cpu while in other it uses
le32_to_cpu. Replace all __le32_to_cpu instances with le32_to_cpu for
consistency.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-09-05 14:55:27 +03:00
Oded Gabbay
abca3a8224 habanalabs: replace __cpu_to_le32/64 with cpu_to_le32/64
In some files the code use __cpu_to_le32/64 while in other it use
cpu_to_le32/64. Replace all __cpu_to_le32/64 instances with
cpu_to_le32/64 for consistency.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-09-05 14:55:27 +03:00
Tomer Tayar
129b6a9324 habanalabs: Handle HW_IP_INFO if device disabled or in reset
The HW IP information is relevant even if the device is disabled or in
reset, so always handle the corresponding INFO IOCTL opcode.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-09-05 14:55:27 +03:00
Tomer Tayar
ea451f88ef habanalabs: Expose devices after initialization is done
The char devices are currently exposed to user before the device and
driver initialization are done.
This patch moves the cdev and device adding to the system to the end of
the initialization sequence, while keeping the creation of the
structures at the beginning to allow the usage of dev_*().

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-09-05 14:55:27 +03:00
Omer Shpigelman
9b50f539ff habanalabs: improve security in Debug IOCTL
This patch improves the security in the Debug IOCTL.
It adds checks that:
- The register index value is in the allowed range for all opcodes.
- The event types number is in the allowed range in SPMU enable.
- The events number is in the allowed range in SPMU disable.

Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-09-05 14:55:27 +03:00
Omer Shpigelman
8d1759329d habanalabs: use default structure for user input in Debug IOCTL
This patch fixes a possible kernel crash when a user provides a too small
input structure to the Debug IOCTL.
The fix sets a default input structure and copies to it the user data.
In case the user provided as input a too small structure, the code will
use the default values taken from the default structure.
Note that in contrary to the input structure, the user can provide an
output structure with changing size or no size at all. Therefore the user
output structure validation is already done in the Debug logic later on.

Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-09-05 14:55:27 +03:00
Tomer Tayar
10d7de2cdb habanalabs: Add descriptive name to PSOC app status register
Add a meaningful name to the general PSOC application status register
which better describes its usage in keeping the HW state.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-09-05 14:55:26 +03:00
Tomer Tayar
4095a17657 habanalabs: Add descriptive names to PSOC scratch-pad registers
The PSOC scratch-pad registers are used for communication with the
device CPU. This patch adds new definitions for these registers which
are more descriptive than their general names.

The new set of definitions also gathers and documents the current usage
of the scratch-pad registers by the driver and the device CPU.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-09-05 14:55:26 +03:00
Oded Gabbay
4d6a7751f6 habanalabs: create two char devices per ASIC
This patch changes the driver to create two char devices for each ASIC
it discovers. This is done to allow system/monitoring applications to
query the device for stats, information, idle state and more, while also
allowing the deep-learning application to send work to the ASIC.

One char device is the original device, hlX. IOCTL calls through this
device file can perform any task on the device (compute, memory, queries).
The open function for this device will fail if it was called before but
the file-descriptor it created was not completely released yet (the
release callback function is not called from the kernel until all
instances of that FD are closed). The driver needs to keep this behavior
to support backward compatibility with existing userspace, which count
that the open will fail if the device is "occupied".

The second char device is called "hl_controlDx", where x is the same index
of the main device with a minor number of the original char device + 1.
Applications that open this device can only call the INFO IOCTL. There is
no limitation on the number of applications opening this device.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-05 14:55:26 +03:00
Oded Gabbay
b968eb1a84 habanalabs: change device_setup_cdev() to be more generic
This patch re-factors the device_setup_cdev() function to make it more
generic. It doesn't manipulate members of the driver's internal device
structure but instead works only on the arguments that are sent to it.

This is in preparation for using this function to create an additional
char device per ASIC.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-05 14:55:26 +03:00
Oded Gabbay
eb7caf84b0 habanalabs: maintain a list of file private data objects
This patch adds a new list to the driver's device structure. The list will
keep the file private data structures that the driver creates when a user
process opens the device.

This change is needed because it is useless to try to count how many FD
are open. Instead, track our own private data structure per open file and
once it is released, remove it from the list. As long as the list is not
empty, it means we have a user that can do something with our device.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-05 14:55:26 +03:00
Oded Gabbay
86d5307a6d habanalabs: rename user_ctx as compute_ctx
This patch renames the "user_ctx" field in the device structure to
"compute_ctx". This better reflects the meaning of this context.

In addition, we also check in the ctx_fini() that the debug mode should be
disabled only if the context being destroyed is the compute context. This
has no effect right now as we only have a single process and a single
context, but this makes the code more ready for multiple process support.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-05 14:55:26 +03:00
Oded Gabbay
02e921e42b habanalabs: show the process context dram usage
When the user query the dram usage of a context, show it the dram usage of
its context, not the user context that is currently running on the device.

This has no effect right now as we only have a single process and a single
context, but this makes the code more ready for multiple process support.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-05 14:55:26 +03:00
Oded Gabbay
4aecb05e52 habanalabs: kill user process after CS rollback
This patch calls the kill user process function after we rollback the
in-flight CSs. This is because the user process can't be closed while
there are open CSs. Therefore, there is no point of sending it a SIGKILL
before we do the rollback CS part.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-05 14:55:26 +03:00
Oded Gabbay
b888751a02 habanalabs: add handle field to context structure
This patch adds a field to the context's structure that will hold a unique
handle for the context.

This will be needed when the user will create the context.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-05 14:55:26 +03:00
Chuhong Yuan
30f273222c habanalabs: Use dev_get_drvdata
Instead of using to_pci_dev + pci_get_drvdata,
use dev_get_drvdata to make code simpler.

Signed-off-by: Chuhong Yuan <hslester96@gmail.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-09-05 14:55:26 +03:00
Oded Gabbay
209257feab habanalabs: power management through sysfs is only for GOYA
The ability of setting power management properties by the system
administrator (through sysfs properties) is only relevant for the GOYA
ASIC. Therefore, move the relevant sysfs properties to the GOYA sysfs
specific file, to make the properties appear in sysfs only for GOYA cards.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>
2019-09-05 14:55:26 +03:00
Oded Gabbay
ed0fc50535 habanalabs: cap simulator timeout
In the driver timeout functions, we give the simulator a factor of 10
in the timeout. This was necessary when the requested timeout is small
but if it was a few seconds, this can result in a very large timeout which
is unnecessary.

This patch caps the maximum timeout of the simulator to 10 seconds, which
is our largest timeout in the code. That is more then enough for anything
the simulator is doing.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>
2019-09-05 14:55:26 +03:00