Commit graph

947024 commits

Author SHA1 Message Date
Moti Haimovski
a9855a2d91 habanalabs: check for DMA errors when clearing memory
In GAUDI we use QMAN0 DMA for clearing the MMU memory region
at initialization. if this operation fails it places the DMA in an error
state and then when trying to initialize QMAN0 we fail and erroneously
assume its the QMAN that failed.

This commit adds a check and clear of such DMA errors at initialization so
we will have a better understanding of what went wrong.

Signed-off-by: Moti Haimovski <mhaimovski@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-07-24 20:31:37 +03:00
Ofir Bitton
22cb855598 habanalabs: verify queue can contain all cs jobs
In order for the user to be aware of wrong inputs, we must return
error in case the amount of jobs per cs exceeds the corresponding
queue size.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-07-24 20:31:37 +03:00
Ofir Bitton
5574cb2194 habanalabs: Assign each CQ with its own work queue
We identified a possible race during job completion when working
with a single multi-threaded work queue. In order to overcome this
race we suggest using a single threaded work queue per completion
queue, hence we guarantee jobs completion in order.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-07-24 20:31:37 +03:00
Oded Gabbay
c83c417193 habanalabs: halt device CPU only upon certain reset
Currently the driver halts the device CPU in the halt engines function,
which halts all the engines of the ASIC. The problem is that if later on we
stop the reset process (due to inability to clean memory mappings in time),
the CPU will remain in halt mode. This creates many issues, such as
thermal/power control and FLR handling.

Therefore, move the halting of the device CPU to the very end of the reset
process, just before writing to the registers to initiate the reset. In
addition, the driver now needs to send a message to the device F/W to
disable it from sending interrupts to the host machine because during halt
engines function the driver disables the MSI/MSI-X interrupts.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Tomer Tayar <ttayar@habana.ai>
2020-07-24 20:31:36 +03:00
Omer Shpigelman
9158c47e20 habanalabs: remove unused hash
Remove an old hash that is not in use anymore.

Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-07-24 20:31:36 +03:00
Ofir Bitton
79b1894c41 habanalabs: use queue pi/ci in order to determine queue occupancy
Instead of using the free slots amount on the compute CQ to determine
whether we can submit work to queues, use the queues pi/ci.

This is needed in future ASICs where we don't have CQ per queue.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-07-24 20:31:36 +03:00
Ofir Bitton
3abc99bb7d habanalabs: configure maximum queues per asic
Currently the amount of maximum queues is statically configured.
Using a static value is causing redundunt cycles when traversing
all queues and consumes more memory than actually needed.
In this patch we configure each asic with the exact number of
queues needed.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-07-24 20:31:36 +03:00
Oded Gabbay
12ae3133d2 habanalabs: remove soft-reset support from GAUDI
Soft-reset isn't supported in GAUDI. Remove the code that performs it and
print error in case the user wants to do it via sysfs.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Tomer Tayar <ttayar@habana.ai>
2020-07-24 20:31:36 +03:00
Ofir Bitton
f4cbfd2445 habanalabs: PCIe iATU refactoring
Divide iATU initialization into inbound/outbound methods.
We must separate it in order to enable different match mode
per PCIe region.
In addition, added support for PCI address match mode.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-07-24 20:31:36 +03:00
Oded Gabbay
fcc6a4e606 habanalabs: Extract ECC information from FW
ECC (Error Correcting Code) interrupts are going to be handled
by the FW. Hence, we define an interface in which the driver can
obtain the relevant ECC information.
This information is needed for monitoring and can also lead
to a hard reset if ECC error is not correctable.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-07-24 20:31:36 +03:00
Ofir Bitton
db491e4f08 habanalabs: Add dropped cs statistics info struct
Add command submission statistics structure which can be obtained
through the info ioctl. Each drop counter describes the reason for
which the command submission was dropped.
This information is needed for the user to be aware of the specific
reason for which the submitted work was dropped. The user can then
utilize the driver more efficiently.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-07-24 20:31:36 +03:00
Christine Gharzuzi
c8f9b49d2d habanalabs: extract cpu boot status lookup
Extract detection of the cpu boot status to a function
to allow code reuse

Signed-off-by: Christine Gharzuzi <cgharzuzi@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-07-24 20:31:35 +03:00
Oded Gabbay
0eab4f89d6 habanalabs: rephrase error messages
rephrase some error/warning/notice messages to make them more accessible to
ordinary users.

There is no need to print context ASID as the driver currently doesn't
support multiple contexts.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Tomer Tayar <ttayar@habana.ai>
2020-07-24 20:31:35 +03:00
Ofir Bitton
dd9efabd0a habanalabs: Increase queues depth
After recent concurrent cs amount increase, we must also
increase queues depth since much more concurrent work can be done.
All external queue depths were increased to 4096 as gaudi's
internal queue depths were also increased to 1024.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-07-24 20:31:35 +03:00
Omer Shpigelman
917b79b096 habanalabs: rephrase error message
Rephrase F/W error message to make it more understandable to ordinary
users.

Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-07-24 20:31:35 +03:00
Adam Aharon
e8edded693 habanalabs: calculate trace frequency from PLL
The profiler needs to know the PLL values for correctly showing the
profiling data. Because our firmware can use different PLL configurations,
we need to read the PLL values from the ASIC to pass them to the profiler.

Signed-off-by: Adam Aharon <aaharon@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-07-24 20:31:35 +03:00
Oded Gabbay
6ced91170d habanalabs: align armcp_packet structure to 8 bytes
Once there is a 64-bit field in a structure, GCC compiler for ARM aligns
the structure to 8 bytes. In order to avoid confusion when these
structures are being passed between CPUs from different architectures, we
explicitly align the structure to 8 bytes.

Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-07-24 20:31:35 +03:00
Oded Gabbay
3bf1c021e3 uapi/habanalabs: fix some comments
MAP/UNMAP are done also for device memory.

Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-07-24 20:31:35 +03:00
Ofir Bitton
6c07bab34b habanalabs: Use mask instead of shift in sync stream registers
Use proper bitfield masks instead of shifting values when configuring
packets sent to device.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-07-24 20:31:34 +03:00
Ofir Bitton
21e7a34634 habanalabs: sync stream generic functionality
Currently sync stream is limited only for external queues. We want to
remove this constraint by adding a new queue property dedicated for sync
stream. In addition we move the initialization and reset methods to the
common code since we can re-use them with slight changes.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-07-24 20:31:34 +03:00
Ofir Bitton
c16d45f42b habanalabs: Use pending CS amount per ASIC
Training schemes requires much more concurrent command submissions than
inference does. In addition, training command submissions can be completed
in a non serialized manner. Hence, we add support in which each ASIC will
be able to configure the amount of concurrent pending command submissions,
rather than use a predefined amount. This change will enhance performance
by allowing the user to add more concurrent work without waiting for the
previous work to be completed.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-07-24 20:31:34 +03:00
Oded Gabbay
0b168c8f1d habanalabs: remove rate limiters from GAUDI
We no longer need to initialize the rate limiters in GAUDI A1.

Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-07-24 20:31:34 +03:00
Thierry Reding
82aa68afa1 thermal: core: Fix thermal zone lookup by ID
When a thermal zone is looked up by an ID and no zone is found matching
that ID, the thermal_zone_get_by_id() function will return a pointer to
the thermal zone list head which isn't actually a valid thermal zone.

This can lead to a subsequent crash because a valid pointer is returned
to the called, but dereferencing that pointer as struct thermal_zone is
not safe.

Fixes: 329b064fbd ("thermal: core: Get thermal zone by id")
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20200724170105.2705467-1-thierry.reding@gmail.com
2020-07-24 19:11:47 +02:00
Andrei Vagin
9614cc576d arm64: enable time namespace support
CONFIG_TIME_NS is dependes on GENERIC_VDSO_TIME_NS.

Signed-off-by: Andrei Vagin <avagin@gmail.com>
Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Reviewed-by: Dmitry Safonov <dima@arista.com>
Link: https://lore.kernel.org/r/20200624083321.144975-7-avagin@gmail.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-07-24 18:06:52 +01:00
Andrei Vagin
bcf9964342 arm64/vdso: Restrict splitting VVAR VMA
Forbid splitting VVAR VMA resulting in a stricter ABI and reducing the
amount of corner-cases to consider while working further on VDSO time
namespace support.

As the offset from timens to VVAR page is computed compile-time, the pages
in VVAR should stay together and not being partically mremap()'ed.

Signed-off-by: Andrei Vagin <avagin@gmail.com>
Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Reviewed-by: Dmitry Safonov <dima@arista.com>
Link: https://lore.kernel.org/r/20200624083321.144975-6-avagin@gmail.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-07-24 18:06:52 +01:00
Andrei Vagin
ee3cda8e46 arm64/vdso: Handle faults on timens page
If a task belongs to a time namespace then the VVAR page which contains
the system wide VDSO data is replaced with a namespace specific page
which has the same layout as the VVAR page.

Signed-off-by: Andrei Vagin <avagin@gmail.com>
Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Reviewed-by: Dmitry Safonov <dima@arista.com>
Link: https://lore.kernel.org/r/20200624083321.144975-5-avagin@gmail.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-07-24 18:06:45 +01:00
Tzung-Bi Shih
3aecfc72d7
ASoC: dapm: don't call pm_runtime_* on card device
runtime_usage of sound card has been observed to grow without bound.
For example:
$ cat /sys/devices/platform/sound/power/runtime_usage
46
$ sox -n -t s16 -r 48000 -c 2 - synth 1 sine 440 vol 0.1 | \
  aplay -q -D hw:0,0 -f S16_LE -r 48000 -c 2
$ cat /sys/devices/platform/sound/power/runtime_usage
52

Commit 4e872a4682 ("ASoC: dapm: Don't force card bias level to be
updated") stops to force update bias_level on card.  If card doesn't
provide set_bias_level callback, the snd_soc_dapm_set_bias_level()
is equivalent to NOP for card device.

As a result, dapm_pre_sequence_async() doesn't change the bias_level of
card device correctly.  Thus, pm_runtime_get_sync() would be called in
dapm_pre_sequence_async() without symmetric pm_runtime_put() in
dapm_post_sequence_async().

Don't call pm_runtime_* on card device.

Signed-off-by: Tzung-Bi Shih <tzungbi@google.com>
Link: https://lore.kernel.org/r/20200724070731.451377-1-tzungbi@google.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2020-07-24 17:27:53 +01:00
Armas Spann
293a92c1d9 ALSA: hda/realtek: typo_fix: enable headset mic of ASUS ROG Zephyrus G14(GA401) series with ALC289
This patch fixes a small typo I accidently submitted with the initial patch. The board should be named GA401 not G401.

Fixes: ff53664daf ("ALSA: hda/realtek: enable headset mic of ASUS ROG Zephyrus G14(G401) series with ALC289")
Signed-off-by: Armas Spann <zappel@retarded.farm>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20200724140837.302763-1-zappel@retarded.farm
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2020-07-24 18:25:22 +02:00
Armas Spann
4b43d05a19 ALSA: hda/realtek: enable headset mic of ASUS ROG Zephyrus G15(GA502) series with ALC289
This patch adds support for headset mic to the ASUS ROG Zephyrus
G15(GA502) notebook series by adding the corresponding
vendor/pci_device id, as well as adding a new fixup for the used
realtek ALC289. The fixup stets the correct pin to get the headset mic
correctly recognized on audio-jack.

Signed-off-by: Armas Spann <zappel@retarded.farm>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20200724140616.298892-1-zappel@retarded.farm
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2020-07-24 18:21:31 +02:00
Greg Kroah-Hartman
54918b8ed1 interconnect changes for 5.9
Here are the interconnect changes for the 5.9-rc1 merge window
 consisting mostly of changes that give the core more flexibility
 in order to support some new provider drivers.
 
 Core changes:
 - Export of_icc_get_from_provider()
 - Relax requirement in of_icc_get_from_provider()
 - Allow inter-provider pairs to be configured
 - Mark all dummy functions as static inline
 
 Signed-off-by: Georgi Djakov <georgi.djakov@linaro.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJfGsHrAAoJEIDQzArG2BZjc3gP/2PVWSUrFTKbR0u7Nu/9egzs
 IrEKj2A5yWgV9EA1nIXkXaEHIr+2SgZBOKNEqtZBRW7QX4DLqBDH4M7nvedETXd5
 BwKWqap5TyXKKGkdpSNZYABHonGFUhTHxDY0sh7nDZqaPsasICmWKJF4k5lh9BHN
 3Ix2I0fJ3sTRHfG8txQG/SRO/gbH7WnT3U6hznaI21tVtHsWZ5+tzDQKJLyUsnWY
 Q6vWNnQmGj3vbk5Y2xcAUrdNlZuARh0+vYBZVFaoxCHVNggkjvuMzIdrQa5Z2p2H
 H+rZM8URaUKUE8yZK+vXfW5huYAd94D17H5ccm5MXZcj3rsVBTbJY3ByB8iynCYi
 RCySMSzexLB+ldL8cZJb3mYW9krgbVhIQg6Sf6XzMjMEKrNJbQheSa3ayK5jN4lz
 3uksghfMQZeVXRiPWShtupoTXVT+G25/4LMLQ5e3+sEZr+WdFKXir9K7wQaroXep
 7pgCr5BD+JkzG3bgwDQvKRc2o+1tp2r07izlfMbTW2kxmtOsCrWuDVy3aZ3O18EE
 YviKIrAeDBEgIv7MBq7nRgk6MsZxrJXuU6rq2vW01XXrSuTBJGxaIzqZtwW+oXrB
 iNvqC93o5mDemAiyzB5wB5imXQgkO0W9mFA+ryHtMWwOR857MlOEZakSY+m5Qv+9
 IAeiNVgqIwNNGCdcWHyD
 =+gRu
 -----END PGP SIGNATURE-----

Merge tag 'icc-5.9-rc1' of https://git.linaro.org/people/georgi.djakov/linux into char-misc-next

Georgi writes:

interconnect changes for 5.9

Here are the interconnect changes for the 5.9-rc1 merge window
consisting mostly of changes that give the core more flexibility
in order to support some new provider drivers.

Core changes:
- Export of_icc_get_from_provider()
- Relax requirement in of_icc_get_from_provider()
- Allow inter-provider pairs to be configured
- Mark all dummy functions as static inline

Signed-off-by: Georgi Djakov <georgi.djakov@linaro.org>

* tag 'icc-5.9-rc1' of https://git.linaro.org/people/georgi.djakov/linux:
  interconnect: Mark all dummy functions as static inline
  interconnect: Allow inter-provider pairs to be configured
  interconnect: Relax requirement in of_icc_get_from_provider()
  interconnect: Export of_icc_get_from_provider()
2020-07-24 17:24:09 +02:00
Jim Cromie
4c0d77828d dyndbg: export ddebug_exec_queries
Export ddebug_exec_queries() for use by modules.

This will allow module authors to control all their *pr_debug*s
dynamically.  And since ddebug_exec_queries() is what implements
"echo $query >control", it gives the same per-callsite control.

Virtues of this:
- simplicity. just an export.
- full control over any/all subsets of callsites.
- same "query/command-string" in code and console
- full callsite selectivity with module file line format

Format in particular deserves special attention; it is where
low-hanging fruit will be found.

Consider: drivers/gpu/drm/amd/display/include/logger_types.h:

  #define DC_LOG_SURFACE(...)          pr_debug("[SURFACE]:"__VA_ARGS__)
  #define DC_LOG_HW_LINK_TRAINING(...) pr_debug("[HW_LINK_TRAINING]:"__VA_ARGS__)
  .. 9 more ..

Thats 11 string prefixes, used in 804 places in drivers/gpu/**
Clearly this is a systematized classification of those callsites.
And one I'd expect to see repeated often.

Using ddebug_exec_queries(), authors can select on those prefixes
as a unitary set, equivalent to:

    echo "module=MODULE_NAME format=^[SURFACE]: +p" >control

Trivially, those sets can be subsected with the other query terms too,
say file=foo, should the author see fit.

Perhaps as important, users can modify the set of enabled callsites,
presumably to aid debugging by enabling helpful debug callsites, and
disabling those that just clutter the info.

Authors could even alter [fmlt] flags, though I dont see a good reason
why they would.  Perhaps harnessed by bug-logging automation to get
fuller, or more minimal bug-reports.

DRM

drm has both drm.debug, which defines 32 categories of drm_printk
logging, and entirely separate uses of pr_debug, which are dynamic on
this i915 laptop, running mainline.  So I can observe and report on
both.

The i915 driver has 118 dyndbg callsites, with following
"classifications" defined in drivers/gpu/drm/i915/gvt/**

$ grep 915 /proc/dynamic_debug/control | cut -d= -f2 | cut -d: -f1,2 | sort -u
_ "gvt: cmd
_ "gvt: core
_ "gvt: dpy
_ "gvt: el
_ "gvt: irq
_ "gvt: mm
_ "gvt: mmio
_ "gvt: render
_ "gvt: sched
_ "%s for root hub!\012"
_ "Vendor defined info completion code %u\012"

This classification is entirely out-of-band for control by drm.debug,
and is only available to root user at the console.  But module authors
can activate them with ddebug_exec_queries(sprintf("format=^%s +p")),
and then decide how to expose the groups to the user for max utility.

drm.debug

drm.debug has 32 bit-flags, and matching enum drm_debug_category
values to classify the ~2943 DRM_DEBUG*() callsites in drivers/gpu

The drm.debug callback could invoke ddebug_exec_queries() with 32
different hardcoded query strings, needing only (bit) ? " +p" : " -p"
added.

I briefly enabled drm.debug=0xff on my i915 laptop, which yielded
these unique prefixes: (dmesg | cut -c17- | cut -d\] -f1 | sort -u)

[drm:drm_atomic_check_only [drm
[drm:drm_atomic_get_crtc_state [drm
[drm:drm_atomic_get_plane_state [drm
[drm:drm_atomic_nonblocking_commit [drm
[drm:drm_atomic_set_fb_for_plane [drm
[drm:drm_atomic_state_default_clear [drm
[drm:__drm_atomic_state_free [drm
[drm:drm_atomic_state_init [drm
[drm:drm_crtc_vblank_helper_get_vblank_timestamp_internal [drm
[drm:drm_handle_vblank [drm
[drm:drm_ioctl [drm
[drm:drm_mode_addfb2 [drm
[drm:drm_mode_object_get [drm
[drm:drm_mode_object_put.part.0 [drm
[drm:drm_update_vblank_count [drm
[drm:drm_vblank_enable [drm
[drm:drm_vblank_restore [drm
[drm:vblank_disable_fn [drm
i915 0000:00:02.0: [drm:gen9_set_dc_state [i915
i915 0000:00:02.0: [drm:intel_atomic_get_global_obj_state [i915
i915 0000:00:02.0: [drm:__intel_display_power_get_domain.part.0 [i915
i915 0000:00:02.0: [drm:__intel_display_power_put_domain [i915
i915 0000:00:02.0: [drm:intel_plane_atomic_calc_changes [i915
i915 0000:00:02.0: [drm:skl_enable_dc6 [i915

Several good format=^prefixes are apparent there, and some misses.

 ^[drm:drm_atomic_	# misses: [drm:__drm_atomic_state_free [drm
 ^[drm:drm_ioctl
 ^[drm:drm_mode
 ^[drm:drm_vblank_	# misses: [drm:drm_update_vblank_count & [drm:vblank_disable_fn

Its not a perfect 1:1 single format-match per class, but the misses
above can be covered with 1 & 2 additional queries, which can be
concatenated together with ";" separators and submitted with 1 call.

Benefits:

For drm, adapting DRM_DEBUG to use dynamic-debug inside could
replicate (and thereby obsolete) lots of bit-checking in current
DRM_DEBUG callsites, at least with JUMP_LABEL optimized code.
ddebug_exec_queries() and a handful of fixed query-strings can select
and thereby control the already classified callsites.

With the classes mapped to queries, the enum type and parameter can be
eliminated (folded away with macro magic), at least for DYNAMIC_DEBUG
& JUMP_LABEL builds.

Is it safe ?

ddebug_exec_queries() is currently exposed to user space in
several limited ways;

1 it is called from module-load callback, where it implements the
  $modname.dyndbg=+p "fake" parameter provided to all modules.

2 it handles query input via >control directly

IOW, it is "fully" exposed to local root user; exposing the same
functionality to other kernel modules is no additional risk.

The other standard issue to check is locking:

dyndbg has a single mutex, taken by ddebug_change to handle >control,
and by ddebug_proc_(start|stop) to span `cat control`.  Queries
submitted via export will typically have module specified, which
dramatically cuts the scan by ddebug_change vs "module=* +p".
ISTM this proposed export presents no locking problems.

TLDR;

It would be interesting to see how drm.dyndbg=$QUERY and
drm.debug=$HEXY would interact; it might be order dependent, as
if given as modprobe args or in /etc/modprobe.d/

Acked-by: <jbaron@akamai.com>
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Link: https://lore.kernel.org/r/20200719231058.1586423-19-jim.cromie@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-24 17:00:10 +02:00
Jim Cromie
5aa9ffbbae dyndbg: shorten our logging prefix, drop __func__
For log-message output, reduce column space consumed by current
pr_fmt by dropping __func__ and shortening "dynamic_debug" to
"dyndbg".  This improves readability on narrow consoles, and better
matches other kernel boot info messages.

Acked-by: <jbaron@akamai.com>
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Link: https://lore.kernel.org/r/20200719231058.1586423-18-jim.cromie@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-24 17:00:10 +02:00
Jim Cromie
4b334484fa dyndbg: allow anchored match on format query term
This should work:

  echo module=amd* format=^[IF_TRACE]: +p  >/proc/dynamic_debug/control

consider drivers/gpu/drm/amd/display/include/logger_types.h:
It has 11 defines like:

  #define DC_LOG_IF_TRACE(...) pr_debug("[IF_TRACE]:"__VA_ARGS__)

These defines are used 804 times at recent count; they are a good use
case to evaluate existing format-message based classifications of
*pr_debug*.  Those macros prefix the supplied format with a fixed
string, I'd expect most existing message classification schemes to do
something similar.

Hence we want to be able to anchor our match to the beginning of the
format string, allowing easy construction of clear and precise
queries, leveraging the existing classification scheme to enable and
disable those callsites.

Note that unlike other search terms, formats are implicitly floating
substring matches, without the need for explicit wildcards.

This makes no attempt at wider regex features, just the one we need.

TLDR: Using the anchor also means the []s are less helpful for
disamiguating the prefix from a random in-message occurrence, allowing
shorter prefixes.

Acked-by: <jbaron@akamai.com>
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Link: https://lore.kernel.org/r/20200719231058.1586423-17-jim.cromie@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-24 17:00:10 +02:00
Jim Cromie
84da83a6ff dyndbg: combine flags & mask into a struct, simplify with it
flags & mask are used together everywhere, and are passed around
together between multiple functions; they belong together in a struct,
call that struct flag_settings.

Use struct flag_settings to rework 3 functions:
 - ddebug_exec_query - declares query and flag-settings,
   		     calls other 2, passing flags
 - ddebug_parse_flags - fills flag_settings and returns
 - ddebug_change - test all callsites against query,
   		   modify passing sites.

benefits:
 - bit-banging always needs flags & mask, best together.
 - simpler function signatures
 - 1 less parameter, less stack overhead

no functional changes

Acked-by: <jbaron@akamai.com>
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Link: https://lore.kernel.org/r/20200719231058.1586423-16-jim.cromie@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-24 17:00:09 +02:00
Jim Cromie
14775b0496 dyndbg: accept query terms like file=bar and module=foo
Current code expects "keyword" "arg" as 2 words, space separated.
Change to also accept "keyword=arg" form as well, and drop !(nwords%2)
requirement.  Then in rest of function, use new keyword, arg variables
instead of word[i], word[i+1]

Acked-by: <jbaron@akamai.com>
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Link: https://lore.kernel.org/r/20200719231058.1586423-15-jim.cromie@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-24 17:00:09 +02:00
Jim Cromie
aaebe329bf dyndbg: accept 'file foo.c:func1' and 'file foo.c:10-100'
Accept these additional query forms:

   echo "file $filestr +_" > control

       path/to/file.c:100	# as from control, column 1
       path/to/file.c:1-100	# or any legal line-range
       path/to/file.c:func_A	# as from an editor/browser
       path/to/file.c:drm_*	# wildcards still work
       path/to/file.c:*_foo	# lead wildcard too

1st 2 examples are treated as line-ranges, 3-5 are treated as func's

Doc these changes, and sprinkle in a few extra wild-card examples and
trailing # explanation texts.

Acked-by: <jbaron@akamai.com>
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Link: https://lore.kernel.org/r/20200719231058.1586423-14-jim.cromie@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-24 17:00:09 +02:00
Jim Cromie
8037072d81 dyndbg: refactor parse_linerange out of ddebug_parse_query
Make the code-block reusable to later handle "file foo.c:101-200" etc.
This is a 99% code move, with reindent, function wrap&call, +pr_debug.

no functional changes.

Acked-by: <jbaron@akamai.com>
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Link: https://lore.kernel.org/r/20200719231058.1586423-13-jim.cromie@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-24 17:00:09 +02:00
Jim Cromie
f62fc08fdc dyndbg: use gcc ?: to reduce word count
reduce word count via gcc ?: extension, no actual code change.

Acked-by: <jbaron@akamai.com>
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Link: https://lore.kernel.org/r/20200719231058.1586423-12-jim.cromie@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-24 17:00:09 +02:00
Jim Cromie
47e9f5a823 dyndbg: make ddebug_tables list LIFO for add/remove_module
loadable modules are the last in on this list, and are the only
modules that could be removed.  ddebug_remove_module() searches from
head, but ddebug_add_module() uses list_add_tail().  Change it to
list_add() for a micro-optimization.

Acked-by: <jbaron@akamai.com>
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Link: https://lore.kernel.org/r/20200719231058.1586423-11-jim.cromie@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-24 17:00:09 +02:00
Jim Cromie
9c9d0acbe2 dyndbg: prefer declarative init in caller, to memset in callee
ddebug_exec_query declares an auto var, and passes it to
ddebug_parse_query, which memsets it before using it.  Drop that
memset, instead initialize the variable in the caller; let the
compiler decide how to do it.

Acked-by: <jbaron@akamai.com>
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Link: https://lore.kernel.org/r/20200719231058.1586423-10-jim.cromie@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-24 17:00:09 +02:00
Jim Cromie
0b8f96be9b dyndbg: fix pr_err with empty string
this pr_err attempts to print the string after the OP, but the string
has been parsed and chopped up, so looks empty.

Acked-by: <jbaron@akamai.com>
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Link: https://lore.kernel.org/r/20200719231058.1586423-9-jim.cromie@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-24 17:00:08 +02:00
Jim Cromie
f678ce8cc3 dyndbg: fix a BUG_ON in ddebug_describe_flags
ddebug_describe_flags() currently fills a caller provided string buffer,
after testing its size (also passed) in a BUG_ON.  Fix this by
replacing them with a known-big-enough string buffer wrapped in a
struct, and passing that instead.

Also simplify ddebug_describe_flags() flags parameter from a struct to
a member in that struct, and hoist the member deref up to the caller.
This makes the function reusable (soon) where flags are unpacked.

Acked-by: <jbaron@akamai.com>
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Link: https://lore.kernel.org/r/20200719231058.1586423-8-jim.cromie@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-24 17:00:08 +02:00
Jim Cromie
81d0c2c609 dyndbg: fix overcounting of ram used by dyndbg
during dyndbg init, verbose logging prints its ram overhead.  It
counted strlens of struct _ddebug's 4 string members, in all callsite
entries, which would be approximately correct if each had been
mallocd.  But they are pointers into shared .rodata; for example, all
10 kobject callsites have identical filename, module values.

Its best not to count that memory at all, since we cannot know they
were linked in because of CONFIG_DYNAMIC_DEBUG=y, and we want to
report a number that reflects what ram is saved by deconfiguring it.

Also fix wording and size under-reporting of the __dyndbg section.

Heres my overhead, on a virtme-run VM on a fedora-31 laptop:

  dynamic_debug:dynamic_debug_init: 260 modules, 2479 entries \
    and 10400 bytes in ddebug tables, 138824 bytes in __dyndbg section

Acked-by: <jbaron@akamai.com>
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Link: https://lore.kernel.org/r/20200719231058.1586423-7-jim.cromie@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-24 17:00:08 +02:00
Jim Cromie
e5ebffe18e dyndbg: rename __verbose section to __dyndbg
dyndbg populates its callsite info into __verbose section, change that
to a more specific and descriptive name, __dyndbg.

Also, per checkpatch:
  simplify __attribute(..) to __section(__dyndbg) declaration.

and 1 spelling fix, decriptor

Acked-by: <jbaron@akamai.com>
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Link: https://lore.kernel.org/r/20200719231058.1586423-6-jim.cromie@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-24 17:00:08 +02:00
Jim Cromie
481c0e33f1 dyndbg: refine debug verbosity; 1 is basic, 2 more chatty
The verbose/debug logging done for `cat $MNT/dynamic_debug/control` is
voluminous (2 per control file entry + 2 per PAGE).  Moreover, it just
prints pointer and sequence, which is not useful to a dyndbg user.
So just drop them.

Also require verbose>=2 for several other debug printks that are a bit
too chatty for typical needs;

ddebug_change() prints changes, once per modified callsite.  Since
queries like "+p" will enable ~2300 callsites in a typical laptop, a
user probably doesn't need to see them often.  ddebug_exec_queries()
still summarizes with verbose=1.

ddebug_(add|remove)_module() also print 1 line per action on a module,
not needed by typical modprobe user.

This leaves verbose=1 better focussed on the >control parsing process.

Acked-by: <jbaron@akamai.com>
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Link: https://lore.kernel.org/r/20200719231058.1586423-5-jim.cromie@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-24 17:00:08 +02:00
Jim Cromie
1ff838487d dyndbg: drop obsolete comment on ddebug_proc_open
commit 4bad78c550 ("lib/dynamic_debug.c: use seq_open_private() instead of seq_open()")'

The commit was one of a tree-wide set which replaced open-coded
boilerplate with a single tail-call.  It therefore obsoleted the
comment about that boilerplate, clean that up now.

Acked-by: <jbaron@akamai.com>
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Link: https://lore.kernel.org/r/20200719231058.1586423-4-jim.cromie@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-24 17:00:08 +02:00
Jim Cromie
fa08052070 dyndbg-docs: initialization is done early, not arch
since cf964976484 in 2012, initialization is done with early_initcall,
update the Docs, which still say arch_initcall.

Acked-by: <jbaron@akamai.com>
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Link: https://lore.kernel.org/r/20200719231058.1586423-3-jim.cromie@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-24 17:00:08 +02:00
Jim Cromie
e20e310c81 dyndbg-docs: eschew file /full/path query in docs
Regarding:
commit 2b6783191d ("dynamic_debug: add trim_prefix() to provide source-root relative paths")
commit a73619a845 ("kbuild: use -fmacro-prefix-map to make __FILE__ a relative path")

2nd commit broke dynamic-debug's "file $fullpath" query form, but
nobody noticed because 1st commit had trimmed prefixes from
control-file output, so the click-copy-pasting of fullpaths into new
queries had ceased; that query form became unused.

Removing the function is cleanest, but it could be useful in
old-compiler corner cases, where __FILE__ still has /full/path,
and it safely does nothing otherwize.

So instead, quietly deprecate "file /full/path" query form, by
removing all /full/paths examples in the docs.  I skipped adding a
back-compat note.

Acked-by: <jbaron@akamai.com>
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Link: https://lore.kernel.org/r/20200719231058.1586423-2-jim.cromie@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-24 17:00:08 +02:00
Ashok Raj
3f9a7a13fe PCI/ATS: Add pci_pri_supported() to check device or associated PF
For SR-IOV, the PF PRI is shared between the PF and any associated VFs, and
the PRI Capability is allowed for PFs but not for VFs.  Searching for the
PRI Capability on a VF always fails, even if its associated PF supports
PRI.

Add pci_pri_supported() to check whether device or its associated PF
supports PRI.

[bhelgaas: commit log, avoid "!!"]
Fixes: b16d0cb9e2 ("iommu/vt-d: Always enable PASID/PRI PCI capabilities before ATS")
Link: https://lore.kernel.org/r/1595543849-19692-1-git-send-email-ashok.raj@intel.com
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Acked-by: Joerg Roedel <jroedel@suse.de>
Cc: stable@vger.kernel.org	# v4.4+
2020-07-24 09:50:41 -05:00
Colin Ian King
1c026a18d4 xen: Remove redundant initialization of irq
The variable irq is being initialized with a value that is never read
and it is being updated later with a new value.  The initialization is
redundant and can be removed.

Addresses-Coverity: ("Unused value")
Link: https://lore.kernel.org/r/20200611123134.922395-1-colin.king@canonical.com
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2020-07-24 09:50:22 -05:00