On the Loongson64 platform used with Radeon GPU, shutdown or reboot failed
when console=tty is in the boot cmdline.
radeon_suspend_kms() puts the hw in the suspend state, especially set fb
state as FBINFO_STATE_SUSPENDED:
if (fbcon) {
console_lock();
radeon_fbdev_set_suspend(rdev, 1);
console_unlock();
}
Then avoid to do any more fb operations in the related functions:
if (p->state != FBINFO_STATE_RUNNING)
return;
So call radeon_suspend_kms() in radeon_pci_shutdown() for Loongson64 to fix
this issue, it looks like some kind of workaround like powerpc.
Co-developed-by: Jianmin Lv <lvjianmin@loongson.cn>
Signed-off-by: Jianmin Lv <lvjianmin@loongson.cn>
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
The ttm caching flags (ttm_cached, ttm_write_combined etc) are
used to determine a buffer object's mapping attributes in both
CPU page table and GPU page table (when that buffer is also
accessed by GPU). Currently the ttm caching flags are set in
function amdgpu_ttm_io_mem_reserve which is called during
DRM_AMDGPU_GEM_MMAP ioctl. This has a problem since the GPU
mapping of the buffer object (ioctl DRM_AMDGPU_GEM_VA) can
happen earlier than the mmap time, thus the GPU page table
update code can't pick up the right ttm caching flags to
decide the right GPU page table attributes.
This patch moves the ttm caching flags setting to function
amdgpu_vram_mgr_new - this function is called during the
first step of a buffer object create (eg, DRM_AMDGPU_GEM_CREATE)
so the later both CPU and GPU mapping function calls will
pick up this flag for CPU/GPU page table set up.
v2: rebase (Alex)
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Suggested-by: Christian Koenig <Christian.Koenig@amd.com>
Reviewed-by: Christian Koenig <Christian.Koenig@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Tested-by: Po Huang <Po.Huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
During GPU reset, when receiving a DMCUB OUTBUX0 interrupt,
DAL code will set it to be OUTBOX interrupt and sets hw interrupt.
However, OUTBOX interrupt is not registered yet, so a NULL pointer
access will be executed.
Call Trace:
dal_irq_service_set+0x30/0x90 [amdgpu]
dc_interrupt_set+0x24/0x30 [amdgpu]
amdgpu_dm_set_dmub_outbox_irq_state+0x22/0x30 [amdgpu]
amdgpu_irq_update+0x77/0xa0 [amdgpu]
amdgpu_irq_gpu_reset_resume_helper+0x67/0xa0 [amdgpu]
amdgpu_do_asic_reset+0x219/0x260 [amdgpu]
amdgpu_device_gpu_recover.cold+0x8c5/0xb64 [amdgpu]
amdgpu_debugfs_gpu_recover_show+0x2c/0x60 [amdgpu]
seq_read_iter+0xc2/0x450
? do_anonymous_page+0x22c/0x3b0
seq_read+0xf9/0x140
full_proxy_read+0x5c/0x90
vfs_read+0xaa/0x190
ksys_read+0x67/0xe0
__x64_sys_read+0x1a/0x20
Fixes: effbf6ca7e ("drm/amdgpu/display: remove an old DCN3 guard")
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-and-tested-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
valid DAL irq should be < DAL_IRQ_SOURCES_NUMBER.
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-and-tested-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Without driver loaded, SDMA0_UTCL1_PAGE.TMZ_ENABLE is set to 1
by default for all asic. On Raven/Renoir, the sdma goldsetting
changes SDMA0_UTCL1_PAGE.TMZ_ENABLE to 0.
This patch restores SDMA0_UTCL1_PAGE.TMZ_ENABLE to 1.
Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Acked-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
This reverts commit 33f409e60e.
The patch that we are reverting here was originally applied because it
fixes multiple IGT issues and flickering in Android. However, after a
discussion with Sean Paul and Mark, it looks like that this patch might
cause problems on ChromeOS. For this reason, we decided to revert this
patch.
Cc: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Cc: Harry Wentland <Harry.Wentland@amd.com>
Cc: Hersen Wu <hersenxs.wu@amd.com>
Cc: Sean Paul <seanpaul@chromium.org>
Cc: Mark Yacoub <markyacoub@chromium.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Reviewed-by: Sean Paul <seanpaul@chromium.org>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Added the Beige Goby capabilities in codec query.
v2: fix build error and indent (James)
Signed-off-by: Veerabadhran Gopalakrishnan <veerabadhran.gopalakrishnan@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The tmz functions are verified on yellow carp. So enable it by
default.
Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add helper function to get process device data structure from adev to
update counters.
Update vm faults, page_in, page_out counters will no be executed in
parallel, use WRITE_ONCE to avoid any form of compiler optimizations.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This is part of SVM profiling API, export sysfs counters for
per-process, per-GPU vm retry fault, pages migrated in and out of GPU vram.
counters will not be updated in parallel in GPU retry fault handler and
migration to vram/ram path, use READ_ONCE to avoid compiler
optimization.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 cases of kobj leak, which causes memory leak:
kobj_type must have release() method to free memory from release
callback. Don't need NULL default_attrs to init kobj.
sysfs files created under kobj_status should be removed with kobj_status
as parent kobject.
Remove queue sysfs files when releasing queue from process MMU notifier
release callback.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
No functionality change. Modify kfd_sysfs_create_file to use kobject as
parameter, so it becomes common helper function to remove duplicate code
and will simplify new kfd sysfs file create in future.
Move pr_warn to helper function if sysfs file create failed. Set helper
function as void return because caller doesn't use the helper function
return value.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Clock gating setting is still performed even when the corresponding
CG feature is not supported. And the tricky part is disablement is
actually performed no matter for enablement or disablement request.
That seems not logically right.
Considering HW should already properly take care of the CG state, we
will just skip the corresponding clock gating setting when the feature
is not supported.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
SMU had set all the necessary fields for a link width switch
but the width switch wasn't occurring because the link was idle
in the L1 state. Setting LC_L1_RECONFIG_EN=0x1 will allow width
switches to also be initiated while in L1 instead of waiting until
the link is back in L0.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
A lot of NAK-G being generated when link widht switching is happening.
WA for this issue is to program the SPC to 4 symbols per clock during
bootup when the native PCIE width is x4.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Fix TCP hang when a lightweight invalidation happens on Navi1x.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add missing settings for SQC bits. And correct some confusing logics
around active wgp bitmap calculation.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
When unloading driver, if VCN is powered on, sending message
DisableAllSmuFeatures to SMU will cause SMU hang. We need to
power down VCN and JPEG before clean up SMU.
Signed-off-by: Chengzhe Liu <ChengZhe.Liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
Sometimes, DP receiver chip power-controlled externally by an
Embedded Controller could be treated and used as eDP,
if it drives mobile display. In this case,
we shouldn't be doing power-sequencing, hence we can skip
waiting for T7-ready and T9-ready."
[How]
Added a feature mask to enable eDP no power sequencing feature.
To enable this, set 0x10 flag in amdgpu.dcfeaturemask on
Linux command line.
Signed-off-by: Zhan Liu <zhan.liu@amd.com>
Reviewed-by: Nikola Cornij <Nikola.Cornij@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* devcoredump support for display errors
* dpu: irq cleanup/refactor
* dpu: dt bindings conversion to yaml
* dsi: dt bindings conversion to yaml
* mdp5: alpha/blend_mode/zpos support
* a6xx: cached coherent buffer support
* a660 support
* gpu iova fault improvements:
- info about which block triggered the fault, etc
- generation of gpu devcoredump on fault
* assortment of other cleanups and fixes
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rob Clark <robdclark@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGs4=qsGBBbyn-4JWqW4-YUSTKh67X3DsPQ=T2D9aXKqNA@mail.gmail.com
Instead of using static bandwidth setup, manage bandwidth dynamically,
depending on the amount of allocated planes, their format and
resolution.
Co-developed-with: James Willcox <jwillcox@squareup.com>
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Reported-by: kernel test robot <lkp@intel.com>
Link: https://lore.kernel.org/r/20210525131316.3117809-8-dmitry.baryshkov@linaro.org
Signed-off-by: Rob Clark <robdclark@chromium.org>
Prior downstream kernels had "fudge factors" in devicetree which would
be applied to things like interconnect bandwidth calculations. Bring
some of those values back here.
Signed-off-by: James Willcox <jwillcox@squareup.com>
[DB: changed _ff to _inefficiency, fixed patch description]
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Link: https://lore.kernel.org/r/20210525131316.3117809-7-dmitry.baryshkov@linaro.org
Signed-off-by: Rob Clark <robdclark@chromium.org>
Hook alpha and pixel blend mode support to be exported as proper DRM
plane properties. This allows using this functionality from the
userspace.
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Link: https://lore.kernel.org/r/20210525131316.3117809-5-dmitry.baryshkov@linaro.org
Signed-off-by: Rob Clark <robdclark@chromium.org>
Use generic helpers code to manage drm_plane_state part of mdp5_plane
state instead of manually coding all the details.
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Link: https://lore.kernel.org/r/20210525131316.3117809-2-dmitry.baryshkov@linaro.org
Signed-off-by: Rob Clark <robdclark@chromium.org>
Move the call to dsi_mgr_phy_enable after checking whether the DSI
interface is slave, so that PHY enablement happens together with the
host enablement.
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Link: https://lore.kernel.org/r/20210609211211.2561090-1-dmitry.baryshkov@linaro.org
Signed-off-by: Rob Clark <robdclark@chromium.org>
Handling of the interrupt callback lists is done in dpu_core_irq.c,
under the "cb_lock" spinlock. When these operations results in the need
for enableing or disabling the IRQ in the hardware the code jumps to
dpu_hw_interrupts.c, which protects its operations with "irq_lock"
spinlock.
When an interrupt fires, dpu_hw_intr_dispatch_irq() inspects the
hardware state while holding the "irq_lock" spinlock and jumps to
dpu_core_irq_callback_handler() to invoke the registered handlers, which
traverses the callback list under the "cb_lock" spinlock.
As such, in the event that these happens concurrently we'll end up with
a deadlock.
Prior to '1c1e7763a6d4 ("drm/msm/dpu: simplify IRQ enabling/disabling")'
the enable/disable of the hardware interrupt was done outside the
"cb_lock" region, optimitically by using an atomic enable-counter for
each interrupt and an warning print if someone changed the list between
the atomic_read and the time the operation concluded.
Rather than re-introducing the large array of atomics, this change
embraces the fact that dpu_core_irq and dpu_hw_interrupts are deeply
entangled and make them share the single "irq_lock".
Following this step it's suggested that we squash the two parts into a
single irq handling thing.
Fixes: 1c1e7763a6d4 ("drm/msm/dpu: simplify IRQ enabling/disabling")
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Link: https://lore.kernel.org/r/20210611170003.3539059-1-bjorn.andersson@linaro.org
Signed-off-by: Rob Clark <robdclark@chromium.org>
Wire up support to stall the SMMU on iova fault, and collect a devcore-
dump snapshot for easier debugging of faults.
Currently this is a6xx-only, but mostly only because so far it is the
only one using adreno-smmu-priv.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Acked-by: Jordan Crouse <jordan@cosmicpenguin.net>
Link: https://lore.kernel.org/r/20210610214431.539029-6-robdclark@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>
Add, via the adreno-smmu-priv interface, a way for the GPU to request
the SMMU to stall translation on faults, and then later resume the
translation, either retrying or terminating the current translation.
This will be used on the GPU side to "freeze" the GPU while we snapshot
useful state for devcoredump.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Acked-by: Jordan Crouse <jordan@cosmicpenguin.net>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20210610214431.539029-5-robdclark@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>
Use the new adreno-smmu-priv fault info function to get more SMMU
debug registers and print the current TTBR0 to debug per-instance
pagetables and figure out which GPU block generated the request.
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>
Acked-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20210610214431.539029-4-robdclark@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>
Add a callback in adreno-smmu-priv to read interesting SMMU
registers to provide an opportunity for a richer debug experience
in the GPU driver.
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20210610214431.539029-3-robdclark@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>
Call report_iommu_fault() to allow upper-level drivers to register their
own fault handlers.
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>
Acked-by: Will Deacon <will@kernel.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20210610214431.539029-2-robdclark@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>
While keeping the previous default value for hangcheck period,
we allow now the possibility of configuring its value via
debugfs.
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Link: https://lore.kernel.org/r/20210607104441.184700-1-siglesias@igalia.com
Signed-off-by: Rob Clark <robdclark@chromium.org>
Add adreno_is_{a660,a650_family} helpers and convert update existing
adreno_is_a650 usage based on downstream driver's logic (changing into
adreno_is_a650_family or adding adreno_is_a660).
And add the remaining changes required for A660, again based on
the downstream driver: missing GMU allocations, additional register init,
dummy hfi BW table, cp protect list, entry in gpulist table, hwcg table,
updated a6xx_ucode_check_version check.
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Link: https://lore.kernel.org/r/20210608172808.11803-6-jonathan@marek.ca
Signed-off-by: Rob Clark <robdclark@chromium.org>
SM8250 AOP firmware already sets up PDC registers for us, and it only needs
to be enabled. This path will be used for other newer GPUs.
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Link: https://lore.kernel.org/r/20210608172808.11803-3-jonathan@marek.ca
Signed-off-by: Rob Clark <robdclark@chromium.org>
These aren't used by anything anymore.
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Akhil P Oommen <akhilpo@codeaurora.org>
Link: https://lore.kernel.org/r/20210608172808.11803-2-jonathan@marek.ca
Signed-off-by: Rob Clark <robdclark@chromium.org>
Based on mesa commit daa2ccff7a0201941db3901780d179e2634057d5
Small bit of .c churn in the phy code to adapt to split up of phy
related registers.
Signed-off-by: Rob Clark <robdclark@chromium.org>
The code does not really use dpu_hw_blk fields, so drop them, making
dpu_hw_blk empty structure.
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Link: https://lore.kernel.org/r/20210515190909.1809050-5-dmitry.baryshkov@linaro.org
Reviewed-by: Abhinav Kumar <abhinavk@codeaurora.org>
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>