bianbu-linux-6.6

mirror of https://gitee.com/bianbu-linux/linux-6.6 synced 2025-07-24 01:54:03 -04:00

Author	SHA1	Message	Date
Monk Liu	11c6b82afb	drm/amdgpu:cleanup stolen vga memory finish Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-12-06 12:47:49 -05:00
Michel Dänzer	299c776ceb	amdgpu: Don't use DRM_ERROR when failing to allocate a BO This can be triggered by userspace, e.g. trying to allocate too large a BO, so it shouldn't log anything by default. Callers need to handle failure anyway. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-12-06 12:47:49 -05:00
Roger He	424e2c8580	drm/amd/amdgpu: not allow gtt size exceed 75%*system memory size keep consistency with threshold of swapout Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Roger He <Hongbo.He@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-12-06 12:47:49 -05:00
David Panariti	02bab92328	drm/amdgpu: Add ability to determine and report if board supports ECC. Make initialization code check the ECC related registers, which are initialized by the VBIOS, to see if ECC is present and initialized and DRM_INFO() the result. Signed-off-by: David Panariti <David.Panariti@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-12-06 12:47:48 -05:00
Alex Deucher	56f3df448c	drm/amdgpu/gfx6: use cached values for raster config in clear state Use the cached values rather than hardcoding it. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-12-06 12:47:47 -05:00
Alex Deucher	adfb81659c	drm/amdgpu/gfx7: use cached values for raster config in clear state Use the cached values rather than hardcoding it. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-12-06 12:47:47 -05:00
Alex Deucher	93442184c0	drm/amdgpu/gfx8: use cached values for raster config in clear state Use the cached values rather than hardcoding it. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-12-06 12:47:46 -05:00
Colin Ian King	288e46d398	drm/amdgpu/virt: remove redundant variable pf2vf_ver Variable pf2vf_ver is assigned but never read, it is redundant and hence can be removed. Cleans up clang warning: drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c:310:3: warning: Value stored to 'pf2vf_ver' is never read Reivewed-by: Horace Chen <horace.chen@amd.com> Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-12-06 12:47:23 -05:00
Piotr Redlewski	c1fe75c9e4	drm/amd/amdgpu: fix UVD mc offsets When UVD bo is created, its size is based on the information from firmware header (ucode_size_bytes). The same value should be be used when programming UVD mc controller offsets, otherwise it can happen that (mmUVD_VCPU_CACHE_OFFSET2 + mmUVD_VCPU_CACHE_SIZE2) will point AMDGPU_GPU_PAGE_SIZE bytes after the UVD bo end. Second issue is that when programming the mmUVD_VCPU_CACHE_SIZE0 register, AMDGPU_UVD_FIRMWARE_OFFSET should be taken into account. If it isn't, (mmUVD_VCPU_CACHE_OFFSET2 + mmUVD_VCPU_CACHE_SIZE2) will always point AMDGPU_UVD_FIRMWARE_OFFSET bytes after the UVD bo end. v2: move firmware size calculation into macro definition v3: align firmware size to the gpu page size Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Piotr Redlewski <predlewski@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-12-06 12:47:22 -05:00
Andrey Grodzovsky	79c631239a	drm/amdgpu: Implement BO size validation V2 Validates BO size against each requested domain's total memory. v2: Make GTT size check a MUST to allow fall back to GTT. Rmove redundant NULL check. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-12-06 12:47:22 -05:00
Christian König	fdd5faaa08	drm/amdgpu: cleanup vm_size handling It's pointless to have the same value twice, just always use max_pfn. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-12-06 12:47:21 -05:00
Christian König	c47b41a79a	drm/amdgpu: remove nonsense const u32 cast on ARRAY_SIZE result Not sure what that should originally been good for, but it doesn't seem to make any sense any more. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-12-06 12:47:21 -05:00
Chunming Zhou	6f16b4fb60	drm/amdgpu: use dep_sync for CS dependency/syncobj Otherwise, they could be optimized by scheduled fence. Signed-off-by: Chunming Zhou <david1.zhou@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-12-06 12:47:21 -05:00
Xiangliang.Yu	75737cb4eb	drm/amdgpu/gfx8: Fix compute ring failure after resetting Do ring clear before ring test, otherwise compute ring test will fail after gpu resetting. Still can't find the root cause, just workaround it. Signed-off-by: Xiangliang.Yu <Xiangliang.Yu@amd.com> Acked-by: Monk Liu <Monk.Liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-12-06 12:47:20 -05:00
Pixel Ding	1daee8b472	drm/amdgpu: revise retry init to fully cleanup driver Retry at drm_dev_register instead of amdgpu_device_init. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Pixel Ding <Pixel.Ding@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-12-06 12:47:18 -05:00
Monk Liu	75bc6099bc	drm/amdgpu:read VRAMLOST from gim Signed-off-by: Monk Liu <Monk.Liu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-12-04 16:41:45 -05:00
pding	0c03b912d7	drm/amdgpu: bypass FB resizing for SRIOV VF It introduces 900ms latency in exclusive mode which causes failure of driver loading. Host can resize the BAR before guest staring, so the resizing is not necessary here. Signed-off-by: Pixel Ding <Pixel.Ding@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-12-04 16:41:45 -05:00
pding	c6332b97fa	drm/amdgpu: release exclusive mode after hw_init Signed-off-by: pding <Pixel.Ding@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-12-04 16:41:44 -05:00
pding	1884734a03	drm/amdkfd: initialise kfd inside amdgpu_device_init Also finalize kfd inside amdgpu_device_fini. kfd device_init needs SRIOV exclusive accessing. Try to gather exclusive accessing to reduce time consuming. Signed-off-by: pding <Pixel.Ding@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-12-04 16:41:44 -05:00
Christian König	40575732b6	drm/amdgpu: don't use ttm_bo_move_ttm in amdgpu_ttm_bind v2 Just allocate the GART space and fill it. This prevents forcing the BO to be idle. v2: don't unbind/bind at all, just fill the allocated GART space Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-12-04 16:41:44 -05:00
Christian König	c5835bbb11	drm/amdgpu: rename amdgpu_ttm_bind to amdgpu_ttm_alloc_gart We actually don't bind here, but rather allocate GART space if necessary. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-12-04 16:41:43 -05:00
Hawking Zhang	b2b7e457ba	drm/amdgpu: switch to use new SOC15 reg read/write macros for soc15 ih Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-12-04 16:41:43 -05:00
Christian König	d6895ad39f	drm/amdgpu: resize VRAM BAR for CPU access v6 Try to resize BAR0 to let CPU access all of VRAM. v2: rebased, style cleanups, disable mem decode before resize, handle gmc_v9 as well, round size up to power of two. v3: handle gmc_v6 as well, release and reassign all BARs in the driver. v4: rename new function to amdgpu_device_resize_fb_bar, reenable mem decoding only if all resources are assigned. v5: reorder resource release, return -ENODEV instead of BUG_ON(). v6: squash in rebase fix Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-12-04 16:41:42 -05:00
Horace Chen	3c7388936a	drm/amdgpu: refine SR-IOV firmware VRAM reservation to protect data The previous solution will create a zero buffer on the system domain and then move the zeroes to the VRAM. This will break the original data on the VRAM. Refine the code to create bo on VRAM domain directly and then remove and re-create mem node to the exact position before bo_pin. This can avoid breaking the data and will not cause eviction. Signed-off-by: Horace Chen <horace.chen@amd.com> Reviewed-by: monk liu <monk.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-12-04 16:41:42 -05:00
pding	5ffa61c1bd	drm/amdgpu: retry init if exclusive mode request is failed This is caused of that hypervisor fails to handle request, one known issue is MMIO unblocking timeout. In theory we can retry init here. Signed-off-by: pding <Pixel.Ding@amd.com> Reviewed-by: Xiangliang Yu <Xiangliang.Yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-12-04 16:41:42 -05:00
pding	f47110330c	drm/amdgpu: return error when sriov access requests get timeout Reported-by: Sun Gary <Gary.Sun@amd.com> Signed-off-by: pding <Pixel.Ding@amd.com> Reviewed-by: Xiangliang Yu <Xiangliang.Yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-12-04 16:41:41 -05:00
Michel Dänzer	8fb0450c94	amdgpu: Remove AMDGPU_{HPD,CRTC_IRQ,PAGEFLIP_IRQ}_LAST Not used anymore. Signed-off-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-12-04 16:41:40 -05:00
Michel Dänzer	d794b9f827	amdgpu/dce: Use actual number of CRTCs and HPDs in set_irq_funcs Hardcoding the maximum numbers could result in spurious error messages from the IRQ state callbacks, e.g. on Polaris 11/12: [drm:dce_v11_0_set_pageflip_irq_state [amdgpu]] ERROR invalid pageflip crtc 5 [drm:amdgpu_irq_disable_all [amdgpu]] ERROR error disabling interrupt (-22) Signed-off-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-12-04 16:41:40 -05:00
Christian König	c1c7ce8f56	drm/amdgpu: move GART recovery into GTT manager v2 The GTT manager handles the GART address space anyway, so it is completely pointless to keep the same information around twice. v2: rebased Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-12-04 16:41:33 -05:00
Christian König	3da917b6c6	drm/amdgpu: nuke amdgpu_ttm_is_bound() v2 Rename amdgpu_gtt_mgr_is_allocated() to amdgpu_gtt_mgr_has_gart_addr() and use that instead. v2: rename the function as well. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-12-04 16:41:32 -05:00
Monk Liu	34a4d2bf06	drm/amdgpu:fix random missing of FLR NOTIFY Signed-off-by: Monk Liu <Monk.Liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-12-04 16:41:32 -05:00
Monk Liu	77a3c96b1b	drm/amdgpu/sriov:fix memory leak in psp_load_fw for SR-IOV when doing gpu reset this routine shouldn't do resource allocating otherwise memory leak Signed-off-by: Monk Liu <Monk.Liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-12-04 16:41:31 -05:00
Monk Liu	503846e083	drm/amdgpu:cleanup ucode_init_bo 1,no sriov check since gpu recover is unified 2,need CPU_ACCESS_REQUIRED flag for VRAM if SRIOV because otherwise after following PIN the first allocated VRAM bo is wasted due to some TTM mgr reason. Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-12-04 16:41:31 -05:00
Monk Liu	13a752e3a2	drm/amdgpu:cleanup in_sriov_reset and lock_reset since now gpu reset is unified with gpu_recover for both bare-metal and SR-IOV: 1)rename in_sriov_reset to in_gpu_reset 2)move lock_reset from adev->virt to adev Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-12-04 16:41:31 -05:00
Monk Liu	5740682e66	drm/amdgpu:implement new GPU recover(v3) 1,new imple names amdgpu_gpu_recover which gives more hint on what it does compared with gpu_reset 2,gpu_recover unify bare-metal and SR-IOV, only the asic reset part is implemented differently 3,gpu_recover will increase hang job karma and mark its entity/context as guilty if exceeds limit V2: 4,in scheduler main routine the job from guilty context will be immedialy fake signaled after it poped from queue and its fence be set with "-ECANCELED" error 5,in scheduler recovery routine all jobs from the guilty entity would be dropped 6,in run_job() routine the real IB submission would be skipped if @skip parameter equales true or there was VRAM lost occured. V3: 7,replace deprecated gpu reset, use new gpu recover Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-12-04 16:41:30 -05:00
Monk Liu	48f05f2955	amd/scheduler:imple job skip feature(v3) jobs are skipped under two cases 1)when the entity behind this job marked guilty, the job poped from this entity's queue will be dropped in sched_main loop. 2)in job_recovery(), skip the scheduling job if its karma detected above limit, and also skipped as well for other jobs sharing the same fence context. this approach is becuase job_recovery() cannot access job->entity due to entity may already dead. v2: some logic fix v3: when entity detected guilty, don't drop the job in the poping stage, instead set its fence error as -ECANCELED in run_job(), skip the scheduling either:1) fence->error < 0 or 2) there was a VRAM LOST occurred on this job. this way we can unify the job skipping logic. with this feature we can introduce new gpu recover feature. Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-12-04 16:41:30 -05:00
Christian König	3a393cf96a	drm/amdgpu: fix indentation in amdgpu_display.h That was somehow completely of. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-12-04 16:41:29 -05:00
Rex Zhu	433f1aa786	drm/amdgpu: delete duplicated code. the variable ref_clock was assigned same value twice in same function. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Rex Zhu <Rex.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-12-04 16:33:19 -05:00
Rex Zhu	d668942bb8	drm/amdgpu: add new pp function point notify_smu_memory_info Used to set up smu power logging. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Rex Zhu <Rex.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-12-04 16:33:18 -05:00
Rex Zhu	c79563a316	drm/amdgpu: add header kgd_pp_interface.h move powerplay and amdgpu shared structures and definitions to kgd_pp_interface.h. This is the interface between the base driver and powerplay. Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Rex Zhu <Rex.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-12-04 16:33:18 -05:00
Rex Zhu	11dc9364bd	drm/amdgpu: move struct amd_powerplay to amdgpu.h Clean up the interface. Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Rex Zhu <Rex.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-12-04 16:33:17 -05:00
Christian König	4ff23be3d5	drm/amdgpu: remove extra parameter from amdgpu_ttm_bind() v2 We always use the BO mem now. v2: minor rebase Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-12-04 16:33:16 -05:00
Christian König	2a018f28a8	drm/amdgpu: don't wait interruptible while binding GART space Display can't seem to handle this correctly. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-12-04 16:33:16 -05:00
Christian König	f5318959b5	drm/amdgpu: fix pin domain compatibility check We need to test if any domain fits, not all of them. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-12-04 16:33:16 -05:00
Christian König	ead282a4f5	drm/amdgpu: always bind pinned BOs We always need to bind pinned BOs, not just when the caller requested the address. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-12-04 16:33:15 -05:00
Christian König	5e91fb57eb	drm/amdgpu: use the actual placement for pin accounting This allows us to specify multiple possible placements again. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-12-04 16:33:15 -05:00
pding	8840a3878d	drm/amdgpu: retry init if it fails due to exclusive mode timeout (v3) The exclusive mode has real-time limitation in reality, such like being done in 300ms. It's easy observed if running many VF/VMs in single host with heavy CPU workload. If we find the init fails due to exclusive mode timeout, try it again. v2: - rewrite the condition for readable value. v3: - fix typo, add comments for sleep Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: pding <Pixel.Ding@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-12-04 16:33:14 -05:00
pding	b59142384e	drm/amdgpu/virt: implement wait_reset callbacks for vi/ai Reviewed-by: Monk Liu <monk.liu@amd.com> Signed-off-by: pding <Pixel.Ding@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-12-04 16:33:14 -05:00
Evan Quan	7413d2faef	drm/amd/powerplay: describe the PCIE link speed in right GT/s Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-12-04 16:33:14 -05:00
pding	b636176efd	drm/amdgpu/virt: add wait_reset virt ops Driver can use this interface to check if there's a function level reset done in hypervisor. It's helpful when IRQ handler for reset is not ready, or special handling is required. Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Monk Liu <monk.liu@amd.com> Signed-off-by: pding <Pixel.Ding@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-12-04 16:33:13 -05:00

1 2 3 4 5 ...

3428 commits