bianbu-linux-6.6

mirror of https://gitee.com/bianbu-linux/linux-6.6 synced 2025-07-16 01:03:57 -04:00

Author	SHA1	Message	Date
Guchun Chen	ee10e06eb0	drm/amdgpu: add printing after executing page reservation to eeprom This will tell users if the faulty page has been written to external eeprom device in dmesg log. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-06 16:24:11 -04:00
John Clements	ebfbd1c2ca	drm/amdgpu: expand sienna chichlid reg access support Added dedicated 64bit reg read/write support Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-06 16:23:20 -04:00
Colin Ian King	2342ef4e60	drm/amdgpu: fix spelling mistake "Falied" -> "Failed" There is a spelling mistake in a DRM_ERROR error message. Fix it. This got lost in a merge, restore the fix. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `4afaa61db9`)	2020-08-06 16:16:57 -04:00
Pierre-Eric Pelloux-Prayer	16c642ec3f	drm/amdgpu: new ids flag for tmz (v2) Allows UMD to know if TMZ is supported and enabled. This commit also bumps KMS_DRIVER_MINOR because if we don't UMD can't tell if "ids_flags & AMDGPU_IDS_FLAGS_TMZ == 0" means "tmz is not enabled" or "tmz may be enabled but the kernel doesn't report it". v2: use amdgpu_is_tmz() and reworded commit message. Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-06 15:45:52 -04:00
Evan Quan	25c933b1c4	drm/amd/powerplay: add new sysfs interface for retrieving gpu metrics(V2) A new interface for UMD to retrieve gpu metrics data. V2: rich the documentation Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-06 15:43:56 -04:00
Colin Ian King	c16ce56240	drm/amdgpu: fix spelling mistake "paramter" -> "parameter" There is a spelling mistake in a dev_warn message. Fix it. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-06 15:43:26 -04:00
Philip Yang	b80f050ff2	drm/amdkfd: option to disable system mem limit If multiple process share system memory through /dev/shm, KFD allocate memory should not fail if it reaches the system memory limit because one copy of physical system memory are shared by multiple process. Add module parameter no_system_mem_limit to provide user option to disable system memory limit check at runtime using sysfs or during driver module init using kernel boot argument. By default the system memory limit is on. Print out debug message to warn user if KFD allocate memory failed because system memory reaches limit. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-06 15:43:08 -04:00
Ingo Molnar	a703f3633f	Merge branch 'WIP.locking/seqlocks' into locking/urgent Pick up the full seqlock series PeterZ is working on. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2020-08-06 10:16:38 +02:00
Dave Airlie	2966141ad2	drm/ttm: rename ttm_mem_reg to ttm_resource. This name better reflects what the object does. I didn't rename all the pointers it seemed too messy. Signed-off-by: Dave Airlie <airlied@redhat.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200804025632.3868079-60-airlied@gmail.com	2020-08-06 13:19:21 +10:00
Dave Airlie	9de59bc201	drm/ttm: rename ttm_mem_type_manager -> ttm_resource_manager. This name makes a lot more sense, since these are about managing driver resources rather than just memory ranges. Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200804025632.3868079-59-airlied@gmail.com	2020-08-06 13:12:40 +10:00
Dave Airlie	90a0489a71	drm/ttm: drop type manager has_type under driver control, this flag isn't needed anymore, remove the API that used to access it, and consoldiate with the used api. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200804025632.3868079-56-airlied@gmail.com	2020-08-06 13:12:40 +10:00
Dave Airlie	7541ce1a6f	drm/ttm: drop man->bdev link. This link isn't needed anymore, drop it from the init interface. Reviewed-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200804025632.3868079-54-airlied@gmail.com	2020-08-06 13:12:40 +10:00
Dave Airlie	a29050c4cd	drm/amdgpu/ttm: remove man->bdev references. Just store the device in the private so the link can be removed from the manager Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200804025632.3868079-53-airlied@gmail.com	2020-08-06 13:12:40 +10:00
Dave Airlie	37205891d8	drm/ttm: make ttm_range_man_init/takedown take type + args This makes it easier to move these to a driver allocated system Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200804025632.3868079-47-airlied@gmail.com	2020-08-06 13:12:39 +10:00
Dave Airlie	0af135b892	drm/amdgpu/ttm: use bo manager subclassing for vram/gtt mgrs Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200804025632.3868079-46-airlied@gmail.com	2020-08-06 13:12:21 +10:00
Linus Torvalds	8186749621	drm next for 5.9-rc1 core: - add user def flag to cmd line modes - dma_fence_wait added might_sleep - dma-fence lockdep annotations - indefinite fences are bad documentation - gem CMA functions used in more drivers - struct mutex removal - more drm_ debug macro usage - set/drop master api fixes - fix for drm/mm hole size comparison - drm/mm remove invalid entry optimization - optimise drm/mm hole handling - VRR debugfs added - uncompressed AFBC modifier support - multiple display id blocks in EDID - multiple driver sg handling fixes - __drm_atomic_helper_crtc_reset in all drivers - managed vram helpers ttm: - ttm_mem_reg handling cleanup - remove bo offset field - drop CMA memtype flag - drop mappable flag xilinx: - New Xilinx ZynqMP DisplayPort Subsystem driver nouveau: - add CRC support - start using NVIDIA published class header files - convert all push buffer emission to new macros - Proper push buffer space management for EVO/NVD channels. - firmware loading fixes - 2MiB system memory pages support on Pascal and newer vkms: - larget cursor support i915: - Rocketlake platform enablement - Early DG1 enablement - Numerous GEM refactorings - DP MST fixes - FBC, PSR, Cursor, Color, Gamma fixes - TGL, RKL, EHL workaround updates - TGL 8K display support fixes - SDVO/HDMI/DVI fixes amdgpu: - Initial support for Sienna Cichlid GPU - Initial support for Navy Flounder GPU - SI UVD/VCE support - expose rotation property - Add support for unique id on Arcturus - Enable runtime PM on vega10 boards that support BACO - Skip BAR resizing if the bios already did id - Major swSMU code cleanup - Fixes for DCN bandwidth calculations amdkfd: - Track SDMA usage per process - SMI events interface radeon: - Default to on chip GART for AGP boards on all arches - Runtime PM reference count fixes msm: - headers regenerated causing churn - a650/a640 display and GPU enablement - dpu dither support for 6bpc panels - dpu cursor fix - dsi/mdp5 enablement for sdm630/sdm636/sdm66 tegra: - video capture prep support - reflection support mediatek: - convert mtk_dsi to bridge API meson: - FBC support sun4i: - iommu support rockchip: - register locking fix - per-pixel alpha support PX30 VOP - mgag200: - ported to simple and shmem helpers - device init cleanups - use managed pci functions - dropped hw cursor support ast: - use managed pci functions - use managed VRAM helpers - rework cursor support malidp: - dev_groups support hibmc: - refactor hibmc_drv_vdac: vc4: - create TXP CRTC imx: - error path fixes and cleanups etnaviv: - clock handling and error handling cleanups - use pin_user_pages -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJfK1atAAoJEAx081l5xIa+vDkQAJvl/mjbEA7fDy8Ysa0cgPLI 8nI4Bo/MaxkyRfUcP8+f/n3QQrRME37C0xa/Mn6SG1oFAdlovPwDqmDr5kjhkrMI geo8oJb2Q+AsrJr+ejpuF+iq0FxWi64bLbwJFJ2nBet+lHTMzoPceWeq3gG1Vvfl h6PV4B/9TjrnbhcKLIQSEmJ0kZp9uMkDBF/iynVn4+AKAkG1rQNjigdTH48IFPoz 28KuqG0B4NWu648zYXhjsN0kD3Dxjv3YOH+FsoWQpQa9icCTySYbySsQ7l0/XvA3 4BPtP3rWMhU46FHTBkWF72WQR4F0B4wm5DJJKMeG4vb1mXakOqAKcAq7JAbka+wL PBIiU+AcAKRSiwHmYDuDWtDoSpvYncuo0p3IvNP5hhih+7hqCnLIULDWS+V8AUzW 39usS/DXsVKk/HGYIYC89cRwsqWYD4c7edzOBdPQxW4LNYCD2gXezLJ5TeeR2lih y9JIVnPiluWleOovs4W3BoZNRuLc1rHBO6COToXjlme/48Z+sRHBAoge6UZurqRN jr+e60cS7n/DOeJQuNf4UHZnK48Pc24+3kVfMHlX+OKn8VuKPGr+USkeHV/NYL/B USiKCAxkkZM0dxerSb1/Ra9kGnchf0QBpA6Fsem8kV61Z4GVc+K6xJWg7KXB6n/3 7ZyalUKLwlOCz9sYsCCe =Yvtd -----END PGP SIGNATURE----- Merge tag 'drm-next-2020-08-06' of git://anongit.freedesktop.org/drm/drm Pull drm updates from Dave Airlie: "New xilinx displayport driver, AMD support for two new GPUs (more header files), i915 initial support for RocketLake and some work on their DG1 (discrete chip). The core also grew some lockdep annotations to try and constrain what drivers do with dma-fences, and added some documentation on why the idea of indefinite fences doesn't work. The long list is below. I do have some fixes trees outstanding, but I'll follow up with those later. core: - add user def flag to cmd line modes - dma_fence_wait added might_sleep - dma-fence lockdep annotations - indefinite fences are bad documentation - gem CMA functions used in more drivers - struct mutex removal - more drm_ debug macro usage - set/drop master api fixes - fix for drm/mm hole size comparison - drm/mm remove invalid entry optimization - optimise drm/mm hole handling - VRR debugfs added - uncompressed AFBC modifier support - multiple display id blocks in EDID - multiple driver sg handling fixes - __drm_atomic_helper_crtc_reset in all drivers - managed vram helpers ttm: - ttm_mem_reg handling cleanup - remove bo offset field - drop CMA memtype flag - drop mappable flag xilinx: - New Xilinx ZynqMP DisplayPort Subsystem driver nouveau: - add CRC support - start using NVIDIA published class header files - convert all push buffer emission to new macros - Proper push buffer space management for EVO/NVD channels. - firmware loading fixes - 2MiB system memory pages support on Pascal and newer vkms: - larger cursor support i915: - Rocketlake platform enablement - Early DG1 enablement - Numerous GEM refactorings - DP MST fixes - FBC, PSR, Cursor, Color, Gamma fixes - TGL, RKL, EHL workaround updates - TGL 8K display support fixes - SDVO/HDMI/DVI fixes amdgpu: - Initial support for Sienna Cichlid GPU - Initial support for Navy Flounder GPU - SI UVD/VCE support - expose rotation property - Add support for unique id on Arcturus - Enable runtime PM on vega10 boards that support BACO - Skip BAR resizing if the bios already did id - Major swSMU code cleanup - Fixes for DCN bandwidth calculations amdkfd: - Track SDMA usage per process - SMI events interface radeon: - Default to on chip GART for AGP boards on all arches - Runtime PM reference count fixes msm: - headers regenerated causing churn - a650/a640 display and GPU enablement - dpu dither support for 6bpc panels - dpu cursor fix - dsi/mdp5 enablement for sdm630/sdm636/sdm66 tegra: - video capture prep support - reflection support mediatek: - convert mtk_dsi to bridge API meson: - FBC support sun4i: - iommu support rockchip: - register locking fix - per-pixel alpha support PX30 VOP mgag200: - ported to simple and shmem helpers - device init cleanups - use managed pci functions - dropped hw cursor support ast: - use managed pci functions - use managed VRAM helpers - rework cursor support malidp: - dev_groups support hibmc: - refactor hibmc_drv_vdac: vc4: - create TXP CRTC imx: - error path fixes and cleanups etnaviv: - clock handling and error handling cleanups - use pin_user_pages" * tag 'drm-next-2020-08-06' of git://anongit.freedesktop.org/drm/drm: (1747 commits) drm/msm: use kthread_create_worker instead of kthread_run drm/msm/mdp5: Add MDP5 configuration for SDM636/660 drm/msm/dsi: Add DSI configuration for SDM660 drm/msm/mdp5: Add MDP5 configuration for SDM630 drm/msm/dsi: Add phy configuration for SDM630/636/660 drm/msm/a6xx: add A640/A650 hwcg drm/msm/a6xx: hwcg tables in gpulist drm/msm/dpu: add SM8250 to hw catalog drm/msm/dpu: add SM8150 to hw catalog drm/msm/dpu: intf timing path for displayport drm/msm/dpu: set missing flush bits for INTF_2 and INTF_3 drm/msm/dpu: don't use INTF_INPUT_CTRL feature on sdm845 drm/msm/dpu: move some sspp caps to dpu_caps drm/msm/dpu: update UBWC config for sm8150 and sm8250 drm/msm/dpu: use right setup_blend_config for sm8150 and sm8250 drm/msm/a6xx: set ubwc config for A640 and A650 drm/msm/adreno: un-open-code some packets drm/msm: sync generated headers drm/msm/a6xx: add build_bw_table for A640/A650 drm/msm/a6xx: fix crashstate capture for A650 ...	2020-08-05 19:50:06 -07:00
Dave Airlie	6c28aed6e5	drm/amdgfx/ttm: use wrapper to get ttm memory managers Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200804025632.3868079-38-airlied@gmail.com	2020-08-06 12:32:03 +10:00
Dave Airlie	6fe1c54353	drm/amdgpu/ttm: use new takedown path Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200804025632.3868079-28-airlied@gmail.com	2020-08-06 12:32:03 +10:00
Dave Airlie	158d20d185	drm/amdgpu/ttm: init managers from the driver side. Use new init calls to unwrap manager init Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200804025632.3868079-16-airlied@gmail.com	2020-08-06 12:31:36 +10:00
Dave Airlie	46bca88bbd	drm/ttm/amdgpu: consolidate ttm reserve paths Drop the WARN_ON and consolidate the two paths into one. Use the consolidate slowpath in the execbuf utils code. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200804025632.3868079-6-airlied@gmail.com	2020-08-06 12:16:31 +10:00
Alex Deucher	87ded5caee	drm/amdgpu: move vram usage by vbios to mman (v2) It's related to the memory manager so move it there. v2: inline the structure Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:29:29 -04:00
Alex Deucher	72de33f8f7	drm/amdgpu: move IP discovery data to mman It's related to the memory manager so move it there. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:29:29 -04:00
Alex Deucher	cacbbe7c00	drm/amdgpu: move stolen memory from gmc to mman It's more related to memory management than memory controller. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:29:29 -04:00
Alex Deucher	7438ae6e52	drm/amdgpu/gmc: disable keep_stolen_vga_memory on arcturus I suspect the only reason this was set was to avoid touching the display related registers on arcturus. Someone should double check this on arcturus with S3. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:29:29 -04:00
Alex Deucher	14b18937cb	drm/amdgpu: drop the CPU pointers for the stolen vga bos We never use them. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:29:29 -04:00
Alex Deucher	7348c20a4e	drm/amdgpu/gmc10: switch to using amdgpu_gmc_get_vbios_allocations The new helper centralizes the logic in one place. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:29:29 -04:00
Alex Deucher	7b885f0eb4	drm/amdgpu/gmc9: switch to using amdgpu_gmc_get_vbios_allocations The new helper centralizes the logic in one place. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:29:29 -04:00
Alex Deucher	3853626d2c	drm/amdgpu/gmc8: switch to using amdgpu_gmc_get_vbios_allocations The new helper centralizes the logic in one place. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:29:29 -04:00
Alex Deucher	71755699b5	drm/amdgpu/gmc7: switch to using amdgpu_gmc_get_vbios_allocations The new helper centralizes the logic in one place. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:29:28 -04:00
Alex Deucher	422fe8d27d	drm/amdgpu/gmc6: switch to using amdgpu_gmc_get_vbios_allocations The new helper centralizes the logic in one place. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:29:28 -04:00
Alex Deucher	dd285c5df9	drm/amdgpu/gmc: add new helper to get the FB size used by pre-OS console This adds a new gmc callback to get the size reserved by the pre-OS console and provides a helper function for use by gmc IP drivers. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:29:28 -04:00
Alex Deucher	0635019412	drm/amdgpu: add support for extended stolen vga memory This will allow us to split the allocation for systems where we have to keep the stolen memory around to avoid S3 issues. This way we don't waste as much memory and still avoid any screen artifacts during the bios to driver transition. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:29:28 -04:00
Alex Deucher	5db62dc8d4	drm/amdgpu: move keep stolen memory check into gmc core Rather than leaving this as a gmc v9 specific hack. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:29:28 -04:00
Alex Deucher	fcbc92e2e1	drm/amdgpu: move stolen vga bo from amdgpu to amdgpu.gmc Since that is where we store the other data related to the stolen vga memory. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:29:28 -04:00
Alex Deucher	81b54fb7a2	drm/amdgpu: use a define for the memory size of the vga emulator Rather than open coding it everywhere. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:29:28 -04:00
Alex Deucher	adb5be8122	drm/amdgpu: use create_at for the stolen pre-OS buffer Should be functionally the same since nothing else is allocated at that point, but let's be exact. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:29:28 -04:00
Alex Deucher	37912e963d	drm/amdgpu: handle bo size 0 in amdgpu_bo_create_kernel_at (v2) Just return early to match other bo_create functions. v2: check if the bo_ptr is NULL rather than checking the size. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> (v1) Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:29:28 -04:00
John Clements	4bfb74282f	drm/amdgpu: added RAS EEPROM device support check updated RAS EEPROM init/threshold sequences to check for device support Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:29:18 -04:00
John Clements	0ad7a64d69	drm/amdgpu: enable RAS support for sienna cichlid enabled GECC error injection and query support Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:29:08 -04:00
Monk Liu	a300de40f6	drm/amdgpu: introduce a new parameter to configure how many KCQ we want(v5) what: the MQD's save and restore of KCQ (kernel compute queue) cost lots of clocks during world switch which impacts a lot to multi-VF performance how: introduce a paramter to control the number of KCQ to avoid performance drop if there is no kernel compute queue needed notes: this paramter only affects gfx 8/9/10 v2: refine namings v3: choose queues for each ring to that try best to cross pipes evenly. v4: fix indentation some cleanupsin the gfx_compute_queue_acquire() v5: further fix on indentations more cleanupsin gfx_compute_queue_acquire() TODO: in the future we will let hypervisor driver to set this paramter automatically thus no need for user to configure it through modprobe in virtual machine Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:27:29 -04:00
Guchun Chen	9b856defbe	drm/amdgpu: update eeprom once specifying one bigger threshold(v3) During driver's probe, when it hits bad gpu tag in eeprom i2c init calling(the tag was set when reported bad page reaches bad page threshold in last driver's working loop), there are some strategys to deal with the cases: 1. when the module parameter amdgpu_bad_page_threshold = 0, that means page retirement feature is disabled, so just resetting the eeprom is fine. 2. When amdgpu_bad_page_threshold is not 0, and moreover, user sets one bigger valid data in order to make current boot up succeeds, correct eeprom header tag and do not break booting. 3. For other cases, driver's probe will be broken. v2: Just update eeprom header tag instead of resetting the whole table header when user sets one bigger threshold data. v3: Use dev_info/dev_err to print PCI device information, which helps in mGPU case. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:27:25 -04:00
Guchun Chen	a219ecbb83	drm/amdgpu: disable page reservation when amdgpu_bad_page_threshold = 0 When amdgpu_bad_page_threshold = 0, bad page reservation stuffs are skipped in either UMC ECC irq or page retirement calling of sync flood isr. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:27:20 -04:00
Guchun Chen	f848159b57	drm/amdgpu: decouple sysfs creating of bad page node Bad page information should not be exposed by sysfs when bad page retirement is disabled, so decouple it from ras sysfs group creating, and add one guard before creating. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:27:16 -04:00
Guchun Chen	eb0c3cd48f	drm/amdgpu: add one definition for RAS's sysfs/debugfs name(v2) Add one definition for the RAS module's FS name. It's used in both debugfs and sysfs cases. v2: Use static variable instead of macro definition. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:27:08 -04:00
Guchun Chen	bf0b91b78f	drm/amdgpu: restore ras flags when user resets eeprom(v2) RAS flags needs to be cleaned as well when user requires one clean eeprom. v2: RAS flags shall be restored after eeprom reset succeeds. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:27:04 -04:00
Guchun Chen	e8fbaf0342	drm/amdgpu: break GPU recovery once it's in bad state(v4) When GPU executes recovery and retriving bad GPU tag from external eerpom device, the recovery will be broken and error message is printed as well for user's awareness. v2: Refine warning message in threshold reaching case, and fix spelling typo. v3: Fix explicit calling of bad gpu. v4: Rename function names. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:26:54 -04:00
Guchun Chen	9c06f91ff2	drm/amdgpu: schedule ras recovery when reaching bad page threshold(v2) Once the bad page saved to eeprom reaches the configured threshold, ras recovery will be issued to notify user. v2: Fix spelling typo. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:26:46 -04:00
Guchun Chen	35cd2cdadb	drm/amdgpu: skip bad page reservation once issuing from eeprom write Once the ras recovery is issued from eeprom write itself, bad page reservation should be ignored, otherwise, recursive calling of writting to eeprom would happen. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:26:38 -04:00
Guchun Chen	b82e65a935	drm/amdgpu: break driver init process when it's bad GPU(v5) When retrieving bad gpu tag from eeprom, GPU init should fail as the GPU needs to be retired for further check. v2: Fix spelling typo, correct the condition to detect bad gpu tag and refine error message. v3: Refine function argument name. v4: Fix missing check of returning value of i2c initialization error case. v5: Use dev_err to print PCI information in dmesg instead of DRM_ERROR. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:26:31 -04:00
Guchun Chen	1d6a9d122d	drm/amdgpu: add bad gpu tag definition This tag will be hired for bad gpu detection in eeprom's access. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:26:16 -04:00

... 4 5 6 7 8 ...

8084 commits