bianbu-linux-6.6

mirror of https://gitee.com/bianbu-linux/linux-6.6 synced 2025-07-22 01:43:37 -04:00

Author	SHA1	Message	Date
Chris Wilson	6c067579e6	drm/i915: Split execlist priority queue into rbtree + linked list All the requests at the same priority are executed in FIFO order. They do not need to be stored in the rbtree themselves, as they are a simple list within a level. If we move the requests at one priority into a list, we can then reduce the rbtree to the set of priorities. This should keep the height of the rbtree small, as the number of active priorities can not exceed the number of active requests and should be typically only a few. Currently, we have ~2k possible different priority levels, that may increase to allow even more fine grained selection. Allocating those in advance seems a waste (and may be impossible), so we opt for allocating upon first use, and freeing after its requests are depleted. To avoid the possibility of an allocation failure causing us to lose a request, we preallocate the default priority (0) and bump any request to that priority if we fail to allocate it the appropriate plist. Having a request (that is ready to run, so not leading to corruption) execute out-of-order is better than leaking the request (and its dependency tree) entirely. There should be a benefit to reducing execlists_dequeue() to principally using a simple list (and reducing the frequency of both rbtree iteration and balancing on erase) but for typical workloads, request coalescing should be small enough that we don't notice any change. The main gain is from improving PI calls to schedule, and the explicit list within a level should make request unwinding simpler (we just need to insert at the head of the list rather than the tail and not have to make the rbtree search more complicated). v2: Avoid use-after-free when deleting a depleted priolist v3: Michał found the solution to handling the allocation failure gracefully. If we disable all priority scheduling following the allocation failure, those requests will be executed in fifo and we will ensure that this request and its dependencies are in strict fifo (even when it doesn't realise it is only a single list). Normal scheduling is restored once we know the device is idle, until the next failure! Suggested-by: Michał Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Michał Winiarski <michal.winiarski@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Michał Winiarski <michal.winiarski@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170517121007.27224-8-chris@chris-wilson.co.uk	2017-05-17 13:38:09 +01:00
Chris Wilson	77f0d0e925	drm/i915/execlists: Pack the count into the low bits of the port.request add/remove: 1/1 grow/shrink: 5/4 up/down: 391/-578 (-187) function old new delta execlists_submit_ports 262 471 +209 port_assign.isra - 136 +136 capture 6344 6359 +15 reset_common_ring 438 452 +14 execlists_submit_request 228 238 +10 gen8_init_common_ring 334 341 +7 intel_engine_is_idle 106 105 -1 i915_engine_info 2314 2290 -24 __i915_gem_set_wedged_BKL 485 411 -74 intel_lrc_irq_handler 1789 1604 -185 execlists_update_context 294 - -294 The most important change there is the improve to the intel_lrc_irq_handler and excclist_submit_ports (net improvement since execlists_update_context is now inlined). v2: Use the port_api() for guc as well (even though currently we do not pack any counters in there, yet) and hide all port->request_count inside the helpers. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170517121007.27224-5-chris@chris-wilson.co.uk	2017-05-17 13:38:06 +01:00
Chris Wilson	0ce8178808	drm/i915: Redefine ptr_pack_bits() and friends Rebrand the current (pointer \| bits) pack/unpack utility macros as explicit bit twiddling for PAGE_SIZE so that we can use the more flexible underlying macros for different bits. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170517121007.27224-4-chris@chris-wilson.co.uk	2017-05-17 13:38:04 +01:00
Chris Wilson	991bfc64db	drm/i915: Make ptr_unpack_bits() more function-like ptr_unpack_bits() is a function-like macro, as such it is meant to be replaceable by a function. In this case, we should be passing in the out-param as a pointer. Bizarrely this does affect code generation: function old new delta i915_gem_object_pin_map 409 389 -20 An improvement(?) in this case, but one can't help wonder what strict-aliasing optimisations we are preventing. The generated code looks identical in using ptr_unpack_bits (no extra motions to stack, the pointer and bits appear to be kept in registers), the difference appears to be code ordering and with a reorder it is able to use smaller forward jumps. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170517121007.27224-3-chris@chris-wilson.co.uk	2017-05-17 13:38:03 +01:00
Linus Torvalds	de4d195308	Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RCU updates from Ingo Molnar: "The main changes are: - Debloat RCU headers - Parallelize SRCU callback handling (plus overlapping patches) - Improve the performance of Tree SRCU on a CPU-hotplug stress test - Documentation updates - Miscellaneous fixes" * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (74 commits) rcu: Open-code the rcu_cblist_n_lazy_cbs() function rcu: Open-code the rcu_cblist_n_cbs() function rcu: Open-code the rcu_cblist_empty() function rcu: Separately compile large rcu_segcblist functions srcu: Debloat the <linux/rcu_segcblist.h> header srcu: Adjust default auto-expediting holdoff srcu: Specify auto-expedite holdoff time srcu: Expedite first synchronize_srcu() when idle srcu: Expedited grace periods with reduced memory contention srcu: Make rcutorture writer stalls print SRCU GP state srcu: Exact tracking of srcu_data structures containing callbacks srcu: Make SRCU be built by default srcu: Fix Kconfig botch when SRCU not selected rcu: Make non-preemptive schedule be Tasks RCU quiescent state srcu: Expedite srcu_schedule_cbs_snp() callback invocation srcu: Parallelize callback handling kvm: Move srcu_struct fields to end of struct kvm rcu: Fix typo in PER_RCU_NODE_PERIOD header comment rcu: Use true/false in assignment to bool rcu: Use bool value directly ...	2017-05-10 10:30:46 -07:00
Chris Wilson	4797948071	drm/i915: Squash repeated awaits on the same fence Track the latest fence waited upon on each context, and only add a new asynchronous wait if the new fence is more recent than the recorded fence for that context. This requires us to filter out unordered timelines, which are noted by DMA_FENCE_NO_CONTEXT. However, in the absence of a universal identifier, we have to use our own i915->mm.unordered_timeline token. v2: Throw around the debug crutches v3: Inline the likely case of the pre-allocation cache being full. v4: Drop the pre-allocation support, we can lose the most recent fence in case of allocation failure -- it just means we may emit more awaits than strictly necessary but will not break. v5: Trim allocation size for leaf nodes, they only need an array of u32 not pointers. v6: Create mock_timeline to tidy selftest writing v7: s/intel_timeline_sync_get/intel_timeline_sync_is_later/ (Tvrtko) v8: Prune the stale sync points when we idle. v9: Include a small benchmark in the kselftests v10: Separate the idr implementation into its own compartment. (Tvrkto) v11: Refactor igt_sync kselftests to avoid deep nesting (Tvrkto) v12: __sync_leaf_idx() to assert that p->height is 0 when checking leaves v13: kselftests to investigate struct i915_syncmap itself (Tvrtko) v14: Foray into ascii art graphs v15: Take into account that the random lookup/insert does 2 prng calls, not 1, when benchmarking, and use for_each_set_bit() (Tvrtko) v16: Improved ascii art Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170503093924.5320-4-chris@chris-wilson.co.uk	2017-05-03 11:08:48 +01:00
Chris Wilson	9431282832	drm/i915: Mark up clflushes as belonging to an unordered timeline 2 clflushes on two different objects are not ordered, and so do not belong to the same timeline (context). Either we use a unique context for each, or we reserve a special global context to mean unordered. Ideally, we would reserve 0 to mean unordered (DMA_FENCE_NO_CONTEXT) to have the same semantics everywhere. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170503093924.5320-1-chris@chris-wilson.co.uk	2017-05-03 11:08:45 +01:00
Dave Airlie	73ba2d5c2b	Merge tag 'drm-intel-next-fixes-2017-04-27' of git://anongit.freedesktop.org/git/drm-intel into drm-next drm/i915 and gvt fixes for drm-next/v4.12 * tag 'drm-intel-next-fixes-2017-04-27' of git://anongit.freedesktop.org/git/drm-intel: drm/i915: Confirm the request is still active before adding it to the await drm/i915: Avoid busy-spinning on VLV_GLTC_PW_STATUS mmio drm/i915/selftests: Allocate inode/file dynamically drm/i915: Fix system hang with EI UP masked on Haswell drm/i915: checking for NULL instead of IS_ERR() in mock selftests drm/i915: Perform link quality check unconditionally during long pulse drm/i915: Fix use after free in lpe_audio_platdev_destroy() drm/i915: Use the right mapping_gfp_mask for final shmem allocation drm/i915: Make legacy cursor updates more unsynced drm/i915: Apply a cond_resched() to the saturated signaler drm/i915: Park the signaler before sleeping drm/i915/gvt: fix a bounds check in ring_id_to_context_switch_event() drm/i915/gvt: Fix PTE write flush for taking runtime pm properly drm/i915/gvt: remove some debug messages in scheduler timer handler drm/i915/gvt: add mmio init for virtual display drm/i915/gvt: use directly assignment for structure copying drm/i915/gvt: remove redundant ring id check which cause significant CPU misprediction drm/i915/gvt: remove redundant platform check for mocs load/restore drm/i915/gvt: Align render mmio list to cacheline drm/i915/gvt: cleanup some too chatty scheduler message	2017-04-29 05:50:27 +10:00
Joonas Lahtinen	ea117b8ddc	drm/i915: Reset ILK during GEM sanitization ILK should survive a reset without display corruption. Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-04-28 12:11:59 +03:00
Joonas Lahtinen	f2e4d76ec2	drm/i915: Eliminate HAS_HW_CONTEXTS HAS_HW_CONTEXTS is misleading condition for GPU reset and CCID, replace it with Gen specific (to be updated in next patches). HAS_HW_CONTEXTS in i915_l3_write is bogus because each HAS_L3_DPF match also has .has_hw_contexts = 1 set. This leads to us being able to get rid of the property completely. v2: - Keep the checks at Gen6 for no functional change (Ville) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>	2017-04-28 12:11:59 +03:00
Chris Wilson	bdb57b8dca	drm/i915: Use the right mapping_gfp_mask for final shmem allocation Many sightings report the greater prevalence of allocation failures. This is all due to the incorrect use of mapping_gfp_constraint(), so remove it in favour of just querying the mapping_gfp_mask() which are the exact gfp_t we wanted in the first place. We still do expect a higher chance of reporting ENOMEM, as that is the intention of using __GFP_NORETRY -- to fail rather than oom after having reclaimed from our bo caches, and having done a direct\|kswapd reclaim pass. Reported-by: Jason Ekstrand <jason.ekstrand@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100594 Fixes: `24f8e00a8a` ("drm/i915: Prefer to report ENOMEM rather than incur the oom for gfx allocations") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/20170405221514.23251-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> (cherry picked from commit `b268d9fe0f`) Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2017-04-26 16:28:08 +03:00
Ingo Molnar	58d30c36d4	Merge branch 'for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu Pull RCU updates from Paul E. McKenney: - Documentation updates. - Miscellaneous fixes. - Parallelize SRCU callback handling (plus overlapping patches). Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-04-23 11:12:44 +02:00
Dave Airlie	856ee92e86	Linux 4.11-rc7 -----BEGIN PGP SIGNATURE----- iQEcBAABAgAGBQJY881cAAoJEHm+PkMAQRiGG4UH+wa2z6Qet36Uc4nXFZuSMYrO ErUWs1QpTDDv4a+LE4fgyMvM3j9XqtpfQLy1n70jfD14IqPBhHe4gytasAf+8lg1 YvddFx0Yl3sygVu3dDBNigWeVDbfwepW59coN0vI5nrMo+wrei8aVIWcFKOxdMuO n72u9vuhrkEnLJuQk7SF+t4OQob9McXE3s7QgyRopmlKhKo7mh8On7K2BRI5uluL t0j5kZM0a43EUT5rq9xR8f5pgtyfTMG/FO2MuzZn43MJcZcyfmnOP/cTSIvAKA5U 1i12lxlokYhURNUe+S6jm8A47TrqSRSJxaQJZRlfGJksZ0LJa8eUaLDCviBQEoE= =6QWZ -----END PGP SIGNATURE----- Merge tag 'v4.11-rc7' into drm-next Backmerge Linux 4.11-rc7 from Linus tree, to fix some conflicts that were causing problems with the rerere cache in drm-tip.	2017-04-19 11:07:14 +10:00
Paul E. McKenney	5f0d5a3ae7	mm: Rename SLAB_DESTROY_BY_RCU to SLAB_TYPESAFE_BY_RCU A group of Linux kernel hackers reported chasing a bug that resulted from their assumption that SLAB_DESTROY_BY_RCU provided an existence guarantee, that is, that no block from such a slab would be reallocated during an RCU read-side critical section. Of course, that is not the case. Instead, SLAB_DESTROY_BY_RCU only prevents freeing of an entire slab of blocks. However, there is a phrase for this, namely "type safety". This commit therefore renames SLAB_DESTROY_BY_RCU to SLAB_TYPESAFE_BY_RCU in order to avoid future instances of this sort of confusion. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: <linux-mm@kvack.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Vlastimil Babka <vbabka@suse.cz> [ paulmck: Add comments mentioning the old name, as requested by Eric Dumazet, in order to help people familiar with the old name find the new one. ] Acked-by: David Rientjes <rientjes@google.com>	2017-04-18 11:42:36 -07:00
Chris Wilson	e22d8e3c69	drm/i915: Treat WC a separate cache domain When discussing a new WC mmap, we based the interface upon the assumption that GTT was fully coherent. How naive! Commits `3b5724d702` ("drm/i915: Wait for writes through the GTT to land before reading back") and `ed4596ea99` ("drm/i915/guc: WA to address the Ringbuffer coherency issue") demonstrate that writes through the GTT are indeed delayed and may be overtaken by direct WC access. To be safe, if userspace is mixing WC mmaps with other potential GTT access (pwrite, GTT mmaps) it should use set_domain(WC). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96563 Testcase: igt/gem_pwrite/small-gtt* Testcase: igt/drv_selftest/coherency Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170412110111.26626-2-chris@chris-wilson.co.uk	2017-04-12 12:35:17 +01:00
Chris Wilson	ef74921bc6	drm/i915: Combine write_domain flushes to a single function In the next patch, we will introduce a new cache domain for differentiating between GTT access and direct WC access. This will require us to include WC in our write_domain flushes. Rather than duplicate a third function, combine the existing two into one and flushing WC writes will then be automatically handled as well. v2: Be smarter and clearer by passing in the write domains to flush (Joonas) v3: One missed ~ in v2 conversion Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170412110111.26626-1-chris@chris-wilson.co.uk	2017-04-12 12:35:16 +01:00
Chris Wilson	1d39f28170	drm/i915: Rename intel_engine_cs.exec_id to uabi_id We want to refer to the index of the engine consistently throughout the userspace ABI. We already have such an index through the execbuffer engine specifier, that needs to be able to refer to each engine specifically, so rename it the index to uabi_id to reflect its generality beyond execbuf. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170411124306.15448-1-chris@chris-wilson.co.uk	2017-04-11 14:31:39 +01:00
Sagar Arun Kamble	63987bfebd	drm/i915: Suspend GuC prior to GPU Reset during GEM suspend i915 is currently doing a full GPU reset at the end of i915_gem_suspend() followed by GuC suspend in i915_drm_suspend(). This GPU reset clobbers the GuC, causing the suspend request to then fail, leaving the GuC in an undefined state. We need to tell the GuC to suspend before we do the direct intel_gpu_reset(). v2: Commit message update. (Chris, Daniele) Fixes: `1c777c5d1d` ("drm/i915/hsw: Fix GPU hang during resume from S3-devices state") Cc: Jeff McGee <jeff.mcgee@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Imre Deak <imre.deak@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1491387710-20553-1-git-send-email-sagar.a.kamble@intel.com Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> (cherry picked from commit `fd08923384`) Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2017-04-11 13:25:12 +03:00
Chris Wilson	f2be9d6833	drm/i915: Insert cond_resched() into i915_gem_free_objects As we may have very many objects to free, check to see if the task needs to be rescheduled whilst freeing them. Suggested-by: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170407102552.5781-4-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-04-07 13:37:41 +01:00
Chris Wilson	5ad08be7e3	drm/i915: Break up long runs of freeing objects Before freeing the next batch of objects from the worker, check if the worker's timeslice has expired and if so, defer the next batch to the next invocation of the worker. Suggested-by: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170407102552.5781-3-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-04-07 13:37:41 +01:00
Joonas Lahtinen	e92075ff7d	drm/i915: Simplify shrinker locking By using the same structure for both interruptible and uninterruptible locking in shrinker code, combined with the information that mm.interruptible is only being written to, the code can be greatly simplified. Also removing the i915_gem_ prefix from the locking functions so that nobody in their wildest dreams considers exporting them. Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1491562175-27680-1-git-send-email-joonas.lahtinen@linux.intel.com	2017-04-07 14:33:39 +03:00
Chris Wilson	17b93c40e7	drm/i915: Drain any freed objects prior to hibernation As we call into the shrinker during freeze, we may have freed more objects since we idled during i915_gem_suspend. Make sure we flush the i915_gem_free_objects worker prior to saving the unwanted pages into the hibernation image. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170407102552.5781-2-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-04-07 12:26:00 +01:00
Chris Wilson	d0aa301ae5	drm/i915: The shrinker already acquires struct_mutex, so call it unlocked The shrinker is prepared to be called unlocked (and at other times with struct_mutex held for DIRECT_RECLAIM) so we can skip acquiring the struct_mutex prior to calling the shrinker during freeze. This improves our ability to shrink as we can be more aggressive when we know the caller isn't holding struct_mutex. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170407102552.5781-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-04-07 12:25:11 +01:00
Chris Wilson	b268d9fe0f	drm/i915: Use the right mapping_gfp_mask for final shmem allocation Many sightings report the greater prevalence of allocation failures. This is all due to the incorrect use of mapping_gfp_constraint(), so remove it in favour of just querying the mapping_gfp_mask() which are the exact gfp_t we wanted in the first place. We still do expect a higher chance of reporting ENOMEM, as that is the intention of using __GFP_NORETRY -- to fail rather than oom after having reclaimed from our bo caches, and having done a direct\|kswapd reclaim pass. Reported-by: Jason Ekstrand <jason.ekstrand@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100594 Fixes: `24f8e00a8a` ("drm/i915: Prefer to report ENOMEM rather than incur the oom for gfx allocations") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/20170405221514.23251-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-04-06 13:02:02 +01:00
Sagar Arun Kamble	fd08923384	drm/i915: Suspend GuC prior to GPU Reset during GEM suspend i915 is currently doing a full GPU reset at the end of i915_gem_suspend() followed by GuC suspend in i915_drm_suspend(). This GPU reset clobbers the GuC, causing the suspend request to then fail, leaving the GuC in an undefined state. We need to tell the GuC to suspend before we do the direct intel_gpu_reset(). v2: Commit message update. (Chris, Daniele) Fixes: `1c777c5d1d` ("drm/i915/hsw: Fix GPU hang during resume from S3-devices state") Cc: Jeff McGee <jeff.mcgee@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Imre Deak <imre.deak@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1491387710-20553-1-git-send-email-sagar.a.kamble@intel.com Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-04-06 11:10:01 +01:00
Tvrtko Ursulin	7a3ee5deb4	drm/i915: Remove user-triggerable WARN from i915_gem_object_create Since this can be triggered by simply attempting a huge object, a WARN_ON is not appropriate. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170330163130.24141-1-tvrtko.ursulin@linux.intel.com	2017-04-04 09:39:13 +01:00
Chris Wilson	25112b64b3	drm/i915: Wait for all engines to be idle as part of i915_gem_wait_for_idle() Make i915_gem_wait_for_idle() be a little heavier in order to try and guarantee that the GPU is indeed idle (by checking each engine individually is idle, i.e. all writes are complete and the rings stopped) after waiting for in-flight requests to be completed. v2: And return the final error. v3: Break the wait_for() out from under the WARN -- the macro expansion is hideous and unreadable in the warning message v4: If wait_for_engine() fails the result is catastrophic, mark the device as wedged and wait for the repair team. References: https://bugs.freedesktop.org/show_bug.cgi?id=98836 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170330145041.9005-4-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-03-31 12:07:17 +01:00
Chris Wilson	72022a705e	drm/i915: Move retire-requests into i915_gem_wait_for_idle() As we now distinguish everywhere that can call i915_gem_retire_requests() following a successful wait_for_idle, we can remove the duplication by moving that call into i915_gem_wait_for_idle() itself. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170330145041.9005-3-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-03-31 12:03:46 +01:00
Chris Wilson	8490ae207f	drm/i915: Suppress busy status for engines if wedged If the driver is wedged, HW state may be very inconsistent and report that it is still busy, even though we have stopped using it. This can lead to a double ERROR rather than a graceful cleanup after wedging. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170330145041.9005-2-chris@chris-wilson.co.uk	2017-03-30 17:58:44 +01:00
Chris Wilson	2c170af76c	drm/i915: Do request retirement before marking engines as wedged As we declare an engine as wedged, we mark all of its active requests as in error. However, we don't want to mark successfully completed requests as in error, which requires us to retire those requests first. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170330145041.9005-1-chris@chris-wilson.co.uk	2017-03-30 17:56:21 +01:00
Oscar Mateo	b8991403ea	drm/i915/guc: Take enable_guc_loading check out of GEM core code The should happen as soon as possible, but always within the logic that depends on it (and not interrupting the top-level driver control flow). Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1490720027-23234-1-git-send-email-oscar.mateo@intel.com	2017-03-30 13:11:40 +03:00
Chris Wilson	e2a2aa36a5	drm/i915: Check we have an wake device before flushing GTT writes We can assume that if the device is asleep then all pending GTT writes will have been posted, and so we can defer the flush from i915_gem_object_flush_gtt_write_domain() [ 1957.462568] WARNING: CPU: 0 PID: 6132 at drivers/gpu/drm/i915/intel_drv.h:1742 fwtable_read32+0x123/0x150 [i915] [ 1957.462582] RPM wakelock ref not held during HW access [ 1957.462583] Modules linked in: i915 intel_gtt drm_kms_helper prime_numbers [ 1957.462607] CPU: 0 PID: 6132 Comm: gem_concurrent_ Tainted: G U 4.11.0-rc1+ #464 [ 1957.462619] Hardware name: / , BIOS PYBSWCEL.86A.0027.2015.0507.1758 05/07/2015 [ 1957.462630] Call Trace: [ 1957.462646] dump_stack+0x4d/0x6f [ 1957.462657] __warn+0xc1/0xe0 [ 1957.462667] warn_slowpath_fmt+0x4a/0x50 [ 1957.462709] fwtable_read32+0x123/0x150 [i915] [ 1957.462750] i915_gem_object_flush_gtt_write_domain+0x43/0x70 [i915] [ 1957.462791] i915_gem_object_set_to_cpu_domain+0x46/0xa0 [i915] [ 1957.462831] i915_gem_set_domain_ioctl+0x15d/0x220 [i915] [ 1957.462843] drm_ioctl+0x1d7/0x440 [ 1957.462885] ? i915_gem_obj_prepare_shmem_write+0x1d0/0x1d0 [i915] [ 1957.462896] ? pick_next_task_fair+0x436/0x440 [ 1957.462906] ? mntput+0x1f/0x30 [ 1957.462915] do_vfs_ioctl+0x8f/0x5c0 [ 1957.462925] ? __schedule+0x16f/0x5f0 [ 1957.462935] ? ____fput+0x9/0x10 [ 1957.462943] SyS_ioctl+0x3c/0x70 [ 1957.462952] entry_SYSCALL_64_fastpath+0x17/0x98 [ 1957.462961] RIP: 0033:0x7fc542179ca7 [ 1957.462968] RSP: 002b:00007ffeef12ff98 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ 1957.462982] RAX: ffffffffffffffda RBX: 00007ffeef1301d0 RCX: 00007fc542179ca7 [ 1957.462990] RDX: 00007ffeef12ffd0 RSI: 00000000400c645f RDI: 0000000000000003 [ 1957.462999] RBP: 0000000000000003 R08: 000055f433bc7c40 R09: 000000000000002c [ 1957.463006] R10: 0000000000000073 R11: 0000000000000246 R12: 0000000000000018 [ 1957.463015] R13: 000055f432c89d20 R14: 000055f432c87690 R15: 0000000000000000 Fixes: `3b5724d702` ("drm/i915: Wait for writes through the GTT to land before reading back") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170323150053.28582-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-03-27 12:48:44 +01:00
Oscar Mateo	3950bf3dbf	drm/i915/guc: Add onion teardown to the GuC setup Starting with intel_guc_loader, down to intel_guc_submission and finally to intel_guc_log. v2: - Null execbuf client outside guc_client_free (Daniele) - Assert if things try to get allocated twice (Daniele/Joonas) - Null guc->log.buf_addr when destroyed (Daniele) - Newline between returning success and error labels (Joonas) - Remove some unnecessary comments (Joonas) - Keep guc_log_create_extras naming convention (Joonas) - Helper function guc_log_has_extras (Joonas) - No need for separate relay_channel create/destroy. It's just another extra. - No need to nullify guc->log.flush_wq when destroyed (Joonas) - Hoist the check for has_extras out of guc_log_create_extras (Joonas) - Try to do i915_guc_log_register/unregister calls (kind of) symmetric (Daniele) - Make sure initel_guc_fini is not called before init is ever called (Daniele) v3: - Remove unnecessary parenthesis (Joonas) - Check for logs enabled on debugfs registration - Rebase on top of Tvrtko's "Fix request re-submission after reset" v4: - Rebased - Comment around enabling/disabling interrupts inside GuC logging (Joonas) Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-03-23 14:57:36 +02:00
Chris Wilson	40149f00fb	drm/i915: Actually pass the reclaim gfp_t along to shmemfs! Words cannot describe the embarrassment at creating a new gfp_t relaim to only prevent the oomkiller but allow direct\|kswapd reclaim, and then not use it in the shmem_read_mapping_page_gfp(). Fixes: `24f8e00a8a` ("drm/i915: Prefer to report ENOMEM rather than incur the oom for gfx allocations") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/20170322223447.7493-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-03-23 09:30:20 +00:00
Chris Wilson	24f8e00a8a	drm/i915: Prefer to report ENOMEM rather than incur the oom for gfx allocations Since gfx allocations tend to be large, unmovable and disposable, report the allocation failure back to userspace as an ENOMEM rather than incur the oomkiller. We have already tried to make room by purging our own cached gfx objects, and the oomkiller doesn't attribute ownership of gfx objects so will likely pick the wrong candidate. Instead, let userspace see the ENOMEM. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170322110521.29930-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2017-03-22 19:26:46 +00:00
Chris Wilson	54ec12af2f	drm/i915: Skip force-wake for uncached mmio flush of GGTT writes The trick of using an uncached mmio read to ensure that the GGTT writes are flushed does not require us to do the forcewake dance, so avoid it in the hope of reducing the frequency that we do keep the device forced awake. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170318104257.694-1-chris@chris-wilson.co.uk Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>	2017-03-20 10:45:49 +00:00
Chris Wilson	be062fa427	drm/i915: Initialise i915_gem_object_create_from_data() directly Use pagecache_write to avoid shmemfs clearing the pages prior to us immediately overwriting them with our data. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170317194648.12468-2-chris@chris-wilson.co.uk Reviewed-by: Matthew Auld <matthew.auld@intel.com>	2017-03-17 22:55:56 +00:00
Chris Wilson	ce8ff099c4	drm/i915: i915_gem_object_create_from_data() doesn't require struct_mutex Both object creation and backing storage page allocation do not require struct_mutex, so do not require the caller to take it. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170317194648.12468-1-chris@chris-wilson.co.uk Reviewed-by: Matthew Auld <matthew.auld@intel.com>	2017-03-17 22:54:06 +00:00
Chris Wilson	2e8f9d3229	drm/i915: Restore engine->submit_request before unwedging When we wedge the device, we override engine->submit_request with a nop to ensure that all in-flight requests are marked in error. However, igt would like to unwedge the device to test -EIO handling. This requires us to flush those in-flight requests and restore the original engine->submit_request. v2: Use a vfunc to unify enabling request submission to engines v3: Split new vfunc to a separate patch. v4: Make the wait interruptible -- the third party fences we wait upon may be indefinitely broken, so allow the reset to be aborted. Fixes: `821ed7df6e` ("drm/i915: Update reset path to fix incomplete requests") Testcase: igt/gem_eio Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> #v3 Link: http://patchwork.freedesktop.org/patch/msgid/20170316171305.12972-3-chris@chris-wilson.co.uk	2017-03-16 17:17:14 +00:00
Chris Wilson	8c185ecaf4	drm/i915: Split I915_RESET_IN_PROGRESS into two flags I915_RESET_IN_PROGRESS is being used for both signaling the requirement to i915_mutex_lock_interruptible() to avoid taking the struct_mutex and to instruct a waiter (already holding the struct_mutex) to perform the reset. To allow for a little more coordination, split these two meaning into a couple of distinct flags. I915_RESET_BACKOFF tells i915_mutex_lock_interruptible() not to acquire the mutex and I915_RESET_HANDOFF tells the waiter to call i915_reset(). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170316171305.12972-1-chris@chris-wilson.co.uk	2017-03-16 17:17:10 +00:00
Arkadiusz Hiler	6cd5a72c35	drm/i915/guc: Simplify intel_guc_init_hw() Current version of intel_guc_init_hw() does a lot: - cares about submission - loads huc - implement WA This change offloads some of the logic to intel_uc_init_hw(), which now cares about the above. v2: rename guc_hw_reset and fix typo in define name (M. Wajdeczko) v3: rename once again v4: remove spurious comments and add some style (J. Lahtinen) v5: flow changes, got rid of dead checks (M. Wajdeczko) v6: rebase v7: rebase & onion teardown (J. Lahtinen) Cc: Anusha Srivatsa <anusha.srivatsa@intel.com> Cc: Michal Winiarski <michal.winiarski@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-03-15 14:26:30 +02:00
Arkadiusz Hiler	882d1db09b	drm/i915/uc: Rename intel_?uc_{setup, load}() to _init_hw() GuC historically has two "startup" functions called _init() and _setup() Then HuC came with it's _init() and _load(). This commit renames intel_guc_setup() and intel_huc_load() to *uc_init_hw() as they called from the i915_gem_init_hw(). The aim is to be consistent in that entry points called during particular driver init phases (e.g. init_hw) are all suffixed by that phase. When reading the leaf functions, it should be clear at what stage during the driver load it is called and therefore what operations are legal at that point. Also, since the functions start with intel_guc and intel_huc they take appropiate structure. v2: commit message update (Chris Wilson) v3: change taken parameters to be more "semantic" (M. Wajdeczko) Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Michal Winiarski <michal.winiarski@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-03-15 14:26:30 +02:00
Kenneth Graunke	0f5418e564	drm/i915: Drop support for I915_EXEC_CONSTANTS_* execbuf parameters. This patch makes the I915_PARAM_HAS_EXEC_CONSTANTS getparam return 0 (indicating the optional feature is not supported), and makes execbuf always return -EINVAL if the flags are used. Apparently, no userspace ever shipped which used this optional feature: I checked the git history of Mesa, xf86-video-intel, libva, and Beignet, and there were zero commits showing a use of these flags. Kernel commit `72bfa19c8d` apparently introduced the feature prematurely. According to Chris, the intention was to use this in cairo-drm, but "the use was broken for gen6", so I don't think it ever happened. 'relative_constants_mode' has always been tracked per-device, but this has actually been wrong ever since hardware contexts were introduced, as the INSTPM register is saved (and automatically restored) as part of the render ring context. The software per-device value could therefore get out of sync with the hardware per-context value. This meant that using them is actually unsafe: a client which tried to use them could damage the state of other clients, causing the GPU to interpret their BO offsets as absolute pointers, leading to bogus memory reads. These flags were also never ported to execlist mode, making them no-ops on Gen9+ (which requires execlists), and Gen8 in the default mode. On Gen8+, userspace can write these registers directly, achieving the same effect. On Gen6-7.5, it likely makes sense to extend the command parser to support them. I don't think anyone wants this on Gen4-5. Based on a patch by Dave Gordon. v3: Return -ENODEV for the getparam, as this is what we do for other obsolete features. Suggested by Chris Wilson. Cc: stable@vger.kernel.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92448 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170215093446.21291-1-kenneth@whitecape.org Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170313170433.26843-1-chris@chris-wilson.co.uk (cherry picked from commit `ef0f411f51`) Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2017-03-14 12:28:25 +02:00
Chris Wilson	4565bf58d4	drm/i915: Disable engine->irq_tasklet around resets When we restart the engines, and we have active requests, a request on the first engine may complete and queue a request to the second engine before we try to restart the second engine. That queueing of the request may race with the engine to restart, and so may corrupt the current state. Disabling the engine->irq_tasklet prevents the two paths from writing into ELSP simultaneously (and modifyin the execlists_port[] at the same time). Include fixup `1d309634bc` ("drm/i915: Kill the tasklet then disable") Fixes: `821ed7df6e` ("drm/i915: Update reset path to fix incomplete requests") Testcase: igt/gem_exec_fence/await-hang Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170208143033.11651-3-chris@chris-wilson.co.uk Link: http://patchwork.freedesktop.org/patch/msgid/20170313165958.13970-2-chris@chris-wilson.co.uk (cherry picked from commit `1f7b847d72`) Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2017-03-14 12:26:44 +02:00
Chris Wilson	da9a796f54	drm/i915: Split GEM resetting into 3 phases Currently we do a reset prepare/finish around the call to reset the GPU, but it looks like we need a later stage after the hw has been reinitialised to allow GEM to restart itself. Start by splitting the 2 GEM phases into 3: prepare - before the reset, check if GEM recovered, then stop GEM reset - after the reset, update GEM bookkeeping finish - after the re-initialisation following the reset, restart GEM Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170208143033.11651-2-chris@chris-wilson.co.uk Link: http://patchwork.freedesktop.org/patch/msgid/20170313165958.13970-1-chris@chris-wilson.co.uk (cherry picked from commit `d802709313`) Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2017-03-14 12:11:49 +02:00
Chris Wilson	7f5f95d8ac	drm/i915: Move whole object to CPU domain for coherent shmem access If the object is coherent, we can simply update the cache domain on the whole object rather than calculate the before/after clflushes. The advantage is that we then get correct tracking of ellided flushes when changing coherency later. Testcase: igt/gem_pwrite_snooped Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170310000942.11661-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-03-13 11:16:09 +00:00
Chris Wilson	4e6fdafa7a	drm/i915: Use pagecache write to prepopulate shmemfs from pwrite-ioctl Before we instantiate/pin the backing store for our use, we can prepopulate the shmemfs filp efficiently using a write into the pagecache. We avoid the penalty of instantiating all the pages, important if the user is just writing to a few and never uses the object on the GPU, and using a direct write into shmemfs allows it to avoid the cost of retrieving a page (mostly the clear-before-use, but in theory we could curtail swapin) before it is overwritten. This can be extended later to provide additional specialisation for other backends (other than shmemfs). For now it provides a defense against very large write-only allocations from exhausting all of system memory. v2: Smelling fixes. Fixes: `fe115628d5` ("drm/i915: Implement pwrite without struct-mutex") References: https://bugs.freedesktop.org/show_bug.cgi?id=99107 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.william.auld@gmail.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: <stable@vger.kernel.org> # v4.10+ Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170307120338.7277-2-chris@chris-wilson.co.uk (cherry picked from commit `7c55e2c577`) Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2017-03-09 10:46:07 +02:00
Chris Wilson	0d9dc306e1	drm/i915: Store a permanent error in obj->mm.pages Once the object has been truncated, it is unrecoverable. To facilitate detection of this state store the error in obj->mm.pages. This is required for the next patch which should be applied to v4.10 (via stable), so we also need to mark this patch for backporting. In that regard, let's consider this to be a fix/improvement too. v2: Avoid dereferencing the ERR_PTR when freeing the object. Fixes: `1233e2db19` ("drm/i915: Move object backing storage manipulation to its own locking") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: <stable@vger.kernel.org> # v4.10+ Link: http://patchwork.freedesktop.org/patch/msgid/20170307132031.32461-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> (cherry picked from commit `4e5462ee84`) Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2017-03-09 10:45:58 +02:00
Chris Wilson	89cf83d4e0	drm/i915: Squelch any ktime/jiffie rounding errors for wait-ioctl We wait upon jiffies, but report the time elapsed using a high-resolution timer. This discrepancy can lead to us timing out the wait prior to us reporting the elapsed time as complete. This restores the squelching lost in commit `e95433c73a` ("drm/i915: Rearrange i915_wait_request() accounting with callers"). Fixes: `e95433c73a` ("drm/i915: Rearrange i915_wait_request() accounting with callers") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.william.auld@gmail.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: <drm-intel-fixes@lists.freedesktop.org> # v4.10-rc1+ Cc: stable@vger.kernel.org Link: http://patchwork.freedesktop.org/patch/msgid/20170216125441.30923-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> (cherry picked from commit `c1d2061b28`) Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2017-03-09 10:42:44 +02:00
Chris Wilson	03d1cac6ee	drm/i915: Avoiding recursing on ww_mutex inside shrinker We have to avoid taking ww_mutex inside the shrinker as we use it as a plain mutex type and so need to avoid recursive deadlocks: [ 602.771969] ================================= [ 602.771970] [ INFO: inconsistent lock state ] [ 602.771973] 4.10.0gpudebug+ #122 Not tainted [ 602.771974] --------------------------------- [ 602.771975] inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage. [ 602.771978] kswapd0/40 [HC0[0]:SC0[0]:HE1:SE1] takes: [ 602.771979] (reservation_ww_class_mutex){+.+.?.}, at: [<ffffffffa054680a>] i915_gem_object_wait+0x39a/0x410 [i915] [ 602.772020] {RECLAIM_FS-ON-W} state was registered at: [ 602.772024] mark_held_locks+0x76/0x90 [ 602.772026] lockdep_trace_alloc+0xb8/0xc0 [ 602.772028] __kmalloc_track_caller+0x5d/0x130 [ 602.772031] krealloc+0x89/0xb0 [ 602.772033] reservation_object_reserve_shared+0xaf/0xd0 [ 602.772055] i915_gem_do_execbuffer.isra.35+0x1413/0x18b0 [i915] [ 602.772075] i915_gem_execbuffer2+0x10e/0x1d0 [i915] [ 602.772078] drm_ioctl+0x291/0x480 [ 602.772079] do_vfs_ioctl+0x695/0x6f0 [ 602.772081] SyS_ioctl+0x3c/0x70 [ 602.772084] entry_SYSCALL_64_fastpath+0x18/0xad [ 602.772085] irq event stamp: 5197423 [ 602.772088] hardirqs last enabled at (5197423): [<ffffffff8116751d>] kfree+0xdd/0x170 [ 602.772091] hardirqs last disabled at (5197422): [<ffffffff811674f9>] kfree+0xb9/0x170 [ 602.772095] softirqs last enabled at (5190992): [<ffffffff8107bfe1>] __do_softirq+0x221/0x280 [ 602.772097] softirqs last disabled at (5190575): [<ffffffff8107c294>] irq_exit+0x64/0xc0 [ 602.772099] other info that might help us debug this: [ 602.772100] Possible unsafe locking scenario: [ 602.772101] CPU0 [ 602.772101] ---- [ 602.772102] lock(reservation_ww_class_mutex); [ 602.772104] <Interrupt> [ 602.772105] lock(reservation_ww_class_mutex); [ 602.772107] * DEADLOCK * [ 602.772109] 2 locks held by kswapd0/40: [ 602.772110] #0: (shrinker_rwsem){++++..}, at: [<ffffffff811337b5>] shrink_slab.constprop.62+0x35/0x280 [ 602.772116] #1: (&dev->struct_mutex){+.+.+.}, at: [<ffffffffa0553957>] i915_gem_shrinker_lock+0x27/0x60 [i915] [ 602.772141] stack backtrace: [ 602.772144] CPU: 2 PID: 40 Comm: kswapd0 Not tainted 4.10.0gpudebug+ #122 [ 602.772145] Hardware name: LENOVO 42433ZG/42433ZG, BIOS 8AET64WW (1.44 ) 07/26/2013 [ 602.772147] Call Trace: [ 602.772151] dump_stack+0x68/0xa1 [ 602.772153] print_usage_bug+0x1d4/0x1f0 [ 602.772155] mark_lock+0x390/0x530 [ 602.772157] ? print_irq_inversion_bug+0x200/0x200 [ 602.772159] __lock_acquire+0x405/0x1260 [ 602.772181] ? i915_gem_object_wait+0x39a/0x410 [i915] [ 602.772183] lock_acquire+0x60/0x80 [ 602.772205] ? i915_gem_object_wait+0x39a/0x410 [i915] [ 602.772207] mutex_lock_nested+0x69/0x760 [ 602.772229] ? i915_gem_object_wait+0x39a/0x410 [i915] [ 602.772231] ? kfree+0xdd/0x170 [ 602.772253] ? i915_gem_object_wait+0x163/0x410 [i915] [ 602.772255] ? trace_hardirqs_on_caller+0x18d/0x1c0 [ 602.772256] ? trace_hardirqs_on+0xd/0x10 [ 602.772278] i915_gem_object_wait+0x39a/0x410 [i915] [ 602.772300] i915_gem_object_unbind+0x5e/0x130 [i915] [ 602.772323] i915_gem_shrink+0x22d/0x3d0 [i915] [ 602.772347] i915_gem_shrinker_scan+0x3f/0x80 [i915] [ 602.772349] shrink_slab.constprop.62+0x1ad/0x280 [ 602.772352] shrink_node+0x52/0x80 [ 602.772355] kswapd+0x427/0x5c0 [ 602.772358] kthread+0x122/0x130 [ 602.772360] ? try_to_free_pages+0x270/0x270 [ 602.772362] ? kthread_stop+0x70/0x70 [ 602.772365] ret_from_fork+0x2e/0x40 v2: Add commentary about the pruning being opportunistic Reported-by: Jan Nordholz <jckn@gmx.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99977#c10 Fixes: `e54ca97747` ("drm/i915: Remove completed fences after a wait") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170308132629.7987-1-chris@chris-wilson.co.uk	2017-03-08 20:42:17 +00:00

... 4 5 6 7 8 ...

2021 commits