REGCACHE_MAPLE needs to allocate memory for regmap operations.
This results in lockdep splats if used with fast_io since fast_io uses
spinlocks for locking.
BUG: sleeping function called from invalid context at include/linux/sched/mm.h:306
in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 167, name: kunit_try_catch
preempt_count: 1, expected: 0
1 lock held by kunit_try_catch/167:
#0: 838e9c10 (regmap_kunit:86:(config)->lock){....}-{2:2}, at: regmap_lock_spinlock+0x14/0x1c
irq event stamp: 146
hardirqs last enabled at (145): [<8078bfa8>] crng_make_state+0x1a0/0x294
hardirqs last disabled at (146): [<80c5f62c>] _raw_spin_lock_irqsave+0x7c/0x80
softirqs last enabled at (0): [<80110cc4>] copy_process+0x810/0x216c
softirqs last disabled at (0): [<00000000>] 0x0
CPU: 0 PID: 167 Comm: kunit_try_catch Tainted: G N 6.5.0-rc1-00028-gc4be22597a36-dirty #6
Hardware name: Generic DT based system
unwind_backtrace from show_stack+0x18/0x1c
show_stack from dump_stack_lvl+0x38/0x5c
dump_stack_lvl from __might_resched+0x188/0x2d0
__might_resched from __kmem_cache_alloc_node+0x1f4/0x258
__kmem_cache_alloc_node from __kmalloc+0x48/0x170
__kmalloc from regcache_maple_write+0x194/0x248
regcache_maple_write from _regmap_write+0x88/0x140
_regmap_write from regmap_write+0x44/0x68
regmap_write from basic_read_write+0x8c/0x27c
basic_read_write from kunit_generic_run_threadfn_adapter+0x1c/0x28
kunit_generic_run_threadfn_adapter from kthread+0xf8/0x120
kthread from ret_from_fork+0x14/0x3c
Exception stack(0x881a5fb0 to 0x881a5ff8)
5fa0: 00000000 00000000 00000000 00000000
5fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
5fe0: 00000000 00000000 00000000 00000000 00000013 00000000
Use map->alloc_flags instead of GFP_KERNEL for memory allocations to fix
the problem.
Fixes: f033c26de5 ("regmap: Add maple tree based register cache")
Cc: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20230720172021.2617326-1-linux@roeck-us.net
Signed-off-by: Mark Brown <broonie@kernel.org>
For register maps where we can write multiple values in a single bus
operation it is generally much faster to do so. Improve the performance of
maple tree cache syncs on such devices by identifying blocks of adjacent
registers that need to be written out and combining them into a single
operation.
Combining writes does mean that we need to allocate a scratch buffer and
format the data into it but it is expected that for most cases where caches
are in use the cost of I/O will be much greater than the cost of doing the
allocation and format.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230609-regcache-maple-sync-raw-v1-1-8ddeb4e2b9ab@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmSGPiIeHHRvcnZhbGRz
QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGxWcH/AoUClfYYbYZYaoI
spQiFQiZEdniE3W/36GKtfJok4ur1Aogb/q99KvfQxFGkH+aLbTgXxJlTqrlmfWk
Kj4uVUecQl2OHOuEM6AVFyK6S4xdTjWS7TKMGIVe4LDQCAsFidgcYbcVrAhQ+s1s
eAOhlfrKMzVgRQ0KsJkn60SoSbVP6RyRfu+QQu24GfNtD8SzN8y0adB1PYXGVWmr
WW8+MqfZ1WIkn+NWW5bn3AEz8AYjQC66EvcWhlAWmyyBuUjIoNhpIgPHp6Vm7s+a
oPvw/VXwDX8NzpN0XYI+eVpsgClWtJ+HUcCSeuExTxrWu7Bewp6MMLeCjvi/lPpe
zKRDgnw=
=hx4k
-----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmSHIsUACgkQJNaLcl1U
h9AbDQf+OnspdI5uW+X7ObBdB7KvPAXzDd7rMsXOTdp13bciSmaOzl74f0jPNWYU
GVo3a84eIU0YsdBP7M9+42vyUDQctgoBa7yyxZ+hW8gZQbXwgOZGmvWMncankm9n
hTlUbuLiowPzzjLERMDS0T2iJF0aFG7eB8Xghk/Su38HOCJI6fBx7sQlvt+aFK6J
/WzUuAqHng+6T7FHqxerGz/XIfvnRdS8qC9FZ7ySurmopn2AGY4YlifSGmSlPuFt
MM+nGNTOYn8scBAPRQlVCOwv7XOg6XqEvP5UNe8KpKN1rHfgP/B0X3ZbyE58bpPI
lqZzTXmY/dWD27FUc+3779EGLZ+L8w==
=PyzM
-----END PGP SIGNATURE-----
regmap: Merge up v6.4-rc6
The fix for maple tree RCU locking on sync is a dependency for the
block sync code for the maple tree.
Currently we use the normal single register write function to load the
default values into the cache, resulting in a large number of reallocations
when there are blocks of registers as we extend the memory region we are
using to store the values. Instead scan through the list of defaults for
blocks of adjacent registers and do a single allocation and insert for each
such block. No functional change.
We do not take advantage of the maple tree preallocation, this is purely at
the regcache level. It is not clear to me yet if the maple tree level would
help much here or if we'd have more overhead from overallocating and then
freeing maple tree data.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230523-regcache-maple-load-defaults-v1-1-0c04336f005d@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Unfortunately the maple tree requires us to explicitly lock it so we need
to take the RCU read lock while iterating. When syncing this means that we
end up trying to write out register values while holding the RCU read lock
which triggers lockdep issues since that is an atomic context but most
buses can't be used in atomic context. Pause the iteration and drop the
lock for each register we check to avoid this.
Reported-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Tested-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230523-regcache-maple-sync-lock-v1-1-530e4d68dfab@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Liam recommends using mas_walk() instead of mas_find() for our use case so
let's do that, it avoids some minor overhead associated with being able to
restart the operation which we don't need since we do a simple search.
Suggested-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230403-regmap-maple-walk-fine-v2-1-c07371c8a867@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Doing the dance to drop the maple tree's internal spinlock means we need
multiple exit paths in our error handling.
Reported-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230403-regmap-maple-unlock-v1-1-89998991b16c@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
The current state of the art for sparse register maps is the
rbtree cache. This works well for most applications but isn't
always ideal for sparser register maps since the rbtree can get
deep, requiring a lot of walking. Fortunately the kernel has a
data structure intended to address this very problem, the maple
tree. Provide an initial implementation of a register cache
based on the maple tree to start taking advantage of it.
The entries stored in the maple tree are arrays of register
values, with the maple tree keys holding the register addresses.
We store data in host native format rather than device native
format as we do for rbtree, this will be a benefit for devices
where we don't marshal data within regmap and simplifies the code
but will result in additional CPU overhead when syncing the cache
on devices where we do marshal data in regmap.
This should work well for a lot of devices, though there's some
additional areas that could be looked at such as caching the
last accessed entry like we do for rbtree and trying to minimise
the maple tree level locking. We should also use bulk writes
rather than single register writes when resyncing the cache where
possible, even if we don't store in device native format.
Very small register maps may continue to to better with rbtree
longer term.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230325-regcache-maple-v3-2-23e271f93dc7@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>