Commit graph

83 commits

Author SHA1 Message Date
Mark Rutland
dda5f312bb locking/atomic: arm: fix sync ops
The sync_*() ops on arch/arm are defined in terms of the regular bitops
with no special handling. This is not correct, as UP kernels elide
barriers for the fully-ordered operations, and so the required ordering
is lost when such UP kernels are run under a hypervsior on an SMP
system.

Fix this by defining sync ops with the required barriers.

Note: On 32-bit arm, the sync_*() ops are currently only used by Xen,
which requires ARMv7, but the semantics can be implemented for ARMv6+.

Fixes: e54d2f6152 ("xen/arm: sync_bitops")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20230605070124.3741859-2-mark.rutland@arm.com
2023-06-05 09:57:13 +02:00
Ard Biesheuvel
c76c6c4ecb ARM: 9294/2: vfp: Fix broken softirq handling with instrumentation enabled
Commit 62b95a7b44 ("ARM: 9282/1: vfp: Manipulate task VFP state with
softirqs disabled") replaced the en/disable preemption calls inside the
VFP state handling code with en/disabling of soft IRQs, which is
necessary to allow kernel use of the VFP/SIMD unit when handling a soft
IRQ.

Unfortunately, when lockdep is enabled (or other instrumentation that
enables TRACE_IRQFLAGS), the disable path implemented in asm fails to
perform the lockdep and RCU related bookkeeping, resulting in spurious
warnings and other badness.

Set let's rework the VFP entry code a little bit so we can make the
local_bh_disable() call from C, with all the instrumentations that
happen to have been configured. Calling local_bh_enable() can be done
from asm, as it is a simple wrapper around __local_bh_enable_ip(), which
is always a callable function.

Link: https://lore.kernel.org/all/ZBBYCSZUJOWBg1s8@localhost.localdomain/

Fixes: 62b95a7b44 ("ARM: 9282/1: vfp: Manipulate task VFP state with softirqs disabled")
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2023-04-12 10:04:56 +01:00
Ard Biesheuvel
62b95a7b44 ARM: 9282/1: vfp: Manipulate task VFP state with softirqs disabled
In a subsequent patch, we will relax the kernel mode NEON policy, and
permit kernel mode NEON to be used not only from task context, as is
permitted today, but also from softirq context.

Given that softirqs may trigger over the back of any IRQ unless they are
explicitly disabled, we need to address the resulting races in the VFP
state handling, by disabling softirq processing in two distinct but
related cases:
- kernel mode NEON will leave the FPU disabled after it completes, so
  any kernel code sequence that enables the FPU and subsequently accesses
  its registers needs to disable softirqs until it completes;
- kernel_neon_begin() will preserve the userland VFP state in memory,
  and if it interrupts the ordinary VFP state preserve sequence, the
  latter will resume execution with the VFP registers corrupted, and
  happily continue saving them to memory.

Given that disabling softirqs also disables preemption, we can replace
the existing preempt_disable/enable occurrences in the VFP state
handling asm code with new macros that dis/enable softirqs instead.
In the VFP state handling C code, add local_bh_disable/enable() calls
in those places where the VFP state is preserved.

One thing to keep in mind is that, once we allow NEON use in softirq
context, the result of any such interruption is that the FPEXC_EN bit in
the FPEXC register will be cleared, and vfp_current_hw_state[cpu] will
be NULL. This means that any sequence that [conditionally] clears
FPEXC_EN and/or sets vfp_current_hw_state[cpu] to NULL does not need to
run with softirqs disabled, as the result will be the same. Furthermore,
the handling of THREAD_NOTIFY_SWITCH is guaranteed to run with IRQs
disabled, and so it does not need protection from softirq interruptions
either.

Tested-by: Martin Willi <martin@strongswan.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2023-01-11 16:21:20 +00:00
Russell King (Oracle)
2511d032f0 ARM: findbit: operate by words
Convert the implementations to operate on words rather than bytes
which makes bitmap searching faster.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2022-11-14 12:00:58 +00:00
Ard Biesheuvel
508074607c ARM: 9195/1: entry: avoid explicit literal loads
ARMv7 has MOVW/MOVT instruction pairs to load symbol addresses into
registers without having to rely on literal loads that go via the
D-cache.  For older cores, we now support a similar arrangement, based
on PC-relative group relocations.

This means we can elide most literal loads entirely from the entry path,
by switching to the ldr_va macro to emit the appropriate sequence
depending on the target architecture revision.

While at it, switch to the bl_r macro for invoking the right PABT/DABT
helpers instead of setting the LR register explicitly, which does not
play well with cores that speculate across function returns.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2022-05-20 12:32:32 +01:00
Ard Biesheuvel
952f033163 ARM: 9194/1: assembler: simplify ldr_this_cpu for !SMP builds
When CONFIG_SMP is not defined, the CPU offset is always zero, and so
we can simplify the sequence to load a per-CPU variable.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2022-05-20 12:32:32 +01:00
Linus Torvalds
9c0e6a89b5 ARM development updates for 5.18:
Updates for IRQ stacks and virtually mapped stack support for ARM from
 the following pull requests, etc:
 
 1) ARM: support for IRQ and vmap'ed stacks
 
 This PR covers all the work related to implementing IRQ stacks and
 vmap'ed stacks for all 32-bit ARM systems that are currently supported
 by the Linux kernel, including RiscPC and Footbridge. It has been
 submitted for review in three different waves:
 - IRQ stacks support for v7 SMP systems [0],
 - vmap'ed stacks support for v7 SMP systems[1],
 - extending support for both IRQ stacks and vmap'ed stacks for all
   remaining configurations, including v6/v7 SMP multiplatform kernels
   and uniprocessor configurations including v7-M [2]
 
 [0] https://lore.kernel.org/linux-arm-kernel/20211115084732.3704393-1-ardb@kernel.org/
 [1] https://lore.kernel.org/linux-arm-kernel/20211122092816.2865873-1-ardb@kernel.org/
 [2] https://lore.kernel.org/linux-arm-kernel/20211206164659.1495084-1-ardb@kernel.org/
 
 2) ARM: support for IRQ and vmap'ed stacks [v6]
 
 This tag covers the changes between the version of vmap'ed + IRQ stacks
 support pulled into rmk/devel-stable [0] (which was dropped from v5.17
 due to issues discovered too late in the cycle), and my v5 proposed for
 the v5.18 cycle [1].
 
 [0] git://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux.git arm-irq-and-vmap-stacks-for-rmk
 [1] https://lore.kernel.org/linux-arm-kernel/20220124174744.1054712-1-ardb@kernel.org/
 
 3) ARM: ftrace fixes and cleanups
 
 Make all flavors of ftrace available on all builds, regardless of ISA
 choice, unwinder choice or compiler:
 - use ADD not POP where possible
 - fix a couple of Thumb2 related issues
 - enable HAVE_FUNCTION_GRAPH_FP_TEST for robustness
 - enable the graph tracer with the EABI unwinder
 - avoid clobbering frame pointer registers to make Clang happy
 
 Link: https://lore.kernel.org/linux-arm-kernel/20220203082204.1176734-1-ardb@kernel.org/
 
 4) Fixes for the above.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEuNNh8scc2k/wOAE+9OeQG+StrGQFAmI7U9IACgkQ9OeQG+St
 rGQghg/+PmgLJ9zmJrMGOarNLmmGzCbkPi6SrlbaDxriE7ofqo76qrQhGAsWxvDx
 OEYBNdWmOxTi7GP6sozFaTWrpD2ZbuFuKUUpusnjU2sMD/BwYHZZ/lKfZpn7WoE0
 48e2FCFYsJ3sYpROhVgaFWk+64eVwHfZ7pr9pad1gAEB4SAaT+CiuXBsJCl4DBi7
 eobYzLqETtCBkXFUo46n6r0xESdzQfgfZMsh5IpPRyswSPhzqdYrSLXJRmFGBqvT
 FS2gcMgd7IpcVsmd4pgFKe0Y9rBSuMEqsqYvzwSWI4GAgFppZO1R5RvHdS89US4P
 9F6hgxYnJdc8hVhoAZNNi5cCcJp9th3Io97YzTUIm0xgK3nXyhsSGWIk3ahx76mX
 mnCcflUoOP9YVHUuoi1/N7iSe6xwtH+dg0Mn69aM4rNcZh5J59jV2rrNhdnr1Pjb
 XE8iQHJZATHZrxyAtj7PzlnNzJsfVcJyT/WieT0My7tZaZC0cICdKEJ6yurTlCvE
 v7P3EHUYFaQGkQijHFJdstkouY7SHpN0iH18xKErciWOwDmRsgVaoxw18iNIvuY/
 TvSNXJBDgh8is8eV/mmN0iVkK0mYTxhy0G5CHavrgy8STWNC6CdqFtrxZnInoCAz
 wq25QvQtPZcxz1dS9bTuWUfrPATaIeQeCzUsAIiE7u9aP/olL5M=
 =AVCL
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm

Pull ARM updates from Russell King:
 "Updates for IRQ stacks and virtually mapped stack support, and ftrace:

   - Support for IRQ and vmap'ed stacks

     This covers all the work related to implementing IRQ stacks and
     vmap'ed stacks for all 32-bit ARM systems that are currently
     supported by the Linux kernel, including RiscPC and Footbridge. It
     has been submitted for review in four different waves:

      - IRQ stacks support for v7 SMP systems [0]

      - vmap'ed stacks support for v7 SMP systems[1]

      - extending support for both IRQ stacks and vmap'ed stacks for all
        remaining configurations, including v6/v7 SMP multiplatform
        kernels and uniprocessor configurations including v7-M [2]

      - fixes and updates in [3]

   - ftrace fixes and cleanups

     Make all flavors of ftrace available on all builds, regardless of
     ISA choice, unwinder choice or compiler [4]:

      - use ADD not POP where possible

      - fix a couple of Thumb2 related issues

      - enable HAVE_FUNCTION_GRAPH_FP_TEST for robustness

      - enable the graph tracer with the EABI unwinder

      - avoid clobbering frame pointer registers to make Clang happy

   - Fixes for the above"

[0] https://lore.kernel.org/linux-arm-kernel/20211115084732.3704393-1-ardb@kernel.org/
[1] https://lore.kernel.org/linux-arm-kernel/20211122092816.2865873-1-ardb@kernel.org/
[2] https://lore.kernel.org/linux-arm-kernel/20211206164659.1495084-1-ardb@kernel.org/
[3] https://lore.kernel.org/linux-arm-kernel/20220124174744.1054712-1-ardb@kernel.org/
[4] https://lore.kernel.org/linux-arm-kernel/20220203082204.1176734-1-ardb@kernel.org/

* tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm: (62 commits)
  ARM: fix building NOMMU ARMv4/v5 kernels
  ARM: unwind: only permit stack switch when unwinding call_with_stack()
  ARM: Revert "unwind: dump exception stack from calling frame"
  ARM: entry: fix unwinder problems caused by IRQ stacks
  ARM: unwind: set frame.pc correctly for current-thread unwinding
  ARM: 9184/1: return_address: disable again for CONFIG_ARM_UNWIND=y
  ARM: 9183/1: unwind: avoid spurious warnings on bogus code addresses
  Revert "ARM: 9144/1: forbid ftrace with clang and thumb2_kernel"
  ARM: mach-bcm: disable ftrace in SMC invocation routines
  ARM: cacheflush: avoid clobbering the frame pointer
  ARM: kprobes: treat R7 as the frame pointer register in Thumb2 builds
  ARM: ftrace: enable the graph tracer with the EABI unwinder
  ARM: unwind: track location of LR value in stack frame
  ARM: ftrace: enable HAVE_FUNCTION_GRAPH_FP_TEST
  ARM: ftrace: avoid unnecessary literal loads
  ARM: ftrace: avoid redundant loads or clobbering IP
  ARM: ftrace: use trampolines to keep .init.text in branching range
  ARM: ftrace: use ADD not POP to counter PUSH at entry
  ARM: ftrace: ensure that ADR takes the Thumb bit into account
  ARM: make get_current() and __my_cpu_offset() __always_inline
  ...
2022-03-23 17:35:57 -07:00
Russell King (Oracle)
33970b031d ARM: fix co-processor register typo
In the recent Spectre BHB patches, there was a typo that is only
exposed in certain configurations: mcr p15,0,XX,c7,r5,4 should have
been mcr p15,0,XX,c7,c5,4

Reported-by: kernel test robot <lkp@intel.com>
Fixes: b9baf5c8c5 ("ARM: Spectre-BHB workaround")
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-09 12:23:28 -08:00
Linus Torvalds
fc55c23a73 ARM Spectre BHB mitigations
These patches add Spectre BHB migitations for the following Arm CPUs to
 the 32-bit ARM kernels:
 
 Cortex-A15
 Cortex-A57
 Cortex-A72
 Cortex-A73
 Cortex A75
 
 Brahma B15
 
 for CVE-2022-23960.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEuNNh8scc2k/wOAE+9OeQG+StrGQFAmInch4ACgkQ9OeQG+St
 rGT62Q//Xve9O5C6d3I+7hwzVUGgRmYszrLRqLDG2qFP3Vw7hx1VygFRovKiFPD5
 jvVHWMIC6Yev4D7N2DjXpmfULOrL7277EX31QFpdtkvNR5WrSAV7ku0mJm4UmE6+
 WWo3l7d7WfxnbN7ZhRpISYc6aPm0/oYhH6Oux0FXe9eKWVr+hnNjVqBVaoSbnomy
 AcYhh1yy3p680zKvarUndLkYPgCPiCci7+IozxD4MfBV/M5IlIDawW9P0lxMgMZR
 ZbUe6t2k1Tis2EH2gKtj7KB0sDxAUnMD8tWYQylYsBM8wIINLDifuMSBrgU4ZcML
 3stf7IBynn7oA8U+4jrIwc1OEBj64UYqQEPTqg8jaogAB+JfPINNxp7Byq1LnuJm
 iwnmgeapQLRR3sh2jx8C4Boexv9KyIYAhIc2MkciyUlLbBWABLPNxp5cycz5znUr
 mSBPeSj2F0A10LdPT8NauHJj8m2j1U67tyBcRFO6z+T6+krR6zk+Aiqb/XyWOQbN
 Fe3D0SqOw5bd8hDenO5wGqQAuPpKhQhIo+XsbxckQ3jMtFKAABGkCW08gTTmfeDg
 kg56GCvedrzGdZs7xkAzJ/o/AtNxYNdYjWnfc+zJmkLMPbt2qunL7yUkwOuiru29
 biCMyw6j0afPpt7ScJAASTKyuaUgE3HxxWTnk1rgCsl3Ho8MeLU=
 =VHyX
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-bhb' of git://git.armlinux.org.uk/~rmk/linux-arm

Pull ARM spectre fixes from Russell King:
 "ARM Spectre BHB mitigations.

  These patches add Spectre BHB migitations for the following Arm CPUs
  to the 32-bit ARM kernels:
   - Cortex A15
   - Cortex A57
   - Cortex A72
   - Cortex A73
   - Cortex A75
   - Brahma B15
  for CVE-2022-23960"

* tag 'for-linus-bhb' of git://git.armlinux.org.uk/~rmk/linux-arm:
  ARM: include unprivileged BPF status in Spectre V2 reporting
  ARM: Spectre-BHB workaround
  ARM: use LOADADDR() to get load address of sections
  ARM: early traps initialisation
  ARM: report Spectre v2 status through sysfs
2022-03-08 09:08:06 -08:00
Russell King (Oracle)
b9baf5c8c5 ARM: Spectre-BHB workaround
Workaround the Spectre BHB issues for Cortex-A15, Cortex-A57,
Cortex-A72, Cortex-A73 and Cortex-A75. We also include Brahma B15 as
well to be safe, which is affected by Spectre V2 in the same ways as
Cortex-A15.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2022-03-05 10:42:07 +00:00
Ard Biesheuvel
d6905849f8 ARM: assembler: define a Kconfig symbol for group relocation support
Nathan reports the group relocations go out of range in pathological
cases such as allyesconfig kernels, which have little chance of actually
booting but are still used in validation.

So add a Kconfig symbol for this feature, and make it depend on
!COMPILE_TEST.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-01-24 21:02:34 +01:00
Ard Biesheuvel
9f80ccda53 ARM: 9180/1: Thumb2: align ALT_UP() sections in modules sufficiently
When building for Thumb2, the .alt.smp.init sections that are emitted by
the ALT_UP() patching code may not be 32-bit aligned, even though the
fixup_smp_on_up() routine expects that. This results in alignment faults
at module load time, which need to be fixed up by the fault handler.

So let's align those sections explicitly, and prevent this from occurring.

Cc: <stable@vger.kernel.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2022-01-19 11:10:54 +00:00
Ard Biesheuvel
9c46929e79 ARM: implement THREAD_INFO_IN_TASK for uniprocessor systems
On UP systems, only a single task can be 'current' at the same time,
which means we can use a global variable to track it. This means we can
also enable THREAD_INFO_IN_TASK for those systems, as in that case,
thread_info is accessed via current rather than the other way around,
removing the need to store thread_info at the base of the task stack.
This, in turn, permits us to enable IRQ stacks and vmap'ed stacks on UP
systems as well.

To partially mitigate the performance overhead of this arrangement, use
a ADD/ADD/LDR sequence with the appropriate PC-relative group
relocations to load the value of current when needed. This means that
accessing current will still only require a single load as before,
avoiding the need for a literal to carry the address of the global
variable in each function. However, accessing thread_info will now
require this load as well.

Acked-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Marc Zyngier <maz@kernel.org>
Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> # ARMv7M
2021-12-06 12:49:17 +01:00
Ard Biesheuvel
7b9896c352 ARM: percpu: add SMP_ON_UP support
Permit the use of the TPIDRPRW system register for carrying the per-CPU
offset in generic SMP configurations that also target non-SMP capable
ARMv6 cores. This uses the SMP_ON_UP code patching framework to turn all
TPIDRPRW accesses into reads/writes of entry #0 in the __per_cpu_offset
array.

While at it, switch over some existing direct TPIDRPRW accesses in asm
code to invocations of a new helper that is patched in the same way when
necessary.

Note that CPU_V6+SMP without SMP_ON_UP results in a kernel that does not
boot on v6 CPUs without SMP extensions, so add this dependency to
Kconfig as well.

Acked-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Marc Zyngier <maz@kernel.org>
Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> # ARMv7M
2021-12-06 12:49:17 +01:00
Ard Biesheuvel
4e918ab13e ARM: assembler: add optimized ldr/str macros to load variables from memory
We will be adding variable loads to various hot paths, so it makes sense
to add a helper macro that can load variables from asm code without the
use of literal pool entries. On v7 or later, we can simply use MOVW/MOVT
pairs, but on earlier cores, this requires a bit of hackery to emit a
instruction sequence that implements this using a sequence of ADD/LDR
instructions.

Acked-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Marc Zyngier <maz@kernel.org>
Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> # ARMv7M
2021-12-06 12:49:16 +01:00
Ard Biesheuvel
d4664b6c98 ARM: implement IRQ stacks
Now that we no longer rely on the stack pointer to access the current
task struct or thread info, we can implement support for IRQ stacks
cleanly as well.

Define a per-CPU IRQ stack and switch to this stack when taking an IRQ,
provided that we were not already using that stack in the interrupted
context. This is never the case for IRQs taken from user space, but ones
taken while running in the kernel could fire while one taken from user
space has not completed yet.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Tested-by: Keith Packard <keithpac@amazon.com>
Acked-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Marc Zyngier <maz@kernel.org>
Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> # ARMv7M
2021-12-03 15:11:31 +01:00
Ard Biesheuvel
b3ab60b179 ARM: assembler: introduce bl_r macro
Add a bl_r macro that abstract the difference between the ways indirect
calls are performed on older and newer ARM architecture revisions.

The main difference is to prefer blx instructions over explicit LR
assignments when possible, as these tend to confuse the prediction logic
in out-of-order cores when speculating across a function return.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Tested-by: Keith Packard <keithpac@amazon.com>
Tested-by: Marc Zyngier <maz@kernel.org>
Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> # ARMv7M
2021-12-03 15:11:31 +01:00
Ard Biesheuvel
18ed1c01a7 ARM: smp: Enable THREAD_INFO_IN_TASK
Now that we no longer rely on thread_info living at the base of the task
stack to be able to access the 'current' pointer, we can wire up the
generic support for moving thread_info into the task struct itself.

Note that this requires us to update the cpu field in thread_info
explicitly, now that the core code no longer does so. Ideally, we would
switch the percpu code to access the cpu field in task_struct instead,
but this unleashes #include circular dependency hell.

Co-developed-by: Keith Packard <keithpac@amazon.com>
Signed-off-by: Keith Packard <keithpac@amazon.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Tested-by: Amit Daniel Kachhap <amit.kachhap@arm.com>
2021-09-27 16:54:02 +02:00
Ard Biesheuvel
50596b7559 ARM: smp: Store current pointer in TPIDRURO register if available
Now that the user space TLS register is assigned on every return to user
space, we can use it to keep the 'current' pointer while running in the
kernel. This removes the need to access it via thread_info, which is
located at the base of the stack, but will be moved out of there in a
subsequent patch.

Use the __builtin_thread_pointer() helper when available - this will
help GCC understand that reloading the value within the same function is
not necessary, even when using the per-task stack protector (which also
generates accesses via the TLS register). For example, the generated
code below loads TPIDRURO only once, and uses it to access both the
stack canary and the preempt_count fields.

<do_one_initcall>:
       e92d 41f0       stmdb   sp!, {r4, r5, r6, r7, r8, lr}
       ee1d 4f70       mrc     15, 0, r4, cr13, cr0, {3}
       4606            mov     r6, r0
       b094            sub     sp, #80 ; 0x50
       f8d4 34e8       ldr.w   r3, [r4, #1256] ; 0x4e8  <- stack canary
       9313            str     r3, [sp, #76]   ; 0x4c
       f8d4 8004       ldr.w   r8, [r4, #4]             <- preempt count

Co-developed-by: Keith Packard <keithpac@amazon.com>
Signed-off-by: Keith Packard <keithpac@amazon.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Tested-by: Amit Daniel Kachhap <amit.kachhap@arm.com>
2021-09-27 16:54:02 +02:00
Ard Biesheuvel
6468e898c6 ARM: 9039/1: assembler: generalize byte swapping macro into rev_l
Take the 4 instruction byte swapping sequence from the decompressor's
head.S, and turn it into a rev_l GAS macro for general use. While
at it, make it use the 'rev' instruction when compiling for v6 or
later.

Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2021-02-01 19:41:30 +00:00
Ard Biesheuvel
450abd38fe ARM: kernel: use relative references for UP/SMP alternatives
Currently, the .alt.smp.init section contains the virtual addresses
of the patch sites. Since patching may occur both before and after
switching into virtual mode, this requires some manual handling of
the address when applying the UP alternative.

Let's simplify this by using relative offsets in the table entries:
this allows us to simply add each entry's address to its contents,
regardless of whether we are running in virtual mode or not.

Reviewed-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2020-10-28 17:05:39 +01:00
Ard Biesheuvel
0b1674638a ARM: assembler: introduce adr_l, ldr_l and str_l macros
Like arm64, ARM supports position independent code sequences that
produce symbol references with a greater reach than the ordinary
adr/ldr instructions. Since on ARM, the adrl pseudo-instruction is
only supported in ARM mode (and not at all when using Clang), having
a adr_l macro like we do on arm64 is useful, and increases symmetry
as well.

Currently, we use open coded instruction sequences involving literals
and arithmetic operations. Instead, we can use movw/movt pairs on v7
CPUs, circumventing the D-cache entirely.

E.g., on v7+ CPUs, we can emit a PC-relative reference as follows:

       movw         <reg>, #:lower16:<sym> - (1f + 8)
       movt         <reg>, #:upper16:<sym> - (1f + 8)
  1:   add          <reg>, <reg>, pc

For older CPUs, we can emit the literal into a subsection, allowing it
to be emitted out of line while retaining the ability to perform
arithmetic on label offsets.

E.g., on pre-v7 CPUs, we can emit a PC-relative reference as follows:

       ldr          <reg>, 2f
  1:   add          <reg>, <reg>, pc
       .subsection  1
  2:   .long        <sym> - (1b + 8)
       .previous

This is allowed by the assembler because, unlike ordinary sections,
subsections are combined into a single section in the object file, and
so the label references are not true cross-section references that are
visible as relocations. (Subsections have been available in binutils
since 2004 at least, so they should not cause any issues with older
toolchains.)

So use the above to implement the macros mov_l, adr_l, ldr_l and str_l,
all of which will use movw/movt pairs on v7 and later CPUs, and use
PC-relative literals otherwise.

Reviewed-by: Nicolas Pitre <nico@fluxnic.net>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2020-10-28 16:59:43 +01:00
Linus Torvalds
c2b0fc847f ARM updates for 5.8-rc1:
- remove a now unnecessary usage of the KERNEL_DS for
   sys_oabi_epoll_ctl()
 - update my email address in a number of drivers
 - decompressor EFI updates from Ard Biesheuvel
 - module unwind section handling updates
 - sparsemem Kconfig cleanups
 - make act_mm macro respect THREAD_SIZE
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEuNNh8scc2k/wOAE+9OeQG+StrGQFAl7VacAACgkQ9OeQG+St
 rGRHaA//Z8m+8LnSd+sqwrlEZYVj6IdPoihOhVZRsEhp4Fb09ZENOxL06W4lynHy
 tPcs4YAqLyp3Xmn+pk5NuH3SBoFGPOuUxseMpQKyT2ZnA126LB/3sW+xFPSbCayY
 gBHT7QhX6MxJEwxCxgvp2McOs6F55rYENcjozQ+DQMiNY5MTm0fKgGgbn1kpzglz
 7N2U7MR9ulXTCof3hZolQWBMOKa6LRldG7C3ajPITeOtk+vjyAPobqrkbzRDsIPV
 09j6BruFQoUbuyxtycNC0x+BDotrS/NN5OyhR07eJR5R0QNDW+qn8iqrkkVQUQsr
 mZpTR8CelzLL2+/1CDY2KrweY13eFbDoxiTVJl9aqCdlOsJKxwk1yv4HrEcpbBoK
 vtKwPDxPIKxFeJSCJX3xFjg9g6mRrBJ5CItPOThVgEqNt/dsbogqXlX4UhIjXzPs
 DBbeQ+EEZgNg7Ws/EwXIwtM8ZPc+bZZY8fskJd0gRCjbiCtstXXNjsHRd1vZ16KM
 yytpDxEIB7A+6lxcnV80VSCjD++A//kVThZ5kBl+ec1HOxRSyYOGIMGUMZhuyfE8
 f4xE3KVVsbqHGyh94C6tDLx73XgkmjfNx8YAgGRss+fQBoJbmwkJ0fDy4MhKlznD
 UnVcOXSjs7Iqih7R+icAtbIkbo1EUF5Mwu2I3SEZ/FOJmzEbbCY=
 =vMHE
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm

Pull ARM updates from Russell King:

 - remove a now unnecessary usage of the KERNEL_DS for
   sys_oabi_epoll_ctl()

 - update my email address in a number of drivers

 - decompressor EFI updates from Ard Biesheuvel

 - module unwind section handling updates

 - sparsemem Kconfig cleanups

 - make act_mm macro respect THREAD_SIZE

* tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
  ARM: 8980/1: Allow either FLATMEM or SPARSEMEM on the multiplatform build
  ARM: 8979/1: Remove redundant ARCH_SPARSEMEM_DEFAULT setting
  ARM: 8978/1: mm: make act_mm() respect THREAD_SIZE
  ARM: decompressor: run decompressor in place if loaded via UEFI
  ARM: decompressor: move GOT into .data for EFI enabled builds
  ARM: decompressor: defer loading of the contents of the LC0 structure
  ARM: decompressor: split off _edata and stack base into separate object
  ARM: decompressor: move headroom variable out of LC0
  ARM: 8976/1: module: allow arch overrides for .init section names
  ARM: 8975/1: module: fix handling of unwind init sections
  ARM: 8974/1: use SPARSMEM_STATIC when SPARSEMEM is enabled
  ARM: 8971/1: replace the sole use of a symbol with its definition
  ARM: 8969/1: decompressor: simplify libfdt builds
  Update rmk's email address in various drivers
  ARM: compat: remove KERNEL_DS usage in sys_oabi_epoll_ctl()
2020-06-01 15:36:32 -07:00
Russell King
747ffc2fcf ARM: uaccess: consolidate uaccess asm to asm/uaccess-asm.h
Consolidate the user access assembly code to asm/uaccess-asm.h.  This
moves the csdb, check_uaccess, uaccess_mask_range_ptr, uaccess_enable,
uaccess_disable, uaccess_save, uaccess_restore macros, and creates two
new ones for exception entry and exit - uaccess_entry and uaccess_exit.

This makes the uaccess_save and uaccess_restore macros private to
asm/uaccess-asm.h.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2020-05-03 17:30:24 +01:00
Jian Cai
a780e485b5 ARM: 8971/1: replace the sole use of a symbol with its definition
ALT_UP_B macro sets symbol up_b_offset via .equ to an expression
involving another symbol. The macro gets expanded twice when
arch/arm/kernel/sleep.S is assembled, creating a scenario where
up_b_offset is set to another expression involving symbols while its
current value is based on symbols. LLVM integrated assembler does not
allow such cases, and based on the documentation of binutils, "Values
that are based on expressions involving other symbols are allowed, but
some targets may restrict this to only being done once per assembly", so
it may be better to avoid such cases as it is not clearly stated which
targets should support or disallow them. The fix in this case is simple,
as up_b_offset has only one use, so we can replace the use with the
definition and get rid of up_b_offset.

 Link:https://github.com/ClangBuiltLinux/linux/issues/920

 Reviewed-by: Stefan Agner <stefan@agner.ch>

Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Jian Cai <caij2003@gmail.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2020-04-29 13:30:20 +01:00
Thomas Gleixner
d2912cb15b treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500
Based on 2 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license version 2 as
  published by the free software foundation

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license version 2 as
  published by the free software foundation #

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-only

has been chosen to replace the boilerplate/reference in 4122 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Enrico Weigelt <info@metux.net>
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190604081206.933168790@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-19 17:09:55 +02:00
Stefan Agner
c001899a5d ARM: 8843/1: use unified assembler in headers
Use unified assembler syntax (UAL) in headers. Divided syntax is
considered deprecated. This will also allow to build the kernel
using LLVM's integrated assembler.

Signed-off-by: Stefan Agner <stefan@agner.ch>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2019-02-26 11:26:06 +00:00
Vincent Whitchurch
f441882a52 ARM: 8812/1: Optimise copy_{from/to}_user for !CPU_USE_DOMAINS
ARMv6+ processors do not use CONFIG_CPU_USE_DOMAINS and use privileged
ldr/str instructions in copy_{from/to}_user.  They are currently
unnecessarily using single ldr/str instructions and can use ldm/stm
instructions instead like memcpy does (but with appropriate fixup
tables).

This speeds up a "dd if=foo of=bar bs=32k" on a tmpfs filesystem by
about 4% on my Cortex-A9.

before:134217728 bytes (128.0MB) copied, 0.543848 seconds, 235.4MB/s
before:134217728 bytes (128.0MB) copied, 0.538610 seconds, 237.6MB/s
before:134217728 bytes (128.0MB) copied, 0.544356 seconds, 235.1MB/s
before:134217728 bytes (128.0MB) copied, 0.544364 seconds, 235.1MB/s
before:134217728 bytes (128.0MB) copied, 0.537130 seconds, 238.3MB/s
before:134217728 bytes (128.0MB) copied, 0.533443 seconds, 240.0MB/s
before:134217728 bytes (128.0MB) copied, 0.545691 seconds, 234.6MB/s
before:134217728 bytes (128.0MB) copied, 0.534695 seconds, 239.4MB/s
before:134217728 bytes (128.0MB) copied, 0.540561 seconds, 236.8MB/s
before:134217728 bytes (128.0MB) copied, 0.541025 seconds, 236.6MB/s

 after:134217728 bytes (128.0MB) copied, 0.520445 seconds, 245.9MB/s
 after:134217728 bytes (128.0MB) copied, 0.527846 seconds, 242.5MB/s
 after:134217728 bytes (128.0MB) copied, 0.519510 seconds, 246.4MB/s
 after:134217728 bytes (128.0MB) copied, 0.527231 seconds, 242.8MB/s
 after:134217728 bytes (128.0MB) copied, 0.525030 seconds, 243.8MB/s
 after:134217728 bytes (128.0MB) copied, 0.524236 seconds, 244.2MB/s
 after:134217728 bytes (128.0MB) copied, 0.523659 seconds, 244.4MB/s
 after:134217728 bytes (128.0MB) copied, 0.525018 seconds, 243.8MB/s
 after:134217728 bytes (128.0MB) copied, 0.519249 seconds, 246.5MB/s
 after:134217728 bytes (128.0MB) copied, 0.518527 seconds, 246.9MB/s

Reviewed-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2018-11-12 10:51:59 +00:00
Russell King
3e98d24098 Merge branches 'fixes', 'misc' and 'spectre' into for-next 2018-10-10 13:53:33 +01:00
Julien Thierry
afaf6838f4 ARM: 8796/1: spectre-v1,v1.1: provide helpers for address sanitization
Introduce C and asm helpers to sanitize user address, taking the
address range they target into account.

Use asm helper for existing sanitization in __copy_from_user().

Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2018-10-05 10:51:15 +01:00
Russell King
c61b466d4f Merge branches 'fixes', 'misc' and 'spectre' into for-linus
Conflicts:
	arch/arm/include/asm/uaccess.h

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2018-08-13 16:28:50 +01:00
Russell King
a3c0f84765 ARM: spectre-v1: mitigate user accesses
Spectre variant 1 attacks are about this sequence of pseudo-code:

	index = load(user-manipulated pointer);
	access(base + index * stride);

In order for the cache side-channel to work, the access() must me made
to memory which userspace can detect whether cache lines have been
loaded.  On 32-bit ARM, this must be either user accessible memory, or
a kernel mapping of that same user accessible memory.

The problem occurs when the load() speculatively loads privileged data,
and the subsequent access() is made to user accessible memory.

Any load() which makes use of a user-maniplated pointer is a potential
problem if the data it has loaded is used in a subsequent access.  This
also applies for the access() if the data loaded by that access is used
by a subsequent access.

Harden the get_user() accessors against Spectre attacks by forcing out
of bounds addresses to a NULL pointer.  This prevents get_user() being
used as the load() step above.  As a side effect, put_user() will also
be affected even though it isn't implicated.

Also harden copy_from_user() by redoing the bounds check within the
arm_copy_from_user() code, and NULLing the pointer if out of bounds.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2018-08-02 17:41:38 +01:00
Russell King
0ac000e867 Merge branches 'fixes', 'misc' and 'spectre' into for-linus 2018-06-05 10:03:27 +01:00
Russell King
a78d156587 ARM: spectre-v1: add speculation barrier (csdb) macros
Add assembly and C macros for the new CSDB instruction.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Boot-tested-by: Tony Lindgren <tony@atomide.com>
Reviewed-by: Tony Lindgren <tony@atomide.com>
2018-05-31 23:27:16 +01:00
Masami Hiramatsu
0d73c3f8e7 ARM: 8772/1: kprobes: Prohibit kprobes on get_user functions
Since do_undefinstr() uses get_user to get the undefined
instruction, it can be called before kprobes processes
recursive check. This can cause an infinit recursive
exception.
Prohibit probing on get_user functions.

Fixes: 24ba613c9d ("ARM kprobes: core code")
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2018-05-19 11:35:56 +01:00
Russell King
8bafae202c ARM: BUG if jumping to usermode address in kernel mode
Detect if we are returning to usermode via the normal kernel exit paths
but the saved PSR value indicates that we are in kernel mode.  This
could occur due to corrupted stack state, which has been observed with
"ftracetest".

This ensures that we catch the problem case before we get to user code.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2017-11-26 15:41:39 +00:00
Arnd Bergmann
ffa47aa678 ARM: Prepare for randomized task_struct
With the new task struct randomization, we can run into a build
failure for certain random seeds, which will place fields beyond
the allow immediate size in the assembly:

arch/arm/kernel/entry-armv.S: Assembler messages:
arch/arm/kernel/entry-armv.S:803: Error: bad immediate value for offset (4096)

Only two constants in asm-offset.h are affected, and I'm changing
both of them here to work correctly in all configurations.

One more macro has the problem, but is currently unused, so this
removes it instead of adding complexity.

Suggested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
[kees: Adjust commit log slightly]
Signed-off-by: Kees Cook <keescook@chromium.org>
2017-06-30 12:00:50 -07:00
Vladimir Murzin
b2bf482a50 ARM: 8605/1: V7M: fix notrace variant of save_and_disable_irqs
Commit 8e43a905 "ARM: 7325/1: fix v7 boot with lockdep enabled"
introduced notrace variant of save_and_disable_irqs to balance notrace
variant of restore_irqs; however V7M case has been missed. It was not
noticed because cache-v7.S the only place where notrace variant is used.
So fix it, since we are going to extend V7 cache routines to handle V7M
case too.

Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Tested-by: Andras Szemzo <sza@esh.hu>
Tested-by: Joachim Eastwood <manabian@gmail.com>
Tested-by: Alexandre TORGUE <alexandre.torgue@st.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-09-06 15:51:07 +01:00
Russell King
e6a9dc6129 ARM: introduce svc_pt_regs structure
Since the privileged mode pt_regs are an extended version of the saved
userland pt_regs, introduce a new svc_pt_regs structure to describe this
layout.

Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2016-06-22 19:54:52 +01:00
Russell King
5745eef6b8 ARM: rename S_FRAME_SIZE to PT_REGS_SIZE
S_FRAME_SIZE is no longer the size of the kernel stack frame, so this
name is misleading.  It is the size of the kernel pt_regs structure.
Name it so.

Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2016-06-22 19:54:28 +01:00
Linus Torvalds
57e6bbcb4b Merge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm
Pull ARM fixes from Russell King:
 "A number of fixes for the merge window, fixing a number of cases
  missed when testing the uaccess code, particularly cases which only
  show up with certain compiler versions"

* 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
  ARM: 8431/1: fix alignement of __bug_table section entries
  arm/xen: Enable user access to the kernel before issuing a privcmd call
  ARM: domains: add memory dependencies to get_domain/set_domain
  ARM: domains: thread_info.h no longer needs asm/domains.h
  ARM: uaccess: fix undefined instruction on ARMv7M/noMMU
  ARM: uaccess: remove unneeded uaccess_save_and_disable macro
  ARM: swpan: fix nwfpe for uaccess changes
  ARM: 8429/1: disable GCC SRA optimization
2015-09-14 12:24:10 -07:00
Russell King
296254f322 ARM: uaccess: remove unneeded uaccess_save_and_disable macro
This macro is never referenced, remove it.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-09-09 23:26:40 +01:00
Russell King
40d3f02851 Merge branches 'cleanup', 'fixes', 'misc', 'omap-barrier' and 'uaccess' into for-linus 2015-09-03 15:28:37 +01:00
Russell King
a5e090acbf ARM: software-based priviledged-no-access support
Provide a software-based implementation of the priviledged no access
support found in ARMv8.1.

Userspace pages are mapped using a different domain number from the
kernel and IO mappings.  If we switch the user domain to "no access"
when we enter the kernel, we can prevent the kernel from touching
userspace.

However, the kernel needs to be able to access userspace via the
various user accessor functions.  With the wrapping in the previous
patch, we can temporarily enable access when the kernel needs user
access, and re-disable it afterwards.

This allows us to trap non-intended accesses to userspace, eg, caused
by an inadvertent dereference of the LIST_POISON* values, which, with
appropriate user mappings setup, can be made to succeed.  This in turn
can allow use-after-free bugs to be further exploited than would
otherwise be possible.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-08-26 20:34:24 +01:00
Russell King
2190fed67b ARM: entry: provide uaccess assembly macro hooks
Provide hooks into the kernel entry and exit paths to permit control
of userspace visibility to the kernel.  The intended use is:

- on entry to kernel from user, uaccess_disable will be called to
  disable userspace visibility
- on exit from kernel to user, uaccess_enable will be called to
  enable userspace visibility
- on entry from a kernel exception, uaccess_save_and_disable will be
  called to save the current userspace visibility setting, and disable
  access
- on exit from a kernel exception, uaccess_restore will be called to
  restore the userspace visibility as it was before the exception
  occurred.

These hooks allows us to keep userspace visibility disabled for the
vast majority of the kernel, except for localised regions where we
want to explicitly access userspace.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-08-26 20:27:02 +01:00
Russell King
3302caddf1 ARM: entry: efficiency cleanups
Make the "fast" syscall return path fast again.  The addition of IRQ
tracing and context tracking has made this path grossly inefficient.
We can do much better if these options are enabled if we save the
syscall return code on the stack - we then don't need to save a bunch
of registers around every single callout to C code.

Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-08-25 10:32:48 +01:00
Russell King
01e09a2816 ARM: entry: get rid of asm_trace_hardirqs_on_cond
There's no need for this macro, it can use a default for the
condition argument.

Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-08-25 10:32:46 +01:00
Russell King
14327c6628 ARM: replace BSYM() with badr assembly macro
BSYM() was invented to allow us to work around a problem with the
assembler, where local symbols resolved by the assembler for the 'adr'
instruction did not take account of their ISA.

Since we don't want BSYM() used elsewhere, replace BSYM() with a new
macro 'badr', which is like the 'adr' pseudo-op, but with the BSYM()
mechanics integrated into it.  This ensures that the BSYM()-ification
is only used in conjunction with 'adr'.

Acked-by: Dave Martin <Dave.Martin@arm.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-05-08 17:33:50 +01:00
Russell King
89c6bc5884 ARM: allow 16-bit instructions in ALT_UP()
Allow ALT_UP() to cope with a 16-bit Thumb instruction by automatically
inserting a following nop instruction.  This allows us to care less
about getting the assembler to emit a 32-bit thumb instruction.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-04-14 22:26:51 +01:00
Russell King
6ebbf2ce43 ARM: convert all "mov.* pc, reg" to "bx reg" for ARMv6+
ARMv6 and greater introduced a new instruction ("bx") which can be used
to return from function calls.  Recent CPUs perform better when the
"bx lr" instruction is used rather than the "mov pc, lr" instruction,
and this sequence is strongly recommended to be used by the ARM
architecture manual (section A.4.1.1).

We provide a new macro "ret" with all its variants for the condition
code which will resolve to the appropriate instruction.

Rather than doing this piecemeal, and miss some instances, change all
the "mov pc" instances to use the new macro, with the exception of
the "movs" instruction and the kprobes code.  This allows us to detect
the "mov pc, lr" case and fix it up - and also gives us the possibility
of deploying this for other registers depending on the CPU selection.

Reported-by: Will Deacon <will.deacon@arm.com>
Tested-by: Stephen Warren <swarren@nvidia.com> # Tegra Jetson TK1
Tested-by: Robert Jarzmik <robert.jarzmik@free.fr> # mioa701_bootresume.S
Tested-by: Andrew Lunn <andrew@lunn.ch> # Kirkwood
Tested-by: Shawn Guo <shawn.guo@freescale.com>
Tested-by: Tony Lindgren <tony@atomide.com> # OMAPs
Tested-by: Gregory CLEMENT <gregory.clement@free-electrons.com> # Armada XP, 375, 385
Acked-by: Sekhar Nori <nsekhar@ti.com> # DaVinci
Acked-by: Christoffer Dall <christoffer.dall@linaro.org> # kvm/hyp
Acked-by: Haojian Zhuang <haojian.zhuang@gmail.com> # PXA3xx
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> # Xen
Tested-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> # ARMv7M
Tested-by: Simon Horman <horms+renesas@verge.net.au> # Shmobile
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2014-07-18 12:29:04 +01:00