bug-binutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Bug gold/28314] New: [AArch64] Insufficient veneer stub alignment (gold


From: andre.paquette at nokia dot com
Subject: [Bug gold/28314] New: [AArch64] Insufficient veneer stub alignment (gold)
Date: Tue, 07 Sep 2021 17:19:46 +0000

https://sourceware.org/bugzilla/show_bug.cgi?id=28314

            Bug ID: 28314
           Summary: [AArch64] Insufficient veneer stub alignment (gold)
           Product: binutils
           Version: 2.35.1
            Status: UNCONFIRMED
          Severity: enhancement
          Priority: P2
         Component: gold
          Assignee: ccoutant at gmail dot com
          Reporter: andre.paquette at nokia dot com
                CC: ian at airs dot com, jeremip11 at gmail dot com, nickc at 
redhat dot com,
                    pexu at sourceware dot mail.kapsi.fi,
                    unassigned at sourceware dot org, wilson at gcc dot gnu.org
  Target Milestone: ---

+++ This bug was initially created as a clone of Bug #22903 +++

Hi there.  It looks like bug 22903 was only fixed for ld and is still an issue
in gold.  As an experiment I increased STUB_ADDR_ALIGN to 8 for Reloc_stub and
that does resolve the issue.  I'm not sure if increasing the alignment is
appropriate to all reloc stub types, so this was just a proof of concept.

diff --git a/gold/aarch64.cc b/gold/aarch64.cc
index 07abe44931..7626fcad4a 100644
--- a/gold/aarch64.cc
+++ b/gold/aarch64.cc
@@ -1317,7 +1317,7 @@ class Reloc_stub : public Stub_base<size, big_endian>
 };  // End of Reloc_stub

 template<int size, bool big_endian>
-const int Reloc_stub<size, big_endian>::STUB_ADDR_ALIGN = 4;
+const int Reloc_stub<size, big_endian>::STUB_ADDR_ALIGN = 8;

 // Write data to output file.

Original description from bug #22903 follows.


Hi.

It is not currently possible to specify an alignment requirement that will be
used for generated veneer stubs (i.e. far calls for -fpic, -fpie etc. builds).

Currently, the alignment for the stubs is 4 bytes. While this works just fine
for the majority of the systems, it works only because many requisite deeds has
been done beforehand (and a hint of luck, too). 

The problematic veneer template (aarch64_long_branch_stub at
bfd/elfnn-aarch64.c) uses LDR to load the far address. The address itself is
stored after the veneer code block, which does the address loading (via
LDR/ADD) and branching. The template looks like this:

  ldr ip0, 1f # <-- ip0, i.e. X16, i.e. 64-bit register
  adr ip1, #0
  add ip0, ip0, ip1
  br  ip0
  1: .xword <address>

While the address is 8-byte aligned within the stub itself, it will be
misaligned unless the veneer lands on a 8-byte (or more) aligned address.
ARMv8-A ARM clearly states, that unless an address is accessed to the size of
the data element being accessed (i.e. N-bit accesses must be N-bit aligned)
either an Alignment fault is generated or an unaligned access is performed.

It is possible to disable the alignment check, and thus perform an unaligned
access, via system register SCTLR_ELx.A (e.g. the case for Linux). However,
there's a small catch 22 that is well buried into the small details within the
ARM. If the stage 1 address translation is disabled (e.g. MMU disabled),
Device-nGnRnE memory type is assigned to all data accesses (or the address
simply happens to be some type of Device memory, nothing unusual with SoCs).
Unlike Normal memory type, all accesses to any type of Device memory *must* be
aligned, period.

So, if the code has to deal with a large memory area and is not able to use MMU
(say, not available or being set up), and thus no address translation is
enabled, or for whatever reason uses Device memory type, LD's current approach
will generate code, that is highly prone to intermittent failures that could be
difficult to track down (without proper JTAG tools) as no matter how well the
user does his task, the generated code is the source of the failure. Also, it
should be understood that it would be an overkill and highly complex task
trying to recover from this sort of exception (must interpret the bytecode,
then perform aligned access(es), maybe patch the bytecode etc.) while the
proper thing to do is to simply not perform any unaligned accesses when such
accesses are not possible.

Obviously, one can always just generate the long branches by hand, maybe use
static linking where possible, so this is not a roadblocker by no means. As the
subject is rather undocumented and there's apparently a patch readily
available, this should be fixed. Perhaps there is no need to change the default
alignment (without further studies), but it should be possible to change the
alignment nevertheless.

I hope I provided enough background information for this rare, but indeed
curious case!

-- 
You are receiving this mail because:
You are on the CC list for the bug.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]