bug-binutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Bug ld/30333] New: [avr-ld] NOPs not removed after rcall for devices wi


From: sourceware-bugzilla at mhxnet dot de
Subject: [Bug ld/30333] New: [avr-ld] NOPs not removed after rcall for devices with >8k of flash even with -mrelax
Date: Tue, 11 Apr 2023 11:33:55 +0000

https://sourceware.org/bugzilla/show_bug.cgi?id=30333

            Bug ID: 30333
           Summary: [avr-ld] NOPs not removed after rcall for devices with
                    >8k of flash even with -mrelax
           Product: binutils
           Version: 2.40
            Status: UNCONFIRMED
          Severity: normal
          Priority: P2
         Component: ld
          Assignee: unassigned at sourceware dot org
          Reporter: sourceware-bugzilla at mhxnet dot de
  Target Milestone: ---

Created attachment 14811
  --> https://sourceware.org/bugzilla/attachment.cgi?id=14811&action=edit
Reproduction code & build script

I've been working on a bootloader for xmega3 cores recently and noticed that as
soon as I compile the code for devices with more than 8k of flash, the size of
the binary increases by more than 20 bytes (almost 5% of the bootloader binary
size). The issue isn't limited to xmega3, though, and I've used an older core
in the examples further down.

My assumption from reading various pieces of documentation is that `-mrelax` is
supposed to take care of replacing long calls with short calls and shrinking
the
holes in the binary accordingly. If it doesn't shrink the binary, there's no
obvious point apart from the short calls executing in one cycle less. From
glancing at the code, it seems that shrinking of some sort is implemented, but
it's not clear to me if it's doing what it's supposed to.

Here's an example to reproduce the behaviour:

    static void __attribute__((__noinline__)) f(void)
    {
      *((volatile char *) 0x0140) = 42;
    }

    __attribute__((naked, section(".vectors"), noreturn)) void start(void)
    {
      f();
      for(;;){}
      __builtin_unreachable();
    }

Compiling this with

    avr-gcc -mmcu=atmega88 -Os -mrelax -nostartfiles -nostdlib -o x.elf x.c

yields:

    00000000 <start>:
       0:       01 d0           rcall   .+2             ; 0x4 <f>

    00000002 <.L3>:
       2:       ff cf           rjmp    .-2             ; 0x2 <.L3>

    00000004 <f>:
       4:       8a e2           ldi     r24, 0x2A       ; 42
       6:       80 93 40 01     sts     0x0140, r24     ; 0x800140 <_end+0x40>
       a:       08 95           ret

Compiling it instead for `atmega168`:

    00000000 <start>:
       0:       02 d0           rcall   .+4             ; 0x6 <f>
       2:       00 00           nop

    00000004 <.L3>:
       4:       ff cf           rjmp    .-2             ; 0x4 <.L3>

    00000006 <f>:
       6:       8a e2           ldi     r24, 0x2A       ; 42
       8:       80 93 40 01     sts     0x0140, r24     ; 0x800140 <_end+0x40>
       c:       08 95           ret

Dropping the `-mrelax` will generate a `call` instead of an `rcall`+`nop`.

My expectation would be that, at least with `-mrelax`, I get an `rcall` without
a `nop` regardless of the flash size of the MCU.

If this isn't a bug, I'd like to understand why, as I haven't found any
documentation that would explain this behaviour.

I'm using `crossdev`-based builds of gcc/binutils on Gentoo Linux.

    avr-gcc (Gentoo 13.0.1_pre20230305 p8) 13.0.1 20230305 (experimental)
    GNU ld (Gentoo 2.40 p4) 2.40.0

The behaviour doesn't change if I e.g. use an older version of gcc.

I'm attaching a tarball with the reproduction code and a script to build ELF
and
disassembly files for two MCUs.

I'm more than happy to provide more information if needed.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]