bug-binutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Bug binutils/30444] New: Implementation of COFF/PE format lacks base64


From: sven.koehler at gmail dot com
Subject: [Bug binutils/30444] New: Implementation of COFF/PE format lacks base64 support (Extended COFF Object)
Date: Sun, 14 May 2023 10:24:24 +0000

https://sourceware.org/bugzilla/show_bug.cgi?id=30444

            Bug ID: 30444
           Summary: Implementation of COFF/PE format lacks base64 support
                    (Extended COFF Object)
           Product: binutils
           Version: 2.38
            Status: UNCONFIRMED
          Severity: normal
          Priority: P2
         Component: binutils
          Assignee: unassigned at sourceware dot org
          Reporter: sven.koehler at gmail dot com
  Target Milestone: ---

Created attachment 14879
  --> https://sourceware.org/bugzilla/attachment.cgi?id=14879&action=edit
7z archive will all the files needed

Find attached a *.c file that contains the declaration of 80000 variables. Each
variable has a name with more than 128 characters. So the size of the string
table is larger than 80 * 128 = 10.240.000.

It turns out, that section names longer than 8 bytes will be encoded with a
slash followed by an offset into the string table. The offset will be encoded
in decimal. Offsets larger than 9.999.999 cannot be encoded (slash followed by
7 decimal digits) since the buffer for the section name has length 8. 

LLVM encodes large offsets by using two slashes followed by a base64 encoding
of a 32bit integer. This encoding is not understood by binutils: ld, objdump,
and possibly other tools don't decode such section names. Instead they print
lots of warnings and linking eventually fails with an error "multiple
definition of ..." because COMDAT sections are not properly eliminated. An
example for such a warning is:
> x86_64-w64-mingw32-objdump: test-llvm.o: warning: COMDAT symbol 
> '.bss$aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa10'
>  does not match section name '//AAph7S'

I have confirmed, that microsoft's dumpbin.exe (equivalent of objdump)
understands base64 encoded section names. So base64 encoding seems to be
supported by Microsoft toolchains. However, I was not able to find
documentation on this. dumpbin calls such files "EXTENDED COFF OBJECT" while
files without base64 encoding are called just "COFF OBJECT".

LLVM's tools (such as lld and llvm-objdump) properly handle base64 section
names as well.

While binutils fails to read such files, the assembler also fails to write such
files. In particular, compiling the attached *.c file with gcc just results in
an error message "File too big".

Find attached the archive test.7z with the following files:
> test.c
> test-llvm.o 
> test-llvm.o.dumpbin
> test-llvm.o.objdump

Steps to reproduce:
> clang -fdata-sections -target x86_64-w64-mingw32 -c test.c -o test-llvm.o
> x86_64-w64-mingw32-objdump -x test-llvm.o >test-llvm.o.objdump 2>&1
> dumpbin.exe /out:test-llvm.o.dumpbin /headers test-llvm.o
> x86_64-w64-mingw32-gcc -fdata-sections -c test.c -o test-gcc.o
See test-llvm.o.objdump for the warnings by objdump. The gcc command gives you
the "file too big" error.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]