bug-binutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Bug ld/23254] New: ld.bfd mishandles file pointers while scanning archi


From: zenith432 at users dot sourceforge.net
Subject: [Bug ld/23254] New: ld.bfd mishandles file pointers while scanning archive with LTO members, causing "malformed archive" errors for a well-formed archive
Date: Sat, 02 Jun 2018 10:39:04 +0000

https://sourceware.org/bugzilla/show_bug.cgi?id=23254

            Bug ID: 23254
           Summary: ld.bfd mishandles file pointers while scanning archive
                    with LTO members, causing "malformed archive" errors
                    for a well-formed archive
           Product: binutils
           Version: 2.31 (HEAD)
            Status: UNCONFIRMED
          Severity: critical
          Priority: P2
         Component: ld
          Assignee: unassigned at sourceware dot org
          Reporter: zenith432 at users dot sourceforge.net
  Target Milestone: ---

When processing an archive with LTO object members, ld.bfd mismanages the file
pointer, occasionally causing "malformed archive" errors even though the
archive is well-formed.

The cause of this problem is as follows:
An archive is processed by the function elf_link_add_archive_symbols
(bfd/elflink.c), which calls for each member the functions
_bfd_get_elt_at_filepos
bfd_check_format
add_archive_element

The first two functions (_bfd_get_elt_at_filepos, bfd_check_format) process the
archive element using the I/O functions found in bfd/bfdio.c and bfd/cache.c. 
These I/O functions use stdio FILE*.  They store the FILE* in abfd->iostream
for the parent archive, and use fseek/fread to scan the archive file.

After this is done, if an archive element is loaded, add_archive_element (in
ld/ldmain.c) is called, which eventually calls plugin_maybe_claim (in
ld/plugin.c) to handle LTO elements.  This calls plugin_object_p (in
ld/plugin.c) which eventually calls the plugin's claim_file_handler method (see
plugin_call_claim_file).  The plugin's claim_file_handler function processes
the archive file using system calls lseek()/read() using a file descriptor set
up in plugin_object_p.

Description of The Problem:
Prior to commit 7d0b9ebc1e0079271a7c7737b53bc026525eab64 (by Alan Modra),
plugin_object_p would obtain a file descriptor to the archive file based on its
name using system call open(), then pass the file descriptor downward to the
plugin's claim_file_handler for it to process the file.

In the commit mentioned above, plugin_object_p was changed so that instead of
opening the archive by name, it obtains the file descriptor from (FILE*)
abfd->iostream using fileno(), then duplicates it with systemcall dup(), then
passes it down to the plugin's claim_file_handler.

*The documentation for system call dup() very clearly explains that the dup'd
file descriptor shares a file pointer with the original descriptor*.

As a result of this change, the plugin's claim_file_handler clobbers the file
pointer using calls to lseek()/read().  HOWEVER, the archive is later continued
to be accessed using fseek()/fread() on abfd->iostream when processing the next
archive element !!

This is WRONG.  Because fseek/fread *do not know* that the file pointer was
changed outside the stdio FILE* framework.  The FILE structure caches a buffer
for the file, and if fseek() gets a position that is within the loaded buffer,
it may not call lseek().  Moreover, if FILE struct caches a position for the
file, this potentially causes fseek() and fread() to believe the file pointer
is in a different place than it really is and they read the wrong part of the
file and eventually cause a "malformed archive" error.

A later commit ace667e59aede65c400381f1cff704b61e8ccb0b by Andrew Burgess added
a small fix that wraps the call to plugin's claim_file_handler in code saving
and restoring the file pointer.  However, restoration of the file pointer is
only done if the plugin does not "claim" the archive element.  It seems this
was done to prevent bugs if the archive consists of non-LTO elements.  However,
if the archive has LTO elements and the element is "claimed", the file pointer
remains clobbered and the problem persists.

This problem started with 2.28 official release of binutils (in line with dates
on above commits), and continues to this very day.
I am able to consistently reproduce the "malformed archive" errors on large
EDK2-based build on macOS.  This is with self-built gcc 8.1 and binutils 2.30
or binutils git master.  I've tried tweaking configure options, but none of
them made any difference.
For some reason I cannot reproduce this problem with EDK2 build on Fedora 28
with GCC 8.1.1.  I've examined the patches Redhat makes to gcc and binutils
from their source RPMs, but there doesn't seem to be anything there that fixes
this.  Another possibility is that fseek() on Fedora works differently than on
macOS, so it always calls lseek() and fixes the file pointer that's been
clobbered, so the issue never surfaces.  I can't be sure because the
development tools are all pre-built binaries and difficult to debug and see
what's going on.

Due to the complexity of this problem I cannot attach a simple example to
reproduce it, and like I said - I'm not sure it reproduces on Fedora at all. 
However, I spent time debugging it and the cause is what is explained above and
can be checked in source code.  At any rate, mixing use of fseek()/fread() with
out-of-line use of lseek()/read() on the same (or dup) file descriptor is
obviously a programming error - as stdio was not meant to be used this way.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]