bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [Bug 1776920] Re: qemu-img convert on Mac OSX creates c


From: Eric Blake
Subject: Re: [Qemu-devel] [Bug 1776920] Re: qemu-img convert on Mac OSX creates corrupt images
Date: Fri, 7 Sep 2018 17:05:25 -0500
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1

On 09/07/2018 04:12 PM, Bruno Haible wrote:
Why would that be important? As far as I understand the SEEK_DATA facility
from the man page [1], the most immediate way to make reasonable use of it is
to call
    offset = lseek (fd, SEEK_DATA, offset);
and
    offset = lseek (fd, SEEK_HOLE, offset);
alternatingly. On Linux, you may start with SEEK_DATA or SEEK_HOLE; on
macOS, you would need to start with SEEK_HOLE because starting with
SEEK_DATA won't work. Is it a _that_ big problem?
If you are doing a single pass over a file starting from offset 0, then 
yes, alternating between SEEK_DATA first and SEEK_HOLE second will visit 
the entire file, with every seek starting from an extent boundary and 
thus not triggering the bug at hand (and yes, that order is important, 
because of Solaris - read on to see why starting with SEEK_HOLE at 
offset 0 is a bad idea).  And on MacOS, SEEK_DATA on offset 0 returns 0, 
if there is no leading hole - the bug at hand is only triggered when you 
query an offset that does not start an extent, but 0 always starts an 
extent.  But if you are doing random-access reads of portions of the 
file, and want to know whether a given offset lies within data or a hole 
(and the file is not being modified by another parallel process), and do 
not already know if your offset lies on an extent boundary, then this 
bug is nasty.  let's consider your options.
On Linux, if you call both SEEK_DATA and SEEK_HOLE on an offset that is 
in bounds, then you will always have one of the two calls return the 
same offset back.
On Solaris, if you call both, one of the two calls will return the same 
offset, except in the special case that a file that ends in a hole and 
your offset lies in that final hole (then, DATA fails with ENXIO, while 
HOLE returns the end of the file instead of the current offset).  And 
that's why starting with SEEK_HOLE at offset 0 is insufficient - if the 
answer is larger than 0, you still don't know if the file starts with 
data, or is composed of a single hole, without making a second syscall.
If you want to know where the current data/hole ends, then making both 
calls gets you that answer every time.  But if all you care about is 
whether you are in data or a hole, and not where it ends, the fact that 
one of the two answers should return the same offset means you can 
optimize and make a single SEEK_DATA query to learn where you are in the 
file (if it is the same offset, you are in data; if it returns a 
different offset or ENXIO you are in a hole).  True, you often need to 
know where the current extent ends, but if DATA returned a different 
answer, then you already know you are in a hole and where the hole ends 
without having to check HOLE.  Also, there are some cases where if you 
know the file system has 64k extents as its minimum hole size, but you 
don't need to read 64k of data at your starting offset, then you don't 
need to query for the end (since you won't hit an extent flip in the 
meantime).
That is, until MacOS comes along, and now both queries return a 
different offset than your input, but neither fails.  If you optimized 
by calling SEEK_DATA first, you end up treating the current offset as a 
hole (data loss). And if you make both calls looking for the 
POSIX-specified patterns, your logic can be thrown off (at which point 
the only sane response is to treat SEEK_HOLE as broken, and read the 
entire file rather than benefitting from skipping reads of holes).  And 
if you swap things to call SEEK_HOLE instead of SEEK_DATA first, you run 
into the issue with Solaris behavior on trailing holes.
As for why random access determination of data/hole is even useful, it 
helps to understand what qemu is doing.  It uses the qcow2 format which 
remaps a sparse guest view into a compact host file; reading sequential 
guest addresses can indeed read out-of-order on the host file, and more 
importantly, you tend to start reading at guest offset 0, but host 
offset 0 is always the qcow2 header, so the very first read of guest 
data will occur at a host offset larger than 0 - which makes it very 
likely that the first address for a SEEK_DATA query is indeed not 
aligned to an extent boundary.
--
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



reply via email to

[Prev in Thread] Current Thread [Next in Thread]