Re: [Qemu-devel] [Bug 1776920] Re: qemu-img convert on Mac OSX creates c

bug-gnulib

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [Bug 1776920] Re: qemu-img convert on Mac OSX creates c

From:	Eric Blake
Subject:	Re: [Qemu-devel] [Bug 1776920] Re: qemu-img convert on Mac OSX creates corrupt images
Date:	Fri, 7 Sep 2018 17:05:25 -0500
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1

On 09/07/2018 04:12 PM, Bruno Haible wrote:

Why would that be important? As far as I understand the SEEK_DATA facility
from the man page [1], the most immediate way to make reasonable use of it is
to call
    offset = lseek (fd, SEEK_DATA, offset);
and
    offset = lseek (fd, SEEK_HOLE, offset);
alternatingly. On Linux, you may start with SEEK_DATA or SEEK_HOLE; on
macOS, you would need to start with SEEK_HOLE because starting with
SEEK_DATA won't work. Is it a _that_ big problem?

If you are doing a single pass over a file starting from offset 0, thenyes, alternating between SEEK_DATA first and SEEK_HOLE second will visitthe entire file, with every seek starting from an extent boundary andthus not triggering the bug at hand (and yes, that order is important,because of Solaris - read on to see why starting with SEEK_HOLE atoffset 0 is a bad idea). And on MacOS, SEEK_DATA on offset 0 returns 0,if there is no leading hole - the bug at hand is only triggered when youquery an offset that does not start an extent, but 0 always starts anextent. But if you are doing random-access reads of portions of thefile, and want to know whether a given offset lies within data or a hole(and the file is not being modified by another parallel process), and donot already know if your offset lies on an extent boundary, then thisbug is nasty. let's consider your options.

On Linux, if you call both SEEK_DATA and SEEK_HOLE on an offset that isin bounds, then you will always have one of the two calls return thesame offset back.

On Solaris, if you call both, one of the two calls will return the sameoffset, except in the special case that a file that ends in a hole andyour offset lies in that final hole (then, DATA fails with ENXIO, whileHOLE returns the end of the file instead of the current offset). Andthat's why starting with SEEK_HOLE at offset 0 is insufficient - if theanswer is larger than 0, you still don't know if the file starts withdata, or is composed of a single hole, without making a second syscall.

If you want to know where the current data/hole ends, then making bothcalls gets you that answer every time. But if all you care about iswhether you are in data or a hole, and not where it ends, the fact thatone of the two answers should return the same offset means you canoptimize and make a single SEEK_DATA query to learn where you are in thefile (if it is the same offset, you are in data; if it returns adifferent offset or ENXIO you are in a hole). True, you often need toknow where the current extent ends, but if DATA returned a differentanswer, then you already know you are in a hole and where the hole endswithout having to check HOLE. Also, there are some cases where if youknow the file system has 64k extents as its minimum hole size, but youdon't need to read 64k of data at your starting offset, then you don'tneed to query for the end (since you won't hit an extent flip in themeantime).

That is, until MacOS comes along, and now both queries return adifferent offset than your input, but neither fails. If you optimizedby calling SEEK_DATA first, you end up treating the current offset as ahole (data loss). And if you make both calls looking for thePOSIX-specified patterns, your logic can be thrown off (at which pointthe only sane response is to treat SEEK_HOLE as broken, and read theentire file rather than benefitting from skipping reads of holes). Andif you swap things to call SEEK_HOLE instead of SEEK_DATA first, you runinto the issue with Solaris behavior on trailing holes.

As for why random access determination of data/hole is even useful, ithelps to understand what qemu is doing. It uses the qcow2 format whichremaps a sparse guest view into a compact host file; reading sequentialguest addresses can indeed read out-of-order on the host file, and moreimportantly, you tend to start reading at guest offset 0, but hostoffset 0 is always the qcow2 header, so the very first read of guestdata will occur at a host offset larger than 0 - which makes it verylikely that the first address for a SEEK_DATA query is indeed notaligned to an extent boundary.


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Qemu-devel] [Bug 1776920] Re: qemu-img convert on Mac OSX creates corrupt images, Eric Blake, 2018/09/07
- Re: [Qemu-devel] [Bug 1776920] Re: qemu-img convert on Mac OSX creates corrupt images, Bruno Haible, 2018/09/07
  - Re: [Qemu-devel] [Bug 1776920] Re: qemu-img convert on Mac OSX creates corrupt images, Eric Blake <=
    - Re: [Qemu-devel] [Bug 1776920] Re: qemu-img convert on Mac OSX creates corrupt images, Bruno Haible, 2018/09/07
    - Re: [Qemu-devel] [Bug 1776920] Re: qemu-img convert on Mac OSX creates corrupt images, Eric Blake, 2018/09/10

Prev by Date: Re: posix_spawn_file_actions_addchdir
Next by Date: Re: Introduce posix_spawn
Previous by thread: Re: [Qemu-devel] [Bug 1776920] Re: qemu-img convert on Mac OSX creates corrupt images
Next by thread: Re: [Qemu-devel] [Bug 1776920] Re: qemu-img convert on Mac OSX creates corrupt images
Index(es):
- Date
- Thread