bug-tar
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-tar] segfault getting cwd with --listed-incremental


From: Paul Eggert
Subject: Re: [Bug-tar] segfault getting cwd with --listed-incremental
Date: Fri, 16 Jul 2010 09:48:21 -0700
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.10) Gecko/20100527 Thunderbird/3.0.5

On 07/16/10 05:04, Sergey Poznyakoff wrote:
> Paul Eggert <address@hidden> ha escrit:
>> OK, but aren't symlinks an argument for the patch, not an argument
>> against it?
> 
> I am not sure at all.

It should be easy to construct a set of symlinks for which tar-1.23.90
does the wrong thing.  Please see below for a first cut at an example.

>> The patched code doesn't take symlinks into account either.  The previous
>> version rewrote A/B/../C as A/C, and surely this wasn't safe when B was
>> a symlink to some other directory.
> 
> That would not be safe if caname had been used as a real directory name
> (e.g. by chdir'ing to it), which is not the case.

Even if caname is used only as a magic token to decide whether a file
name refers to a new file, it's still incorrect to rewrite A/B/../C as A/C.
For example:

mkdir a
echo foo >a/b
echo barx >b
ln -s . a/dot
ls -li a/b a/dot/../b
tar -g x.list -cf x.tar a/b a/dot/../b

The "ls" outputs this:

10783652 -rw-r--r-- 1 eggert stapdev 4 Jul 16 09:18 a/b
10783653 -rw-r--r-- 1 eggert stapdev 5 Jul 16 09:18 a/dot/../b

so the two names a/b and a/dot/../b refer to different files.
But with tar 1.23.90, tar incorrectly canonicalizes a/dot/../b
to a/b, and decides that only one of the two files needs to be
archived.

I am not saying that my patch fixes tar in this area; I expect
that many issues remain.  However, it is problematic to rely on files
having canonical names, and to try to resolve ".."
and the like in file names, because the results are often
incorrect.  It is all too common to have two different names
for the same file, because of hard links, or symbolic links, or
multiply-mounted file systems, or loopback mounts, or God knows
what else.  It would be better to have a solution that does not
rely on using getcwd to create canonical names.

>> Why must normalize_filename_x produce an absolute file name?
> 
> It is used to ensure the order of directories in the resulting archive
> is the same, whatever their order on the command line and to avoid
> dumping the same directory twice, if it was referenced by different
> relative names.

Can we attack this problem by recording the dev,ino pair for each name
instead?  We can sort by dev,ino pair, and detect duplicates
by inspecting dev,ino number as usual.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]