bug-diffutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug-diffutils] bug#21942: bug#21942: Files with incorrect file sizes


From: Jim Meyering
Subject: [bug-diffutils] bug#21942: bug#21942: Files with incorrect file sizes
Date: Sun, 6 Dec 2015 11:27:20 -0800

On Fri, Nov 20, 2015 at 2:33 PM, Stephan Müller <address@hidden> wrote:
> Am Fri, 20 Nov 2015 23:12:40 +0100
> schrieb Jim Meyering <address@hidden>:
>
>> On Fri, Nov 20, 2015 at 9:06 PM, Stephan Müller <address@hidden>
>> wrote:
>> > Am Fri, 20 Nov 2015 18:28:37 +0100
>> > schrieb Jim Meyering <address@hidden>:
>> >
>> >> On Tue, Nov 17, 2015 at 12:44 PM, Stephan Müller
>> >> <address@hidden> wrote:
>> >> > recently I had to debug weird problem. Finally I figured it out.
>> >> >
>> >> > Virtual file systems like /sys or /proc usually don't care about
>> >> > file sizes. All files have a size of 0. This leads to
>> >> > difficulties as diff sometimes looks for file sizes.
>> >> >
>> >> > Say you do:
>> >> >
>> >> >> $ cp /proc/cmdline my_cmdline
>> >> >> $ diff /proc/cmdline my_cmdline ; echo $?
>> >> >> 0      // ok, files don't differ
>> >> >> $ diff --brief /proc/cmdline my_cmdline
>> >> >> Files /proc/cmdline and mycmdline differ
>> >> >
>> >> > The --brief option triggers a binary compare, as we aren't
>> >> > interested in the actual differences this makes sense. As a first
>> >> > step, file sizes are compared (0 vs ~150) and the files are
>> >> > reported as different.
>> >>
>> >> thanks for the report.
>> >> What version of diffutils are you using?
>> >> I think this has been fixed for some time.
>> >> I was unable to reproduce with 2.8.1 nor with the latest built from
>> >> git. I.e., I created an empty file and used diff-2.8.1 to compare
>> >> it with the nominally-
>> >> zero-length /proc/cmdline file, and diff did the right thing.
>> >> Also, I ran stat to show st_size of each file is indeed 0:
>> >>
>> >>   $ :
>> >> > /tmp/k; /p/p/diffutils-2.8.1/bin/diff /proc/cmdline /tmp/k; \
>> >> > stat --format %s /proc/cmdline /tmp/k
>> >>   1d0
>> >>   < ro root=LABEL=...
>> >>   0
>> >>   0
>> >>
>> >> In fact, I went ahead and built all available versions and tested
>> >> them like this:
>> >>
>> >>   $ for i in /p/p/*/bin/diff; do p=diffutils-$i; echo $i; $i
>> >> /proc/cmdline /tmp/k > /dev/null && echo bad; done
>> >>   /p/p/diffutils-2.7/bin/diff
>> >>   /p/p/diffutils-2.8.1/bin/diff
>> >>   /p/p/diffutils-2.8/bin/diff
>> >>   /p/p/diffutils-2.9/bin/diff
>> >>   /p/p/diffutils-3.0/bin/diff
>> >>   /p/p/diffutils-3.1/bin/diff
>> >>   /p/p/diffutils-3.2/bin/diff
>> >>   /p/p/diffutils-3.3/bin/diff
>> >
>> > Hi,
>> >
>> > I am using v.3.3 of diffutils
>> >
>> > $ diff -v
>> > diff (GNU diffutils) 3.3
>> >
>> > but I think you misunderstood the problem. Sorry for being
>> > ambiguous. I am not diffing against an empty file. That works well.
>> > The point is procfs doesn't care about size, but 'normal' file
>> > systems do. So for example on my system I have (after
>> > cp /proc/cmdline mycmdline)
>> >
>> > $ stat --format %s /proc/cmdline mycmdline
>> > 0
>> > 140
>> >
>> > The result of diffing /proc/cmdline against mycmdline depends on the
>> > --brief flag.
>> >
>> > STEPS TO REPRODUCE:
>> >
>> > cp /proc/cmdline mycmdline
>> > diff --brief /proc/cmdline mycmdline > /dev/null ; echo ?$
>> > 1
>> > diff /proc/cmdline mycmdline ; echo $?
>> > 0
>> >
>> > EXPECTED RESULT:
>> >
>> > cp /proc/cmdline mycmdline
>> > diff --brief /proc/cmdline mycmdline > /dev/null ; echo ?$
>> > 0
>> > diff /proc/cmdline mycmdline ; echo $?
>> > 0
>>
>> Oh, indeed. Thank you for clarifying. That feels like a bug.
>> Here's a knee-jerk patch that refrains from using the
>> st_size-comparing heuristic when either of the sizes is zero. This
>> may well be wrong. I have only barely tested the diff.c code path.
>
> Thanks, that makes the problem at least (even) less unlikely. But if we
> cant trust file sizes we're doomed. What do you think about a flag
> controlling comparison by size and a notice if files differ by size.
>
> I can craft a patch for this.

Thank you, but I don't want to have to specify some new option to
avoid this misbehavior, so will push the attached patch shortly.
If someone finds a system for which a falsely reported stat.st_size
is nonzero, we can revisit this.

Attachment: 0001-diff-brief-no-longer-mistakenly-reports-diff.-with-0.patch
Description: Text Data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]