bug-diffutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug-diffutils] bug#69748: bug#69748: Does diff not work on big enough f


From: Robert Boyer
Subject: [bug-diffutils] bug#69748: bug#69748: Does diff not work on big enough files?
Date: Tue, 12 Mar 2024 15:22:12 -0500

Are you trying to be funny? Or are you simply stupid?  You are much too brilliant and famous
to be stupid, so I am assuming you were trying to be funny, a parody of the overworked bug fixer.

In an almost immediate follow up message, I already solved the problem, and it
worked perfectly for me trying to compare an old file of the primes below a billion with a new
file of the primes below ten billion.  Fortunately, this little gem of a program helped me
believe that I had computed at least the primes below a billion correctly.  What a relief!

> there isn't enough room for 'diff' to do its job with its current algorithm 

Probably very sadly true, so you must improve your algorithm, and here is how. It won't hurt, I promise.

From my previous message:

Here is a better version of diff, better only in the sense that it works on all files.  But what do I know?  Nothing.

This is Common Lisp.  I was running in SBCL.

(defun my-diff (file1 file2)
  (let ((s1 (open file1 :element-type '(integer 0 255)))
        (s2 (open file2 :element-type '(integer 0 255)))
        (c1 0)
        (c2 0))
    (declare (fixnum c1 c2))
    (loop
     (setq c1 (read-byte s1 nil 256))
     (setq c2 (read-byte s2 nil 256))
     (cond ((and (eql c1 256) (eql c2 256)) (return "no difference")))
     (cond ((eql c1 256) (return "file1 hit eof first")))
     (cond ((eql c2 256) (return "file2 hit eof first")))
     (cond ((eql c1 c2))
           (t (return (format nil
                              "difference at position ~s; c1 = ~s, c2 = ~s."
                              (file-position s1) c1 c2)))))))

On Tue, Mar 12, 2024 at 2:58 PM Paul Eggert <eggert@cs.ucla.edu> wrote:
On 3/12/24 08:17, Robert Boyer wrote:

> It is simply incredible to me that diff might not work!

Like any other program, 'diff' needs enough resources to run. You're
trying to compare a 5 GiB file on a Chromebook that has (let me guess) 4
GiB of RAM and 32 GB of flash, most of which is occupied by ChromeOS and
other stuff. If so, there isn't enough room for 'diff' to do its job
with its current algorithm and you'll have to either use a bigger
machine or solve a smaller problem.

It's possible to imagine a different 'diff' algorithm that would take
less RAM but a lot more time, presumably because it would do more I/O to
a temporary file. But if the available flash is small enough, even that
wouldn't work. I doubt whether it'd be worth the time to develop the
code for this alternative approach.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]