emacs-bug-tracker
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#46048: closed (split -n K/N loses data, sum of output files is small


From: GNU bug Tracking System
Subject: bug#46048: closed (split -n K/N loses data, sum of output files is smaller than input file.)
Date: Mon, 25 Jan 2021 14:22:02 +0000

Your message dated Mon, 25 Jan 2021 14:21:35 +0000
with message-id <4f858cd0-19e4-d159-c2e7-51b3aad0b3b0@draigBrady.com>
and subject line Re: bug#46048: split -n K/N loses data, sum of output files is 
smaller than input file.
has caused the debbugs.gnu.org bug report #46048,
regarding split -n K/N loses data, sum of output files is smaller than input 
file.
to be marked as done.

(If you believe you have received this mail in error, please contact
help-debbugs@gnu.org.)


-- 
46048: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=46048
GNU Bug Tracking System
Contact help-debbugs@gnu.org with problems
--- Begin Message --- Subject: split -n K/N loses data, sum of output files is smaller than input file. Date: Fri, 22 Jan 2021 18:58:03 -1000
split --number K/N appears to lose data in, with the sum of the sizes of the output files being smaller than the original input file by 131072 bytes.

$ split --version
split (GNU coreutils) 8.30
...

$ head -c 1000000 < /dev/urandom > test.dat
$ split --number=1/4 test.dat > t1
$ split --number=2/4 test.dat > t2
$ split --number=3/4 test.dat > t3
$ split --number=4/4 test.dat > t4

$ ls -l
-rw-r--r-- 1 user user  250000 Jan 22 18:36 t1
-rw-r--r-- 1 user user  250000 Jan 22 18:36 t2
-rw-r--r-- 1 user user  250000 Jan 22 18:36 t3
-rw-r--r-- 1 user user  118928 Jan 22 18:36 t4
-rw-r--r-- 1 user user 1000000 Jan 22 18:33 test.dat

Surely this should not be the case?

Paul

--- End Message ---
--- Begin Message --- Subject: Re: bug#46048: split -n K/N loses data, sum of output files is smaller than input file. Date: Mon, 25 Jan 2021 14:21:35 +0000 User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:84.0) Gecko/20100101 Thunderbird/84.0
On 24/01/2021 19:55, Paul Eggert wrote:
On 1/24/21 8:52 AM, Pádraig Brady wrote:
-      if (lseek (STDIN_FILENO, start, SEEK_CUR) < 0)
+      if (lseek (STDIN_FILENO, start, SEEK_SET) < 0)

Dumb question: will this handle the case where you're splitting from
stdin and stdin is a seekable file and its initial file offset is nonzero?

Right. Following on the logic from input_file_size(),
I'm going with the attached, which I'll push later.
Marking this as done.

thanks,
Pádraig

Attachment: split-k_of_n.patch
Description: Text Data


--- End Message ---

reply via email to

[Prev in Thread] Current Thread [Next in Thread]