coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

split: Allow splitting by line count (instead of byte size)


From: John
Subject: split: Allow splitting by line count (instead of byte size)
Date: Mon, 11 Jan 2021 18:53:03 -0600 (CST)

  I would like to be able to split a file by line count, instead of by 
(partial) file size.  For contrast, I had a 50M file with one record per line, 
to be processed by a script that's making one API call per line in the file.  I 
used split to break the file up into two files, and wound up with two 25M files 
with vastly different line counts (one had about 6K and the other hand about 
11K).

  Now, this wasn't split's "fault"; it operated exactly as designed.  The cause 
of the unexpected result was that the lines in different parts of the original 
file were of vastly different lengths.

  What I would like is to be able to split a file such that the resulting 
chunks have an even number of lines, regardless of how many bytes each line 
contains.  I checked the documentation and the coreutils/textutils list 
archives, but this doesn't seem to a) be a current feature, or b) have been 
brought up before.  I also checked the rejected features just in case, but I 
didn't see it there either (whew!).

  I know I could just do the math myself (and that's what I did, for the next 
iteration of the above job that I had to process) and pass the pre-divided line 
count to split with the "-l" option, but...you could make that same argument 
with byte counts and split handles that, so I figure it's worth asking.  :)




reply via email to

[Prev in Thread] Current Thread [Next in Thread]