coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: split: Allow splitting by line count (instead of byte size)


From: Pádraig Brady
Subject: Re: split: Allow splitting by line count (instead of byte size)
Date: Tue, 12 Jan 2021 21:11:16 +0000
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:84.0) Gecko/20100101 Thunderbird/84.0

On 12/01/2021 18:16, John wrote:
* On Tuesday, January 12, 2021 07:31, "Pádraig Brady" said:
The disadvantage is that we'd be pulling some wc logic into split,
but it wouldn't be providing any efficiency advantages.
Achieving this in shell is also simple enough and portable

    lines=$(($(wc -l < "file") / 2))
    split -l $lines file

   You're not wrong that it's simple enough and portable (that's what I did 
myself when I had to do the next data set in the project)...but it's also 
equally applicable to bytes, so why have split count the number of bytes in the 
input when another tool could do that?

bytes=$(($(wc -c < "file") / 2))
split -b $bytes file

   I know this example is a little pedantic and/or facetious, and I'm not trying to be 
snarky here (well, maybe a little snarky?  :)); the goal here is just to illustrate that 
while I agree that "do one thing and do it well" is a very noble goal, 
sometimes there's such a thing as being _too_ modular.

Well doing this for bytes is arguably more useful.
Also doing this for bytes is computationally inexpensive to implement.
I.E. --number=N doesn't hide much processing,
while --number=L/N would.
It's generally best to be explicit with expensive operations.

   On the other hand, its not being worth the effort _is_ a valid argument in 
my book; it's up to the people actually doing the work whether they think their 
effort is justified.  I know _I_ would like it, but I assume I'm an edge case 
since split has existed this long without this feature and I can only speak for 
myself.

cheers,
Pádraig




reply via email to

[Prev in Thread] Current Thread [Next in Thread]