coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH V2] Add support for cksum --algorithm [sm3]


From: Pádraig Brady
Subject: Re: [PATCH V2] Add support for cksum --algorithm [sm3]
Date: Tue, 14 Sep 2021 13:29:41 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:84.0) Gecko/20100101 Thunderbird/84.0

On 12/09/2021 23:01, Pádraig Brady wrote:
On 12/09/2021 19:13, Pádraig Brady wrote:
This patch set refactors all digest implementations
to their own modules, all interfaced through digest.c.
All file operations and diagnostics are done in digest.c.
All digests are made available through `cksum -a`.
Also we add support for SM3 through `cksum -a sm3` only.

V2 changes:

- Various small fixes to previous patch set.
- Simplify b2sum specific code.
- Add support for `cksum -c` to infer the algorithm from tagged checksums.

This is pretty much ready to land now I think.
I hope to land it tomorrow.

There is a question though re default format to use for cksum -a.
I.e. should we use --tag format by default for cksum -a, as that
is now directly consumable by cksum -c.
If we did that though, we'd have to have the opposite option
for cksum, so something like `cksum --untagged` to produce
the traditional coreutils output format of "$hash  $file".
I'm undecided.

I thought about it a bit more, I'm going to change the default format
for cksum to tagged, and add an --untagged option. Reasons:

- It's a now or never change, since cksum -a is new it doesn't have backward 
compat issues.
- It's simpler as don't need to specify --tag on generation or -a on checking 
invocations.
- It's a more general format supporting mixed and length adjusted digests.

Now the tagged format doesn't support encoding --binary or --text mode,
but that got me investigating whether cksum should support these at all.
My conclusion is that it should not, and just use binary everywhere.
I.e. cksum should not support --binary or --text. Reasons:

 - cygwin is the main consideration here, and it seems to be defaulting to 
binary these days
   - i.e. text/binary is a confusing distraction for the vast majority of cases
 - The cygwin model seems to be being subsumed by the WSL model anyway
 - Shared checksum files for text files stored in system native format seems 
quite edge case
   - Even then, one can always convert to system native after verification
 - The functionality is retained in the standalone utils if needed

Interestingly one place where we might care about processing in text mode,
is for the checksum files themselves, but we don't actually handle that at all 
:/
Looking back I see many users having issues with \r chars messing up --check.
Eric Blake had a good suggestion to encode \r in file names and then ignore
real \r chars in checksum files. I'll implement that now while were working on 
this.

Nearly there...
Pádraig



reply via email to

[Prev in Thread] Current Thread [Next in Thread]