bug-fileutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ANNOUNCE] GNU fileutils 4.1


From: cyrus
Subject: Re: [ANNOUNCE] GNU fileutils 4.1
Date: Thu, 24 May 2001 11:08:36 -0700
User-agent: Mutt/1.2.5i

On Thu, May 24, 2001 at 10:12:45AM -0700, Paul Eggert wrote:
] > Date: Thu, 24 May 2001 08:33:00 -0700
] > From: cyrus <address@hidden>
] >
] > Defending a one-off shell script in court is going to be much
] > trickier than defending a utility that ships with every GNU-based
] > operating system.
] 
] In that case, you may need a utility of your own.  "dd" is not really
] designed to be defended in court.  It contains several features that
] are not relevant to this application, e.g. conversion from ASCII to
] EBCDIC.  You might better off adding the few features you need to
] "md5sum", or designing a minimal utility of your own and using that.

I think you're missing the point.  The point isn't that dd is such
a wonderful piece of code that writing our own would be foolish;
the point is that dd is and has been a standard utility for moving
bytes from point A to point B for longer than I've been alive,
probably.  By incorporating checksumming functionality into it, we
get to piggyback some useful forensics functionality on the coattails
of a known, trusted utility.  It's as simple as that.

] 
] By the way, do you know that the name "dd" is a joke?  It's a pun,
] taken from OS/360 JCL command language.  That explains why it uses its
] own weird option syntax, rather than the standard option syntax used
] in other utilities.  (Do you really want to defend a weird joke
] program in court?  I'm sure the other side would love to tell the
] judge about the joke.  :-)

I don't think that's really relevant...  The name "Unix" itself is
a joke.  I don't think anyone is going to care how or why a program
got its name.

] 
] Also, if your goal is to defend the code in court, I don't know why
] you're using MD5 checksums.  My impression was that Hans Dobbertin's
] work on MD5 leaves it pretty close to toast.  And even aside from
] Dobbertin's work, these days one could even brute-force MD5 if one
] were determined enough: van Oorschot and Wiener estimated in 1994 that
] for only $10 million they could build an MD5-cracker based on
] exhaustive search, and the cost would be a small fraction of that
] these days.

There's no reason to defend code in court.  Like I said earlier,
no judge or jury is going to sit down and examine code.  They'll
rely on the testimony of "expert witnesses" who describe how dd
works, and how the integrated checksumming function works.  It is
my supposition that judge and jury will care about the code only
insofar as it effects the validity of the chain of custody of the
bytes being presented as evidence.

Van Oorschot and Wiener's proposed collision attack is infeasible
in practice (particularly for format-constrained binary data - the
abstract of the paper you're referencing specifically mentions
"Using collision search adapted for hashing collisions, one can
find slightly altered versions of these messages such that the two
new messages give the same hash result."  Slightly altering, say,
an inode or file meta data would effectively render the filesystem
broken), and MD5 is still a widely-respected algorithm.  MD5 is the
basis for several accepted industry-standard security mechanisms
(SSL and IPSEC, for instance, both make extensive use of MD5 as a
message digest algorithm).

] 
] The OpenBSD 2.8 man page for MD5 says "MD2 and MD5 are recommended
] only for compatibility with existing applications. In new
] applications, SHA-1 or RIPEMD-160 should be preferred."  This seems
] like a reasonable recommendation to me.

Perhaps.  Regardless, longevity and widespread deployment recommend
MD5 over these (possibly stronger) newer (and less widely-used)
algorithms for the purpose of defense in court.  I still think
you're missing the point; courts won't care about theoretical attacks
on one-way hashing algorithms.  There is nothing absolute about any
court case - there's a reason why jurys simply need to be convinced
"beyond a reasonable doubt" of a party's guilt to convict.  MD5
provides data integrity in most cases "beyond a reasonable doubt".
It is by no means the weakest link in a chain-of-custody argument.
Defending the shell script I wrote at 3AM to duplicate and checksum
data would be significantly more onerous than defending the use of
a particular hashing algorithm in a well-known and understood utility
that has existed on every Unix box on the planet for decades.

I also haven't quite heard a good argument for why *not* to put
this functionality into dd - care to expand on this at all?

-- 
cyrus.
<address@hidden>



reply via email to

[Prev in Thread] Current Thread [Next in Thread]