bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#58472: [PATCH] Make `message-unique-id' less prone to collisions


From: Stefan Kangas
Subject: bug#58472: [PATCH] Make `message-unique-id' less prone to collisions
Date: Thu, 13 Oct 2022 12:10:39 +0000

Matt Armstrong <matt@rfc20.org> writes:

> Most email I get today to use a UUID or UUID-like Message-ID, like this:
>
> Message-ID: <736d10a6-001f-4a29-a1d4-554f58733b69@dfw1s10mta1086.xt.local>
> Message-ID: <1815053947.8446619.1665544925708@lor1-app45123.prod.linkedin.com>
> Message-ID: 
> <01000183b9eaa6f8-411d1f4c-b573-472d-b45f-47b0c4eb6ace-000000@email.amazonses.com>
> Message-ID: 
> <CABqZ1wa8MxrieVKZ11adZUV2qB_CnpMJoFEn-U3d5CQ7z7smWw@mail.gmail.com>

Those are 30-51 characters in length.  I also note that Gmail uses both
lower case and upper case characters.

>> If we limit the length of the time string to 12 characters, and the
>> total length to 25 characters (including the ".gnu" part), we still have
>> a guaranteed 9 characters of random data, or 46 bits of entropy.
>
> I suspect that most mailers use more randomness than that.

So I guess we might as well bump this up to 30 characters in total,
which gives us 72 bits.  The Message-IDs would look like:

    cnkrs75yamag1k7x8rnt3y50za.gnu@stefankangas.se
    cnkrifkirauwuwfkzs3rcit8cq.gnu@stefankangas.se

We could go longer, but it's also nice to have something which is not an
absolute abomination to look at.

If we add in upper case characters too, we can encode the time with one
less character.  So we end up with 89 bits of randomness and this:

    1Z2KnqE1t2bSgUWkcu53M34Y4y.gnu@stefankangas.se
    1Z2KbUgleGoe0WRJ3jbiM0mE7W.gnu@stefankangas.se

If we don't want to always start the Message-ID with the same characters
(which makes them more distinct, at a glance), we could just reverse the
time string:

    QlRXPpmK2Z1kUklxIpMNZpChOu.gnu@stefankangas.se
    Z59YikmK2Z1FSmYj172SAdPpuX.gnu@stefankangas.se

> Some of the SHA hash algorithms are in the public domain.  Could they be
> added to Emacs and used for UUID generation here?

We have `secure-hash'.  Is that what you mean?  Or do you mean to use a
proper RFC 4122 UUID?

All we need is for the Message-ID to be unique though, so some ad hoc
solution is probably fine.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]