bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#51733: 27.1; Detect impossible email addresses better


From: Eli Zaretskii
Subject: bug#51733: 27.1; Detect impossible email addresses better
Date: Wed, 19 Jan 2022 16:57:38 +0200

> From: Lars Ingebrigtsen <larsi@gnus.org>
> Cc: 51733@debbugs.gnu.org,  jidanni@jidanni.org
> Date: Wed, 19 Jan 2022 15:28:51 +0100
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> > Why? .ru is a top-level domain, it doesn't affect what should be
> > before the dot, I think?
> >
> > If you replace "Сгсе.ru" with "Cгсе.ru", you do get a warning.
> 
> Yes.  But "Сгсе.ru" is a whole-script confusable with "Crce.ru", and is
> therefore suspicious.

OK, but why do you think "Сгсе.ru" is confusable?  The SLD part is
entirely made of single-script characters, and UTS#39 explicitly
allows that:

  [...] it can be perfectly legitimate to have scripts in a SLD
  (second level domain) not be the same as scripts in a TLD (top-level
  domain), such as:

    Cyrillic labels in a domain name with a TLD of .ru or .рф 

That's your case, isn't it?

> >> Is that what they mean here?
> >
> > I'm not sure I understand the purpose of finding which scripts
> > "contain a whole-script confusable with a string X".  What are we
> > supposed to do with the resulting list?
> 
> I think this standard was written by somebody with a PhD in Philosophy,
> and not a programmer, so the language is very high falutin'.
> 
> So they're not actually suggesting that a list should be made, but the
> result should be mathematically equivalent with the result of the
> mathematical algorithm described.  I just don't understand what he's
> saying here.

Regardless of what they are saying, I don't think the above is
suitable for production.  I think it should be enough to see whether
there could be confusion with the corresponding ASCII characters from
confusables.txt.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]