help-libidn
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: two (bugs? misfeatures?) in libidn


From: Simon Josefsson
Subject: Re: two (bugs? misfeatures?) in libidn
Date: Thu, 16 Aug 2012 22:04:03 +0200
User-agent: Gnus/5.130006 (Ma Gnus v0.6) Emacs/23.3 (gnu/linux)

Jon Nelson <address@hidden> writes:

> On Thu, Aug 2, 2012 at 3:21 PM, Simon Josefsson <address@hidden> wrote:
>> Jon Nelson <address@hidden> writes:
>>
>>> I've encountered two bugs or misfeatures in libidn:
>>
>> Hi!  Thanks for your report.
>>
>>> 1. given an idna-encoded input, it is possible to generate invalid
>>> UTF-8 output (as defined by RFC3629). The UTF-8 is invalid because
>>> codepoints above 0x10FFFF are used.
>>>
>>> See http://tools.ietf.org/html/rfc3629
>>
>> Can you be more concrete, what inputs does this happen for and what
>> output would you expect?  An example would help illustrate the problem.
>
> Example:   echo xn--1234xxxxxxxxxx | idn -u --debug

Thank you.  Interestingly, the punycode code from RFC 3492 happily
decodes the string to Unicode code points > U+10FFFF.  I can't see
anything in RFC 3492 (punycode) or RFC 3490 (IDNA ToUnicode) that
requires checking for code points > U+10FFFF, or where that check would
be done.  Arguable, the final conversion from UCS4 to UTF8 should
trigger an error in libidn, but then the damage is already done:
ToUnicode has returned a sequence of code points which are illegal.  So,
it seems ToUnicode should perform this check somewhere, but I can't find
where it would be suitable reading RFC 3492 and RFC 3490.  Thoughts?

/Simon



reply via email to

[Prev in Thread] Current Thread [Next in Thread]