help-libidn
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Libidn2 technology preview


From: Simon Josefsson
Subject: Libidn2 technology preview
Date: Wed, 09 Mar 2011 22:22:42 +0100
User-agent: Gnus/5.110014 (No Gnus v0.14) Emacs/23.2 (gnu/linux)

Hi everyone.

I have a IDNA2008 implementation that is starting to work (it passes
some test vectors for the lookup algorithm), and it I want to publish an
early version it at this point to get public review of the API.

First I want to explain why this is a separate library than an extension
of libidn.so: libidn.so is an IDNA2003 implementation plus some other
stuff.  It is quite big, around 200kb if you optimize it for size.  That
is already a size problem for embedded devices today.  By having libidn2
be a separate project, libidn won't be larger.  People that don't need
IDNA2003 does not need the libidn baggage, and people who don't need
IDNA2008 does not need the libidn2 baggage.  Further, eventually
IDNA2003 may just go away, replaced by IDNA2008 and at that point it
would be very painful (impossible) to remove the IDNA2003 stuff from
Libidn.  This combined makes me believe that IDNA2008 should be in a
separate shared library.  However, there is no reason the libidn2.so
shared library wouldn't be part of the 'GNU Libidn' project umbrella
eventually.  Adding it today would just slow down development though,
since it is still changing significantly both internally and externally.

The webpage for this project will be:

http://josefsson.org/libidn2/

I have uploaded GTK-DOC generated API manual at:

http://josefsson.org/libidn2/reference/idn2-idn2.html

In particular the essential API looks like this:

/* IDNA2008 with UTF-8 input. */
extern IDN2_API int
idn2_lookup_u8 (const uint8_t *src, uint8_t **lookupname, int flags);
extern IDN2_API int
idn2_register_u8 (const uint8_t *ulabel, const uint8_t *alabel,
                  uint8_t **insertname, int flags);

/* IDNA2008 with locale encoded inputs. */
extern IDN2_API int
idn2_lookup_ul (const char *src, char **lookupname, int flags);
extern IDN2_API int
idn2_register_ul (const char *ulabel, const char *alabel,
                  char **insertname, int flags);

I want to stress that these interfaces are not final and I want your
input on how to make them better.  There is no ABI guarantees of the
shared library now.

As you can see, there is one interface for passing in UTF-8 strings and
one for passing in locale encoded strings.  The locale interface will
convert the string to UTF-8 and NFC normalize it.

I'm not sure how useful the idn2_register_ul interface is -- accepting
non-UTF8 and non-NFC inputs to the register process is error prone.  For
the lookup process it is natural.

Note that the "register" interface takes only one label, not an entire
domain name.  This is per the suggested interface in RFC 5891.  I'm not
sure how useful this is -- maybe it should accept an entire domain name.
Thoughts?

Possibly there should be a way to pass a pre-allocated buffer and let
the function populate the buffer with the output domain name, instead of
forcing callers to copy the newly allocated name into its proper place.
My proposal on how to achieve this is to let the code inspect the
*lookupname or *insertname value and if that is non-NULL, then the
output is copied into that buffer location rather than allocating a new
buffer.  Of course, the buffer must have room for 256 bytes (255
characters + 1 NUL) which is the largest possible domain name.  I'm not
sure a size parameter is needed, 256 is such a small buffer size anyway
that the caller could be required to always allocate a 256 byte large
buffer.

Archives of the actual implementation will be available at:
http://josefsson.org/libidn2/releases/

This work is sponsored by DENIC.  If you know others who are interested
in supporting this effort, please let me know!

/Simon



reply via email to

[Prev in Thread] Current Thread [Next in Thread]