bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-gnulib] iconv made easy


From: Paul Eggert
Subject: Re: [bug-gnulib] iconv made easy
Date: Mon, 13 Dec 2004 10:46:28 -0800
User-agent: Gnus/5.1006 (Gnus v5.10.6) Emacs/21.3 (gnu/linux)

Bruno Haible <address@hidden> writes:

>> I know the function doesn't handle embedded ASCII #0
>
> iconv() handles NUL bytes correctly; you don't need to handle them specially.

I think he was aiming for convenience at the expense of generality.
But personally I'm not sure it's worth it in this case; the caller can
simply specify a length of strlen(string)+1.

One approach that I've used in other APIs is for the caller to pass a
length of -1 (actually, SIZE_MAX) to denote the "length" of an
argument that is actually a null-terminated string.  That way, the API
allows for embedded nulls, but it's still convenient/cheap for the
user to pass a null-terminated string of unknown length.  This
approach is a win if the callee is already computing the length
somehow -- it saves a strlen.

Here, the strlen cannot be avoided, but perhaps the length=-1 approach
is still convenient enough to achieve Simon's goal of convenience.


> You will notice that there are two approaches to converting a string:
> a) allocate an initial buffer and extend it as needed, stopping and
>    restarting iconv() each time a realloc is needed,
> b) call iconv() once to determine the length and then once again for
>    filling the result string.

Good point.  Generally speaking, in GNU code we are willing to trade
space for speed (within reason, of course) so I guess (a) would be
preferable, typically.

But in this case isn't there a 3rd option that is even faster?
Something like this:

  c) Use MB_LEN_MAX to calculate an upper bound for the size of the
     output buffer (from the input buffer size).  Allocate a buffer of
     that size, invoke iconv(), and then realloc the buffer once
     iconv() finishes and you know the correct size.

This is nearly as simple as (b).  Overall, I'd expect it to be faster
than either (a) or (b), assuming a decent memory allocator.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]