bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#8545: issues with recent doprnt-related changes


From: Eli Zaretskii
Subject: bug#8545: issues with recent doprnt-related changes
Date: Mon, 25 Apr 2011 12:00:44 +0300

> Date: Sun, 24 Apr 2011 22:46:33 -0700
> From: Paul Eggert <eggert@cs.ucla.edu>
> CC: Eli Zaretskii <eliz@gnu.org>
> 
> This is a followup to Bug#8435.  Eli invited me to review the recent
> doprnt-related changes, so here's a quick review:

Thanks for the review and the comments.

> * doprnt returns size_t.  But Stefan wrote that he prefers sizes to be
>   signed values, and doprnt always returns a value that can fit in
>   EMACS_INT.  So shouldn't doprnt return EMACS_INT, as it did before?

I made it return size_t because all the related variables in verror
are size_t, and I didn't want to mix signed with unsigned.  AFAIU, the
preference to use signed is for those values that come from Lisp or go
back to the Lisp level, which is not the case here.

But I will let Stefan comment on this.  Changing doprnt to return a
signed value, and making the respective changes in verror, would be
trivial, and I won't mind doing that, if that's the verdict.

> * doprnt supports only a small subset of the standard printf formats,
>   but this subset is not documented.  It's unclear what the subset is.
>   Or it's a superset of the subset, with %S and %l?  Anyway, this
>   should be documented clearly in the lead comment.

I added such a documentation.

> * I suggest that every feature in doprnt be a feature that is actually
>   needed and used; this will simplify maintainance.

I agree, but I didn't add any features, except the support for %ld,
which is surely needed for error messages that show EMACS_INT values.
All the rest was already there in the original code of doprnt.  I see
no reason to remove that code just because no error message currently
uses some of those features.  According to "bzr annotate", most of the
related code in doprnt survived largely untouched since the 1990s,
except some cleanup.  This is no guarantee of being bug-free, of
course, but it does have _some_ weight in my eyes.

> * Format strings never include embedded null bytes, so there's
>   no need for doprnt to support that.

Potentially, someone could call `error' with its first argument taken
from a Lisp string, which could include null characters.  But again,
this feature was there to begin with, and I see no particular need to
remove it.

> * If the format string is too long, the alloca inside doprnt will
>   crash Emacs on some hosts.

You are right.  I modified doprnt to use SAFE_ALLOCA instead.

>                               I suggest removing the alloca,
>   instituting a fixed size limit on format specifiers, and documenting
>   that limit.  Since user code cannot ever supply one of these
>   formats, that should be good enough.

GNU coding standards frown on arbitrary limits, so I didn't want to
take that route, what with SAFE_ALLOCA readily available and easy to
use.

> * The width features of doprnt (e.g., %25s) are never used.

Again, an old feature that I see no reasons to remove.  And, since
doprnt produces error messages meant to be displayed, I find that
this feature actually makes sense.

>   - doprnt mishandles format specifications such as %0.0.0d.
>     It passes them off to printf, and this results in undefined
>     behavior, near as I can tell.

Since both error and verror are now marked as ATTRIBUTE_FORMAT_PRINTF,
the compiler will detect such invalid formats and flag them.  If the
warning is disregarded, the result of such a format is just a somewhat
illegible message.  In any case, vsnprintf would do the same, right?

>   - doprnt uses atoi (&fmtcpy[1]), but surely this isn't right if
>     there are flags such as '-'.

Why not?  In that case, atoi will produce a negative value for
`width', which is already handled by the code.  If I'm missing
something, please point out the specific problems with that.

>   - Quite possibly there are other problems in this area, but I
>     didn't want to spend further time reviewing a never-used feature.

I did read that code.  It looked solid to me, but if you or someone
else see specific problems, please point them out.

> * In this code, in verror:
> 
>       used = doprnt (buffer, size, m, m + mlen, ap);
> 
>       /* Note: the -1 below is because `doprnt' returns the number of bytes   
>   
>          excluding the terminating null byte, and it always terminates with a 
>   
>          null byte, even when producing a truncated message.  */
>       if (used < size - 1)
>         break;
> 
>   I don't see the reason for the "- 1".  If you replace this with:
> 
>       used = doprnt (buffer, size, m, m + mlen, ap);
> 
>       if (used < size)
>         break;
> 
>   the code should still work, because, when used < size, the buffer
>   should be properly null-terminated.  If it isn't then there's something
>   wrong with doprnt, no?

As the comment says, doprnt always null-terminates the result, even if
the result is truncated, and it never returns a value larger than the
buffer size it was given.  (In that, it differs from vsnprintf, which
can return larger values.)  When doprnt does truncate the output
string, it returns `size - 1'; if we compare against `size', we will
happily bail out of the loop, and never try to enlarge the buffer.

I saw no reason to enhance doprnt to continue processing the format
string and the arguments once the buffer is exhausted.  So I modified
verror instead to DTRT.

> * In this code, in verror:
> 
>       else if (size < size_max - 1)
>         size = size_max - 1;
> 
>   there's no need for the "- 1"s.  Just use this:
> 
>       else if (size < size_max)
>         size = size_max;

I made that change, thanks.

The reason I originally limited to `size_max - 1' is that the games
you play with the maximum size, viz.:

  size_t size_max =
    min (MOST_POSITIVE_FIXNUM, min (INT_MAX, SIZE_MAX - 1)) + 1;

are neither clear nor commented.  E.g., why the second `min'? could
INT_MAX be ever larger than SIZE_MAX-1? if so, what does that mean in
terms of relation between `int' and `size_t' on such a platform?

I'm only familiar with a very small number of architectures, where all
these tricks are unnecessary.  When I see such code, it makes me dizzy
and unsure of what I may be missing.  So I opted for a safer way out
(since error messages as long as SIZE_MAX are only theoretically
possible), that would not risk overflowing signed values into the sign
bit.  Perhaps in the future you could comment such obscure code to
make it understandable by mere mortals such as myself.

> * This code in verror:
> 
>       if (buffer == buf)
>         buffer = (char *) xmalloc (size);
>       else
>         buffer = (char *) xrealloc (buffer, size);
> 
>   uses xrealloc, which is unnecessarily expensive, as it may copy the
>   buffer's contents even though they are entirely garbage here.  Use
>   this instead, to avoid the useless copy:
> 
>       if (buffer != buf)
>         xfree (buffer);
>       buffer = (char *) xmalloc (size);

You are right, I made that change.

I believe this takes care of all the imminent problems you discovered
in your review.  Nevertheless, I will leave this bug report open for a
while, to allow you and others to come up with more problems and
suggestions for improvements.

Thanks again for taking the time to review and comment.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]