[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: GSFromUnicode stack trashing bug
From: |
Richard Frith-Macdonald |
Subject: |
Re: GSFromUnicode stack trashing bug |
Date: |
Wed, 18 Jan 2006 15:17:28 +0000 |
On 18 Jan 2006, at 11:16, Wim Oudshoorn wrote:
There is a bug in the function GSFromUnicode in the file Unicode.m.
This bug can corrupt the stack. It is a little tricky to explain,
so bear with me :-)
First look at the code fragment
while (dpos + sl >= bsize)
{
GROW ();
}
if (sl == 1)
{
ptr [dpos++] = u & 0x7f;
}
This occurs around Unicode.m:1836.
Here
dpos = index in destination buffer, pointed to by
ptr = the base pointer of the destination buffer
sl = the lenght of the UTF8 encode character that needs to be
written to the destination buffer.
bsize = the length of the destination buffer.
The check in the while condition is off by one, look at the
following example:
dpos = 0,
sl = 1,
bsize = 1,
So we have a buffer of lenght 1, and the character will be written
at index 0.
So there is space enough. However we will still grow the buffer.
(No this is not needed to accomodate the terminating null character.)
Of course this is not very serious, you just grow the buffer to soon.
That would be true, if the GROW macro does always what you expect.
Lets look
at the GROW macro:
if (dst == 0) \
{ \
/* \
* Data is just being discarded anyway, so we can \
* adjust the offset into the local buffer on the \
* stack and pretend the buffer has grown. \
*/ \
if (extra == 0) \
{ \
ptr -= BUFSIZ; \
bsize += BUFSIZ; \
} \
else \
{ \
ptr -= BUFSIZ-1; \
bsize += BUFSIZ-1; \
} \
} \
else if (zone == 0) \
....
Here you see that if dst == 0 the result is discarded. Instead of
not converting the function in question just reuses a fixed buffer and
cycles through it.
So in our case above, assume in addition that
extra = 0
BUFSIZ = 1
I assumed you meant bsize rather than BUFSIZ, since BUFSIZ is a
constant (generally a multiple of 1024)
Now you will see that GROW has the disastrous effect of
FIRST substracting 1 from ptr before writing in that address the
unicode character. So you write BEFORE the beginning of the buffer
allocated on the stack.
This already happens when the encoding lenght of the character is
1. If the encoding length is
longer the problem is not as easily fixed as just fixing the check
in the while loop.
Well, at first I accepted this analysis, but now I'm not so sure...
You say:
The check in the while condition is off by one, look at the
following example:
dpos = 0,
sl = 1,
bsize = 1,
However, it looks like that is an impossible example, since the code
says
if (dst == 0 || *size == 0)
{
ptr = buf;
bsize = (extra != 0) ? BUFSIZ - 1 : BUFSIZ;
}
So for the case we are looking at (where dst is 0) it would appear
that bsize cannot be less than BUFSIZ-1 (at least 1023 on all systems
I know of).
This would seem to imply that for the problem to occur the character
length (sl) must be at least 1023, which should never be the case.
Am I missing something here?
Do you have any code to demonstrate the problem actually happening?