[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
GSFromUnicode stack trashing bug
From: |
Wim Oudshoorn |
Subject: |
GSFromUnicode stack trashing bug |
Date: |
Wed, 18 Jan 2006 12:16:49 +0100 |
User-agent: |
Gnus/5.1002 (Gnus v5.10.2) Emacs/22.0.50 (darwin) |
There is a bug in the function GSFromUnicode in the file Unicode.m.
This bug can corrupt the stack. It is a little tricky to explain,
so bear with me :-)
First look at the code fragment
while (dpos + sl >= bsize)
{
GROW ();
}
if (sl == 1)
{
ptr [dpos++] = u & 0x7f;
}
This occurs around Unicode.m:1836.
Here
dpos = index in destination buffer, pointed to by
ptr = the base pointer of the destination buffer
sl = the lenght of the UTF8 encode character that needs to be
written to the destination buffer.
bsize = the length of the destination buffer.
The check in the while condition is off by one, look at the following example:
dpos = 0,
sl = 1,
bsize = 1,
So we have a buffer of lenght 1, and the character will be written at index 0.
So there is space enough. However we will still grow the buffer.
(No this is not needed to accomodate the terminating null character.)
Of course this is not very serious, you just grow the buffer to soon.
That would be true, if the GROW macro does always what you expect. Lets look
at the GROW macro:
if (dst == 0) \
{ \
/* \
* Data is just being discarded anyway, so we can \
* adjust the offset into the local buffer on the \
* stack and pretend the buffer has grown. \
*/ \
if (extra == 0) \
{ \
ptr -= BUFSIZ; \
bsize += BUFSIZ; \
} \
else \
{ \
ptr -= BUFSIZ-1; \
bsize += BUFSIZ-1; \
} \
} \
else if (zone == 0) \
....
Here you see that if dst == 0 the result is discarded. Instead of
not converting the function in question just reuses a fixed buffer and
cycles through it.
So in our case above, assume in addition that
extra = 0
BUFSIZ = 1
Now you will see that GROW has the disastrous effect of
FIRST substracting 1 from ptr before writing in that address the
unicode character. So you write BEFORE the beginning of the buffer allocated
on the stack.
This already happens when the encoding lenght of the character is 1. If the
encoding length is
longer the problem is not as easily fixed as just fixing the check in the while
loop.
I guess that the best way of fixing this is by letting go of the GROW dst == 0
hack, and
just write macros that append a byte at the destination buffer. But if
someone has
other opinions, please let me know.
Wim Oudshoorn.
- GSFromUnicode stack trashing bug,
Wim Oudshoorn <=