bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#58042: 29.0.50; ASAN use-after-free in re_match_2_internal


From: Eli Zaretskii
Subject: bug#58042: 29.0.50; ASAN use-after-free in re_match_2_internal
Date: Wed, 05 Oct 2022 10:22:34 +0300

> From: Gerd Möllmann <gerd.moellmann@gmail.com>
> Cc: 58042@debbugs.gnu.org
> Date: Wed, 05 Oct 2022 08:58:51 +0200
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> > The question that we should try answering is this: what variable holds
> > the C pointer to the data of a Lisp string that is being relocated
> > and/or compacted by GC between the time the C pointer is assigned and
> > the time its value is dereferenced?
> 
> I think we can answer that question, at least with a good probability.
> If you look what the offending (I think) pointer points to:
> 
> frame #5: 0x0000000100582044 
> emacs`re_match_2_internal(bufp=0x000000010111ace8, 
> string1=0x0000000000000000, size1=0, 
> string2="/Users/gerd/.config/emacs.d.default/elpa/magit-section-20220901.331/puny.dylib",
>  size2=78, pos=0, regs=0x0000000000000000, stop=78) at regex-emacs.c:4328:15
>    4325                   DEBUG_PRINT ("EXECUTING anychar.\n");
>    4326       
>    4327                   PREFETCH ();
> -> 4328                   buf_ch = RE_STRING_CHAR_AND_LENGTH (d, buf_charlen,
>    4329                                                       
> target_multibyte);
>    4330                   buf_ch = TRANSLATE (buf_ch);
>    4331                   if (buf_ch == '\n')
> (lldb) p d
> (re_char *) $285 = 0x000000011f90d0a1 "magit-section-20220901.331/puny.dylib"
> 
> That looks like part of the filename here:
> 
> frame #10: 0x0000000100503cf4 emacs`Ffind_file_name_handler(filename=(struct 
> Lisp_String *) $318 = 0x000000011f6ec4c0, operation=(struct Lisp_Symbol *) 
> $321 = 0x00000001010ec310) at fileio.c:324:24
>    321                    operations = Fget (handler, Qoperations);
>    322        
>    323                  if (STRINGP (string)
> -> 324                      && (match_pos = fast_string_match (string, 
> filename)) > pos
>    325                      && (NILP (operations) || ! NILP (Fmemq 
> (operation, operations))))
>    326                    {
>    327                      Lisp_Object tem;
> (lldb) p filename
> (Lisp_Object) $322 = 0x000000011f6ec4c4 (struct Lisp_String *) $324 = 
> 0x000000011f6ec4c0
> (lldb) p *$324
> (struct Lisp_String) $325 = {
>   u = {
>     s = {
>       size = 78
>       size_byte = -1
>       intervals = NULL
>       data = 0x000000011f5d2f38 
> "/Users/gerd/.config/emacs.d.default/elpa/magit-section-20220901.331/puny.dylib"
>     }
>     next = 0x000000000000004e
>     gcaligned = 'N'
>   }
> }
> 
> So, I'd say that the filename string data has been moved somewhere else
> during compaction.  Which would mean GC somehow ran between the point
> where "d" in frame#5 was initially set up from the filename, and line
> 4328 where the problem is detected.

That part is clear, but the "GC somehow ran" part is not, and that is
the part which we must understand to fix the problem.  The filename's
SSDATA is passed to re_search as a C string, under the assumption that
GC cannot happen while re_search runs.  If that assumption is false,
we need to understand exactly how and in what cases, because without
that there's nothing we can do -- regex-emacs.c code deals explicitly
only with C strings.

IOW, this isn't the case like

  char *ptr = SSDATA (lisp_string);
  ...
  dereference (ptr);

where GC can happen as part of "...".  Those cases are easy to fix.
But this is not that case.

> > I don't see how to answer
> > that question without understanding how redisplay was called in the
> > middle of what seems to be loading of a Lisp package, because none of
> > the items 1 and 3 show anything that could call redisplay.
> 
> What I can see is that, apparently, redisplay got called because Emacs
> received a MacOS event, and did a prepare_menu_bars etc etc.

You mean, a macOS event can be received asynchronously, and will
interrupt some processing in C, like inside regex-emacs.c?  If that
can happen, no code in Emacs is safe, ever.  I don't believe this is
possible: we no longer process window-system events asynchronously,
AFAIK, and for this very reason.  But maybe macOS is different?  In
that case, either we should change the macOS code to avoid doing that,
or we should have some means of blocking such "interrupts" around
specific code fragments, akin to block_input.

> How that's possible, if it is, while Emacs is in between frame#10 and
> frame#5 I have not the slightest idea.  And please note that this is all
> happening in the same thread T0, according to ASAN.

Yes, I've seen that it's the same thread.  Having redisplay run from
another thread would be a larger disaster.

> Maybe someone knowing the Mac port has an idea if this can happen?

I hope so.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]