bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#41520: 28.0.50; Crash in character.h due to assertion error


From: Eli Zaretskii
Subject: bug#41520: 28.0.50; Crash in character.h due to assertion error
Date: Tue, 26 May 2020 19:17:15 +0300

> From: Pip Cet <pipcet@gmail.com>
> Date: Mon, 25 May 2020 20:39:06 +0000
> Cc: stefan@marxist.se, 41520@debbugs.gnu.org
> 
> The plan is to introduce additional struct-valued macros for things
> like PT/PT_BYTE:
> 
> #define PT_POS POS (PT, PT_BYTE)
> 
> In particular, it's not an lvalue. That's important to me, since
> assigning to PT_POS would be a severe bug.

So all the places where we now access PT and PT_BYTE separately will
now dereference a struct?

(Btw, PT and PT_BYTE are already non-lvalues, so we have this
covered.)

> > Like GPT, for example?
> 
> That's difficult. GPT is, of course, very special.

How so?  It's just like any other buffer position.

> > What about BEGV and ZV?
> 
> BEG, BEGV, ZV, and Z would all have _POS equivalents, and very often
> using them results in more readable code.
> 
> > IOW, I don't understand the goal here.
> 
> There are multiple goals: I think this significantly aids readability,
> and I think there might still be some minor bugs to catch, and future
> bugs to avoid.
> 
> For debug builds only, it might make sense to include the object that
> the bytepos-charpos relation is valid for, to catch cases where one
> object's correspondence is used for another object.
> 
> > I think I did understand when we were talking about accessing characters by 
> > buffer positions, and
> > the bugs related to incorrect usage there, but now it sounds like the plot 
> > thickens?
> 
> I hope that most code will follow a basic structure of being passed a
> Lisp_Object or two (charpos/marker and object), converting that to a
> pos_t, handing that to internal APIs, potentially receiving a pos_t
> back and converting it back to a Lisp_Object, with only a few lines of
> code deep down the call stacks actually unwrapping the pos_t and
> manipulating it directly. That means there are a few more cases than
> accessing buffer text: comparing two positions, for example, walking a
> buffer or string by incrementing or decrementing them, adding two
> positions or subtracting them.
> 
> (It's true that all kinds of crazy experiments would be easier with
> code that follows this structure, but that's a side effect: the goal
> really is to increase readability a little in a lot of places.)

I cannot judge readability: I'm too accustomed to the current
variables.  But it's clear that this will hurt speed, and in the
innermost loops of our code.  Having to maintain 2 values, recompute
one from another, and move them into and out of a structure each adds
overhead, some small, some large.  They will add up.  I don't think I
see how we can justify that, as the current code is not horribly
unreadable.

Let's see what others think.

>     ch = bidi_fetch_char (cpos += nc, bpos += clen, &disp_pos, &dpp, &bs,
>                   bidi_it->w, fwp, &clen, &nc)
> 
> "nc" and "clen" belong together, and so do cpos and bpos. I find the
> names don't make that very obvious, and simply reducing the number of
> arguments bidi_fetch_char takes by two helps a little.

We can use more descriptive names, that's easy and has zero overhead.
Converting all of our sources to using a struct as positions is
something different.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]