bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#44173: 28.0.50; gdb-mi mangles strings with octal escapes


From: Mattias Engdegård
Subject: bug#44173: 28.0.50; gdb-mi mangles strings with octal escapes
Date: Sat, 24 Oct 2020 20:27:13 +0200

24 okt. 2020 kl. 19.23 skrev Eli Zaretskii <eliz@gnu.org>:

>> If gdb-mi-decode-strings is non-nil, then file names, string contents etc 
>> are properly decoded as UTF-8 as expected
> 
> Not UTF-8, but the value of gdb-mi-decode-strings, if it's a
> coding-system, right?

Right.

> I hoped/thought you intended to solve this issue as well, but if the
> situation is no worse than it was before, it's fine to leave it at
> that.  However, please retain at least part of the comment regarding
> gdb-mi-decode-strings and the ambiguity related to its use, I think
> it's important that people know that.

Yes, the valid parts of the comment will be kept.
I'm not sure what a solution to the remaining problems would look like, but it 
would probably involve splitting gdb-mi-decode-strings in separate variables 
for file names and program values. On the other hand, given that the world is 
converging to UTF-8, it may be a disappearing problem?

In any case, should we want to decode strings differently depending on their 
structural position in the answer, I believe that it would be better done in 
the field accessors instead of the parser. For example,

  (bindat-get-field breakpoint 'fullname)

might become something like

  (gdb-mi--get-string-field breakpoint 'fullname 'filename)

which would tell the accessor how to decode the field.

In the short term I suggest changing the default value of gdb-mi-decode-strings 
to 't' as this gives the behaviour most commonly expected by the user. However, 
it is not critical, and in any case orthogonal to the issue at hand. What do 
you think?

> And I hope you've verified that this does still fix the problem in
> bug#21572, which this variable and the related code tries to fix?

Yes -- I tried debugging programs whose source file names contain Unicode chars 
and they were shown correctly (with gdb-mi-decode-strings = t).

>> +       (t
>> +        (error "Unrecognised escape char: %c" (following-char))))
> 
> How about leaving the text unchanged instead of signaling an error
> (and thus preventing the entire data from getting to the higher
> levels)?

Maybe, but I really dislike hiding bugs by being overly tolerant. It is 
precisely this tolerant nature of 'json-read' that caused this bug in the first 
place. (I'm not sure whether this is compliant with RFC 8259, by the way.)
I think it's fine to signal errors if the syntax isn't what we expect; after 
all, that is what the JSON parser does in other cases.

Thanks for the helpful comments. I'll prepare a proper patch.






reply via email to

[Prev in Thread] Current Thread [Next in Thread]