emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [elpa] 02/04: company-clang: handle multibyte chars between bol and


From: Dmitry Gutov
Subject: Re: [elpa] 02/04: company-clang: handle multibyte chars between bol and point
Date: Fri, 21 Mar 2014 05:47:11 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.3.0

On 20.03.2014 18:11, Eli Zaretskii wrote:

I needed to look in their sources, but the information there isn't
clear-cut, either (or maybe I didn't understand the code ;-).  Some
functions that convert file offsets to columns count bytes from the
beginning of the line, others count characters, assuming a UTF-8
encoding.  But since you say the attempt to count characters in
non-UTF-8 encoding failed, I guess clang needs byte counts of UTF-8
encoding.

Yes. And from what I've read (http://stackoverflow.com/a/8259610/615245), non-ANSI encoding support was added piecewise, so maybe the relevant code still hasn't settled.

In any case, please note that UTF-8 and the internal encoding used by
Emacs are not exactly identical, so IMO you should encode into UTF-8
and then use 'length' to compute the "column".

This makes sense. I don't think anyone's likely to encounter a source file with characters that are encoded differently between utf-8 and utf-8-emacs, but I guess the latter is unspecced, so it could change in the future.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]