[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: master 6011d39b6a: Fix drag-and-drop of files with multibyte filenam
From: |
Eli Zaretskii |
Subject: |
Re: master 6011d39b6a: Fix drag-and-drop of files with multibyte filenames |
Date: |
Sun, 05 Jun 2022 13:31:10 +0300 |
> From: Po Lu <luangruo@yahoo.com>
> Cc: emacs-devel@gnu.org
> Date: Sun, 05 Jun 2022 18:00:10 +0800
>
> Eli Zaretskii <eliz@gnu.org> writes:
>
> > I don't think I understand this change. raw-text basically doesn't do
> > any conversion, except if the text includes raw bytes. Is that the
> > problem here, and if so, how come a file name can include raw bytes in
> > its name?
>
> Encoding it as `raw-text-unix' is to satisfy the requirement in
> xselect.c that strings returned by selection converters must be
> unibyte. IOW, it's the same as
>
> (string-as-unibyte (expand-file-name value))
>
> except that we can't use `string-as-unibyte', because it's obsolete.
Then why not encode in UTF-8, for example?
> > And what does "Motif expects this to be STRING, but it treats the data
> > as a sequence of bytes instead of a Latin-1 string" mean in this
> > context? The difference between raw bytes and Latin-1 strings is only
> > meaningful to Emacs; how does Motif distinguish between them?
>
> The selection property type STRING means a Latin-1 string, with some
> minor extensions. See this paragraph under "TEXT Properties" in the
> ICCCM:
>
> STRING as a type or a target specifies the ISO Latin-1 character set
> plus the control characters TAB (octal 11) and NEWLINE (octal
> 12). The spacing interpretation of TAB is context dependent. Other
> ASCII control characters are explicitly not included in STRING at the
> present time.
>
> But Motif doesn't comply with the ICCCM meaning of STRING or use the
> generic TEXT type when converting a drag-and-drop selection to
> FILE_NAME. It instead expects the type of the selection property to be
> STRING, but the data is treated as raw bytes.
If some program other than Emacs is the target of the drop, raw bytes
produced from raw-text will not be meaningful for it.
I actually don't understand why you don't use ENCODE_FILE for files
and ENCODE_SYSTEM for everything else -- this is the only encoding
which we know to be generally suitable for any operation that calls
low-level C APIs whose implementation is not in Emacs. Bonus points
for adhering to selection-coding-system when that is non-nil.
Are there any known problems with using these two system encodings in
this case?