From: Sergey Vnotchenko <sergey@shine.inf>
To: bug-gnu-emacs@gnu.org
Subject: Russian copy & paste failed under X11 (from emacs to kde/gnome)
This bug report will be sent to the Free Software Foundation,
not to your local site managers!
Please write in English, because the Emacs maintainers do not have
translators to read other languages for them.
Your bug report will be posted to the bug-gnu-emacs@gnu.org mailing list,
and to the gnu.emacs.bug news group.
In GNU Emacs 21.3.1 (i586-suse-linux, X toolkit, Xaw3d scroll bars)
of 2004-04-06 on gray
configured using `configure '--with-gcc' '--with-pop' '--with-system-malloc' '--prefix=/usr' '--infodir=/usr/share/info' '--mandir=/usr/share/man' '--sharedstatedir=/var/lib' '--libexecdir=/usr/lib' '--with-x' '--with-xpm' '--with-jpeg' '--with-tiff' '--with-gif' '--with-png' '--with-x-toolkit=lucid' '--x-includes=/usr/X11R6/include' '--x-libraries=/usr/X11R6/lib' 'i586-suse-linux' 'CC=gcc' 'CFLAGS=-O2 -march=i586 -mcpu=i686 -fmessage-length=0 -Wall -pipe -DSYSTEM_PURESIZE_EXTRA=25000 -DSITELOAD_PURESIZE_EXTRA=10000 -D_GNU_SOURCE ' 'LDFLAGS=-s' 'build_alias=i586-suse-linux' 'host_alias=i586-suse-linux' 'target_alias=i586-suse-linux''
Important settings:
value of $LC_ALL: nil
value of $LC_COLLATE: nil
value of $LC_CTYPE: nil
value of $LC_MESSAGES: C
value of $LC_MONETARY: nil
value of $LC_NUMERIC: nil
value of $LC_TIME: C
value of $LANG: ru_RU.KOI8-R
locale-coding-system: cyrillic-koi8
default-enable-multibyte-characters: t
Please describe exactly what actions triggered the bug
and the precise symptoms of the bug:
HOW TO REPRODUCE:
1) Select text with russian characters.
2) Paste in Mozilla browser. Result - no russian, only ascii chars appears.
3) Paste in any KDE app. Result - incorrect russian chars appears.
ANALYSIS:
I've traced problem down to 'ctext-pre-write-conversion' defun, which failed
to provide correct conversion. I've hacked it to convert selection directly
to koi8 and append neccessary escape sequences. So, it is now seems to works
for koi, but surely broken for other encodings.
With best regards, Sergey.
;; WORKAROUND (FOR KOI8 AND ASCII ONLY):
(defun ctext-pre-write-conversion (from to)
"Encode characters between FROM and TO as Compound Text w/Extended Segments.
If FROM is a string, or if the current buffer is not the one set up for us
by run_pre_post_conversion_on_str, generate a new temp buffer, insert the
text, and convert it in the temporary buffer. Otherwise, convert in-place."
(cond ((and (string= (buffer-name) " *code-converting-work*")
(not (stringp from)))
; Minimize consing due to subsequent insertions and deletions.
(buffer-disable-undo)
(narrow-to-region from to))
(t
(let ((buf (current-buffer)))
(set-buffer (generate-new-buffer " *temp"))
(buffer-disable-undo)
(if (stringp from)
(insert from)
(insert-buffer-substring buf from to))
(setq from (point-min) to (point-max)))))
(encode-coding-region from to 'cyrillic-koi8-unix)
;; Convert clipboard to extended segment (koi8-r)
;; according to http://www.xfree86.org/current/ctext.pdf
;; (Non-Standard Character Set Encodings)
(save-match-data
(goto-char (point-min))
(setq case-fold-search nil)
(while (re-search-forward "[^[:ascii:]]\\{1,16383\\}" nil 'move)
(forward-char)
(setq
text (match-string 0)
length (+ (length text) 7)
M (+ (/ length 128) 128)
L (+ (% length 128) 128)
ext_segment (concat "\x1b\x25\x2f\x31" (vector M L) "koi8-r\x2" text)
)
(replace-match ext_segment)
)
)
nil)