Re: [bug-gettext] broken handling of unicode code point escapes in Tcl

bug-gettext

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-gettext] broken handling of unicode code point escapes in Tcl

From:	Guido Berhoerster
Subject:	Re: [bug-gettext] broken handling of unicode code point escapes in Tcl
Date:	Wed, 26 Jun 2013 11:27:22 +0200
User-agent:	Mutt/1.5.21 (2010-09-15)

* Daiki Ueno <address@hidden> [2013-06-26 04:22]:
> Guido Berhoerster <address@hidden> writes:
> 
> > I still wonder why you're substituting \u escapes with unicode
> > characters at all, as that potentially allows unescaped control
> > sequences which make the .po file quite fragile?
> 
> I agree that interpreting \u escapes might cause confusing output for
> Unicode control characters, but I don't think it is totally unuseful.
> 
> I can think of at least a couple of benefits of the current behavior:
> 
> 1. translators are provided with decoded (human-readable) strings
> 2. strings escaped in different escaping schemes (e.g. \U in Python) can
>    be unified
> 
> Perhaps an idea might be to introduce gettext-specific Unicode escaping
> scheme (which may only escape control characters) and add an option to
> xgettext to use it.

It can be a bit more complicated than just control characters,
e.g. certain space characters such as U+00A0, U+202F or U+2001
are also non-obvious but not control sequences. Maybe a better
option would be to offer substitution of only alphanumeric and
punctuation characters rather than non-control characters.
Or you could simply add an option to not substitute \u escapes
at all, that is the behavior of the diverse native Tcl
.msg-format extractors that float around (e.g. thos included in
in tkabber or coccinella) and what I'd personally prefer.
-- 
Guido Berhoerster

[Prev in Thread]

Current Thread

[Next in Thread]

[bug-gettext] broken handling of unicode code point escapes in Tcl, Guido Berhoerster, 2013/06/24
- Re: [bug-gettext] broken handling of unicode code point escapes in Tcl, Daiki Ueno, 2013/06/24
  - Re: [bug-gettext] broken handling of unicode code point escapes in Tcl, Guido Berhoerster, 2013/06/25
    - Re: [bug-gettext] broken handling of unicode code point escapes in Tcl, Daiki Ueno, 2013/06/25
    - Re: [bug-gettext] broken handling of unicode code point escapes in Tcl, Guido Berhoerster <=
    - Re: [bug-gettext] broken handling of unicode code point escapes in Tcl, Daiki Ueno, 2013/06/27
- [bug-gettext] broken handling of unicode code point escapes in Tcl, Guido Berhoerster, 2013/06/24

Prev by Date: Re: [bug-gettext] broken handling of unicode code point escapes in Tcl
Next by Date: Re: [bug-gettext] broken handling of unicode code point escapes in Tcl
Previous by thread: Re: [bug-gettext] broken handling of unicode code point escapes in Tcl
Next by thread: Re: [bug-gettext] broken handling of unicode code point escapes in Tcl
Index(es):
- Date
- Thread