[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Building intermediate Chinese language romanization alists

From: Eric Abrahamsen
Subject: Building intermediate Chinese language romanization alists
Date: Tue, 15 Jan 2019 13:19:23 -0800
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (gnu/linux)


I often would like to get access to the correspondences between
romanized Chinese, and Chinese characters. E.g., in the pinyin
romanization method, the string "zhong" can map to any of the characters
"中种重众终钟忠衷肿仲锺踵盅冢忪舯螽". This is useful for creating
language utilities, and other people have put together their own
correspondences for their own purposes[1].

Emacs ships with several of these mappings (though I understand they are
not included in the distribution), which are used to build the relevant
input methods. In the case of pinyin, the text
file ./leim/MISC-DIC/pinyin.map is converted with `titdic-convert' into
the file ./lisp/leim/quail/PY.el.

PY.el is automatically generated (by the function `py-converter' in
titdic-cnv.el): the mapping in pinyin.map is directly inserted into the
generated file, then wrapped in quotes and parens, to construct a call
to `quail-define-rules'.

I might be able to get the map back out of quail somehow, but since this
seems to be something that more than a few people would like access to,
I wonder if it would be acceptable to add an intermediary step, creating
(for instance) a defconst called `pinyin-map-alist' that holds the
contents of pinyin.map, and then changing the `quail-define-rules' call

(apply #'quail-define-rules pinyin-map-alist)

The input method wouldn't be affected, but we'd have access to the
mapping via the constant, which would be very useful.

Pinyin would be the most useful romanization method to do this for, but
it looks like the CTLau and possibly ziranma methods might benefit from
similar treatment.

(Another issue is that if the constant is written into PY.el, which
isn't a library, it might be a bit difficult to get out again, but
perhaps the defconst could be appended to one
of./lisp/language/{chinese.el,china-util.el}. Or PY.el could be made a

I'm not entirely familiar with the language-related build process, but I
hope there might be an appropriate stage at which to hang the alist on a
variable name.


[1]: https://github.com/tumashu/pyim/blob/master/pyim-pymap.el

reply via email to

[Prev in Thread] Current Thread [Next in Thread]