[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Texmacs-dev] utf-8 support update
From: |
Felix Breuer |
Subject: |
[Texmacs-dev] utf-8 support update |
Date: |
23 Nov 2002 21:24:17 +0100 |
Hello!
I have begun work on the TeXmacs universal character set -> Unicode
mapping. It can be found at my site www.fbreuer.de/texmacs.
> It may not be really necessary to put the comments (a lot of extra work).
Those comments were from http://www.lut.fi/man/tex/dcfonts/node7.html, I
started working from there.
> You have to be careful with the number of bytes for each character.
> In the Cork encoding, each character only takes one byte,
> so you should write #41 for "A" rather than "#0041".
I changed the mapping accordingly.
> We will rather write these conversion routines in C++ (they must
> be really fast) in src/Resources/Translators
I do not get how this translator works. It seems to never return a
translated string. And instead of building a table associating indices
into the string-to-be-translated, it associates strings with indices.
Are texmacs hashmaps multimaps? I am lost. Could somebody explain it to
me?
Since we are talking about a conversion of the encoding of a string and
not of a translation of its contents, wouldn't it be better to function
as_utf8 to string.cc? This would lend itself more to the inclusion of
other encodings using iconv.h. However, I don't want to argue, I just
need someone to enlighten me :)
Regarding the universal characters: <big|cap> is a different character
then \<cap\>, so <big|...> nodes would have to be converted as well. Why
isn't <big|cap> encoded as <big|\<cap\>>? The latter seems more
consistent to me. How about <left|...>, <right|...>, <mid|...>?
Felix
- Re: [Texmacs-dev] string encoding, (continued)
- Re: [Texmacs-dev] string encoding, Joris van der Hoeven, 2002/11/15
- Re: [Texmacs-dev] string encoding, David Allouche, 2002/11/15
- Re: [Texmacs-dev] string encoding, Felix Breuer, 2002/11/15
- Re: [Texmacs-dev] string encoding, Joris van der Hoeven, 2002/11/15
- Re: [Texmacs-dev] string encoding, Felix Breuer, 2002/11/16
- Re: [Texmacs-dev] string encoding, Felix Breuer, 2002/11/16
- Re: [Texmacs-dev] string encoding, Joris van der Hoeven, 2002/11/16
- Re: [Texmacs-dev] string encoding, Felix Breuer, 2002/11/18
- Re: [Texmacs-dev] string encoding, Joris van der Hoeven, 2002/11/19
- [Texmacs-dev] utf-8 support update,
Felix Breuer <=
- Re: [Texmacs-dev] utf-8 support update, Joris van der Hoeven, 2002/11/25