[Texmacs-dev] utf-8 support update

texmacs-dev

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Texmacs-dev] utf-8 support update

From:	Felix Breuer
Subject:	[Texmacs-dev] utf-8 support update
Date:	23 Nov 2002 21:24:17 +0100

Hello!

I have begun work on the TeXmacs universal character set -> Unicode
mapping. It can be found at my site www.fbreuer.de/texmacs. 

> It may not be really necessary to put the comments (a lot of extra work).

Those comments were from http://www.lut.fi/man/tex/dcfonts/node7.html, I
started working from there.

> You have to be careful with the number of bytes for each character.
> In the Cork encoding, each character only takes one byte,
> so you should write #41 for "A" rather than "#0041".

I changed the mapping accordingly.

> We will rather write these conversion routines in C++ (they must
> be really fast) in src/Resources/Translators

I do not get how this translator works. It seems to never return a
translated string. And instead of building a table associating indices
into the string-to-be-translated, it associates strings with indices.
Are texmacs hashmaps multimaps? I am lost. Could somebody explain it to
me?

Since we are talking about a conversion of the encoding of a string and
not of a translation of its contents, wouldn't it be better to function
as_utf8 to string.cc? This would lend itself more to the inclusion of
other encodings using iconv.h. However, I don't want to argue, I just
need someone to enlighten me :)


Regarding the universal characters: <big|cap> is a different character
then \<cap\>, so <big|...> nodes would have to be converted as well. Why
isn't <big|cap> encoded as <big|\<cap\>>? The latter seems more
consistent to me. How about <left|...>, <right|...>, <mid|...>? 


Felix

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Texmacs-dev] string encoding, (continued)
- Re: [Texmacs-dev] string encoding, Joris van der Hoeven, 2002/11/15
  - Re: [Texmacs-dev] string encoding, David Allouche, 2002/11/15
    - Re: [Texmacs-dev] string encoding, Joris van der Hoeven, 2002/11/15
  - Re: [Texmacs-dev] string encoding, Felix Breuer, 2002/11/15
    - Re: [Texmacs-dev] string encoding, Joris van der Hoeven, 2002/11/15
    - Re: [Texmacs-dev] string encoding, Felix Breuer, 2002/11/16
    - Re: [Texmacs-dev] string encoding, Felix Breuer, 2002/11/16
    - Re: [Texmacs-dev] string encoding, Joris van der Hoeven, 2002/11/16
    - Re: [Texmacs-dev] string encoding, Felix Breuer, 2002/11/18
    - Re: [Texmacs-dev] string encoding, Joris van der Hoeven, 2002/11/19
    - [Texmacs-dev] utf-8 support update, Felix Breuer <=
    - Re: [Texmacs-dev] utf-8 support update, Joris van der Hoeven, 2002/11/25

Prev by Date: [Texmacs-dev] RE: noweb, pamphlets, and TeXmacs
Next by Date: [Texmacs-dev] Re: noweb, pamphlets, and TeXmacs
Previous by thread: Re: [Texmacs-dev] string encoding
Next by thread: Re: [Texmacs-dev] utf-8 support update
Index(es):
- Date
- Thread