[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Ispell and unibyte characters

From: Eli Zaretskii
Subject: Re: Ispell and unibyte characters
Date: Thu, 12 Apr 2012 22:01:30 +0300

> Date: Thu, 12 Apr 2012 16:36:57 +0200
> From: Agustin Martin <address@hidden>
> I am still dealing with an open issue here. Some languages have non 7bit
> wordchars, like Catalan middledot, and it should be converted to UTF-8 if
> default communication language is changed to UTF-8.

Sorry, I don't understand: do you mean "non 8-bit wordchars"?  I don't
think 7 bits is assumed anywhere.

Assuming you did mean 8-bit, then why not use UTF-8 for Catalan from
the get-go?  Only some languages can use single-byte encodings, and
evidently Catalan is not one of them.  For that matter, why shouldn't
aspell and hunspell use UTF-8 by default (something I already asked)?

> I have looked at the encoding stuff and I am currently trying something
> like
> (if ispell-encoding8-command
>     ;; Convert non 7bit otherchars to utf-8 if needed
>     (encode-coding-string
>      (decode-coding-string (nth 3 adict) (nth 7 adict))
>      'utf-8)
>   (nth 3 adict)) ; otherchars
> to get new UTF-8 string where
> (nth 7 adict) -> dict-coding-system
> (nth 3 adict) -> Original otherchars
> but get a sgml-lexical-context error. Need to look more carefuly, so this
> will take longer. I am far from expert in handling encodings, so comments
> are welcome.

I don't understand what are you trying to accomplish by encoding
OTHERCHARS in UTF-8.  What exactly is the problem with them being
encoded in some 8-bit encoding?  Please explain.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]