[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: library for unicode collation in C for texi2any?
From: |
Eli Zaretskii |
Subject: |
Re: library for unicode collation in C for texi2any? |
Date: |
Sat, 14 Oct 2023 22:20:40 +0300 |
> From: Gavin Smith <gavinsmith0123@gmail.com>
> Date: Sat, 14 Oct 2023 19:57:22 +0100
>
> It's all in the future, but I am slightly concerned about is duplicating
> in Texinfo existing system facilities. For example, for avoiding use of
> wcwidth, our use of which depends on setting a UTF-8 locale, and using
> the wchar_t type. Is every program that uses wcwidth supposed to supply
> their own implementation instead, and isn't this wasteful?
What other locale-specific functions do we need in addition to
wcwidth?
If the list of those functions is short enough, we could replace them
all by the corresponding Gnulib/libunistring functions, and then we
could stop setting specific locales and relying on locale-specific
libc functions. That will give us locale-independent code which will
work on all systems.
> I don't know if libunistring aspires to become a standard system library
> for handling UTF-8 data but if we use it for other UTF-8 processing it
> would make sense to use it for collation.
>
> I suggest writing to Bruno Haible to ask if he has plans to include
> collation functionality in libunistring in the future. I am currently
> reading through "Unicode Technical Standard #10" and although I don't
> understand a lot of it yet, it seems feasible that we could implement it
> in C.
It is feasible, but implementing it from scratch is a lot of work, and
needs a large database (which we could take from the CLDR). But note
that CLDR is AFAIK locale-dependent; the only part of it that doesn't
depend on the locale is collation by Unicode codepoints.
- Re: implementation language [was: library for unicode collation in C for texi2any?], (continued)
- Re: implementation language [was: library for unicode collation in C for texi2any?], Patrice Dumas, 2023/10/14
- Re: implementation language [was: library for unicode collation in C for texi2any?], Per Bothner, 2023/10/14
- Re: implementation language [was: library for unicode collation in C for texi2any?], Per Bothner, 2023/10/14
- Re: implementation language [was: library for unicode collation in C for texi2any?], Gavin Smith, 2023/10/15
- Re: implementation language [was: library for unicode collation in C for texi2any?], Per Bothner, 2023/10/15
- Re: implementation language [was: library for unicode collation in C for texi2any?], Gavin Smith, 2023/10/16
- Re: implementation language [was: library for unicode collation in C for texi2any?], Per Bothner, 2023/10/16
- Re: implementation language [was: library for unicode collation in C for texi2any?], Leo Butler, 2023/10/16
Re: library for unicode collation in C for texi2any?, Gavin Smith, 2023/10/14
- Re: library for unicode collation in C for texi2any?,
Eli Zaretskii <=