[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Customization variable to disable Unicode collation?
From: |
Patrice Dumas |
Subject: |
Re: Customization variable to disable Unicode collation? |
Date: |
Wed, 31 Jan 2024 23:25:02 +0100 |
On Wed, Jan 31, 2024 at 08:19:21PM +0000, Gavin Smith wrote:
> On Wed, Jan 31, 2024 at 10:38:36AM +0100, Patrice Dumas wrote:
> > With collation also possible with XS/C, but with a different result than
> > in perl, I think that there should be a way to use perl unicode
> > collation from C too, in addition to using a unicode collation or not.
> >
> > Should it be a separate customization variable, or should
> > USE_UNICODE_COLLATION be replaced by a variable with a textual value
> > taking more possibilities, for example USE_COLLATION with possible
> > values:
> > unicode
> > unicodeperl
> > basic
>
> Hi, I think we should wait until we decide exactly what we are doing
> with collation in C first. If we are aiming to get exactly the same
> collation in C as in Perl, then I don't think we should have a separate
> option for this.
I do not think that we can do that, there will be differences.
> If collation with C allows something different, then
> we could have an option. There is the issue of language-specific tailoring
> which could be potentially achieved with locale-based sorting in C.
It can also be achieved in Perl, I think, with Unicode::Collate::Locale.
It could be possible to git the same as in C, the current locale, or
a specific locale. We will have portability issues, but we can do
something similar to what you did with Unicode::Collate to fallback to
Texinfo::CollateStub, to fallback from Unicode::Collate::Locale to
Unicode::Collate (or Texinfo::CollateStub).
# TODO Unicode::Collate has been in perl core long enough, but
# Unicode::Collate::Locale is present since perl major version 5.14 only,
# released in 2011. So probably better to use Unicode::Collate until 2031
# (and if documentlanguage is not set) and switch to Unicode::Collate::Locale
# at this date.
#my $collator = Unicode::Collate::Locale->new('locale' => $documentlanguage,
# %collate_options);
Also based on your other mail, it looks like we could have 5 possibilities
* basic collation
* 'default' (en_US) unicode collation: Unicode::Collate in Perl and
strxfrm_l with en_US in C
* current locale unicode collation: Unicode::Collate::Locale with
current locale in perl and strxfrm in C
* @documentlanguage unicode collation: Unicode::Collate::Locale with
@documentlanguage in Perl and strxfrm_l with documentlanguage.utf_8
in C
* user specified language collation: Unicode::Collate::Locale with
a customization variable value in Perl and strxfrm_l with
customization variable value.utf_8 in C
It seems to me to be a bit too many options, but I do not see any
special difficulty with any of these possibilities either. If there a
specified locale it will require some new code, but it should not be too
complex.
--
Pat