[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH] normalization tweaks for macOS
From: |
alex xmb ratchev |
Subject: |
Re: [PATCH] normalization tweaks for macOS |
Date: |
Tue, 18 Jul 2023 11:29:08 +0200 |
i no much here the topic .. just for short .. i found uconv of icu-devtools
has more opts
has also some transliteration opt
just that u may not know it
i no pro i still cant achieve what i had to do
On Tue, Jul 18, 2023, 12:13 AM Grisha Levit <grishalevit@gmail.com> wrote:
> On Mon, Jul 17, 2023 at 3:29 PM Chet Ramey <chet.ramey@case.edu> wrote:
> >
> > On 7/7/23 5:05 PM, Grisha Levit wrote:
> > > A few small tweaks for the macOS-specific normalization handling to
> > > handle the issues below:
> >
> > The issue is that the behavior has to be different between cases where
> > the shell is reading input from the terminal and gets NFC characters
> > that need to be converted to NFD (which is how HFS+ and APFS store them)
> > and when the shell is reading input from a file and doesn't need to (and
> > should not) do anything with NFD characters.
>
> NB: while HFS+ stores NFD names, APFS preserves normalization, so we
> can get either NFC or NFD text back from readdir. Both are
> normalization-insensitive: "Being normalization-insensitive ensures
> that normalization variants of a filename cannot be created in the
> same directory, and that a filename can be found with any of its
> normalization variants." [1]
>
> Currently, Bash never actually converts to NFD. The fnx_tofs()
> function is there but it is never used. Instead, Bash converts
> filenames to NFC with fnx_fromfs() before comparing with either the
> glob pattern or the completion hint text (which is never converted).
>
> Since access is normalization-insensitive, we just need to normalize
> to _some_ form, so going to NFC is fine, but if we're going to do that
> we should normalize both the filesystem name and the text being
> compared.
>
> If there's a match, globs expand to the filenames (NFC or NFD) as
> returned by readdir(), and Readline completes with NFC-normalized
> versions of the names. I think this makes sense.
>
> What doesn't work quite right currently though is that glob patterns
> with NFD text never match anything, and completion prefixes with NFD
> text never expand to anything.
>
> [1]:
> https://developer.apple.com/library/archive/documentation/FileManagement/Conceptual/APFS_Guide/FAQ/FAQ.html
>
> > Does iconv work when taking NFD input that came from the file system and
> > trying to convert it to NFD (UTF-8-MAC)? I've honestly never checked.
>
> Converting to UTF-8-MAC always normalizes to NFD:
>
> $ printf '\303\251\0\145\314\201' | iconv -f UTF-8-MAC -t UTF-8-MAC | od
> -b -An
> 145 314 201 000 145 314 201
>
> $ printf '\303\251\0\145\314\201' | iconv -f UTF-8 -t UTF-8-MAC | od
> -b -An
> 145 314 201 000 145 314 201
>
> But Bash only converts from UTF-8-MAC to UTF-8, which always normalizes to
> NFC:
>
> $ printf '\303\251\0\145\314\201' | iconv -f UTF-8-MAC -t UTF-8 | od
> -b -An
> 303 251 000 303 251
>
>
- [PATCH] normalization tweaks for macOS, Grisha Levit, 2023/07/07
- Re: [PATCH] normalization tweaks for macOS, Chet Ramey, 2023/07/17
- Re: [PATCH] normalization tweaks for macOS, Grisha Levit, 2023/07/17
- Re: [PATCH] normalization tweaks for macOS,
alex xmb ratchev <=
- Re: [PATCH] normalization tweaks for macOS, Chet Ramey, 2023/07/18
- Re: [PATCH] normalization tweaks for macOS, Grisha Levit, 2023/07/18
- Re: [PATCH] normalization tweaks for macOS, Chet Ramey, 2023/07/20
- Re: [PATCH] normalization tweaks for macOS, alex xmb ratchev, 2023/07/20
- Re: [PATCH] normalization tweaks for macOS, Grisha Levit, 2023/07/20
- Re: [PATCH] normalization tweaks for macOS, Chet Ramey, 2023/07/24
- Re: [PATCH] normalization tweaks for macOS, Grisha Levit, 2023/07/25
- Re: [PATCH] normalization tweaks for macOS, Chet Ramey, 2023/07/31
- Re: [PATCH] normalization tweaks for macOS, Grisha Levit, 2023/07/31