[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: regexps and locales

From: Olivier Wittenberg
Subject: Re: regexps and locales
Date: Fri, 4 Feb 2005 23:41:52 +0100
User-agent: Mutt/

On Tue, Feb 01, 2005 at 12:33:11PM -0500, Chet Ramey wrote:
> Isn't a one-byte-long filename still a valid filename?


> As far as I know, as long as you can get a one-byte filename
> created, readdir will return it, and `?' should match it.

readdir will return it, but if I understand POSIX correctly, '?'
should not match it.

> I don't think Posix says that; it says that `?' matches `a character'.  If
> you have a filename returned by readdir, and you're in a multibyte locale,
> `?' will match a character, wide or not.

Here's how POSIX defines the word 'character' (in Base Definitions):

3.87 Character

A sequence of one or more bytes representing a single graphic symbol
or control code.

    This term corresponds to the ISO C standard term multi-byte
    character, where a single-byte character is a special case of a
    multi-byte character. Unlike the usage in the ISO C standard,
    character here has no necessary relationship with storage space,
    and byte is used when storage space is discussed.

According to this definition, I understand that "\\351", for example,
is a byte, but is not a character in a UTF-8 locale, hence should not
be matched by '?'.

That's how I interpret the POSIX specification, though I'd certainly
prefer to be proven wrong.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]