bug-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: String substitution bug


From: Andreas Kähäri
Subject: Re: String substitution bug
Date: Mon, 25 Nov 2024 08:26:56 +0100

On Sun, Nov 24, 2024 at 10:51:43PM +1000, Martin D Kealey wrote:
> On Sun, 24 Nov 2024 at 18:05, Andreas Kähäri <andreas.kahari@abc.se> wrote:
> 
> > I think the manual is quite clear:
> >
> >         Within [ and ], character classes can be specified
> >         using the syntax [:class:], where class is one of the
> >         following classes defined in the POSIX standard:
> >         alnum alpha ascii blank cntrl digit graph lower print
> >         punct space upper word xdigit
> >
> > It says that the syntax "[:class:]" may be used within "[" and "]".
> >
> 
> When one already knows how it works, that's obvious, and it's hard to see
> how it could mean anything else.
> 
> When one *doesn't *already know how it works, “using the syntax *[:class:]*”
> could just as easily mean using *:class:* inside *[…]*.
> 
> (This ambiguity is largely because it does not specify whether *[:class:]*
> is the syntax for a character class, or the syntax for a bracket expression
> containing a character class. It isn't helped by the phrase “*within [ and
> ]*”, which is pretty silly; its most literal reading in English means
> “putting something in [ and putting something in ]”, both of which are
> impossible. The intention is of course “*between [ and ]*” but that's
> slightly ambiguous; I would suggest “*within a bracket expression*”, since
> that's what it's called in POSIX §9.3.5 RE Bracket Expression
> <https://pubs.opengroup.org/onlinepubs/9799919799/basedefs/V1_chap09.html#tag_09_03_05>
> and in POSIX §2.14.1 Patterns Matching a Single Character
> <https://pubs.opengroup.org/onlinepubs/9799919799/utilities/V3_chap02.html#tag_19_14_01>
> 2.14x
> <https://pubs.opengroup.org/onlinepubs/9799919799/utilities/V3_chap02.html#tag_19_14_02>.
> Or if that's not clear enough, “*within a […] bracket expression*”.)
> 
> This REALLY needs to be driven home both in the explanation and with
> examples, preferably with at least one that illustrates using more than one
> character class inside one match group.

A technical manual about a shell is not the place for that.  It may be
distributed in a separate tutorial.

> 
> How about this:

Is this significantly different from how regular expressions usually
works to warrant the text to be duplicated in multiple manuals on the
user's system (e.g. re_format)?  I don't think so.

> 
> Within a […] bracket expression, character classes can be specified using
> > the class syntax *[:class:]* (giving a bracket expression syntax of *[*…
> > *[:class:]*…*]*), where *class* is one of the following classes defined
> > in the POSIX standard:
> 
> 
> >    - *alnum* *alpha* *ascii* *blank* *cntrl* *digit* *graph* *lower*
> >    *print* *punct* *space* *upper* *word* *xdigit*
> >
> > There is no limit on the number or order of symbols, range expressions,
> > character classes, and equivalence classes that can be used in one […]
> > bracket expression. For example *[@[:digit:]#[=c=]$F-L]* will match any
> > one of:
> >
> >    - the symbols ‘@’, ‘#’, & ‘$’; or
> >    - the decimal digits; or
> >    - the symbols that are “equivalent” to ‘c’ according to the current
> >    locale
> >    (in addition to 'c'  itself, this typically includes 'C', but may also
> >    include accented variants such as 'ç', 'Ç', 'č', & 'Č');
> >    or
> >    - the (upper-case) letters ‘F’, ‘G’, ‘H’, ‘I’, ‘J’, ‘K’, & ‘L’.
> >
> >  -Martin

-- 
Andreas (Kusalananda) Kähäri
Uppsala, Sweden

.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]