bug-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [DOC] Incomplete explanation about the regex =~ operator


From: Chet Ramey
Subject: Re: [DOC] Incomplete explanation about the regex =~ operator
Date: Sat, 12 Jan 2019 17:27:38 -0500
User-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:60.0) Gecko/20100101 Thunderbird/60.3.3

On 1/12/19 1:14 AM, kevin wrote:

>>> Moreover, the explanation in the Bash FAQ is unclear; it lacks examples to
>>> know when "an interference" occurred.
>> What is "an interference"?
>>
>>
>>> Look at the following answer to get an overview of the issue:
>>> https://stackoverflow.com/a/12696899
>> That answer is correct: bash uses the C library's regexp library and
>> only guarantees that POSIX EREs work.
>>
> I do not speak English very well.

Your English is fine.

> The Bash FAQ indicates that the shell works differently in a conditional
> expression formed using
> a regular expression. Nonetheless, the Bash FAQ does not give examples to
> get a concrete idea.

I think Greg Wooledge's site has some examples along these lines.

> |"In versions of bash prior to bash-3.2, the effect of quoting the regular
> expression argument to the [[ command's =~ operator was not specified. *The
> practical effect* was that double-quoting the pattern argument required
> backslashes to quote special pattern characters, *which interfered with*
> the backslash processing performed by double-quoted word expansion and was
> inconsistent with how the == shell pattern matching operator treated quoted
> characters."|
> 
> I do not see the practical effect because I do not find concrete cases (or
> examples). In other words, I do not understand the justification.

The ambiguity is that the backslash is special to both the shell and the
regular expression matching engine. Since double-quoting the pattern
enables backslash processing as part of word expansion, what should a
string like "abc\$" match? That gets passed to the regular expression
engine as "abc$" after being processed by the shell's word expansions.
Since the unquoted $ in the pattern means to anchor the pattern at the
end of the string, it's ambiguous what the user meant. If you use a literal
pattern, you can use single quotes to make your intent clear ('abc\$'),
but if you want some expansion to be performed, you have to experiment
with the correct number of backslashes to use to get the right pattern
passed through to the regexp engine.

Beginning with bash-3.2, the behavior of =~ is documented to be the same
as ==: quoting any part of the pattern forces it to be matched as a string,
which means characters special to regular expressions have to be quoted
before they are passed to the regexp matching engine. The shell does this
by processing the quoted portions of the pattern and inserting backslashes
to quote special pattern characters.

> Finally, the fact that the shell works differently in the mentioned case
> should be indicated in the man page and Texinfo source.

It is. That is the difference. The effect of quoting characters in the
pattern is now specified where it was not in bash-3.1 and earlier versions.

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
                 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRU    chet@case.edu    http://tiswww.cwru.edu/~chet/



reply via email to

[Prev in Thread] Current Thread [Next in Thread]