Re: built-in regex matches wrong character

bug-bash

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: built-in regex matches wrong character

From:	Eric Blake
Subject:	Re: built-in regex matches wrong character
Date:	Thu, 6 Sep 2018 09:23:33 -0500
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1

On 09/06/2018 09:17 AM, Chet Ramey wrote:

On 9/5/18 4:39 PM, Eric Blake wrote:

Or, you can use bash's 'shopt -s globasciiranges' which is
supposed to enable Rational Range Interpretation, where even in non-C
locales, a character range bounded by two ASCII characters takes on the C
locale definition of only the ASCII characters in that range, rather than
the locale's definition of whatever other characters might also be
equivalent (actually, while I know that shopt affects globbing, I don't
know if it also affects regex matching - but if it doesn't, that's probably
a bug that should be fixed).


Since bash uses the C library's regexp engine, and most C libraries don't
implement RRI, much less expose it as a flags option available via
regcomp(), there's no reason to expect that globasciiranges would have
any effect on regular expression matching.

But bash could be taught to convert any regex that contains a range withboth endpoints ASCII into a different bracket expression before handingthings over to regcomp(). That is, if the user is matching against[a-d], bash hands [abcd] to regcomp() instead. You don't need a flag inregcomp() to get RRI, just merely some pre-processing (and often memoryallocation, as the expansion of a range into a non-range tends torequire more characters).


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

[Prev in Thread]

Current Thread

[Next in Thread]

built-in regex matches wrong character, mamatb, 2018/09/05
- Re: built-in regex matches wrong character, Eric Blake, 2018/09/05
  - Re: built-in regex matches wrong character, Miguel Amat, 2018/09/05
    - Re: built-in regex matches wrong character, Chet Ramey, 2018/09/06
  - Re: built-in regex matches wrong character, Chet Ramey, 2018/09/06
    - Re: built-in regex matches wrong character, Eric Blake <=
    - Re: built-in regex matches wrong character, Chet Ramey, 2018/09/06
    - Message not available
    - Re: built-in regex matches wrong character, Aharon Robbins, 2018/09/06
    - Re: built-in regex matches wrong character, Eric Blake, 2018/09/06
- Re: built-in regex matches wrong character, Chet Ramey, 2018/09/06

Prev by Date: Re: built-in regex matches wrong character
Next by Date: Re: built-in regex matches wrong character
Previous by thread: Re: built-in regex matches wrong character
Next by thread: Re: built-in regex matches wrong character
Index(es):
- Date
- Thread