[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Accepting [xyz---abc] - three minus signs to mean one
From: |
Paul Eggert |
Subject: |
Re: Accepting [xyz---abc] - three minus signs to mean one |
Date: |
Thu, 21 Apr 2022 19:08:55 -0700 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.8.0 |
On 4/21/22 00:57, Arnold Robbins wrote:
As far as my testing indicates, dfa.c doesn't need a patch, it seems
to accept "---" inside brackets for a single minus.
Yes, a brief perusal of the dfa.c source code suggests you're right.
Thanks for looking into this. I tend to agree with you that POSIX is not
likely to outlaw this extension.
If there are no objections, can we get this into Gnulib?
Although the basic idea looks good, I see a few places where the patch
can be improved.
* The two calls to re_string_peek_byte might go past the end of the
pattern (a subscript violation). This is possible because the pattern is
not necessarily null-terminated.
* The two calls to re_string_fetch_byte can be simplified into a single
call to re_string_skip_bytes.
* No need to assign to token->opr.c, as it already has the correct value.
* Can fall through to the default case to save a bit of duplicate code.
* glibc still uses comments /* like this */ for style reasons, and we
should stick to that.
I wrote a patch with these improvements in mind and installed it into
Gnulib (see attached); hope it works for Gawk too.
0001-regex-match-.-.-like-V7-grep.patch
Description: Text Data