[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Case-insensitive regular expressions
From: |
Shawn Wagner |
Subject: |
Re: Case-insensitive regular expressions |
Date: |
Tue, 6 Jul 2021 19:09:57 -0700 |
Yeah, that's definitely the easiest to add while still being flexible.
Patch for anyone else who cares:
diff --git a/doc/ed.texi b/doc/ed.texi
index f654616..7f0ba6c 100644
--- a/doc/ed.texi
+++ b/doc/ed.texi
@@ -699,6 +699,8 @@ Matches any character not in a word.
@end table
+In addition, as a GNU @command{ed} extension, if the regular
+expression starts with @samp{(?i)}, it is matched case-insensitively.
@node Commands
@chapter Commands
diff --git a/regex.c b/regex.c
index 3f88966..1c2d498 100644
--- a/regex.c
+++ b/regex.c
@@ -114,6 +114,7 @@ static regex_t * get_compiled_regex( const char **
const ibufpp,
const char * pat;
const char delimiter = **ibufpp;
int n;
+ int re_flags = 0;
if( delimiter == ' ' ) { set_error_msg( inv_pat_del ); return 0; }
if( delimiter == '\n' || *++*ibufpp == delimiter ||
@@ -129,7 +130,13 @@ static regex_t * get_compiled_regex( const char
** const ibufpp,
/* exp compiled && not copied */
if( exp && exp != subst_regex_ ) regfree( exp );
else exp = ( &store[0] != subst_regex_ ) ? &store[0] : &store[1];
- n = regcomp( exp, pat, extended_regexp() ? REG_EXTENDED : 0 );
+ if( extended_regexp() ) re_flags |= REG_EXTENDED;
+ if( strncmp(pat, "(?i)", 4) == 0 )
+ {
+ re_flags |= REG_ICASE;
+ pat += 4;
+ }
+ n = regcomp( exp, pat, re_flags );
if( n )
{
char buf[80];
On Tue, Jul 6, 2021 at 6:33 PM Shawn Wagner <shawnw.mobile@gmail.com> wrote:
>
> 4. Adopt the vim convention that \c anywhere in a regex makes it
> case-insensitive. Clean and fully isolated from the rest.
>
> Going that route, I'd rather lift a page from perl & co and indicate it by
> looking for (?i) at the very beginning of a regular expression, and then just
> skipping over those characters when passing the RE to regcomp().
>
> Looking through the GNU ed source, that might be the easiest approach.
> Checking for a flag after the RE before it's compiled is looking a bit
> complicated to add.
>
>
> On Tue, Jul 6, 2021 at 4:12 PM John Cowan <cowan@ccil.org> wrote:
>>
>>
>>
>> On Tue, Jul 6, 2021 at 4:38 PM Shawn Wagner <shawnw.mobile@gmail.com> wrote:
>>
>>> 1. A command line switch to match all REs that way. Easiest to add, but
>>> then what if you want some patterns to be sensitive and some not in the
>>> same session? It's just the reverse of the current form.
>>
>>
>> I think that would be Very Bad.
>>>
>>> 2. Adding an I flag to REs in address ranges, s, g, etc. (I instead of i to
>>> match GNU sed and for the same reason - to avoid ambiguity with the
>>> existing i command).
>>
>>
>> A possibility.
>>>
>>> 3. Take a page from GNU grep and add support for using PCRE2 regular
>>> expression engine via ed -P,
>>
>>
>> Certainly provides the most function, but may end up with an editor whose
>> regex package is larger than the rest of it!
>>
>> 4. Adopt the vim convention that \c anywhere in a regex makes it
>> case-insensitive. Clean and fully isolated from the rest.
>>
>> Implementation notes: (a) You need to foldcase both the regex (before it is
>> compiled) and the text being matched; (b) Unicode case folding has a bunch
>> of special cases.
>>
>>
>>
case_insensitive_re.patch
Description: Binary data