bug-ed
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Case-insensitive regular expressions


From: Shawn Wagner
Subject: Re: Case-insensitive regular expressions
Date: Tue, 6 Jul 2021 19:09:57 -0700

Yeah, that's definitely the easiest to add while still being flexible.
Patch for anyone else who cares:

diff --git a/doc/ed.texi b/doc/ed.texi
index f654616..7f0ba6c 100644
--- a/doc/ed.texi
+++ b/doc/ed.texi
@@ -699,6 +699,8 @@ Matches any character not in a word.

 @end table

+In addition, as a GNU @command{ed} extension, if the regular
+expression starts with @samp{(?i)}, it is matched case-insensitively.

 @node Commands
 @chapter Commands
diff --git a/regex.c b/regex.c
index 3f88966..1c2d498 100644
--- a/regex.c
+++ b/regex.c
@@ -114,6 +114,7 @@ static regex_t * get_compiled_regex( const char **
const ibufpp,
   const char * pat;
   const char delimiter = **ibufpp;
   int n;
+  int re_flags = 0;

   if( delimiter == ' ' ) { set_error_msg( inv_pat_del ); return 0; }
   if( delimiter == '\n' || *++*ibufpp == delimiter ||
@@ -129,7 +130,13 @@ static regex_t * get_compiled_regex( const char
** const ibufpp,
   /* exp compiled && not copied */
   if( exp && exp != subst_regex_ ) regfree( exp );
   else exp = ( &store[0] != subst_regex_ ) ? &store[0] : &store[1];
-  n = regcomp( exp, pat, extended_regexp() ? REG_EXTENDED : 0 );
+  if( extended_regexp() ) re_flags |= REG_EXTENDED;
+  if( strncmp(pat, "(?i)", 4) == 0 )
+    {
+      re_flags |= REG_ICASE;
+      pat += 4;
+    }
+  n = regcomp( exp, pat, re_flags );
   if( n )
     {
     char buf[80];

On Tue, Jul 6, 2021 at 6:33 PM Shawn Wagner <shawnw.mobile@gmail.com> wrote:
>
> 4. Adopt the vim convention that \c anywhere in a regex makes it 
> case-insensitive.  Clean and fully isolated from the rest.
>
> Going that route, I'd rather lift a page from perl & co and indicate it by 
> looking for (?i) at the very beginning of a regular expression, and then just 
> skipping over those characters when passing the RE to regcomp().
>
> Looking through the GNU ed source, that might be the easiest approach. 
> Checking for a flag after the RE before it's compiled is looking a bit 
> complicated to add.
>
>
> On Tue, Jul 6, 2021 at 4:12 PM John Cowan <cowan@ccil.org> wrote:
>>
>>
>>
>> On Tue, Jul 6, 2021 at 4:38 PM Shawn Wagner <shawnw.mobile@gmail.com> wrote:
>>
>>> 1. A command line switch to match all REs that way. Easiest to add, but
>>> then what if you want some patterns to be sensitive and some not in the
>>> same session? It's just the reverse of the current form.
>>
>>
>> I think that would be Very Bad.
>>>
>>> 2. Adding an I flag to REs in address ranges, s, g, etc. (I instead of i to
>>> match GNU sed and for the same reason - to avoid ambiguity with the
>>> existing i command).
>>
>>
>> A possibility.
>>>
>>> 3. Take a page from GNU grep and add support for using PCRE2 regular
>>> expression engine via ed -P,
>>
>>
>> Certainly provides the most function, but may end up with an editor whose 
>> regex package is larger than the rest of it!
>>
>> 4. Adopt the vim convention that \c anywhere in a regex makes it 
>> case-insensitive.  Clean and fully isolated from the rest.
>>
>> Implementation notes:  (a) You need to foldcase both the regex (before it is 
>> compiled) and the text being matched; (b) Unicode case folding has a bunch 
>> of special cases.
>>
>>
>>

Attachment: case_insensitive_re.patch
Description: Binary data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]