octave-bug-tracker
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Octave-bug-tracker] [bug #59992] regexp: behaviour of \> (end of a word


From: John W. Eaton
Subject: [Octave-bug-tracker] [bug #59992] regexp: behaviour of \> (end of a word) inconsistent with MATLAB
Date: Tue, 2 Feb 2021 13:33:44 -0500 (EST)
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Firefox/78.0

Follow-up Comment #1, bug #59992 (project octave):

It looks like Octave is doing what Emacs does:


‘\>’
     matches the empty string, but only at the end of a word.
     ‘\>’ matches at the end of the buffer only if the contents
     end with a word-constituent character.

‘\w’
     matches any word-constituent character.
     The syntax table determines which characters these are.


While Matlab says


expr\>
      Matches:  The end of a word.
      Example:  '\w*e\>' matches any words ending with e.


The set of word-constituent characters in both Octave and Matlab appear to be
the set [a-zA-Z_0-9], but I guess Matlab allows an arbitrary character to be
considered as the final character in the word?  Can it be more than one?  For
example, what do the following expressions do?


[b, e] = regexp ('foo!+bar', '\w+\>')
[b, e] = regexp ('foo?!+bar', 'foo?!\>')


Is there an easy way to get PCRE to work differently here and allow any
character(s) to be treated specially as the end of a word?

    _______________________________________________________

Reply to this item at:

  <https://savannah.gnu.org/bugs/?59992>

_______________________________________________
  Message sent via Savannah
  https://savannah.gnu.org/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]