[Octave-bug-tracker] [bug #59992] regexp: behaviour of \> (end of a word

octave-bug-tracker

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Octave-bug-tracker] [bug #59992] regexp: behaviour of \> (end of a word

From:	John W. Eaton
Subject:	[Octave-bug-tracker] [bug #59992] regexp: behaviour of \> (end of a word) inconsistent with MATLAB
Date:	Tue, 2 Feb 2021 13:33:44 -0500 (EST)
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Firefox/78.0

Follow-up Comment #1, bug #59992 (project octave):

It looks like Octave is doing what Emacs does:


‘\>’
     matches the empty string, but only at the end of a word.
     ‘\>’ matches at the end of the buffer only if the contents
     end with a word-constituent character.

‘\w’
     matches any word-constituent character.
     The syntax table determines which characters these are.


While Matlab says


expr\>
      Matches:  The end of a word.
      Example:  '\w*e\>' matches any words ending with e.


The set of word-constituent characters in both Octave and Matlab appear to be
the set [a-zA-Z_0-9], but I guess Matlab allows an arbitrary character to be
considered as the final character in the word?  Can it be more than one?  For
example, what do the following expressions do?


[b, e] = regexp ('foo!+bar', '\w+\>')
[b, e] = regexp ('foo?!+bar', 'foo?!\>')


Is there an easy way to get PCRE to work differently here and allow any
character(s) to be treated specially as the end of a word?

    _______________________________________________________

Reply to this item at:

  <https://savannah.gnu.org/bugs/?59992>

_______________________________________________
  Message sent via Savannah
  https://savannah.gnu.org/

[Prev in Thread]

Current Thread

[Next in Thread]

[Octave-bug-tracker] [bug #59992] regexp: behaviour of \> (end of a word) inconsistent with MATLAB, Sébastien Villemot, 2021/02/02
- [Octave-bug-tracker] [bug #59992] regexp: behaviour of \> (end of a word) inconsistent with MATLAB, John W. Eaton <=
  - [Octave-bug-tracker] [bug #59992] regexp: behaviour of \> (end of a word) inconsistent with MATLAB, Rik, 2021/02/02
    - [Octave-bug-tracker] [bug #59992] regexp: behaviour of \> (end of a word) inconsistent with MATLAB, Philip Nienhuis, 2021/02/02
    - [Octave-bug-tracker] [bug #59992] regexp: behaviour of \> (end of a word) inconsistent with MATLAB, Rik, 2021/02/02
    - [Octave-bug-tracker] [bug #59992] regexp: behaviour of \> (end of a word) inconsistent with MATLAB, Rik, 2021/02/12
    - [Octave-bug-tracker] [bug #59992] regexp: behaviour of \> (end of a word) inconsistent with MATLAB, Kai Torben Ohlhus, 2021/02/15

Prev by Date: [Octave-bug-tracker] [bug #59989] Wrong scope of nested function in anonymous function handle
Next by Date: [Octave-bug-tracker] [bug #59992] regexp: behaviour of \> (end of a word) inconsistent with MATLAB
Previous by thread: [Octave-bug-tracker] [bug #59992] regexp: behaviour of \> (end of a word) inconsistent with MATLAB
Next by thread: [Octave-bug-tracker] [bug #59992] regexp: behaviour of \> (end of a word) inconsistent with MATLAB
Index(es):
- Date
- Thread