emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: :alnum: broken?


From: Stefan Monnier
Subject: Re: :alnum: broken?
Date: Wed, 26 Feb 2020 10:48:36 -0500
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux)

> Patch attached. It was written for master, but I would suggest it go in 
> emacs-27.

FWIW, you have a +1 from me, tho I don't see any urgency so I'd keep it
for `master`.


        Stefan


Mattias Engdegård [2020-02-26 15:10:46] wrote:

> I just made this very mistake while adding a new regexp-error checking
> feature to xr. Needless to say I now am strongly in favour of turning it
> into a hard error.
>
>
> The error message could be improved. For the benefit of
> isearch-forward-regexp, it's probably a good idea if it doesn't start or end
> in a square bracket.
>
> From 014a7a7dce5ae23b8a47dd68eaaef0a5cb985b46 Mon Sep 17 00:00:00 2001
> From: =?UTF-8?q?Mattias=20Engdeg=C3=A5rd?= <address@hidden>
> Date: Wed, 26 Feb 2020 14:46:01 +0100
> Subject: [PATCH] Signal an error for the regexp "[:alnum:]"
>
> Omitting the extra brackets is a common mistake; see discussion at
> https://lists.gnu.org/archive/html/emacs-devel/2020-02/msg00215.html
>
> * src/regex-emacs.c (reg_errcode_t, re_error_msgid): Add REG_ECLASSBR.
> (regex_compile): Check for the mistake.
> * test/src/regex-emacs-tests.el (regexp-invalid): Test.
> * etc/NEWS: Announce.
> ---
>  etc/NEWS                      |  5 +++++
>  src/regex-emacs.c             | 21 ++++++++++++++++++++-
>  test/src/regex-emacs-tests.el |  4 ++++
>  3 files changed, 29 insertions(+), 1 deletion(-)
>
> diff --git a/etc/NEWS b/etc/NEWS
> index 54aab1a5b6..404b4b9ebd 100644
> --- a/etc/NEWS
> +++ b/etc/NEWS
> @@ -190,6 +190,11 @@ Emacs now supports bignums so this old glitch is no 
> longer needed.
>  'previous-system-time-locale' have been removed, as they were created
>  by mistake and were not useful to Lisp code.
>  
> +** The regexp mistake '[:digit:]' is now an error.
> +The correct syntax is '[[:digit:]]'.  Previously, forgetting the extra
> +brackets silently resulted in a regexp that did not at all work as
> +intended.
> +
>
>  * Lisp Changes in Emacs 28.1
>  
> diff --git a/src/regex-emacs.c b/src/regex-emacs.c
> index 694431c95e..2648e1d6ae 100644
> --- a/src/regex-emacs.c
> +++ b/src/regex-emacs.c
> @@ -818,7 +818,8 @@ print_double_string (re_char *where, re_char *string1, 
> ptrdiff_t size1,
>    REG_ESIZE,         /* Compiled pattern bigger than 2^16 bytes.  */
>    REG_ERPAREN,               /* Unmatched ) or \); not returned from 
> regcomp.  */
>    REG_ERANGEX,               /* Range striding over charsets.  */
> -  REG_ESIZEBR           /* n or m too big in \{n,m\} */
> +  REG_ESIZEBR,          /* n or m too big in \{n,m\} */
> +  REG_ECLASSBR,         /* Missing [] around [:class:].  */
>  } reg_errcode_t;
>  
>  static const char *re_error_msgid[] =
> @@ -842,6 +843,7 @@ print_double_string (re_char *where, re_char *string1, 
> ptrdiff_t size1,
>     [REG_ERPAREN] = "Unmatched ) or \\)",
>     [REG_ERANGEX ] = "Range striding over charsets",
>     [REG_ESIZEBR ] = "Invalid content of \\{\\}",
> +   [REG_ECLASSBR] = "Class syntax is [[:digit:]], not [:digit:]",
>    };
>  
>  /* For 'regs_allocated'.  */
> @@ -2000,6 +2002,23 @@ regex_compile (re_char *pattern, ptrdiff_t size,
>  
>           laststart = b;
>  
> +            /* Check for the mistake of forgetting the extra square brackets,
> +               as in "[:alpha:]".  */
> +            if (*p == ':')
> +              {
> +                re_char *q = p + 1;
> +                while (q != pend && *q != ']')
> +                  {
> +                    if (*q == ':')
> +                      {
> +                        if (q + 1 != pend && q[1] == ']' && q > p + 1)
> +                          FREE_STACK_RETURN (REG_ECLASSBR);
> +                        break;
> +                      }
> +                    q++;
> +                  }
> +              }
> +
>           /* Test '*p == '^' twice, instead of using an if
>              statement, so we need only one BUF_PUSH.  */
>           BUF_PUSH (*p == '^' ? charset_not : charset);
> diff --git a/test/src/regex-emacs-tests.el b/test/src/regex-emacs-tests.el
> index f9372e37b1..d268b97080 100644
> --- a/test/src/regex-emacs-tests.el
> +++ b/test/src/regex-emacs-tests.el
> @@ -803,4 +803,8 @@ regexp-multibyte-unibyte
>    (should-not (string-match "å" "\xe5"))
>    (should-not (string-match "[å]" "\xe5")))
>  
> +(ert-deftest regexp-invalid ()
> +  (should-error (string-match "[:space:]" "")
> +                :type 'invalid-regexp))
> +
>  ;;; regex-emacs-tests.el ends here




reply via email to

[Prev in Thread] Current Thread [Next in Thread]