bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#54470: 29.0.50; [PATCH] Add documentation/tests for Eshell argument


From: Jim Porter
Subject: bug#54470: 29.0.50; [PATCH] Add documentation/tests for Eshell argument expansion
Date: Sun, 20 Mar 2022 13:57:26 -0700

On 3/20/2022 12:05 AM, Eli Zaretskii wrote:
Thank you for working on this.  See some minor comments below.

Thanks for the thorough review.

+Eshell's globbing syntax is very similar to that of Zsh
+(@pxref{Filename Generation, , , zsh, The Z Shell Manual}).  Users
+coming from Bash can still use Bash-style globbing, as there are no
+incompatibilities.  To customize the syntax and behavior of globbing
+in Eshell see the Customize@footnote{@xref{Easy Customization, , ,
+emacs, The GNU Emacs Manual}.} group ``eshell-glob''.

Let's not have hyper-links in footnotes, it makes little sense to me.
(Yes, I know this was in the original text.)

Ok, fixed.

+@table @code
+
+@item *
+Matches any string (including the empty string).  For example,
+@samp{*.el} matches any file with the @file{.el} extension.

You use @code in the @table, but @samp in the body, which will look
inconsistent in the printed version of the manual.  Please use one of
them (I think @samp is better).

Done. I only did this for the glob section though. Should I change the items in the predicates/modifiers to use @samp too? They're different enough that I'm not quite sure.

Or would @kbd be better to use here? These are things meant to be typed by the user into an interactive prompt, after all...

+@item **
+Matches any number of subdirectories in a file name, including zero.

The "including zero" part is confusing: it isn't immediately clear
whether it refers to "file name" or "any number".  I'd use "Matches
zero or more subdirectories..." instead.

Fixed.

+For example, @samp{**/foo.el} matches @file{foo.el},
+@file{bar/foo.el}, @file{bar/baz/foo.el}, etc.

The fact that the "zero" case removes the slash as well (if it does)
should be mentioned explicitly in the text, I think.  It is importand
in cases like foo/**/bar (which perhaps should have an example to make
the feature more clear).

I've updated the item to be '**/', which is more accurate, since the trailing '/' is really a part of the token (the Zsh manual also documents it this way). I also added a short note that the '**/' must be the entirety of a file name segment (so something like 'foo**/bar.el' is nonsense).

+@item ***
+As @code{**}, but follows symlinks as well.
    ^^
"Like" is better, I think.

Fixed.

+@item [ @dots{} ]
+Defines a @dfn{character set} (@pxref{Regexps, , , emacs, The GNU
+Emacs Manual}).

Every @dfn should have a @cindex entry for the term.  In this case,
the term should be qualified, since "character set" is used in many
different contexts.  Something like

   @cindex character set, in Eshell glob patterns

Fixed.

                     A character set matches any character between
+@samp{[} and @samp{]}

This is ambiguous: "between [ and ]" could be interpreted as
characters that are between those in the alphabetical order.  I'd
follow the description in the Emacs manual, which says "characters
between the two brackets".

Fixed.

             You can also include ranges of characters in the set by
+separating the start and end with @samp{-}.  Thus, @samp{[a-z]}
+matches any lower-case @acronym{ASCII} letter.

It might be a good idea to mention here that, unlike with Zsh,
character ranges are interpreted in the Unicode codepoint order, not
in the locale-dependent collation order.  This affects stuff like
[a-z].  Also, does case-fold-search have any effect on these matches?

Done. case-fold-search doesn't have any effect, but eshell-glob-case-insensitive does. I've added a bit of documentation about that to the manual.

+Additionally, you can include @dfn{character classes} in a character

Another @dfn without an index entry..

Fixed.

+@item [^ @dots{} ]
+Defines a @dfn{complemented character set}.  This behaves just like a

And another.

Fixed here (and for @dfn{group} just below this).

+(ert-deftest em-glob-test/match-recursive ()
+  "Test that \"**\" recursively matches directories."
+  (with-fake-files '("a.el" "b.el" "ccc.el" "d.txt" "dir/a.el" "dir/sub/a.el"
+                     "dir/symlink/a.el" "symlink/a.el" "symlink/sub/a.el")
+    (should (equal (eshell-extended-glob "**/a.el")
+                   '("a.el" "dir/a.el" "dir/sub/a.el")))))
+
+(ert-deftest em-glob-test/match-recursive-follow-symlinks ()
+  "Test that \"***\" recursively matches directories, following symlinks."
+  (with-fake-files '("a.el" "b.el" "ccc.el" "d.txt" "dir/a.el" "dir/sub/a.el"
+                     "dir/symlink/a.el" "symlink/a.el" "symlink/sub/a.el")
+    (should (equal (eshell-extended-glob "***/a.el")
+                   '("a.el" "dir/a.el" "dir/sub/a.el" "dir/symlink/a.el"
+                     "symlink/a.el" "symlink/sub/a.el")))))

I think symlink tests should be skipped on MS-Windows, at least by
default (with perhaps some Make variable to activate them?).  Creating
symlinks on Windows requires elevation of privileges, and causes the
system to stop and pop up the elevation confirmation dialog; for some
users it will fail even when enabled.

These tests (and the ones in em-pred-tests.el) don't actually touch the filesystem. I just provided mock implementations of the relevant Elisp functions so that the tests can pretend that the files corresponding to these names really exist. That's a bit less realistic, but it should work on all systems without errors. It should work fine on MS-Windows, unless there are other bugs present.

That said, if you'd prefer a different implementation (e.g. one that examines real files) or just more documentation about what I'm doing here, let me know.

+@node Argument Predication
+@section Argument Predication and Modification
+Eshell supports @dfn{argument predication}, to filter elements of a
+glob, and @dfn{argument modification}, to manipulate argument values.
+These are similar to glob qualifiers in Zsh (@pxref{Glob Qualifiers, ,
+, zsh, The Z Shell Manual}).

Another place where index entries are needed.

Done.

+
+Predicates and modifiers are introduced with @code{( @dots{} )} after
+any list argument, where @code{@dots{}} is a list of predicates or
+modifiers.

Instead of using @dots{], which lacks any semantics here, I would
suggest to use @code{(@var{filters})}.

Done.

+To customize the syntax and behavior of predicates and modifiers in
+Eshell see the Customize@footnote{@xref{Easy Customization, , , emacs,
+The GNU Emacs Manual}.} group ``eshell-pred''.

Again, please move this cross-reference from the footnote to the body.

Done.

+@subsection Argument Predicates

I'd prefer not to have @subsections without nodes.  If you think a
@node is not appropriate for some reason, please use @subheading
instead.

Done, added nodes for these.

+The @code{^} and @code{-} operators are not argument predicates
+themselves, but they modify the behavior of all subsequent predicates.
+@code{^} inverts the meaning of subsequent predicates, so
+@samp{*(^RWX)} expands to all files whose permissions disallow the
+world from accessing them in any way (i.e., reading, writing to, or
+modifying them).  When examining a symbolic link, @code{-} applies the
+subsequent predicates to the link's target instead of the link itself.

This is better moved to after the table of predicates.

Done.

+@table @asis

All the @items use @code, so "@table @code" is better, and then you
can drop @code in the @items.

This is because there's a single item that doesn't just use @code:

  @item @code{.} (Period)

I just lifted that convention from the "Syntax of Regular Expressions" section in the Emacs manual. I think it helps disambiguate what that character is, since a lone "." on a line is a bit confusing.

Note: I changed this slightly in the latest patch to add '@r{}' around '(Period)', matching the regexp section.

+If @var{file} is specified instead, compare against the modification
+time of @file{file}.  Thus, @samp{a-'hello.txt'} matches all files
+accessed after @file{hello.txt} was last accessed.

The use of quotes 'like this', here and elsewhere in a similar
context, begs the question: how to specify names that have embedded
single-quote characters in them?

"Very carefully." :)

Seriously though, this is an area I don't fully understand yet, but in which I've found several bugs (or at least I think they're bugs). As such, I intentionally avoided documenting this since it's pretty confusing. To answer your specific question, you could type:

  *(a-"hello'there.txt")

However, the quoting rules are inconsistent. For example, this is how you normally insert a single quote inside a single-quoted string in Eshell:

  ~ $ echo 'hi''there'
  hi'there

That doesn't work inside predicates though, and it would treat the file name as "hi''there".

Relatedly, the allowed delimiters aren't consistent. a/m/c/u/g all allow *any* non-digit character as a delimiter, so 'a-XfooX' compares against the file 'foo'. They also use "balanced" bracket pairs, so 'a-<foo>' means the same thing.

:s/:gs/:i/:x allow any character as the delimiter (including digits), but don't use "balanced" bracket pairs, so you could type ':s<pattern<repl<' or ':i<pattern<'.

:j/:S only allow ' and / as the delimiters.

I think this should be fixed, but it'll take a fair bit of work, and part of the reason for my filing this bug was to lay the foundation of unit tests so that I could adjust the escaping logic without regressing anything.

+@item e
+Treating the value as a file name, gets the file extension.

What is considered the extension in 'foo.bar.baz'?

It's 'baz'. I've expanded on this, explaining that it returns the final extension sans dot. I also added examples for this and all the other file name modifiers ('h'/head, 't'/tail, and 'r'/root).

+@item q
+Marks that the value should be interpreted by Eshell literally.

What does "literally" mean here?

Added an explanation for this (it means that any special characters like '$' lose their meaning and are treated literally). That said, I'm not 100% sure when this would be useful, since I couldn't figure out a time where this would change how a command is executed. I'm probably just missing something though.

+@item S
+@item S/@var{pattern}/
+Splits the value by the regular expression @var{pattern}.  If
+@var{pattern} is omitted, split on spaces.

"Split the value by regexp" doesn't explain itself.  How about "split
the value using the regular expression @var{pattern} as delimiters"?

Fixed.

+@item j
+@item j/@var{delim}/
+Joins a list of values, separated by the string @var{delim}.  If
+@var{delim} is omitted, use a single space as the delimiter.

And if DELIM is not omitted, what should it be? a regexp?

A string. I mentioned this in the first sentence, but I think it was a bit too brief. I've reworded this to be more explicit.

+@item o
+Sorts a list of strings in ascending lexicographic order.
+
+@item O
+Sorts a list of strings in descending lexicographic order.

This should clarify what is considered the lexicographic order here.
Given the usual dependence on the locale, this is not self-evident.

Fixed, and added a cross-reference to the corresponding section of the Elisp manual. It would have been nice to link to the function itself: 'string<', though I'm not sure how to do that (or if that's something I shouldn't do).

Attachment: 0001-Add-unit-tests-and-documentation-for-Eshell-pattern-.patch
Description: Text document

Attachment: 0002-Add-unit-tests-and-documentation-for-Eshell-predicat.patch
Description: Text document

Attachment: 0003-Add-G-argument-predicate-in-Eshell.patch
Description: Text document


reply via email to

[Prev in Thread] Current Thread [Next in Thread]