emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ELPA] Update proposal: psgml: asl-like parsing in PI


From: Lucien Pullen
Subject: Re: [ELPA] Update proposal: psgml: asl-like parsing in PI
Date: Sun, 26 Nov 2017 22:11:18 -0700
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.4.50 (darwin)

Also sprach Stefan Monnier on 2017-11-22:
[ I Cc'd Lennart, but I don't think he reads this email address any more,
  and AFAIK he's not interested in maintaining PSGML any more.
  If you're interested, you could be the new maintainer.  ]
I certainly use it enough and have gone digging into it. Let me consider the proposal.

See the attached patch file for specifics. I'd appreciate some feedback on the docstring and the possibility of turning on and off floating literals as
an optional argument so it could be used as the parser for
sgml-parse-attribute-specification-list as well.

Not sure what you mean by "turning on and off floating literals".

+As an extension, the atom string can appear as a literal instead
+of only a token.

I don't know what that means.

It's formal names of SGML syntax, except the 'floating' part. I don't have a copy of ISO 8859 in front of me to give the correct name for the "attribute specification without name and value indicator" syntax so I called it 'floating'.

There was talk of amending SGML to allow

   <!ATTLIST my-el data CDATA #IMPLIED>

to be minimized as

   <my-el "My data= value">

when there is only one CDATA attribute, like with enumerated tokens, but nothing appears to have come of it.

This also lets you write '<?my-pi "data"="My data= value">' if you wanted, which is illegal if a start tag. Perhaps it is better to forbid this syntax in a PI parser for SGML attribute specification list syntax since it contradicts SGML's attribute specification list syntax.

-(defun sgml--pi-element-handler ()
+(defun sgml-parse-name-or-literal ()
   (sgml-parse-s)
-  (let ((eltype (sgml-lookup-eltype (sgml-parse-name)))
-        name value)
+  (if (looking-at "['\"]")
+      (sgml-parse-literal)
+    (sgml-parse-name)))

(
This new function has nothing to do with the existing function except by git's insistence on making a small patch file that works by location of character strings and not lisp syntax.
)

In SGML you can't write '<el "value">' as an alternative to '<el value>' because one is a literal and the other is a name token. You can write '<el name="value">', though, with or without the quotes, since "value" appears in the value position of the attribute specification, and psgml calls this frequently enough but doesn't have a function for it.

AFAICT the old code used (read (current-buffer)) instead of
(sgml-parse-name) here.  What's the implication of this change?

+(defun sgml-pi-asl-parser (&optional value-fn)

I'd personally just make `value-fn` into a mandatory argument.

The old code uses the following form:

<?PSGML ELEMENT MY-EL face=(elisp face forms)>

instead of

<?PSGML ELEMENT MY-EL face="(elisp face forms)">

so it mixes SGML parsing and elisp parsing. I made `value-fn' a parameter so the same procedure could be used to read regular SGML and SGML+Elisp. Pretty much the only thing I can think of it being used for is this one case.

[ Which also justifies using the name `sgml-parse-name-or-literal` without double dash, since that function might be useful for external
  callers to provide as value-fn argument.
But sgml-parse-name-or-literal probably needs a docstring then. ]

Almost none of the other parse functions have docstrings. For some reason `sgml-parse-minimum-literal' does. Added one in its style.

       (cond ((sgml-parse-delim "VI")
+ (setq val (funcall (if value-fn value-fn 'sgml-parse-name-or-literal))))
+            (t (setq val nil)))

(if A A B) is better written (or A B):

Can facepalms work by homeopathy? Surely my screen has the negative imprint of one to prevent future forgetfulness of logic macros.

          (cond ((sgml-parse-delim "VI")
(setq val (funcall (or value-fn #'sgml-parse-name-or-literal))))
                (t (setq val nil)))

And the `setq val` can be hoisted outside the conditional:

          (setq val (cond ((sgml-parse-delim "VI")
(funcall (or value-fn #'sgml-parse-name-or-literal)))
                          (t nil)))

And the trailing (t nil) can be dropped:

          (setq val (if (sgml-parse-delim "VI")
(funcall (or value-fn #'sgml-parse-name-or-literal))))

Some people may prefer `when` over `if` here.

Also, rather than let-bind `val` outside the loop and `setq` it at each
turn, we can do:

        (let ((val (if (sgml-parse-delim "VI")
(funcall (or value-fn #'sgml-parse-name-or-literal)))))
          (push (if (null value) name (cons name value)) acc))

With the use of `lexical-binding` this form is just as efficient.

This is mostly from the original. I didn't do much style rewriting, and kinda copied the structure without much thinking. I think I wrote the patch before going to work; it needs cleanup.

+      (push (if (null val) name (cons name val)) acc)

AFAICT the old code used `t` rather than `nil` to mark the "no
value" case. Is there a risk that sgml-parse-name-or-literal can return nil? If so, is the resulting change in behavior on purpose?

If it can't return nil, then the previous code can be simplified further
to have a single `if` test.

sgml-parse-name-or-literal is convenience for "if looking at a LIT or LITA use parse literal code otherwise use parse token code", which always returns a string (I think). The function in question is `sgml-pi-asl-parser' that returns the list of strings and conses.

I'm reading over the code again and saving "token" and "token=nil" as "token" (but ONLY when using a parser other than `sgml-parse-name-or-literal', since that will return "nil" and not 'nil) in the return is incorrect.

   <?PSGML ELEMENT my-el face=nil>

I was probably thinking in terms of "bool-option (bool-option) #IMPLIED", where, when omitted, the attribute is false and true otherwise. Changed it to "(let ((default (cons nil nil))) ...)" style to catch that.

+(defun sgml--pi-element-handler ()
+  (destructuring-bind (elname &rest options)

the new psgml-parse.el code uses `cl-lib` rather than `cl` so the above
needs to be changed to `cl-destructuring-bind`.

Whoops. Normally I program CL and this doesn't show up as a warning while compiling.



Here is a patch showing the new changes. I haven't made note that you can use `sgml-pi-asl-parser' in an `sgml-pi-function' procedure, and `sgml-do-processing-instruction' likely needs modified to make this work:

Instead of calling `narrow-to-region' in `sgml--pi-psgml-handler' call it like this instead?

(let ((next (point)))
 (save-restriction ... (cond (psgml-pi ...) ...))
 (goto-char next))



>From a3670a3ebb64496b868bc10ec997e9cc31bc900a Mon Sep 17 00:00:00 2001
From: Lucien Pullen <address@hidden>
Date: Sun, 26 Nov 2017 21:49:32 -0700
Subject: [PATCH 2/2] * psgml-parse.el: clean up PI parser code

(sgml-parse-name-or-literal): Add a docstring.

(sgml-pi-asl-parser): Follow Elisp conventions for the docstring.  Write
about literals in terms of SGML syntax.

Use functional style in loop body instead of declarative.

Differentiate the value parser returning 'nil and the name=value form
not being used.

(sgml--pi-element-handler): `cl-lib' library prefixes names with "cl-".
---
 psgml-parse.el | 31 +++++++++++++++++--------------
 1 file changed, 17 insertions(+), 14 deletions(-)

diff --git a/psgml-parse.el b/psgml-parse.el
index 0710287..3cd762b 100644
--- a/psgml-parse.el
+++ b/psgml-parse.el
@@ -1621,6 +1621,7 @@ in any of them."
                                  command)))))))
 
 (defun sgml-parse-name-or-literal ()
+  "Parse an SGML token or literal and return it as a string."
   (sgml-parse-s)
   (if (looking-at "['\"]")
       (sgml-parse-literal)
@@ -1629,30 +1630,32 @@ in any of them."
 (defun sgml-pi-asl-parser (&optional value-fn)
   "Parse the processing instruction like an attribute specification list.
 
-Returns a list of strings and (name-string . value-string) pairs
-in the order they were written in; atom strings when there is no
-value indicator and conses when there is.
+Returns a list of VALUE and (NAME . VALUE) pairs in the order
+they were written in; atom strings when there is no value
+indicator and conses when there is.
 
-As an extension, the atom string can appear as a literal instead
-of only a token.
+  The NAME and VALUE can be an SGML literal.  This is an
+  extension to SGML syntax that requires NAME and a VALUE without
+  a 'name=' to be a name token.
 
-As an extension, the value of a name-value pair can be parsed by
-calling value-fn with no arguments instead of as a token or
-literal."
+The optional argument VALUE-FN (default
+`sgml-parse-name-or-literal') is called with zero arguments to
+parse the value part of a name=value pair."
   (let ((acc '())
-        name val)
+        name
+        (implied (cons nil nil)))
     (sgml-parse-s)
     (while (setq name (sgml-parse-name-or-literal))
       (sgml-parse-s)
-      (cond ((sgml-parse-delim "VI")
-             (setq val (funcall (if value-fn value-fn 
'sgml-parse-name-or-literal))))
-            (t (setq val nil)))
-      (push (if (null val) name (cons name val)) acc)
+      (let ((val (if (sgml-parse-delim "VI")
+                     (funcall (or value-fn 'sgml-parse-name-or-literal))
+                   implied)))
+        (push (if (eq val implied) name (cons name val)) acc))
       (sgml-parse-s))
     (nreverse acc)))
 
 (defun sgml--pi-element-handler ()
-  (destructuring-bind (elname &rest options)
+  (cl-destructuring-bind (elname &rest options)
       (sgml-pi-asl-parser (lambda () (read (current-buffer))))
     (let ((eltype (sgml-lookup-eltype elname)))
       (dolist (option options)
-- 
2.3.4


reply via email to

[Prev in Thread] Current Thread [Next in Thread]