emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Problems with xml-parse-string


From: Lars Magne Ingebrigtsen
Subject: Re: Problems with xml-parse-string
Date: Fri, 24 Sep 2010 18:46:11 +0200
User-agent: Gnus/5.110011 (No Gnus v0.11) Emacs/24.0.50 (gnu/linux)

Chong Yidong <address@hidden> writes:

>> The main difference between sxml and xml.el output is that it has the
>> weird an unnecessary "@" node for the attributes and that it wastes a
>> cons in the attributes, isn't it?
>
> The xml.el output always has an alist for attributes after each tag; if
> there are no attributes, the element after the tag name is nil.  In
> sxml, the `@' denotes an attribute list, which is omitted if no
> attributes exist.

Yes.  So it's yet another irregularity you have to check for.

To take a concrete example: You want the src of the img node you have.

xml.el:  (cdr (assq 'img (cadr node)))
sxml.el: (if (and (consp (cadr node))
                  (eq (caadr node) '@))
             (cadr (assq 'img node)))

(And I'm not even sure that's correct.  It's probably not.  Which is my
point.)

libxml: (cdr (assq :img (cdr node)))

(The difference between libxml and xml.c for attributes is minuscule.)
             
>> Other than that it has the same problem that xml.el has, in that text
>> nodes have to be special-cased, so you can't say assq or use simple
>> descent without testing.
>
> It is illogical to criticize sxml for wasting conses, while arguing for
> wrapping each text node in a cons.

No, it is not.  I'm sacrificing space for speed and regularity.  sxml
wasting cons cells, and adding slowdowns at the same time.

> Anyway, it is difficult to see how real the problem is without a
> concrete example.  Could you provide one?  I suspect that the real
> problem, if one exists, is Elisp's relatively weak support for list
> mapping and reduction; if that's the case, the correct solution is to
> pull in some of the relevant functions from the CL package.

Here's a pretty piece of code, chosen at random:

(defun nnrss-find-el (tag data &optional found-list)
  "Find the all matching elements in the data.
Careful with this on large documents!"
  (when (consp data)
    (dolist (bit data)
      (when (car-safe bit)
        (when (equal tag (car bit))
          ;; Old xml.el may return a list of string.
          (when (and (consp (caddr bit))
                     (stringp (caaddr bit)))
            (setcar (cddr bit) (caaddr bit)))
          (setq found-list
                (append found-list
                        (list bit))))
        (if (and (consp (car-safe (caddr bit)))
                 (not (stringp (caddr bit))))
            (setq found-list
                  (append found-list
                          (nnrss-find-el
                           tag (caddr bit))))
          (setq found-list
                (append found-list
                        (nnrss-find-el
                         tag (cddr bit))))))))
  found-list)

The horror!  

-- 
(domestic pets only, the antidote for overdose, milk.)
  address@hidden * Lars Magne Ingebrigtsen




reply via email to

[Prev in Thread] Current Thread [Next in Thread]