emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: HTML-Info design


From: Ivan Shmakov
Subject: Re: HTML-Info design
Date: Mon, 29 Dec 2014 14:24:44 +0000
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux)

>>>>> Lars Ingebrigtsen <address@hidden> writes:
>>>>> Nic Ferrier <address@hidden> writes:

 >> It's certainly the case that definite ending is easier to process.

 > I don't really know what to say.  "HTML parsing is a solved problem"?

        Granted, my Libxml2 installation may be out of date, but for the
        HTML5 document MIMEd (valid per http://validator.w3.org/check),
        libxml-parse-html-region (surprisingly) produces the following:

(html
 ((lang . "en") (dir . "ltr"))
 (head nil (title nil "HTML parsing"))
 (body nil (dl nil
               (dt nil "This\n")
               (dd nil "is\n"
                   (dd nil "a\n"
                       (dd nil "perfectly\n"
                           (dd nil "valid\n"
                               (dd nil "HTML5\n"
                                   (dd nil "document.\n")))))))))

        Naturally, SHR rendition of the document would be just as
        unreasonable as is the tree above.

        On the contrary, using Lynx to render the very same document
        results in:

$ lynx --dump --stdin --force-html < example.html 
   This
          is
          a
          perfectly
          valid
          HTML5
          document.
$ 

        The relevant part of the specification [1] is as follows.

    A dt element’s end tag may be omitted if the dt element is
    immediately followed by another dt element or a dd element.

    A dd element’s end tag may be omitted if the dd element is
    immediately followed by another dd element or a dt element, or if
    there is no more content in the parent element.

[1] http://www.w3.org/TR/html5/syntax.html#optional-tags

-- 
FSF associate member #7257  http://boycottsystemd.org/  … 3013 B6A0 230E 334A
This
is
a
perfectly
valid
HTML5
document.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]