[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: texi to epub

From: Kurt Hornik
Subject: Re: texi to epub
Date: Thu, 16 Dec 2021 18:33:32 +0100

>>>>> Patrice Dumas writes:

> On Wed, Dec 15, 2021 at 07:06:37PM +0100, Kurt Hornik wrote:
>> Friends,
>> calibre does not guarantee that an EPUB produced by it is valid. The
>> only guarantee it makes is that if you feed it valid XHTML 1.1 + CSS
>> 2.1 it will output a valid EPUB.
>> and of course makeinfo gives HTML 4.01 Transitional: I also tried the
>> effect of going through HTML tidy to turn that into XHTML, but that did
>> not make epubcheck happy.

> My wild guess is that outputting valid XHTML directly with texi2any is
> probably the simplest way to go.  This is actually probably a
> prerequisite for generating epub anyway.  I recall somebody else wanting
> XHTML too.

> I could have a try, but before I would like to have an XHTML
> command-line offline validator, is there something like that existing?

See my prev msg about this.

Directly outputting valid XHTML would of course be great.  

I actually had played with a pipeline which does .texi to .html, then
use HTML tidy a la 

  tidy --output-xhtml yes --doctype strict

to get strict XHTML, and then call calibre to create the ebook, but
epubcheck was never happy with that.  After some playing around, I now
understand why: epub needs XHTML 1.1 a la

  <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"

but HTML tidy only gives

  <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"

and the difference matters a lot: by simply changing the DOCTYPE it
seems that I can now use the W3C validator (service) to reproduce the
warnings by epubcheck.

There are two major sources of warnings in my case (using texinfo 6.8):

* The data-manual attribute in the hyperlinks

* Tables which do their header and footer inside thead and tfoot, but
  not the content inside tbody (which seems valid in XHTML 1.0 but not
  in 1.1).

Now if I look at e.g. <https://www.w3.org/TR/html4/sgml/loosedtd.html>
it seems that tbody should also be ok for HTML 4.01 Transitional which
texinfo currently targets, so perhaps this could be added in?


> -- 
> Pat

reply via email to

[Prev in Thread] Current Thread [Next in Thread]