[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Standalone Info reader cannot read Info files with CR-LF EOLs

From: Eli Zaretskii
Subject: Re: Standalone Info reader cannot read Info files with CR-LF EOLs
Date: Sun, 28 Dec 2014 22:57:12 +0200

> Date: Sun, 28 Dec 2014 20:06:54 +0000
> From: Gavin Smith <address@hidden>
> Cc: Texinfo <address@hidden>
> On Fri, Dec 26, 2014 at 9:52 PM, Eli Zaretskii <address@hidden> wrote:
> > It's a broken file.  I have no idea how they produced it, but it
> > wasn't by stock makeinfo 4.8 on Windows, because that version already
> > did both count byte offsets in makeinfo disregarding the CR
> > characters, and had the EOL conversion function in the Info reader.  I
> > just checked its code, which I still have on my disk.
> >
> I couldn't quickly find the code in C makeinfo for this - is it
> something to do with file modes under Windows?

Yes.  Makeinfo simply counted the bytes in memory, and the CR
characters were added by C library functions as result of text-mode

> You are probably right that it wasn't produced by makeinfo under
> Windows, but I did reproduce something similar when running makeinfo
> 4.13 under GNU/Linux with a Texinfo source file with CR-LF line
> endings. See the attached input and output files. The whitespace in
> the output Info file doesn't make a lot of sense, but the point is
> that the preamble of the info file does contain a line with a CR-LF
> ending, but the tag table doesn't take this into account - the node
> separator is at byte 113 of the file exactly. It's possible that this
> file was produced in a similar way.

Maybe.  There's of course any number of ways to produce a broken Info

> There may be similar results if a file has mixed kinds of line endings
> (or if it includes other files with different line endings). We can't
> exactly say that the tag tables in files like these is "incorrect".
> Same goes for files produced under Windows where the CR bytes aren't
> counted. We're just left with the problem of loading the files that
> are out there properly.

The most important requirement is to be able to read and display files
that were produced from valid Texinfo sources, either on Unix or on
Windows, and do it in a way that will work in at least the stand-alone
Info and in Emacs.  The situation we have right now with texi2any
doesn't fulfill this requirement, which is not good, I think.

> > Its tag table accounts for the CR characters, which is wrong.  That's
> > why the Info reader from 4.13 cannot read it correctly.  And that's
> > exactly what will happen with Info files created by makeinfo 5.2 when
> > someone tries to read them with Info from 4.13.
> >
> > Moreover, the same problem will happen with the Emacs Info reader.
> > Emacs removes the CR characters when it reads files into buffers (any
> > files, not just Info files), so it must have the tag table with
> > offsets that disregard the CRs.
> If it turns out there are files out there where the 1000-byte slack in
> looking for a node isn't enough, we could tweak it, maybe by
> increasing the slack as we get later on in the file. Maybe something
> similar could be done in Emacs Info. If we could stop makeinfo
> producing files with CR bytes it would stop this problem for newly
> produced files.

There's still a problem of Info files produced by makeinfo 4.x on
Windows -- these will not be reliably readable with the new
stand-alone Info.  I think we need to have a solution for that as
well, at least as a user option, if not automatically.  (One way of
doing that automatically would be to detect the 4.x version from the
file's preamble.)

reply via email to

[Prev in Thread] Current Thread [Next in Thread]