[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Standalone Info reader cannot read Info files with CR-LF EOLs
From: |
Gavin Smith |
Subject: |
Re: Standalone Info reader cannot read Info files with CR-LF EOLs |
Date: |
Sun, 28 Dec 2014 20:06:54 +0000 |
On Fri, Dec 26, 2014 at 9:52 PM, Eli Zaretskii <address@hidden> wrote:
> It's a broken file. I have no idea how they produced it, but it
> wasn't by stock makeinfo 4.8 on Windows, because that version already
> did both count byte offsets in makeinfo disregarding the CR
> characters, and had the EOL conversion function in the Info reader. I
> just checked its code, which I still have on my disk.
>
I couldn't quickly find the code in C makeinfo for this - is it
something to do with file modes under Windows?
You are probably right that it wasn't produced by makeinfo under
Windows, but I did reproduce something similar when running makeinfo
4.13 under GNU/Linux with a Texinfo source file with CR-LF line
endings. See the attached input and output files. The whitespace in
the output Info file doesn't make a lot of sense, but the point is
that the preamble of the info file does contain a line with a CR-LF
ending, but the tag table doesn't take this into account - the node
separator is at byte 113 of the file exactly. It's possible that this
file was produced in a similar way.
There may be similar results if a file has mixed kinds of line endings
(or if it includes other files with different line endings). We can't
exactly say that the tag tables in files like these is "incorrect".
Same goes for files produced under Windows where the CR bytes aren't
counted. We're just left with the problem of loading the files that
are out there properly.
> Its tag table accounts for the CR characters, which is wrong. That's
> why the Info reader from 4.13 cannot read it correctly. And that's
> exactly what will happen with Info files created by makeinfo 5.2 when
> someone tries to read them with Info from 4.13.
>
> Moreover, the same problem will happen with the Emacs Info reader.
> Emacs removes the CR characters when it reads files into buffers (any
> files, not just Info files), so it must have the tag table with
> offsets that disregard the CRs.
If it turns out there are files out there where the 1000-byte slack in
looking for a node isn't enough, we could tweak it, maybe by
increasing the slack as we get later on in the file. Maybe something
similar could be done in Emacs Info. If we could stop makeinfo
producing files with CR bytes it would stop this problem for newly
produced files.
cr-lf-endings-4.texi
Description: TeXInfo document
cr-lf-endings-4.info
Description: Binary data