[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: CR-LF line endings in Texinfo files (was texinfo-6.3.90 pretest)
From: |
Gavin Smith |
Subject: |
Re: CR-LF line endings in Texinfo files (was texinfo-6.3.90 pretest) |
Date: |
Thu, 18 May 2017 08:04:48 +0100 |
User-agent: |
Mutt/1.5.23 (2014-03-12) |
On Wed, May 17, 2017 at 05:19:56AM +0300, Eli Zaretskii wrote:
> > From: Gavin Smith <address@hidden>
> > Date: Tue, 16 May 2017 22:05:41 +0100
> > Cc: Texinfo <address@hidden>
> >
> > On 16 May 2017 at 20:06, Eli Zaretskii <address@hidden> wrote:
> > >> From: Gavin Smith <address@hidden>
> > >> Date: Tue, 16 May 2017 19:29:48 +0100
> > >>
> > >> Note: Info files with CR-LF line endings should carry on being
> > >> supported. on all operating systems.
> > >
> > > But actually, they aren't supported on any OS. That's why we changed
> > > texi2any to emit Unix-style LF-only EOLs a while back, remember?
> >
> > They shouldn't be generated but info can still read them.
>
> Info can indeed read them, but cross-references to anchors don't work
> correctly then. They land the reader at the wrong place, because byte
> counts don't match. This started happening some versions ago, because
> of some change whose details I no longer remember.
So the Info files produced by old versions of makeinfo
on MS-DOS or MS-Windows ended lines with a CR-LF sequence, but they
were only counted as one in the tags table (giving byte offsets within
the file of nodes). So info stripped these sequences so that files
could be read correctly.
Eventually it appeared that there could be files with some lines
ending in CR-LF which were counted as _two_ bytes, so in an attempt
to support these, the CR bytes were only stripped after a node
couldn't be found in a file. Apparently this doesn't work completely
reliably, especially for anchors where is nothing at the offset
to confirm that you found the right place (unlike nodes, which
have a node separator).
Could we unconditionally strip the CR's on DOS and Windows only? This
could be done by calling the 'convert_eols' function (currently in
info/nodes.c), or else by opening the file in "text" mode in
filesys_read_info_file in info/filesys.c (currently it uses a flag
"O_BINARY" to open the file).
This would mean that Info files with CR-LF line endings could only
be read on Windows, and moreover Info on Windows could not read
(the few) Info files where the CR bytes were counted. I believe
this could result in quite a bit of simplification of the code, as the
code that conditionally calls 'convert_eols' is a bit difficult
(see find_node_from_tag in info/nodes.c), so I'd like to make
this change.
The upside for Windows users would be that following xrefs to anchors
would be more reliable. (The info/t/crs-not-counted.sh test would have
to be skipped on Windows, or we could just remove this test as there
wouldn't be much left for it to test.)
If anyone really wanted to read an Info file with CR-LF line endings
on GNU/Linux, they would have to convert the file endings themselves.