[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [bug #38795] texi2any makes CR in output when input is mixed CR-LF a
From: |
Vincent Belaïche |
Subject: |
Re: [bug #38795] texi2any makes CR in output when input is mixed CR-LF and LF files |
Date: |
Wed, 21 Aug 2013 23:04:22 +0200 |
User-agent: |
Thunderbird 2.0.0.24 (Windows/20100228) |
Vincent Belaïche a écrit :
[...]
Hello Karl,
[...]
Now, there is something which I had not noticed in the first place:
the info files are not the same in terms of amount of CR. It seems
that the problem is quite funnier than I had initially thought :-/, I
can't figure out what is happening ...
I have to go to work now, I will post more information later on (like
what texi2any I am using, what activeperl version, and also info files
resulting from launching the wrapper from other environments, or from
doing by hand the command line that the wrapper is doing).
Anyway, please note that I am not in trouble with this problem, the
files on all the projects which I am working about are consistently
encoded, my point was rather to contribute to texinfo project by
reporting the strange behaviour.
VBR,
Vincent.
More experiment results.
Trying from an EMACS "bash" shell buffer, I get the following output:
-----
/c/Documents and Settings/Vincent/Local Settings/Temp/bug_38795>makeinfo
bbdb.texi
Locales dir for document strings not found
makeinfo-dos.cpp cmdline=c:\msys\1.0\lib\activePerl\bin\perl.exe
c:\Programme\GNU\installation\texinfo-install\trunk.old\tp\texi2any.pl
"bbdb.texi"
/c/Documents and Settings/Vincent/Local Settings/Temp/bug_38795>
-----
And the produced info file is that one:
https://savannah.gnu.org/bugs/download.php?file_id=28901
Please note that the *bash* buffer is an MSYS bash that is launched in a
commint mode buffer by the w32utils package `M-x bash' command, I also
attached the w32utils package there --- this is a package which I wrote,
I will make it public someday in a more open way, like importing it to
some forge.
http://savannah.gnu.org/bugs/download.php?file_id=28902
When I do `M-x list-processes' I get:
-----
shell run *bash* --
C:/msys/1.0/bin/bash --posix --noediting -i
shell<1> run *shell* --
C:/Programme/GNU/Emacs/bin/cmdproxy.exe -i
-----
where *bash* is the bash buffer
Here I note that it seems that I get the same thing as with AUCTeX
(mixed LF and CRLF)
Now, another experiment, I call the command directly under a *shell*
buffer of EMACS, without using the wrapper, here is the sort of output I
get:
----
c:\Documents and Settings\Vincent\Local
Settings\Temp\bug_38795>c:\msys\1.0\lib\activePerl\bin\perl.exe
c:\Programme\GNU\installation\texinfo-install\trunk.old\tp\texi2any.pl
"bbdb.texi"
c:\msys\1.0\lib\activePerl\bin\perl.exe
c:\Programme\GNU\installation\texinfo-install\trunk.old\tp\texi2any.pl
"bbdb.texi"
Locales dir for document strings not found
c:\Documents and Settings\Vincent\Local Settings\Temp\bug_38795>
----
and the info file is that one:
http://savannah.gnu.org/bugs/download.php?file_id=28903
Here it seems that the info is same as if I use the wrapper under
*shell* buffer. I also did the same command line:
c:\msys\1.0\lib\activePerl\bin\perl.exe
c:\Programme\GNU\installation\texinfo-install\trunk.old\tp\texi2any.pl
"bbdb.texi"
from a cmd.exe console application launched directly from MSWindows ---
i.e. not a *shell* buffer --- and got also the same info as
http://savannah.gnu.org/bugs/download.php?file_id=28903
More intersting, I typed the following command line:
/c/msys/1.0/lib/activePerl/bin/perl.exe
'c:\Programme\GNU\installation\texinfo-install\trunk.old\tp\texi2any.pl'
bbdb.texi
from an MSYS rxvt console application, and I got also the same info as
http://savannah.gnu.org/bugs/download.php?file_id=28903 --- I would have
expected the same thing as AUCTeX and *bash* with wrapper, but that did
not happen.
Last trial was command w/o wrapper, ie
/c/msys/1.0/lib/activePerl/bin/perl.exe
'c:\Programme\GNU\installation\texinfo-install\trunk.old\tp\texi2any.pl'
bbdb.texi
from the EMACS *bash* buffer, and I got also the same info as
http://savannah.gnu.org/bugs/download.php?file_id=28903
So quite funny: I have two types of info output
- type #1: one with all lines terminated by a CRLF
- type #2: lines ending in CRLF only from line #1250, while lines at the
beginning of the file are terminated in LF
After repeating the experiments I get the following:
- AUCTeX : sometimes type #1, and sometimes type #2 --- I could get type
#2 only once, and could not reproduce it.
- MSYS/rxvt w/o wrapper: sometimes type #1, and sometimes type #2 --- I
could get type #2 less often than type #1, it seems that it can happens
only the first time the console is launched
- MSYS/rxvt with wrapper type #1
- cmd.exe with of without wrapper: type #1
- w32utils' *bash* w/o wrapper sometimes type #1, and sometimes type #2
- w32utils' *bash* with wrapper type #1
- *shell* with or without wrapper: type #1
My first gut feeling was that the way the line ending are handled in
perl depends on some detection of whether the environment is MSWindows
or Linux, but this detection does not always give the same results
dynamically depending on the environment and whether you look at input
or at output. It seemed to me that when you use something like MSYS bash
to launch the command, there is some effect in what perl detects. So it
would be better to do explicitely this detection in the perl script once
and for all, and then to apply it explicitely for all the output files,
so that the output is consistent.
But now, I am really wondering whether there is any difference at all
between type #1 and type #2, and the difference is only a display
artefact of EMACS when I visit the info file. I seems that my EMACS
version will hide no, or some of the ^M endings when I visit the file,
and that behaviour is not systematic and completely confused me on what
the real info output is !!
So I am now not completely sure that type #2 output really exists all,
because it happens far less often and is hard to reproduce and now I
could not reproduce it again to see the file with hexl-mode or checking
the size to be sure that it really has mixed LF and CRLF and that is not
a display artefact of visiting the file.
Maybe in the end it is just an EMACS display artefact and output info
file always has type #1 (consistently CRLF endings). Maybe I was just
puzzled by the fact that even with type #1 EMACS explicitely display the
^M endings, which usually happens only when the line endings are not
consistent, but surely that also happens when the file is handled like
some binary file --- which seems to be the case for info files.
I am now thinking that when I visited bbdb.info then when the display
was like type #2 what has happened was that EMACS started with thinking
that this is a text file, and then changed its opinion during the visit,
which resulted in having two types of displaying (type #1 & type #2). At
the end of the day, that could reveal some bug in EMACS EOL format
detection.
I will try again to see whether I can really produce some type #2 file
(or file display) again.
BR,
Vincent.
PS: Sorry for the very lengthy email, I spent my evening trying to
reproduce the issue. But now I cannot get type #2 display any longer...
-----
- Re: [bug #38795] texi2any makes CR in output when input is mixed CR-LF and LF files, Karl Berry, 2013/08/16
- Re: [bug #38795] texi2any makes CR in output when input is mixed CR-LF and LF files, Vincent Belaïche, 2013/08/20
- Re: [bug #38795] texi2any makes CR in output when input is mixed CR-LF and LF files, Karl Berry, 2013/08/20
- Re: [bug #38795] texi2any makes CR in output when input is mixed CR-LF and LF files, Vincent Belaïche, 2013/08/21
- Re: [bug #38795] texi2any makes CR in output when input is mixed CR-LF and LF files,
Vincent Belaïche <=
- Re: [bug #38795] texi2any makes CR in output when input is mixed CR-LF and LF files, Karl Berry, 2013/08/22
- Re: [bug #38795] texi2any makes CR in output when input is mixed CR-LF and LF files, Vincent Belaïche, 2013/08/24
- Re: [bug #38795] texi2any makes CR in output when input is mixed CR-LF and LF files, Karl Berry, 2013/08/24
- Re: [bug #38795] texi2any makes CR in output when input is mixed CR-LF and LF files, Vincent Belaïche, 2013/08/25