lilypond-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Encoding of LilyPond console output


From: Wilbert Berendsen
Subject: Encoding of LilyPond console output
Date: Sat, 31 Dec 2011 14:44:01 +0100

Hi all,

When dealing with files and directories with accented letters
(non-ASCII filenames) I came across a number of small issues, which I
need to handle correctly in Frescobaldi, my LilyPond compagnion.

Everything boils down to the question in which encoding the 8bit
console output of LilyPond is presented. I was always assuming that
this was UTF-8, which worked correctly on Linux. I.e. both
filenameswith accented letters and translated messages (such as French,
with many accented letters) always showed up correctly in the LilyPond
console output. Which is read by Frescobaldi as an 8bit bytestream and
decoded into unicode strings using the UTF-8 encoding.

Then, on Windows, I discovered that filenames do not use UTF-8
encoding, but rather 'mbcs' or something like that (what is returned
by sys.getfilesystemencoding() in Python).

So I changed Frescobaldi to use that encoding when reading LilyPond
console output, but then we discovered that translated messages (such
as the French ones) with accented characters do show in a garbled
encoding (clearly showing something like UTF-8 displayed as Latin1).

So again I changed Frescobaldi, and now it reads the console output
byte stream and parses that for file references (such as: file.ly:12:3:
error: blabla) and decodes those filenames using the filesystem
encoding, and the rest using UTF-8.

This seems to work well: file references are correctly parsed and
messages are readable still.

Only some other messages from LilyPond that show filenames, like
"processing `file.ly'...", show the filename in a wrong encoding,
because the filename is written as-is in the filesystem encoding,
intermingled in a message encoded as UTF-8. This can also be seen on
the Windows console (both CMD and the Git bash console).

So everything thrown together: is my analysis of the mixed output
encodings LilyPond uses on stdout and stderr correct?

And in line with this: can LilyPond be made more aware of this, and use
the same encoding for all output (correctly encoding filenames)? Or am
I wrong?

With many regards and best wishes to all for the new year!
Wilbert

-- 
Wilbert Berendsen
(http://www.wilbertberendsen.nl)




reply via email to

[Prev in Thread] Current Thread [Next in Thread]