libcdio-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Libcdio-devel] How tolerant to be towards CD-TEXT character set mis


From: Thomas Schmitt
Subject: Re: [Libcdio-devel] How tolerant to be towards CD-TEXT character set mislabeling ?
Date: Sun, 28 Apr 2019 13:40:01 +0200

Hi,

Ludolf Holzheid wrote:
> I would vote for even switching to CP1252, which further extends
> ISO-8859-1.

A good point, provided that it is really true that each ISO-8859-1
character has the same byte code in CP1252.
The following documents confirm this:
  https://www.i18nqa.com/debug/table-iso8859-1-vs-windows-1252.html
  https://en.wikipedia.org/wiki/ISO/IEC_8859-1#Windows-1252

Afaik, no ISO-8859-X X!=1 is a true superset of ISO-8859-1.


So let's consider to replace "ASCII" and "ISO-8859-1" by "CP1252" for
outmost reader tolerance:

          case CDTEXT_CHARCODE_ISO_8859_1:
            /* default */
            /* ISO-8859-1 is a subset of CP1252. If non-ISO-8859-1 are
             * present against CD-TEXT specification, CP1252 gives more hope
             * for a readable result than telling iconv to be picky.
             */
            charset = (char *) "CP1252";
            break;
          case CDTEXT_CHARCODE_ASCII:
            /* ASCII is a subset of ISO-8859-1. Some CDs announce it but then
             * have 8-bit characters in their text. Trying CP1252 gives
             * more hope for a readable result than telling iconv to be picky.
             */
            charset = (char *) "CP1252"

But other than with "ISO-8859-1", which was already tested extensively,
libcdio never converted CD-TEXT from CP1252 up to now.
So there should be some extra testing.

To Serge (the bug reporter):
Will you find time and maybe a few more audio CDs to test it ?


Have a nice day :)

Thomas




reply via email to

[Prev in Thread] Current Thread [Next in Thread]