pspp-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Outputting .sav files


From: Ben Pfaff
Subject: Re: Outputting .sav files
Date: Fri, 02 Jan 2009 22:25:34 -0800
User-agent: Gnus/5.11 (Gnus v5.11) Emacs/22.2 (gnu/linux)

John Darrington <address@hidden> writes:

> On Thu, Jan 01, 2009 at 03:33:10PM -0800, Ben Pfaff wrote:
>      And, of course, we should support the encoding for .sav files,
>      too.
>
> My concern would be ensuring that we don't break anything which
> has previously "worked".   There are a lot of older files in the
> wild which have the character encoding of 2 but contain non-ascii
> characters.

I think that we could do pretty well with a strategy like:

        1. If the file has a specific character encoding, use it.

        2. Otherwise, if the file has no characters with the high
           bit set ("non-ASCII"), don't worry about it.

        3. Otherwise, if the user specified an encoding on the GET
           command, use that.

        4. Otherwise, default to an encoding based on the locale
           and issue a warning that says what we just did and how
           the user could override it by specifying an encoding.

>      > I seem to remember that we had a discussion about converting codepage
>      > numbers to posix encodings a few months ago, but I couldn't find it by
>      > searching the mailing list archives.
>      
>      I think that Gnulib or gettext might include a mapping table.
>
> I had a brief look but couldn't find anything.  There's some
> information on other websites, but nothing which is either complete or
> authoritative. 

The list here looks pretty extensive:
    http://demo.icu-project.org/icu-bin/convexp?s=WINDOWS

The main page of that site makes the bold claim that "ICU's
conversion tables are based on charset data collected by IBM over
the course of many decades, and is the most complete available
anywhere."

Of course it's always hard to say that something like this is
complete.  You never know when another encoding might pop up.
-- 
I love deadlines.
I love the whooshing noise they make as they go by.
--Douglas Adams




reply via email to

[Prev in Thread] Current Thread [Next in Thread]