[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: even more about character encoding names
From: |
John Darrington |
Subject: |
Re: even more about character encoding names |
Date: |
Mon, 3 Jan 2011 10:23:01 +0000 |
User-agent: |
Mutt/1.5.18 (2008-05-17) |
On Sun, Jan 02, 2011 at 01:59:05PM -0800, Ben Pfaff wrote:
I had been under the impression, from previous discussions, that
the string in .sav file record 7, subtype 20
(e.g. "windows-1252") provided additional important information
on top of what was in the code page number in record 7 subtype 3
(e.g. 1252).
But now that I go through my trove of .sav files that I have
found around the Internet, I can only find one where this is the
case. That one is one written by PSPP itself (version
0.7.4-g44daa4)! In all the others, the encoding string just
repeats what we already know from the code page number.
Have you seen any .sav files where the character encoding name
provides more information than the codepage number?
Based on discussions I've had with SPSS users it seems that the datum provided
by 7.3 determines the encoding for strings in the dictionary (ie, Variable
names, variable
labels, value label keys AND value label values), whereas the string provided
by 7.20 determines the encoding of string data in the file records. At least
this is
what recent SPSS versions appear to do.
Now I have never seen a system file generated by a recent SPSS where the two
data
did not correspond. However, when I crafted such a file and asked my friendly
SPSS
user to run it, they reported inconsistencies in the way strings (especially
value label keys)
were displayed.
J'
--
PGP Public key ID: 1024D/2DE827B3
fingerprint = 8797 A26D 0854 2EAB 0285 A290 8A67 719C 2DE8 27B3
See http://pgp.mit.edu or any PGP keyserver for public key.
signature.asc
Description: Digital signature