Greetings,
I've used PSPP to extract columnar data from a
2008 public use data file named NAMCS08.exe
It's a medical care survey.
The extraction script I used is pasted
below.
To keep the script small, I've left un-needed data
clumped in large undefined columns.
The DIAG1 and DIAG2 columns contain non-numeric
diagnostic codes.
The PVM column is a weight value.
The CSTRATM column is a strata value and the CPSUM
column is cluster value, for use in variance estimation.
My understanding is that if I isolate a class of
DIAG codes within the data and then aggregate their respective PVM
weights, I'll have a national annual estimate of incidence rate for
that type of diagnosis. However, the estimate needs to
be accompanied by a relative standard error. The CSTRATM &
CPSUM variables are for use in calculating the RSE.
Apparently SPSS will do this with a script partly
provided on page 89 of the NAMCS file documentation. They mention "SPSS
Complex Samples 12.0 Module".
I've read the PSPP manual and attempted a script
myself, but I'm not even close, and I'm in over my head.
Before I completely abandon this exploration into NAMCS
"public use data files", I thought I'd put this in front of the community.
My objectives:
1 - aggregate the weights of all records that have a
DIAG1 value of 38200, 38201, 3824-, or 3829-, to provide a national estimate of
incidence for those codes, with RSE calculated based on the strata and cluster
values.
2 - if possible, expand objective-1 to
also include all records that have a DIAG2 value of 38200, 38201, 3824-, or
3829- that is accompanied by DIAG1 codes of 49390, 4659-, or V202.
Is PSPP capable of this?
FWIW - I have a respectable skill set in various arena,
but statistical assessment is not part of it. I'm simply over my
head.
If my objectives are difficult to achieve with PSPP,
then I'm done and will move on to something else.
However, if this little project requires only moderate
effort from a person with expertise, then I would be immensely grateful if
someone could provide a script or show me how to do it. I'm not sure about
the protocols of this mailing list, but if I'm allowed to say so,
compensation for the work is available.
Thanks!
DS
Extraction script:
set workspace=100000000.
GET DATA /TYPE=TXT /FILE='C:\Documents and
Settings\Owner\My Documents\WPI New\Papers\Research Data\NAMCS\2008 raw
data\NAMCS08' /ARRANGEMENT=FIXED /FIRSTCASE=2 /IMPORTCASE=ALL
/VARIABLES=VMONTH 0-1 F
VYEAR 2-5 F
VDAYR 6-6 F
AGE 7-9 F
SEX 10-10 F
ETHNIC 11-12 F
RACE 13-14 F
DN1 15-51 A
Reason 52-53 F
DIAG1 54-58 A
DIAG2 59-63 A
DIAG3 64-68 A
DN2 69-222 A
MED 223-223 F
MED1 224-228 F
MED2 229-233 F
MED3 234-238 F
MED4to8
239-263 A
NCMED1 264-265 F
NCMED2 266-267 F
NCMED3 268-269 F
DN3 270-301 A
PVW 302-307
F
DN35 308-327 A
DRUGID1
328-333 A
DN4 334-384 A
DRUGID2 385-390 A
DN5
391-441 A
DRUGID3 442-447 A
DN6
448-968 A
CSTRATM 969-976 F
CPSUM 977-982
F
DN7 983-996 A.