pspp-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Reading apache log files


From: John Darrington
Subject: Reading apache log files
Date: Thu, 30 Nov 2006 19:02:26 +0800
User-agent: Mutt/1.5.4i

A number of people have asked me recently how to use PSPP to read the log
files generated by Apache.  It's awkward, because some of the fields
are fixed width, whereas others are delimited variable width fields.
Thus, a combination of DATA LIST FIXED and DATA LIST FREE is needed.
Further complications arise because some fields are numeric, but
Apache substitutes - for zero.

One way to read these files is as follows:

INPUT PROGRAM.
FILE HANDLE logfile /NAME='access.log'.
DATA LIST LIST NOTABLE FILE=logfile
    /IP (a15)
    CLIENTID (a255)
    USERID (a255)
    .

COMPUTE #POS =
    LENGTH (RTRIM(IP)) + 1
    + LENGTH (RTRIM (CLIENTID)) + 1
    + LENGTH (RTRIM (USERID)) + 1
    + 2.

REREAD COLUMN=#POS.


DATA LIST FIXED NOTABLE FILE=logfile
    /TIMEDATE 1-26 (A)
    .

COMPUTE #POS = #POS + 26 + 2.
REREAD COLUMN=#POS.

DATA LIST LIST NOTABLE FILE=logfile
    /REQUEST (a512)
    STATUS (f8.0)
    ASIZE (a10)
    REFERER (a512)
    USER_AGENT (a512)
    .
NUMERIC BYTES.
FORMATS BYTES (F8.0).
DO IF ASIZE <> "-"
COMPUTE BYTES=NUMBER (ASIZE, F8.0).
END IF.

* Recode to the natural logarithm
COMPUTE LNBYTES=LN(BYTES).

END INPUT PROGRAM.



DISPLAY DICTIONARY.

EXAMINE BYTES LNBYTES
  /STATISTICS=DESCRIPTIVES
  /PLOT=NPPLOT, BOXPLOT, HISTOGRAM


Further refinements might be needed to read all files of the form
access.log.* in sequence.

J'

-- 
PGP Public key ID: 1024D/2DE827B3 
fingerprint = 8797 A26D 0854 2EAB 0285  A290 8A67 719C 2DE8 27B3
See http://pgp.mit.edu or any PGP keyserver for public key.


Attachment: pgpziWHGYH__a.pgp
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]