RE: seeking a good data set for CTABLES examples

Hi Ben,

The table looks great! Good job.

It would be helpful if gridlines could be shown. I realize the tables could be imported into Word, Excel, or other applications, and lines could be added. I assume that the tables in PSPP would have hidden lines (cells) to import to an application. In SPSS, I used the Tables option to display data for lay personnel, and not so much for me as I was okay with the non-table formats generated by SPSS and, thanks to you, PSPP. My data tables were used in grant applications for federal and state funding and to discuss using big data findings with college and university personnel. It amazes me that most colleges and universities do not see the value of data mining and reporting.

I would be interested in knowing the cell reporting options (row, column, sheet counts and percentages, etc.) you plan to use. I do not need to know this now. When I used SPSS, there were some annoying limitations on displaying stats and formats. I am excited about what you have done and realize that enhancements will be made over time, just as you always have with PSPP.

Take care,

John

___________________________

Email: jhwhite@techwriteinc.com

From: Pspp-users <pspp-users-bounces+jhwhite=techwriteinc.com@gnu.org> On Behalf Of Ben Pfaff
Sent: Monday, January 17, 2022 1:37 AM
To: pspp-users <pspp-users@gnu.org>
Subject: Re: seeking a good data set for CTABLES examples

Here's an example of what I can do currently with this dataset and CTABLES. Syntax:

CTABLES /TABLE QN105BA[c] + QN105BB[c] + QN105BC[c] + QN105BD[c]
/CLABELS ROWLABELS=OPPOSITE.

Output:

Custom Tables
╭──────────────────────────────────────────────────────────────┬───────────┬────────┬───────────┬─────────────┬──────────╮
│ │ Almost │ Very │ Somewhat │ Somewhat │ Very │
│ │ certain │ likely │ likely │ unlikely │ unlikely │
│ ├───────────┼────────┼───────────┼─────────────┼──────────┤
│ │ Count │ Count │ Count │ Count │ Count │
├──────────────────────────────────────────────────────────────┼───────────┼────────┼───────────┼─────────────┼──────────┤
│105b. How likely is it that drivers who have had too much to │ 700│ 1502│ 2763│ 1307│ 609│
│drink to drive safely will A. Get stopped by the police? │ │ │ │ │ │
│105b. How likely is it that drivers who have had too much to │ 1100│ 2819│ 2417│ 430│ 140│
│drink to drive safely will B. Have an accident? │ │ │ │ │ │
│105b. How likely is it that drivers who have had too much to │ 1149│ 2037│ 2032│ 994│ 622│
│drink to drive safely will C. Be convicted for drunk driving? │ │ │ │ │ │
│105b. How likely is it that drivers who have had too much to │ 1101│ 1834│ 2307│ 1095│ 549│
│drink to drive safely will D. Be arrested for drunk driving? │ │ │ │ │ │

╰──────────────────────────────────────────────────────────────┴───────────┴────────┴───────────┴─────────────┴──────────╯

So, progress!

On Sun, Jan 16, 2022 at 2:29 PM Ben Pfaff <blp@cs.stanford.edu> wrote:

Here is one data set that seems to suit the purpose:
https://catalog.data.gov/dataset/2008-national-survey-of-drinking-and-driving-attitudes-and-behaviors
It's not perfect because the variable names are poor (they are simply
named for question numbers) and because a lot of the variables have a
wrong measurement level, but I'm going to start from it.

Please feel free to send me more data sets.

On Sat, Jan 15, 2022 at 10:40 AM Ben Pfaff <blp@cs.stanford.edu> wrote:
>
> Hi! I'm getting to the point with work on CTABLES that I need a good
> data set for use in examples. A good data set would need to be:
> * Publicly available and freely redistributable.
> * Medium size (at least hundreds of cases).
> * Have a mix of categorical and scale variables.
> * Contain some variables suitable for multiple response sets.
>
> I can't use the data sets that come with SPSS because it's not clear
> that they are freely redistributable.
>
> I'd appreciate advice and pointers.
>
> Thanks,
>
> Ben.

From:	jhwhite
Subject:	RE: seeking a good data set for CTABLES examples
Date:	Mon, 17 Jan 2022 11:50:28 -0500