[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Are these bugs in cluster?

From: Alan Mead
Subject: Re: Are these bugs in cluster?
Date: Fri, 29 May 2015 09:33:30 -0500
User-agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Thunderbird/31.6.0

John suggested that I post to pspp-dev.  I'm adding code to the k-means
(i.e., quick-cluster.c) procedure to show cluster membership.

CLUSTER works perfectly on a trivial two-dimensional problem but it
fails miserably on some real data. For example, in one analysis
requesting 3 clusters on 98 cases, it found that everyone was in cluster
3 and zero people were in clusters 1 & 2.  I think part of it is that
the starting values seem to be a pattern of 1's and zero's, even though
the comments describe selecting random individuals as starting values.

My question is about accessing the data.  I copied other code to use a
"casereader" to iterate over the rows of data. Below are the relevant
parts of the code I've added that seems to display cluster membership.
If I want to randomly select cases as starting values, is there a way to
retrieve random records directly?


quick_cluster_show_membership (struct Kmeans *kmeans, const struct
casereader *reader, const struct qc *qc)
  struct ccase *c;
  struct casereader *cs = casereader_clone (reader);
  for (i = 0; (c = casereader_read (cs)) != NULL; i++, case_unref (c))


Alan D. Mead, Ph.D.
President, Talent Algorithms Inc.

science + technology = better workers

+815.588.3846 (Office)
+267.334.4143 (Mobile)

Announcing the Journal of Computerized Adaptive Testing (JCAT), a
peer-reviewed electronic journal designed to advance the science and
practice of computerized adaptive testing:

reply via email to

[Prev in Thread] Current Thread [Next in Thread]