[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: K-Means Clustering
From: |
Duane Currie |
Subject: |
Re: K-Means Clustering |
Date: |
Wed, 9 Mar 2011 09:56:46 -0400 |
Hi folks,
Just a quick jump-in:
k-means needs a distance matrix, but it's k x n in size.
You can also partition the computation if you had extremely large n, but
in general, if n is very large, then you can typically uniformly subsample
(note: imho, the user should decide on if/how to subsample), and produce
approximately the same clustering results.
Duane
On Wed, Mar 9, 2011 at 9:21 AM, John Darrington
<address@hidden> wrote:
> Hi Mehmet!
>
> As I mentioned, the CLUSTER command is soemthing which I think it would be
> great to support.
>
> One issue with clustering is its memory complexity. It requires O(n^2) where
> n is the number of cases being clustered.
> Have you tested your algorithm with large numbers of cases?
>
> Maybe Ben has some ideas how an efficient distance matrix can be implemented
> in PSPP (maybe sparse-array.c can help?) .
>
> In any case, I'd be interested to see your code, and the results of your
> comparisons. Can you post them somewhere?
>
>
> J'
>
> On Tue, Mar 08, 2011 at 05:56:37AM -0800, Mehmet Hakan Satman wrote:
> hi everybody,
>
> I am interested in PSPP and i read about something about the needs for
> developing some functionality.
> I implemented a k-means clustering library using the GNU scientific
> library and
> sent an informative e-mail to John. He suggested me to join this group
> and share
> my ideas with the stuff.
>
>
> I compared the results with SPSS outputs. The analysis of variance table
> is not
> completed but we may add this feature.
>
> I would be glad to integrate something to PSPP and work with you.
>
> What do you think about this?
>
>
>
> --
> PGP Public key ID: 1024D/2DE827B3
> fingerprint = 8797 A26D 0854 2EAB 0285 A290 8A67 719C 2DE8 27B3
> See http://pgp.mit.edu or any PGP keyserver for public key.
>
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.9 (GNU/Linux)
>
> iD8DBQFNd373imdxnC3oJ7MRAiGaAJ4/dtjbbV3KdVan3GHn8X/WdH55sACdF81A
> AxfQCcqJVTLssK99rx3txVg=
> =rRIN
> -----END PGP SIGNATURE-----
>
> _______________________________________________
> pspp-dev mailing list
> address@hidden
> http://lists.gnu.org/mailman/listinfo/pspp-dev
>
>
- K-Means Clustering, Mehmet Hakan Satman, 2011/03/08
- Re: K-Means Clustering, John Darrington, 2011/03/09
- Re: K-Means Clustering,
Duane Currie <=
- Re: K-Means Clustering, Ben Pfaff, 2011/03/09
- Message not available
- Re: K-Means Clustering, John Darrington, 2011/03/10
- Re: K-Means Clustering, Mehmet Hakan Satman, 2011/03/10
- Re: K-Means Clustering, John Darrington, 2011/03/10
- Re: K-Means Clustering, Mehmet Hakan Satman, 2011/03/10
- Re: K-Means Clustering, Mehmet Hakan Satman, 2011/03/11
- Re: K-Means Clustering, John Darrington, 2011/03/12