[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [sc-users] Similarity Clumping



On Sun, Dec 1, 2019 at 4:00 PM <jables.deutsch@xxxxxxxxx> wrote:
> I'm thinking the example you cautioned against is actually better for my purposes, since it doesn't alter the array values.

I'm looking at KMeans now. The only place where the 'data' array is
modified in any way is when adding a new data point from the user:
`data = data ++ [datum]`.

I can't see any place in this class where user data are altered.

The thing that I always found tricky about this class is that you have
to specify the number of clumps in advance. I think it would be more
typical that you don't know the distribution ahead of time, and you
want the analysis to figure out where the clusters are. But that might
be a different statistical technique.

The other trick is that it seems to assume two-dimensional data. You
can fake it by adding single-item arrays:

k = KMeans(3);

[0, 1, 2, 8, 9, 10, 20, 21, 22].do({ |datum| k.add([datum]) });
k.update

k.centroids;
-> [ [ 1.0 ], [ 9.0 ], [ 21.0 ] ]

^^ That's correct; 1, 9 and 21 are in the middle of each of their
respective ranges.

k.assignments;
-> [ 0, 0, 0, 1, 1, 1, 2, 2, 2 ]

^^ Individual data points are associated with the nearest centroid.

k.data;
-> [ [ 0 ], [ 1 ], [ 2 ], [ 8 ], [ 9 ], [ 10 ], [ 20 ], [ 21 ], [ 22 ] ]

^^ And the 'data' array is unchanged.

hjh

_______________________________________________
sc-users mailing list

info (subscription, etc.): http://www.birmingham.ac.uk/facilities/ea-studios/research/supercollider/mailinglist.aspx
archive: https://listarc.bham.ac.uk/marchives/sc-users/
search: https://listarc.bham.ac.uk/lists/sc-users/search/