loading...

Does Anyone Know of Any *Discrete* Clustering Algorithms?

awwsmm profile image Andrew (he/him) ・1 min read

I'm doing some research into clustering algorithms and every source I seem to find discusses 2D (or higher-dimensional) clustering of continuous data. The nearest thing I've found to what I'm looking for is this article which discusses discrete-continuous clustering (where the x and y axes are quantized into cells, but the z axis is allowed to vary continuously).

Has anyone come across any algorithms which perform cluster analysis of purely discrete data? Specifically 2D?

Discussion

pic
Editor guide
Collapse
berniwittmann profile image
Bernhard Wittmann

How about Single-Linkage Clustering or Complete-Linkage Clustering, both belong to hierarchical Clustering, you just have to choose a distance metric that works on the grid, like Manhattan Distance.

Actually shouldn't it be possible to adopt any Clustering algorithm: as an example k-means: you need to choose an appropriate distance metric as above and second adjust the calculation of the prototypes to choose a point of the grid.

Collapse
dylan profile image
Dylan

Can you be a bit more specific about what your data looks like? Are x and y categorical features and z continuous? I had, at some point, a SO thread about combining data specific distance functions in a nearest neighbor search. I can't find it anymore, but it would be sort of like def custom_distance(X): return scipy.dice(categorical_features) + scipy.euclidean(continuous_features)

It looks sort of like: members.cbio.mines-paristech.fr/~j...

Found it! Hopefully something in this thread is helpful.
datascience.stackexchange.com/ques...