Cluster of songs using distances¶
Once the distance metric is calculated (the output from foucluster.distance.distance_matrix), this distances between the songs are used as features for applying clustering.
Several methodologies from sklearn are imported:
- Indicating the number of clusters (KMeans,
- AgglomerativeClustering, SpectralClustering).
- Without the number of clusters (AffinityPropagation,
- MeanShift).
For the first type of clusters,
-
foucluster.cluster.
determinist_cluster
(dist_df, method, n_clusters)[source]¶ Clustering of the songs from the dataframe, indicating the number of clusters to use.
Parameters: Returns: pandas.DataFrame with a column with clusters.
For both types of clusters,
-
foucluster.cluster.
automatic_cluster
(dist_df, method)[source]¶ Parameters: - dist_df (pd.DataFrame) –
- method (str) –
name of the sklearn.cluster.
- cluster.AffinityPropagation.
- cluster.MeanShift.
- cluster.AgglomerativeClustering.
- cluster.SpectralClustering.
- cluster.KMeans.
Returns: pandas.DataFrame with a column with clusters.
When an algorithm which needs the number of clusters, like KMeans, is used with automatic_cluster, it calls to jump method to calculate the number of clusters.