Clustering Uni-variate Time Series Using Sklearn
I have a panda DataFrame from which, i would like to do clustering for each columns. I am using sklearn and this is what i have: data= pd.read_csv('data.csv') data=pd.DataFrame(dat
Solution 1:
The K-Means clusterer expects a 2D array, each row a data point, which can also be one-dimensional. In your case you have to reshape the pandas column to a matrix having len(data)
rows and 1 column. See below an example that works:
from sklearn.cluster import KMeans
import pandas as pd
data = {'one': [1., 2., 3., 4., 3., 2., 1.], 'two': [4., 3., 2., 1., 2., 3., 4.]}
data = pd.DataFrame(data)
n_clusters = 2
for col in data.columns:
kmeans = KMeans(n_clusters=n_clusters)
X = data[col].reshape(-1, 1)
kmeans.fit(X)
print "{}: {}".format(col, kmeans.predict(X))
Post a Comment for "Clustering Uni-variate Time Series Using Sklearn"