Skip to content Skip to sidebar Skip to footer

How To Plot Several Kernel Density Estimates Using Matplotlib?

I want to plot several 'filled' kernel density estimates (KDE) in matplotlib, like the upper halfs of vertical violinplots or a non overlapping version of the cover art of Joy Divi

Solution 1:

This answer shows how to modify Matplotlib's violinplots. Those violinplots can also be adapted to only show the upper half of a violin plot.

pos = np.arange(1, 6) / 2.0
data = [np.random.normal(0, std, size=1000) for std in pos]

violins = plt.violinplot(data,  positions=pos, showextrema=False, vert=False)

for body in violins['bodies']:
    paths = body.get_paths()[0]
    mean = np.mean(paths.vertices[:, 1])
    paths.vertices[:, 1][paths.vertices[:, 1] <= mean] = mean

kde plot

A nice looking overlapping variant can be easily created by setting the bodies' transparency to 0, adding an edgecolor and making sure to plot underlying KDEs first:

pos = np.arange(1, 6) / 2
data = [np.random.normal(0, std, size=1000) for std in pos]

violins = plt.violinplot(
    data[::-1], 
    positions=pos[::-1]/5,
    showextrema=False,
    vert=False,

)

for body in violins['bodies']:
    paths = body.get_paths()[0]
    mean = np.mean(paths.vertices[:, 1])
    paths.vertices[:, 1][paths.vertices[:, 1] <= mean] = mean        
    body.set_edgecolor('black')
    body.set_alpha(1)

joy division plot

Solution 2:

Note that there is an existing package called joypy, building on top of matplotlib to easily produce such "Joyplots" from dataframes.

Apart, there is little reason not to use scipy.stats.gaussian_kde because it is directly providing the KDE. violinplot internally also uses it.

So the plot in question would look something like

from  scipy.stats import gaussian_kde
import matplotlib.pyplot as plt
import numpy as np

pos = np.arange(1, 6) / 2.0
data = [np.random.normal(0, std, size=1000) for std in pos]

defplot_kde(data, y0, height, ax=None, color="C0"):
    ifnot ax: ax = plt.gca()
    x = np.linspace(data.min(), data.max())
    y = gaussian_kde(data)(x)
    ax.plot(x,y0+y/y.max()*height, color=color)
    ax.fill_between(x, y0+y/y.max()*height,y0, color=color, alpha=0.5)

for i, d inenumerate(data):
    plot_kde(d, i, 0.8, ax=None)

plt.show()

enter image description here

Post a Comment for "How To Plot Several Kernel Density Estimates Using Matplotlib?"