How To Add Counts Of Points As A Label In A Sparse Scatter Plot

I have sparse scatter plot to visualize the comparison of predicted vs actual values. The range of the values are 1-4 and there are no decimal points. I have tried plotly so far wi

Solution 1:

This answer uses matplotlib.

To answer the initial question first: You need to find out how often the data produces a point at a given coordinate to be able to annotate the points. If all values are integers this can easily be done using a 2d histogram. Out of the hstogram one would then select only those bins where the count value is nonzero and annotate the respective values in a loop:

x = [3, 0, 1, 2, 2, 0, 1, 3, 3, 3, 4, 1, 4, 3, 0]
y = [1, 0, 4, 3, 2, 1, 4, 0, 3, 0, 4, 2, 3, 3, 1]

import matplotlib.pyplot as plt
import numpy as np

x = np.array(x)
y = np.array(y)

hist, xbins,ybins = np.histogram2d(y,x, bins=range(6))
X,Y = np.meshgrid(xbins[:-1], ybins[:-1])
X = X[hist != 0]; Y = Y[hist != 0]
Z   = hist[hist != 0]

fig, ax = plt.subplots()
ax.scatter(x,y, s=49, alpha=0.4)

for i inrange(len(Z)):
    ax.annotate(str(int(Z[i])), xy=(X[i],Y[i]), xytext=(4,0), 
                textcoords="offset points" )

enter image description here

You may then decide not to plot all points but the result from the histogramming which offers the chance to change the color and size of the scatter points,

ax.scatter(X,Y, s=(Z*20)**1.4, c = Z/Z.max(), cmap="winter_r", alpha=0.4)

enter image description here

Since all values are integers, you may also opt for an image plot,

fig, ax = plt.subplots()
ax.imshow(hist, cmap="PuRd")

for i inrange(len(Z)):
    ax.annotate(str(int(Z[i])), xy=(X[i],Y[i]), xytext=(0,0), color="w",
                ha="center", va="center", textcoords="offset points" )

enter image description here

Without the necesity to calculate the number of occurances, another option is to use a hexbin plot. This gives slightly inaccurate positions of the dots, du to the hexagonal binning, but I still wanted to mention this option.

import matplotlib.pyplot as plt
import matplotlib.colors
import numpy as npx= np.array(x)
y = np.array(y)

fig, ax = plt.subplots()

cmap = [cmap(i) for i in range(cmap.N)]
cmaplist[0] = (1.0,1.0,1.0,1.0)
cmap = matplotlib.colors.LinearSegmentedColormap.from_list('mcm',cmaplist, cmap.N)

ax.hexbin(x,y, gridsize=20, cmap=cmap, linewidth=0 )

enter image description here

