How to create a histogram with points rather than bars

112 Views Asked by At

I would like to plot a histplot but using points rather than bars.

x_n10_p0_6 = binom.rvs(n=10, p=0.6, size=10000, random_state=0)
x_n10_p0_8 = binom.rvs(n=10, p=0.8, size=10000, random_state=0)
x_n20_p0_8 = binom.rvs(n=20, p=0.6, size=10000, random_state=0)

df = pd.DataFrame({
    'x_n10_p0_6': x_n10_p0_6, 
    'x_n10_p0_8': x_n10_p0_8, 
    'x_n20_p0_8': x_n20_p0_8
    })

sns.histplot(df)

This is what I'm getting:

enter image description here

I would like to see something like this:

enter image description here

Source: https://en.wikipedia.org/wiki/Binomial_distribution#/media/File:Binomial_distribution_pmf.svg

There is an element attribute to histplot but it only takes the values {“bars”, “step”, “poly”}

1

There are 1 best solutions below

0
On BEST ANSWER

You are working with discrete distributions. A kde plot, on the contrary, tries to approximate a continuous distribution by smoothing out the input values. As such, a kdeplot with your discrete values only gives a crude approximation of the plot you seem to be after.

Seaborn's histplot currently only implements bars for discrete distributions. However, you can mimic such a plot via matplotlib. Here is an example:

import matplotlib.pyplot as plt
from matplotlib.ticker import MaxNLocator
from scipy.stats import binom
import pandas as pd
import numpy as np

x_n10_p0_6 = binom.rvs(n=10, p=0.6, size=10000, random_state=0)
x_n10_p0_8 = binom.rvs(n=10, p=0.8, size=10000, random_state=0)
x_n20_p0_8 = binom.rvs(n=20, p=0.6, size=10000, random_state=0)

df = pd.DataFrame({
  'x_n10_p0_6': x_n10_p0_6,
  'x_n10_p0_8': x_n10_p0_8,
  'x_n20_p0_8': x_n20_p0_8
})
for col in df.columns:
  xmin = df[col].min()
  xmax = df[col].max()
  counts, _ = np.histogram(df[col], bins=np.arange(xmin - 0.5, xmax + 1, 1))
  plt.scatter(range(xmin, xmax + 1), counts, label=col)
plt.legend()
plt.gca().xaxis.set_major_locator(MaxNLocator(integer=True))  # force integer ticks for discrete x-axis
plt.ylim(ymin=0)
plt.show()

using dots to show discrete histogram

Note that seaborn's histplot has many more options than shown in this example (e.g. scaling the counts down to densities).