Bayesian network in Python: both construction and sampling

Question

Bayesian network in Python: both construction and sampling

3k Views Asked by Rutger Mauritz At 17 August 2025 at 15:38

For a project, I need to create synthetic categorical data containing specific dependencies between the attributes. This can be done by sampling from a pre-defined Bayesian Network. After some exploration on the internet, I found that Pomegranate is a good package for Bayesian Networks, however - as far as I'm concerned - it seems unpossible to sample from such a pre-defined Bayesian Network. As an example, model.sample() raises a NotImplementedError (despite this solution says so).

Does anyone know if there exists a library which provides a good interface for the construction and sampling of/from a Bayesian network?

Original Q&A

There are 5 best solutions below

LoudlySoft On 20 July 2020 at 11:50

Another option is pgmpy which is a Python library for learning (structure and parameter) and inference (statistical and causal) in Bayesian Networks.

You can generate forward and rejection samples as a Pandas dataframe or numpy recarray.

The following code generates 20 forward samples from the Bayesian network "diff -> grade <- intel" as recarray.

from pgmpy.models.BayesianModel import BayesianModel
from pgmpy.factors.discrete import TabularCPD
from pgmpy.sampling import BayesianModelSampling

student = BayesianModel([('diff', 'grade'), ('intel', 'grade')])

cpd_d = TabularCPD('diff', 2, [[0.6], [0.4]])
cpd_i = TabularCPD('intel', 2, [[0.7], [0.3]])
cpd_g = TabularCPD('grade', 3, [[0.3, 0.05, 0.9, 0.5], [0.4, 0.25, 0.08, 0.3], [0.3, 0.7, 0.02, 0.2]], ['intel', 'diff'], [2, 2])

student.add_cpds(cpd_d, cpd_i, cpd_g)
inference = BayesianModelSampling(student)
df_samples = inference.forward_sample(size=20, return_type='recarray')

print(df_samples)

Christian On 02 December 2019 at 14:55

Another option is Bayespy (https://www.bayespy.org/index.html). You build the network using nodes. And on every node, you can call random() which essentially samples from its distribution: https://www.bayespy.org/dev_api/generated/generated/bayespy.inference.vmp.nodes.stochastic.Stochastic.random.html#bayespy.inference.vmp.nodes.stochastic.Stochastic.random

Pierre-Henri Wuillemin On 03 January 2020 at 16:07

Using pyAgrum, you just have to :

#import pyAgrum
import pyAgrum as gum

# create a BN
bn=gum.fastBN("A->B[3]<-C{yes|No}->D")
# specify some CPTs (randomly filled by fastBN)
bn.cpt("A").fillWith([0.3,0.7])

# and then generate a database
gum.generateCSV(bn,"sample.csv",1000,with_labels=True,random_order=False) 
# which returns the LL(database)

the code in a notebook

See http://webia.lip6.fr/~phw/aGrUM/docs/last/notebooks/ for more notebooks using pyAgrum

Disclaimer: I am one of the authors of pyAgrum :-)

Sachz On 03 November 2020 at 21:13

I was also searching for a library in python to work with bayesian networks learning, sampling, inference and I found bnlearn. I tried a couple of examples and it worked. It is possible to import several existing repositories or any .bif type. As per this library,

Sampling of data is based on forward sampling from joint distribution of the Bayesian network. In order to do that, it requires as input a DAG connected with CPDs. It is also possible to create a DAG manually (see create DAG section) or load an existing one

**Rutger Mauritz** · Accepted Answer

I found out that PyAgrum (https://agrum.gitlab.io/pages/pyagrum.html) does the job. It can both be used to create a Bayesian Network via the BayesNet() class and to sample from such a network by using the .drawSamples() method from the a BNDatabaseGenerator() class.

Bayesian network in Python: both construction and sampling

There are 5 best solutions below

Related Questions in PYTHON

Related Questions in SAMPLING

Related Questions in BAYESIAN-NETWORKS

Related Questions in POMEGRANATE

Trending Questions

Popular # Hahtags

Popular Questions