I have a bayesian network, and I know the CPTs by learning the probabilities from existing data. Suppose I receive a new data instance. Ideally I don't want to use all the data again to update the probabilities.
Is there a way to incrementally update the CPTs of the existing network each time new data comes in? I think there should be, and I feel like I'm missing something :)
It's easiest to maintain the joint probability table, and rebuild the CPT from that as needed. Along with the JPT, keep a count of how many examples were used to produce it. When adding the
n
th example, multiply all probabilities by1 - 1/n
, and then add probability1/n
to the new example's associated probability.If you're going to do this a bunch, you should maintain a count of examples for each row in the JPT instead of a probability. That'll cut down on numerical drift.