Information content of a (1D) curve (i.e. spectroscopy)

39 Views Asked by At

I am looking for a measure to quantify the information content of a 1D curve. To explain, here is an example in python:

import numpy as np
import matplotlib.pyplot as plt

x, dx = np.linspace(0, 10, 1001, retstep=True)
y1 = 2 * np.pi * np.exp(-.5 * (x - 5)**2)
y2 = np.pi * np.exp(-.5 * (x - 5)**2) \
   + 2 * np.exp(-5 * (x - 2)**2) \
   + 3 * np.exp(-5 * (x - 8)**2)

plt.plot(x, y1, label='curve 1')
plt.plot(x, y2, label='curve 2')
plt.legend()

enter image description here

Curve 1 is a single Gaussian, which has an "information content" of 3: x-position, amplitude, and width. Curve 2 contains 3 such peaks and hence carries 9 numbers as information content. Here we have the first issue: We can only say this because we have a model of the curve (Gaussian peaks). Is there any function or approach that can calculate the information content/entropy of such a curve?

I looked into the Shannon entropy, which can give some numbers if the probability density function is calculated first, here by np.histogram:

>>> pdf1, x1 = np.histogram(y1, 31, density=True)
>>> pdf2, x2 = np.histogram(y2, 31, density=True)
>>> -np.sum(pdf1 * np.log2(pdf1)) # shannon entropy
5.684224974417829
>>> -np.sum(pdf2 * np.log2(pdf2))
11.151687052639227

The problem is, that this approach does not consider correlations between data points, which are clearly there.

This is how my actual data looks like:

Actual data

Btw: any approximation would be fine, it does not need mathematical rigor here.

Any ideas?

0

There are 0 best solutions below