I'm trying to generate a Geometric Brownian motion "price" time series, i.e., based on log-normal returns. The goal:
- The time series should start with a price value of 1.0,
- should have a length of
n, - and the final/last price should have an expectation value of 1.0.
My understanding of the log-normal distribution that it has a mean of exp(mu + sigma^2 / 2). So in order obtain a mean of 1.0 one has to use mu = -sigma^2 / 2.
My Python implementation looks like this:
import numpy as np
def gen_prices(n: int, start_price: float = 1.0, sigma: float = 1.0) -> np.ndarray:
mu = -(sigma**2) / 2
log_returns = np.random.normal(mu, sigma, size=n - 1)
prices = np.cumprod(
np.concatenate([np.array([start_price]), np.exp(log_returns)]),
)
return prices
Let's check the mean of the final price by repeatedly generating time series, and collecting their final price values:
num_iter = 100000
num_prices = 100
final_prices = [gen_prices(n=num_prices)[-1] for _ in range(num_iter)]
print(np.mean(final_prices))
To my surprise this returns something like 2.363741046091954e-08 far off from 1.0. I'm wondering:
- Is there a fundamental / theoretical flaw in the approach, i.e., is
mu = -sigma^2 / 2the wrong formula to use? - Or is this an issue with numerical stability?
Since the result is off by many orders of magnitudes, it looked like more than "just a numerical" issue, but the formula seems to make sense on first glance.
Any ideas what is the issue here, and how to fix it?
For the record, I also tried an implementation based on np.cumsum instead of np.cumprod, but it seems to behave the same:
def gen_prices(n: int, start_price: float = 1.0, sigma: float = 1.0) -> np.ndarray:
mu = -(sigma**2) / 2
log_returns = np.random.normal(mu, sigma, size=n - 1)
prices = np.exp(
np.cumsum(
np.concatenate([np.array([np.log(start_price)]), log_returns]),
)
)
return prices