I have a question regarding multi output kernels in gpflow 2. For the application I am working on, I want to create a independent multi output kernel that shares kernels across some output dimensions but not all. Two relevant classes in GPflow are the SharedIndependent
and the SeparateIndependent
multi output kernel classes. These however use either one shared kernel for all P output dimensions, or P separate kernels for P output dimensions.
The code attached below is a small adaption from the multi output kernels notebook (https://gpflow.readthedocs.io/en/master/notebooks/advanced/multioutput.html). The 4 output dimensions can be divided in 2 groups (2 output dimensions each): one group with high flexibility, and the other with low flexibility. I would like to use 2 kernels for this task in order to detect 2 different length-scale parameters for the Squared Exponential kernel. For completeness I have added the implementation from the example notebook for the SharedIndependent
class and the SeparateIndependent
class.
I have currently tried to combine these as follows: SeparateIndependent([SharedIndependent(SquaredExponential()+Linear(), output_dim=2) for _ in range(2)])
. This leads to the following error: (15 is the number of inducing points)
ValueError: Dimensions must be equal, but are 2 and 15 for '{{node add_8}} = AddV2[T=DT_DOUBLE](stack, mul_13)' with input shapes: [2,15,2,15,2], [1,15,15].
I have also tried to create a new multi output kernel class for this purpose, that largely mimics the SeparateIndependent
class with some alterations to the list comprehension for loop in the K(X, X2)
and K_diag(X)
functions. This did not turn out successful, due to not knowing how to register the correct Kuu()
, Kuf()
and conditional()
functions for the multi output kernel class. An attempted to create an identical class to the SeparateIndependent
class with only a different name failed as well.
I am happy to know if it is possible to combine the SharedIndependent
and the SeparateIndependent
classes, or if constructing a new MOK class is a preferable solution to the problem. If so, what is the best approach to solve the issues with registering the conditional expressions (Kuu()
, Kuf()
and conditional()
)?
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
import gpflow
from gpflow.kernels import SquaredExponential, Linear, SharedIndependent, SeparateIndependent
from gpflow.inducing_variables import SharedIndependentInducingVariables, InducingPoints
from gpflow.utilities import print_summary
from gpflow.ci_utils import ci_niter
gpflow.config.set_default_float(np.float64)
np.random.seed(0)
MAXITER = ci_niter(2000)
N = 100 # number of points
D = 1 # number of input dimensions
M = 15 # number of inducing points
L = P = 4 # number of latent GPs, number of observations = output dimension
def generate_data(N=100):
X = np.random.rand(N)[:, None] * 10 - 5 # Inputs = N x D
G = np.hstack((0.5 * np.sin(X/2) + X, 3.0 * np.cos(X/2) - X,0.5 * np.sin(4 * X) + X, 3.0 * np.cos(4*X) - X)) # G = N x L
W = np.array([[0.5, -0.3, 0, 0], [0.5, -0.3, 0, 0], [0, 0, -0.4, 0.6],[0.0, 0.0, 0.6, -0.4]]) # L x P
F = np.matmul(G, W) # N x P
Y = F + np.random.randn(*F.shape) * [0.2, 0.2, 0.2, 0.2]
return X, Y
X, Y = data = generate_data(N)
print(X.shape, Y.shape)
Zinit = np.linspace(-5, 5, M)[:, None]
def plot_model(m, name, lower=-7.0, upper=7.0):
pX = np.linspace(lower, upper, 100)[:, None]
pY, pYv = m.predict_y(pX)
if pY.ndim == 3:
pY = pY[:, 0, :]
plt.plot(X, Y, "x")
plt.gca().set_prop_cycle(None)
plt.plot(pX, pY)
for i in range(pY.shape[1]):
top = pY[:, i] + 2.0 * pYv[:, i] ** 0.5
bot = pY[:, i] - 2.0 * pYv[:, i] ** 0.5
plt.fill_between(pX[:, 0], top, bot, alpha=0.3)
plt.xlabel("X")
plt.ylabel("f")
plt.title(f"{name} kernel.")
plt.show()
# initialization of inducing input locations (M random points from the training inputs)
Z = Zinit.copy()
# create multi-output inducing variables from Z
iv = SharedIndependentInducingVariables(InducingPoints(Z))
def optimize_model_with_scipy(model):
optimizer = gpflow.optimizers.Scipy()
optimizer.minimize(
model.training_loss_closure(data),
variables=model.trainable_variables,
method="l-bfgs-b",
options={"disp": True, "maxiter": MAXITER},
)
# create multi-output kernel
kernels = [
(SeparateIndependent([SquaredExponential() + Linear() for _ in range(P)]),'Seperate Independent'),
(SharedIndependent(SquaredExponential()+Linear(), output_dim=P), 'Shared Independent'),
(SeparateIndependent([SharedIndependent(SquaredExponential()+Linear(), output_dim=2) for _ in range(2)]), 'Partially shared independent')
]
for (kernel, name) in kernels:
m = gpflow.models.SVGP(kernel, gpflow.likelihoods.Gaussian(), inducing_variable=iv, num_latent_gps=P)
print_summary(m)
optimize_model_with_scipy(m)
print_summary(m)
plot_model(m, name)
Update: I have solved this issue by creating a separate multi output kernel class containing D Shared Independent multi output kernels. The conditional expression iterates over these to apply them to the appropriate output dimensions. Note that this only is implemented for Shared Independent inducing variables.