Prediction with AR model

69 Views Asked by At

I would like to make a prediction of a wave with autoregression. And I have fitted a AR model to the data and tested the prediction on test data but my problem is that the mean squared error is lowest for first order polynomial.. what dose that mean?

And I cant predict more that the length of the training set... When I use a longer training set I get only zeros at the end.. Is this correct or what we expect and then why? See picture


% load data
clear;
close all;
load Cv52.mat; 
format long

eta_snl    = Cv52.PG.eta_snl;
t_snl_dnum = Cv52.PG.t_datenum;

t_snl_time = datetime(t_snl_dnum','Format','HH:mm:ss.SSS','convertFrom','datenum');

initial_t = datetime('09 07, 2019, 17:11:43.000','Format','MM dd, uuuu, HH:mm:ss.SSS');
final_t   = datetime('09 07, 2019, 17:18:59.000','Format','MM dd, uuuu, HH:mm:ss.SSS');

% splite date, training/validation/test
init = 1;
init_val = 2; 
final = 2000;
split = 400;
ssize = 2;

data       = eta_snl(1:final);

trein_data = data(init:ssize:split); 
val_data  = data(init_val:ssize:split); 
test_data = data(split:ssize:final);

t_snl_time = t_snl_time(init:final);
t_train    = t_snl_time(init:ssize:split);
t_val      = t_snl_time(init_val:ssize:split);
t_test     = t_snl_time(split:ssize:final);

% hyperparameters
deg = 40; % degree of polynomial

% 
trein_score = zeros(1, length(deg));
test_score  = zeros(1, length(deg));


for j = 1 : deg
    % construct matrices 
    m = length(trein_data)- j;
    X = zeros(m, j);
    Y = zeros(1, m);
    for p = 1:m
        X(p, 1:j) = trein_data(p+1:p+j);
        Y(p) = trein_data(p);
    end
    
    % AR model
    % weights, beta is determined by minimize the least square problem
    beta      = (X'*X)\ X' * Y';
    
    Y_pred_trein = beta' * X';
    
    % test and validation of model
    % construct matrices
    X_val  = zeros(length(val_data) - j, j);
    X_test = zeros(length(test_data) - j, j);
    for p = 1:length(val_data) - j
        X_val(p, 1:j) = val_data(p+1:p+j);
        X_test(p, 1:j) = test_data(p+1:p+j);
    end
    
    % predict
    Y_pred_val  = X_val * beta;
    Y_pred_test  = X_test * beta;
    
    % error / validation score
    test_score(j)  =  mean((Y_pred_test - test_data(1:length(Y_pred_test))).^2);
    trein_score(j) =  mean((Y_pred_trein - Y).^2);
    
    if j == 10
        figure;
        hold on;
        plot(t_snl_time, data, 'b')
        plot(t_train(1:length(Y_pred_trein)), Y_pred_trein, 'r')
        plot(t_val(1:length(Y_pred_val)), Y_pred_val, 'k')
        plot(t_test(1:length(Y_pred_test)), Y_pred_test, 'r')
        legend('Actual Data', 'Model fit' , 'Model validation', 'Model test');
        xlabel('Time');
        title('AR-model, CV52 pole 2');
    end
end

figure;
hold on;
plot(1:1:deg, trein_score, 'b.-')
plot(1:1:deg, test_score, 'r.-')
legend('Treining score', 'Test score', 'Location','Best');
xlabel('Polynomial degree');
ylabel('Mean squared error');
title('AR-model evaluation');

This is the code I have written in MATLAB.

0

There are 0 best solutions below