Postgres-MADlib predictions is taking longer than training

67 Views Asked by At

I am training my data using following code:

start_time := clock_timestamp();
  PERFORM madlib.create_nb_prepared_data_tables( 'nb_training',
  training_time := 1000* (extract(epoch FROM clock_timestamp()) - extract(epoch FROM start_time));

And my prediction code goes as follows:

start_time := clock_timestamp();
  PERFORM madlib.create_nb_probs_view( 'categ_feature_probs', 
                                       'probs_view' );

select * from probs_view
prediction_time := 1000 * (extract(epoch FROM clock_timestamp()) - extract(epoch FROM start_time));

The training data is containing 450000 records were as testing dataset contains 50000 records.

Still, my average training_time is around 17173 ms where as prediction_time is 26481 ms. As per my understanding of naive bayes, the prediction_time should be less than training_time. What am I doing wrong here?


There are 1 best solutions below


Naive Bayes classification is in early stage for MADlib which means that interface and implementation are preliminary at this stage. There are a bunch of open JIRAs which tells me it needs some effort before being promoted to a top level module.