Tensorflow model architecture for sparse dataset

116 Views Asked by Wael Othmani At 31 May 2021 at 18:03

I have a regression dataset where approximately 95% of the target variables are zeros (the other 5% are between 1 and 30) and I am trying to design a Tensorflow model to model that data. I am thinking of implementing a model that combines a classifier and a regressor (check the output of the classifier submodel, if it's less than a threshold then pass it to the regression submodel). I have the intuition that this should be built using the functional API But I couldn't find helpful resources on that. Any ideas?

Here is the code that generates the data that I am using to replicate the problem:

n = 10000
zero_percentage = 0.95
zeros = np.zeros(round(n * zero_percentage))
non_zeros = np.random.randint(1,30,size=round(n * (1- zero_percentage)))
y = np.concatenate((zeros,non_zeros))
np.random.shuffle(y)
a = 50
b = 10
x =  np.array([np.random.randint(31,60) if element == 0 else  (element - b) / a for element in y])
y_classification = np.array([0 if element == 0 else 1 for element in y])

Note: I experimented with probabilistic models (Poisson regression and regression with a discretized logistic mixture distribution), and they provided good results but the training was unstable (loss diverges very often).

Original Q&A

There are 1 best solutions below

generic_stackoverflow_user On 31 May 2021 at 21:48

Instead of trying to find some heuristic to balance the training between the zero values and the others, you might want to try some input preprocessing method that can handle imbalanced training sets better (usually by mapping to another space before running the model, then doing the inverse with the results); for example, an embedding layer. Alternatively, normalize the values to a small range (like [-1, 1]) and apply an activation function before evaluating the model on the data.

Tensorflow model architecture for sparse dataset

There are 1 best solutions below

Related Questions in PYTHON

Related Questions in TENSORFLOW

Related Questions in FUNCTIONAL-API

Trending Questions

Popular # Hahtags

Popular Questions