I want to use SGD optimizer in tf.keras. But SGD detail said
Gradient descent (with momentum) optimizer.
Dose it mean SGD doesn't support "Randomly shuffle examples in the data set phase"?
I checked the SGD source,
It seems that there is no random shuffle method.
My understanding about SGD is applying gradient descent for random sample.
But it does only gradient descent with momentum and nesterov.
Does the batch-size which I defined in code represent SGD random shuffle phase?
If so, it does randomly shuffle but never use same dataset, doesn't it?
Is my understanding correct?
I wrote code about batch as below.
(x_train, y_train)).shuffle(10000).batch(32)
test_ds = tf.data.Dataset.from_tensor_slices((x_test, y_test)).batch(32)
I'm not sure if it's what you are looking for, but try using the tf.data.Dataset for your Dataset. For example, for mnist you can easily create the dataset variable, shuffle the samples and divide in batches:
You can have a look at the tutorial about datasets: td.data