Tensorflow - Run Optimizer Op On A Large Batch
Solution 1:
This depends mainly on your GPU memory size. However, it is hard to fit your entire dataset along with the model and its required operations (i.e. predicting probabilities). Thus, you would need to think of batching in a different perspective. I assume your code goes along these lines:
# Model Definition
X = tf.placeholder(tf.float32, shape=[None, DIM,DIM,3], name='X')
y = tf.placeholder(tf.float32, shape=[None, N_CLASSES], name='y')
...
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)
...
# Training your model
sess.run([train_step], feed_dict={X: X_batch, y: y_batch})
Instead of using X
and y
as your feed_dict
to the train_step
. You could just accumulate the cross_entropy
for all batches (i.e. for the entire dataset). Then, you can run the train_step
once. For example:
cross_entropy_all = []
for X_batch, y_batch in batches_generator():
cross_entropy_all += sess.run([cross_entropy], feed_dict={X: X_batch, y: y_batch})
# Numpy or Tensorflow equivalent for `vstack`
cross_entropy_all = np.vstack(cross_entropy_all)
# Run the optimizer on the entire dataset (not just on a specific batch)
sess.run([train_step], feed_dict={cross_entropy: cross_entropy_all})
This should achieve your goal without running your GPU out of memory. The suggested approach runs the optimization step against all cross entropies. Thus, you don't need to feed X and y (that are used/needed to produce the cross_entropy
because it is already fed to the optimization step).
Post a Comment for "Tensorflow - Run Optimizer Op On A Large Batch"