Skip to content Skip to sidebar Skip to footer

Tensorflow - Run Optimizer Op On A Large Batch

Normally, we call the run command with the optimizer operation as input to update the trainable parameters of some model: session.run(model.optimizer_op, feed_dict={model.X: X_batc

Solution 1:

This depends mainly on your GPU memory size. However, it is hard to fit your entire dataset along with the model and its required operations (i.e. predicting probabilities). Thus, you would need to think of batching in a different perspective. I assume your code goes along these lines:

# Model Definition    
X = tf.placeholder(tf.float32, shape=[None, DIM,DIM,3], name='X')
y = tf.placeholder(tf.float32, shape=[None, N_CLASSES], name='y')

...

train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)

...

# Training your model
sess.run([train_step], feed_dict={X: X_batch, y: y_batch})

Instead of using X and y as your feed_dict to the train_step. You could just accumulate the cross_entropy for all batches (i.e. for the entire dataset). Then, you can run the train_step once. For example:

cross_entropy_all = []
for X_batch, y_batch in batches_generator():
    cross_entropy_all += sess.run([cross_entropy], feed_dict={X: X_batch, y: y_batch})

# Numpy or Tensorflow equivalent for `vstack`
cross_entropy_all = np.vstack(cross_entropy_all)

# Run the optimizer on the entire dataset (not just on a specific batch)
sess.run([train_step], feed_dict={cross_entropy: cross_entropy_all})

This should achieve your goal without running your GPU out of memory. The suggested approach runs the optimization step against all cross entropies. Thus, you don't need to feed X and y (that are used/needed to produce the cross_entropy because it is already fed to the optimization step).

Post a Comment for "Tensorflow - Run Optimizer Op On A Large Batch"