Difference Between Keras Model.fit Using Only Batch_size And Using Only Steps_per_epoch
When I run model.fit using both batch_size and steps_per_epoch parameters I receive the following error: ValueError: If steps_per_epoch is set, the `batch_size` must be None. So,
Solution 1:
That's a good question. What I observe from the source code ([1] and [2]) is that:
- When you set
batch_size
, the training data is sliced into batches of this size (see L184). - When you set
steps_per_epoch
, if the training inputs are not framework-native tensors (this is the most common case), the whole training set is being fed into the network in a single batch (see L152), and that's why you get the memory error.
Therefore, based on the implementation, I would advise to use the argument steps_per_epoch
only when feeding through framework-native tensors (i.e. TensorFlow tensors with the first dimension being the batch size), and that is indeed a requirement. In order to do this, the arguments x
and y
from model.fit
need to be set to None
.
Post a Comment for "Difference Between Keras Model.fit Using Only Batch_size And Using Only Steps_per_epoch"