Why The First Convolutional Layer Weights Don't Change During Training?

January 26, 2024 Post a Comment

I got the tensorflow mnist treaining example from here https://github.com/aymericdamien/TensorFlow-Examples/blob/master/examples/3_NeuralNetworks/convolutional_network.py and addd

Solution 1:

Most probably the gradient values become too low in this layer, so it's difficult or impossible to see their updates.

The gradient vanishing is a usual problem for deep networks.

You can check if it's your case:

Print out the gradient values for the weights of the convolution. They should be very low (e.g. 1e-5).
Increase the learning rate to a large value (e.g. 20x). The weights should start changing (note that a network with so high LR quickly diverges).

Python Playground

Why The First Convolutional Layer Weights Don't Change During Training?

Solution 1:

Post a Comment for "Why The First Convolutional Layer Weights Don't Change During Training?"