CS231n笔记 Lec5 卷积神经网络
-
在做图像识别时,只需zero-mean,不需要做normalization,也不需要做其他复杂的预处理
-
loss为nan表示损失太大,可能是学习率太大
-
Weight Initialization
-
Initialization too small: Activations go to zero, gradients also zero, No learning
-
Initialization too big: Activations saturate (for tanh), Gradients zero, no learning
-
Initialization just right:
Nice distribution of activations at all layers, Learning proceeds nicely
-