Understanding Optimization of Neural Networks
The idea of this collection is to understand how the optimization process of f neural networks process, what kind of minimas do they end up in etc.
Towards Understanding Generalization of Deep Learning: Perspective of Loss Landscape
For the landscape of loss function for deep networks, the volume of basin of attraction of good minima dominates over that of poor minima, which guarantees optimization methods with random initialization to converge to good minima. We theoretically justify our findings through analyzing 2-layer neural networks; and show that the low-complexity solutions have a small norm of Hessian matrix with respect to model parameters. For deeper networks, extensive numerical evidence helps to support our arguments.