May 19, 2025

SAD Neural Networks, Divergent Gradient Flows, and Optimality

12 minutes

This academic paper explores the training dynamics of neural networks, specifically focusing on gradient flow for fully connected feedforward networks with various smooth activation functions. The authors establish a dichotomy, showing that gradient flow either converges to a critical point or diverges to infinity while the loss approaches a generalized critical value. Utilizing the mathematical framework of o-minimal structures, they prove that for certain nonlinear polynomial target functions, sufficiently large networks and datasets lead to loss values approaching zero only asymptotically, causing the gradient flow to diverge when initialized well. The paper supports these theoretical findings with numerical experiments on polynomial regression and real-world tasks, observing the parameter norm increasing as the loss decreases.

...more

View all episodes

By Neural Intelligence Network

May 19, 2025

SAD Neural Networks, Divergent Gradient Flows, and Optimality

12 minutes

...more

Share SAD Neural Networks, Divergent Gradient Flows, and Optimality

Sign up to save your podcasts

SAD Neural Networks, Divergent Gradient Flows, and Optimality

SAD Neural Networks, Divergent Gradient Flows, and Optimality