August 08, 2025

Xavier Initialization: Deep Feedforward Networks: Training Difficulties and Solutions

14 minutes

This document explores the challenges associated with training deep feedforward neural networks, specifically investigating why standard gradient descent with random initialization performs poorly. The authors examine the impact of various non-linear activation functions, like sigmoid, hyperbolic tangent, and a new softsign function, on network performance and the issue of unit saturation. They further analyze how activations and gradients change across layers and during training, leading to the proposal of a novel initialization scheme designed to accelerate convergence. The findings suggest that appropriate activation functions and initialization techniques are crucial for improving the learning dynamics and overall effectiveness of deep neural networks.

Source: https://proceedings.mlr.press/v9/glorot10a/glorot10a.pdf

...more

View all episodes

By mcgrof

August 08, 2025

Xavier Initialization: Deep Feedforward Networks: Training Difficulties and Solutions

14 minutes

Source: https://proceedings.mlr.press/v9/glorot10a/glorot10a.pdf

...more

Share Xavier Initialization: Deep Feedforward Networks: Training Difficulties and Solutions

Sign up to save your podcasts

Xavier Initialization: Deep Feedforward Networks: Training Difficulties and Solutions

Xavier Initialization: Deep Feedforward Networks: Training Difficulties and Solutions