Share Backward Gradient Normalization in Deep Neural Networks

Copy link

March 22, 2026

Backward Gradient Normalization in Deep Neural Networks

22 minutes

In this episode:
• Welcome and Introduction: Professor Norris and Linda introduce the episode and the paper of the week: 'Backward Gradient Normalization in Deep Neural Networks'.
• The Ghost of Gradients Past: A discussion on the classic vanishing and exploding gradient problems, and why existing solutions like Batch Normalization and ResNets still leave room for improvement.
• Unpacking Backward Gradient Normalization: Linda explains the core mechanics of the BGN layer, detailing how it leaves the forward pass untouched while scaling gradients during backpropagation.
• Visualizing the Flow: The hosts delve into the paper's experiments with 90-layer deep networks, comparing gradient decay across ReLU, Sigmoid, and Tanh activation functions.
• Results, Trade-offs, and Conclusions: A breakdown of the accuracy improvements and training time efficiency of BGN compared to Batch Normalization on the MNIST dataset, followed by final thoughts.

...more

View all episodes

By Mechanical Dirk

March 22, 2026

Backward Gradient Normalization in Deep Neural Networks

22 minutes

...more

Sign up to save your podcasts