Mechanical Dreams

Backward Gradient Normalization in Deep Neural Networks


Listen Later

In this episode:
• Welcome and Introduction: Professor Norris and Linda introduce the episode and the paper of the week: 'Backward Gradient Normalization in Deep Neural Networks'.
• The Ghost of Gradients Past: A discussion on the classic vanishing and exploding gradient problems, and why existing solutions like Batch Normalization and ResNets still leave room for improvement.
• Unpacking Backward Gradient Normalization: Linda explains the core mechanics of the BGN layer, detailing how it leaves the forward pass untouched while scaling gradients during backpropagation.
• Visualizing the Flow: The hosts delve into the paper's experiments with 90-layer deep networks, comparing gradient decay across ReLU, Sigmoid, and Tanh activation functions.
• Results, Trade-offs, and Conclusions: A breakdown of the accuracy improvements and training time efficiency of BGN compared to Batch Normalization on the MNIST dataset, followed by final thoughts.
...more
View all episodesView all episodes
Download on the App Store

Mechanical DreamsBy Mechanical Dirk