June 03, 2021

Why is autograd so complicated

15 minutes

Why is autograd so complicated? What are the constraints and features that go into making it complicated? What's up with it being written in C++? What's with derivatives.yaml and code generation? What's going on with views and mutation? What's up with hooks and anomaly mode? What's reentrant execution? Why is it relevant to checkpointing? What's the distributed autograd engine?

Further reading.

Autograd notes in the docs https://pytorch.org/docs/stable/notes/autograd.html
derivatives.yaml https://github.com/pytorch/pytorch/blob/master/tools/autograd/derivatives.yaml
Paper on autograd engine in PyTorch https://openreview.net/pdf/25b8eee6c373d48b84e5e9c6e10e7cbbbce4ac73.pdf

...more

View all episodes

By Edward Yang, Team PyTorch

4.8

4949 ratings

June 03, 2021

Why is autograd so complicated

15 minutes

Further reading.

Autograd notes in the docs https://pytorch.org/docs/stable/notes/autograd.html
derivatives.yaml https://github.com/pytorch/pytorch/blob/master/tools/autograd/derivatives.yaml
Paper on autograd engine in PyTorch https://openreview.net/pdf/25b8eee6c373d48b84e5e9c6e10e7cbbbce4ac73.pdf

...more