October 28, 2021

Parameter Prediction for Unseen Deep Architectures

40 minutes

Deep learning has been successful in automating the design of features in machine learning pipelines. However, the algorithms optimizing neural network parameters remain largely hand-designed and computationally inefficient. We study if we can use deep learning to directly predict these parameters by exploiting the past knowledge of training other networks. We introduce a large-scale dataset of diverse computational graphs of neural architectures – DEEPNETS-1M– and use it to explore parameter prediction on CIFAR-10 and ImageNet. By leveraging advances in graph neural networks, we propose a hypernetwork that can predict performant parameters in a single forward pass taking a fraction of a second, even on a CPU.

2021: Boris Knyazev, Michal Drozdzal, Graham W. Taylor, Adriana Romero-Soriano

https://arxiv.org/pdf/2110.13100v1.pdf

...more