
Sign up to save your podcasts
Or
Model/Knowledge distillation is a technique to transfer knowledge from a cumbersome model, like a large neural network or an ensemble of models, to a smaller, more efficient model. The smaller model is trained using "soft targets," which are the class probabilities produced by the larger model, rather than the usual "hard targets" of correct class labels. These soft targets contain more information, including how the cumbersome model generalizes and the similarity structure of the data. A temperature parameter is used to soften the probability distributions, making the information more accessible for the smaller model to learn. This process improves the smaller model's generalization ability and efficiency. Distillation allows the smaller model to achieve performance comparable to the larger model with less computation.
5
22 ratings
Model/Knowledge distillation is a technique to transfer knowledge from a cumbersome model, like a large neural network or an ensemble of models, to a smaller, more efficient model. The smaller model is trained using "soft targets," which are the class probabilities produced by the larger model, rather than the usual "hard targets" of correct class labels. These soft targets contain more information, including how the cumbersome model generalizes and the similarity structure of the data. A temperature parameter is used to soften the probability distributions, making the information more accessible for the smaller model to learn. This process improves the smaller model's generalization ability and efficiency. Distillation allows the smaller model to achieve performance comparable to the larger model with less computation.
272 Listeners
441 Listeners
298 Listeners
331 Listeners
217 Listeners
156 Listeners
192 Listeners
9,170 Listeners
409 Listeners
121 Listeners
75 Listeners
479 Listeners
94 Listeners
31 Listeners
43 Listeners