Learning GenAI via SOTA Papers

EP081: Replacing MLPs With Interpretable KANs


Listen Later

Kolmogorov-Arnold Networks (KANs) are proposed as a promising and mathematically grounded alternative to standard Multi-Layer Perceptrons (MLPs). Unlike MLPs, which apply fixed activation functions on nodes (neurons), KANs place learnable activation functions on the edges (weights) of the network. In a KAN, every weight parameter is replaced by a univariate function parameterized as a spline, meaning the networks contain no linear weight matrices at all.

This architectural shift allows KANs to significantly outperform MLPs in two key areas:

  • Accuracy and Scaling: Smaller KANs can achieve comparable or better accuracy than much larger MLPs in data fitting and function regression tasks. Theoretically and empirically, KANs exhibit faster neural scaling laws and can successfully overcome the "curse of dimensionality" by combining external compositional structures with highly accurate internal spline approximations. Additionally, because splines are local, KANs demonstrate local plasticity, allowing them to avoid catastrophic forgetting in continual learning tasks.
  • Interpretability and Interactivity: Because of their structure, KANs can be intuitively visualized and easily simplified through techniques like sparsification, pruning, and symbolification. This high level of transparency allows human users to easily interact with the network, test hypotheses, and extract exact symbolic formulas from the model's learned weights.

Ultimately, the authors position KANs as highly effective foundational models for small-scale AI + Science tasks. Through extensive experiments, KANs are shown to act as valuable scientific "collaborators," successfully helping researchers (re)discover complex mathematical laws in knot theory, map phase boundaries in physics (Anderson localization), and efficiently solve partial differential equations (PDEs).

...more
View all episodesView all episodes
Download on the App Store

Learning GenAI via SOTA PapersBy Yun Wu