Louise Ai agent - David S. Nishimoto

Louise ai agent - Cannistraci-Hebb Topological Self-Sparsification


Listen Later

Hebbian learning, rooted in the neuroscience principle that “neurons that fire together wire together,” strengthens connections between co-activated neural units, offering a biologically inspired alternative to backpropagation by prioritizing local activation patterns over gradient-based updates. Unlike traditional backpropagation, which adjusts weights using error gradients propagated backward through a network, Hebbian learning reinforces connections based on simultaneous activity, mimicking how biological neurons strengthen synapses through repeated co-firing. This principle has been extended to large-scale transformer architectures through techniques like Cannistraci-Hebb Topological Self-Sparsification (CHTss), which integrates Hebbian dynamics with topological connectivity rules to dynamically prune and regrow connections based on local community organization. The process of CHTss can be broken down as follows: (1) Identify co-activation: During training, CHTss monitors which neurons activate together frequently, using Hebbian rules to quantify their correlation. (2) Prune weak connections: Connections with low co-activation are removed, reducing network density. (3) Regrow strategically: New connections are formed based on topological rules, prioritizing local community structures (e.g., clusters of highly correlated neurons) to maintain functional integrity. This dynamic rewiring contrasts sharply with static sparsity, where a fixed portion of weights is permanently eliminated via magnitude pruning, often leading to performance degradation. CHTss was tested on the LLaMA-130M backbone, where it outperformed fully connected models at 5–30% connectivity, achieving significant computational savings without linear performance loss. Instead, performance often improved due to carefully guided topological sparsity, which fosters structured, community-based subgraphs that reduce overfitting and enhance semantically coherent representations. This adaptive process mirrors biological synaptic pruning and cortical plasticity, where the brain refines neural pathways by eliminating weak synapses and strengthening active ones, enabling function-specific subnetworks to emerge over time. For instance, in CHTss, persistently co-activated node-to-node pathways are reinforced across training iterations, forming modular subnetworks tailored to specific functions, much like how the visual cortex specializes in processing visual data. CHTss’s versatility was validated across LLaMA-60M, 130M, and 1B models, where pruned models retained or surpassed dense counterparts, particularly in zero-shot and few-shot tasks on GLUE and SuperGLUE benchmarks, demonstrating better generalization in data-limited settings. This suggests sparsity forces networks to focus on essential, semantically meaningful features, reducing noise and overfitting. The topological plasticity of CHTss directs activations toward organized subgraphs rather than random, entropic pathways, enhancing interpretability by making knowledge pathways easier to visualize due to fewer active connections. Compared to dropout or regularization, which randomly mask weights or penalize complexity, CHTss learns structural inductive biases directly from data, producing partitioned subgraphs resembling neurobiological functional modules. These modules align with transformer layer stacks, facilitating structured transfer learning where knowledge from one task can be efficiently applied to another. Implemented in PyTorch and Hugging Face, CHTss achieves lower perplexity and higher accuracy in autoregressive language modeling (e.g., WikiText, The Pile) and classification, with zero-shot accuracy gains of 2–5% at 70% pruning density, a significant leap for billion-parameter transformers. The prune-regrow cycle, akin to long-term potentiation in biology (where repeated stimulation strengthens synapses), adds a second learning channel through activation topology, complementing gradient descent.

...more
View all episodesView all episodes
Download on the App Store

Louise Ai agent - David S. NishimotoBy David Nishimoto