This is a guest post by my student Ruiqi Zhong, who has some very exciting work defining new families of statistical models that can take natural language explanations as parameters. The motivation is that existing statistical models are bad at explaining structured data. To address this problem, we agument these models with natural language parameters, which can represent interpretable abstract features and be learned automatically.
Imagine the following scenario: It is the year 3024. We are historians trying to understand what happened between 2016 and 2024, by looking at how Twitter topics changed across that time period. We are given a dataset of user-posted images sorted by time, $x_1$, $x_2$ ... $x_T$, and our goal is to find trends in this dataset to help interpret what happened. If we successfully achieve our goal, we would discover, for instance, (1) a recurring spike of images depicting athletes every four [...]
---
Outline:
(04:21) Warm-up Example: Logistic Regression with Natural Language Parameters
(07:12) Formalizing Regression with Natural Language Parameters.
(10:47) A Modeling Language for Specifying More Complex Models
(14:20) Example Applications:
(14:49) Application 1: monitoring trends in LLM usage via time series modeling
(16:01) Application 2: taxonomizing product reviews via clustering
(18:05) Application 3: Explaining memorable visual features via Classification.
(20:12) Conclusion
The original text contained 5 footnotes which were omitted from this narration.
The original text contained 2 images which were described by AI.
---