Discuss this episode in the Muse community
Follow @MuseAppHQ on Twitter
Show notes
00:00:00 - Speaker 1: There’s been very little innovation and research
more generally into what is a good interface for inputting equations. So
I think most people are probably familiar with Microsoft Word or Excel
have these equation editors where you basically open this palette and
there is a preview and there is a button for every possible mathematical
symbol or operator you can imagine.
00:00:28 - Speaker 2: Hello and welcome to Meta Muse. Muse is a tool for
thought on iPad and Mac. This podcast isn’t about Muse the product,
it’s about Muse the company and the small team behind it. I’m Adam
Wiggins here today with my colleague Mark McGranaghan. Hey, Adam. And
joined by our guest Sarah Lim, who goes by Slim. Hello, hello, and Slim,
you’ve got various interesting affiliations including UC Berkeley,
Notion, Inc and Switch, but what I’m interested in right now is the
lessons you’ve learned from playing classic video games. Tell me about
00:01:01 - Speaker 1: So this arose when I was deciding whether to get
the 14 inch or 16 inch M1 MacBook Pro and a critical question of our
00:01:10 - Speaker 1: honest. Exactly, exactly. I couldn’t decide. I
posted a request for comments on Twitter, and then I had this
realization that when I was 6 years old playing Organ Trail 5, which is
a remake of Organ Trail 2, which is itself a remake of the original. I
was in the initial outfitting stage, and you have 3 choices for your
farm wagon. You can get the small farm wagon, the large farm wagon, and
the Conestoga wagon. I actually don’t know if I’m pronouncing that
correctly, but let’s assume I am. So I just naively chose the Conestoga
wagon because as a 6 year old, I figured that bigger must be better and
being able to store more supplies for your expedition would make it more
successful. I eventually learned that the fact that the wagon is much
larger and can store a lot more weight means that it’s a lot easier to
overload it. Among other things, this requires constantly abandoning
supplies to cut weight. It makes the roover forwarding minigame much
more perilous. It’s a lot harder to control the wagon. And yeah, I
never chose that wagon again on subsequent playthroughs, and I decided
to get the 14-inch laptop.
00:02:12 - Speaker 2: Makes perfect sense to me and and what a great
lesson for a six year old trade-offs, I feel like it’s one of the most
important kind of fundamental concepts to understand as a human in this
world, and I think many folks struggle with that well into adulthood. At
least I feel like I’ve often been in certainly business conversations
where trying to explain trade-offs is met with confusion.
00:02:35 - Speaker 1: They should just play Organ Trail.
00:02:37 - Speaker 2: Clearly that’s the solution. And tell us a little
bit about your background.
00:02:42 - Speaker 1: Yeah, so I’ve been interested in basically all
permutations really of user interfaces and programming languages for a
really long time, so this includes the very different programming
languages as user interfaces and programming languages for user
interfaces, and then, you know, the combination of the two. So right now
I’m doing a PhD in programming languages, interested in more of like
the theoretical perspective, but in the past, I’ve worked on I guess,
end user computing, which is really the broader vision of notion, I was
at Khan Academy for a while on the long term research team.
00:03:18 - Speaker 2: Yeah, and there I think you worked with Andy
Matuschek, who’s a good friend of ours and uh previous guest on the
00:03:24 - Speaker 1: Yes, definitely. That was the first time I worked
with Andy in real depth, and I still really enjoy talking to him and
occasionally collaborating with him today.
So, I guess, prior to that, I was doing a lot of research at the
intersection of HCI or human computer interaction and programming tools,
programming systems, I guess. So, one of the big projects that I worked
on as an undergrad was focused on inspecting.
CSS on a webpage or more generally trying to understand what are the
properties of like the code that influence how the page looks or a
visual outcome of interest, and there I was really motivated by the fact
that you have these software tools have their own Mental model, I guess,
or just model of how code works and how different parts of the program
interact to produce some output and then you have the user who has often
this entirely different intuitive model of what matters, what’s
So they don’t care if this line of code is or isn’t evaluated, they
care whether it actually has a visible effect on the output. So trying
to reconcile those two paradigms, I think is a recurring theme in a lot
00:04:30 - Speaker 2: And I remember seeing a little demo maybe of some
of the, I don’t know if it was a prototype or a full open source tool,
but essentially a visualizer that helps you better understand which CSS
rules are being applied. Am I remembering that right?
00:04:43 - Speaker 1: Yeah, so that was both part of the prototype and
the eventual implementation in Firefox, but the idea there is The syntax
of CSS really elides the complexity, I think, because syntactically it
looks like you have all of these independent properties like color, red,
you know, font size, 16 pixels, and they seem to be all equally
important and at the same level of nesting, I guess, and what that
really hides is the fact that there are a lot of dependencies between
properties, so a certain property like Z index, you know, the perennial
favorite Z index 9999999. Doesn’t take effect unless the element has
like position relative, for example, and it’s not at all apparent if
you’re writing those two properties that there is a dependency between
So I was working on visualizing kind of what those dependencies were.
This actually arose because I wrote to Burt, who is one of the
co-creators of CSS and was like, Hi, I’m interested in building a tool
that visualizes these dependencies. Where can I find the computer
readable list of all such dependencies? And he was like, oh, we don’t
have one, you know, we have this SVG that tries to map out the
dependencies between CSS 2.1 modules, and even there you can see all
these circular dependencies, but we don’t have anything like what
you’re looking for. That to me was totally bananas because it was the
basic blocker to most people being able to go from writing really
trivial CSS to more complicated layouts. So I was like, well, I guess
this thing doesn’t exist, so I’d better go invent it.
00:06:12 - Speaker 2: Perfect way to find good research problems. Now,
more recently, two projects I wanted to make sure we reference because
they connect to what we’ll talk about today, which is recently worked
on the equation editor at Ocean, and then you worked on a rich text CRDT
called Paratext at In and Switch. Uh, would love to hear a little bit
00:06:34 - Speaker 1: Yeah, definitely. So I guess the Paroex project,
which was the most recent one was collaboration with Jeffrey Litt,
Martin Klutman and Peter Van Harperberg, and that one was really
exciting because we were trying to build a CRDT that could handle rich
text formatting and traditionally, you have all of these CRDTs that are
designed for fairly bespoke applications. They’re things like a counter
data type or a set data type that has certain behavior when you combine
two sets, and we’re still at the stage of CRDT development where aside
from things like JSON CRDTs like automerge, we don’t really have a one
size fits all CRDT framework or solution. You still mostly have to hand
design and implement the CRDT for a given application.
And it turns out that in the case of something like rich text, it’s a
lot harder than just saying, oh, you know, we’ll store annotations in
an array and call it a day, because the semantics for how you want
different types of formatting to combine when people split and rejoin
sessions and things like that are all very complex and it turns out that
we have a lot of learned behaviors that arise, even from just like,
Design decisions in Microsoft Word, where you expect certain annotations
to be able to extend, certain annotations to not extend, things like
that. Capturing all of the nuance in that behavior turns out to be
really difficult and requires a lot of domain specific thinking.
But we think we have an approach that works and I would really encourage
everyone to read the essay that we published and try to poke holes in it
too. This was like the 5th version of the. algorithm, right? So like
months ago, we were like, all right, let’s start writing and then
Martin, who has just an incredible talent for these things is like, hey,
everyone, you know, I found some issues with the approach and, you know,
oh no, 00, and sort of we fix those, we’re like, all right, you know,
this one’s good and just repeat this like week after week. So I really
have to give him a ton of credit for both coming up with a lot of these
problems and also figuring out ways to work around it.
00:08:33 - Speaker 2: We talked with Peter a little bit recently, Peter
van Hardenberg, about the pencils down element of the lab, but also just
research generally, which is there’s always more to solve, you know,
it’s the classic XKCD, more research needed is always the end of every
paper ever written, which is indeed the pursuit of the unknown. That’s
part of what makes science and Seeking new knowledge, exciting and
interesting, but at some point you do have to say we have a new quantum
of knowledge and it’s worth publishing that. But then I think if it’s
just straight up wrong or you see major problems that you feel
embarrassed by, then if you want to invest more.
00:09:09 - Speaker 1: Right, exactly. I think in this case. There was a
distinction between, there’s always more we can tack on versus we
wanted to get it right, you know, and in particular, the history of both
operational transforms or OT and CRDT for rich text, just text in
general is such that it’s this minefield of I guess to use kind of a
gruesome visual metaphor, just dead bodies everywhere.
You’re like, oh, you know, such and such algorithm was published and
it’s such and such time and it was new hotness for a while and then we
realized, oh, it was actually wrong and this new paper came out which
proved like 4 of the algorithms were wrong and so on.
And so with correctness being such an important part of any algorithm,
of course, but also kind of this white whale in the rich text field, we
thought it was important to at least make a credible effort to having a
00:09:57 - Speaker 2: Yeah, makes sense. Yeah, I can highly recommend
One of the things I found interesting about it, maybe just for anyone
who’s listening, whose head is spinning from all the specialized jargon
here, CRDTs are a data structure for doing collaborative software,
collaborative documents, and then, yeah, rich text, the Microsoft Word
is the canonical example there.
You can bold things, you can italic things, you can make things bigger
Well, part of what I enjoyed about this paper was actually that I felt,
even if you have no interest in CRDTs, it has these lovely
visualizations that show kind of the data representation of a sentence
like the quick brown fox, and then if you bold quick, and then later
someone else bolds fox, you know, how do those things merge together.
But even aside from the merging and the collaborative aspect, which
obviously is the research, the novel research here. I felt it gave me a
greater understanding of just how rich text editing works under the
hood, which I guess I had a vague idea of, but hadn’t thought about it
so deeply. So, highly recommend that paper. Just give them the figures,
even if you don’t want to read the thousands of words.
00:11:05 - Speaker 1: I’m glad you like the figures. They were a real
00:11:08 - Speaker 2: Perfect, yeah, so.
00:11:10 - Speaker 1: The one thing I would add is that CRDTs are a
technology for collaboration, but the way they differ from operational
transforms or OTs is that a CRDT is basically designed to operate in a
decentralized setting, so you don’t need a persistent network
connection to all the parts. you don’t need a centralized server. The
idea is you can fluidly recover from network partitions by merging all
of the data and operations that happened while you were offline, and
this turns out to be really important to our vision of how collaborative
editing should work because we think it’s really important for people
to be able to do things like not always be editing in the same document
at the same time as everyone. Maybe I want to take some space for myself
to write in private and then have my changes sync up with everyone else
thereafter. Maybe I’m, you know, self-conscious about other people
editing. are seeing my work in progress, but I think that it would be
interesting and helpful to look at what the main document looks like and
how that’s evolving while I’m working in private, and you can have
that kind of one way visibility with something like a CRDT versus with
something like Google Docs, where it’s just sort of always online or
always not editing in your own personal editor. Conversely, maybe I’m
OK with everyone else seeing the work that I’m doing in progress, but I
just find it really visually jarring to have all these cursors and
different colors jumping around and People inserting text, bumping my
paragraphs down the page. I’ve definitely been there. I’m not
particularly precious about people seeing my work in progress, but I
just cannot focus on writing when the page is just changing all around
me. So in that situation, maybe I would want to allow other people to
see my work in progress, so that we don’t duplicate effort or something
like that, but I just have like a focus mode where incoming changes
don’t disrupt my writing environment and these kinds of fork join one
way window. Microgit style branching paradigms are really only enabled
by a technology like CRDTs where you have the flexibility to separate
and then come back together.
00:13:12 - Speaker 2: And I’m incredibly excited by the design research
that needs to go into that.
Now at this point, I think we’re still on the technology level, you
know, one way to think of it is Google Docs came along, I don’t know,
15, it’s almost 20 years ago now, I can’t even remember, let’s say 15
years ago, and this novel idea that We could both have a shared document
or several people could have a shared document, all see the up to-date
version and type into it and get, you know, a reasonable response or
have that be coherent was an amazing breakthrough at the time and has
since been kind of widely copied notion, Figma, many others.
But now maybe we can go beyond that, much more granularity, like you
said, maybe borrowing from the developer version control workflows a
little bit in a lightweight way, giving a lot more control and
flexibility, and giving us a lot more choices about how we want to work
But before we can even get onto those design decisions and how do we
present all these different things to the user, what are the different
options? We need this like fundamental underlying merge technology,
hence the endless fascination that we have the lab and increasingly the
technology industry generally has with CRDTs because it has the
potential to enable all that.
00:14:23 - Speaker 1: Yeah, when we were working on the Paratax project,
Peter was pushing really hard for, don’t make this just a technology
It’s a socio-technical endeavor and we need to invest a lot of time in
the design component, also just doing user interviews, identifying how
people interact with and.
How people collaborate in the status quo on text and Jeffrey and I
actually did do a bunch of user interviews with people from all kinds of
backgrounds. We’ve talked to people who write plays, people who produce
a dramatic podcast kind of in this style of Night Vale.
I love Night Vale. Yeah, people who are in the writer’s room kind of
working together with their collaborators on that, people who write
lessons, video lessons for educational platforms. And there was a ton of
really interesting Insights into user behavior around collaborative
We ended up just torn because we had this 12 week project and we were
like, how should we best spend our time? Clearly, this is not just a
technical area and we need to invest a lot in getting the design right,
understanding what the design space even looks like since it hasn’t
I really want to avoid, and this is a recurring theme in my work, I
really want to avoid publishing or shipping something. And having it be
this like, very broad, very shallow exploration into all the things that
are possible. I think that this kind of work plays an important role,
and there are a lot of people who do this well, just fermenting the
space of possibilities and getting these ideas in a lot of people’s
heads, who can then go on and do really cool things with them.
My personal style, I never want to feel like something is half baked, I
guess, I would much rather ship this cohesive contribution like, here is
an algorithm for building rich text. We think that this is a technical
prerequisite to all of these interesting design choices, but the
alternative with a 12 week period, and in fact, you know, this, the
correctness and revision phase extended way over that. So thanks a lot
to Martin and Jeffrey for leading during that part.
But it’s just already so hard to get it correct that trying to tack on
a really substantive design exploration that does the area justice on
top of that, I was just really worried it would stretched too thin.
So absolutely lots of room for future work in this particular. project.
It’s very much a challenge in any area where you have simultaneously
this rich design space that’s just asking to be explored with tons of
prototypes and things like that, and then also to even realize the most
simple of those prototypes, you require fundamentally new technology.
00:16:53 - Speaker 2: Yeah, I’ve been down that same path on many
research projects as well, and often it’s that I’m excited for what
the technology will enable, but also that in many cases it’s a
combination, you know, some kind of peer to peer networking thing, but
with that will enable us to provide a certain benefit to the user and I
want to explore both of those things, but then that’s too much and then
the whole thing is half baked exactly as you said. I’ve never found a
perfect or even a good. Way to really manage that tradeoff. You just
kind of pick your battles and hope for the best. Yeah, definitely. Well,
I do want to hear about the equation editor project, but first I feel I
should introduce our topic here, which I think folks could probably have
gleaned is going to be rich text and rich text editing, and maybe we
could just step back a moment and define that a little bit.
I think we know that texts, you know, symbolic representation of
language is a pretty key thing, writing and the printing press and all
that sort of thing. We wrote about that a little bit in our text blocks
memo, which I’ll like in the show notes. But typically, I think
computers for a lot of their early time and even now with something like
computer code is typically plain text, that’s the dot TXT file is kind
of almost the native style of text that you have and then rich text
typically layers something on top of that. I don’t know, so maybe you
could better define rich text for us to have a more concrete discussion
00:18:21 - Speaker 1: Yeah, I think rich text for most people basically
evokes things like bold, italic, underline, the ability to augment plain
text with annotations that are useful in formatting, actually, I think.
Notepad to word pad is the archetypal jump in software, if you’re
thinking about it from the old Windows perspective.
In the past few years, I think we’ve started to see a real expansion of
what rich text can look like. So, of course, we started out with
something like Markdown, which is, of course, a plain text
representation. But it’s designed to be able to capture more nuance in
plain text and be rendered to something like HTML which very much
So in Markdown, you have not only these kinds of inline formatting
elements like bold and italic and hyperlinks as well.
You also have support for images, which you could think of as more block
level rich text elements, I guess, and I don’t think there’s a real
clear consensus across editors on how block level rich text elements
Of course, in between you have things like bulleted lists and those tend
to be handled in a fairly standard manner with nested lists and so on,
but it quickly becomes like a question of taste. Which kinds of
So in editors like Coto or Notion, you have all these different block
types where the block is really the atom of collaboration and editing,
and then you can have things like, you know, file embeds or even
database views, things like that.
So I think we’re at a point now where both block-based editors, I’m
using block based editors in like the text or writing sense, not the
structured editors for programming sense, although I have other things
to say about that, but we’re at a point where you’re starting to see
these block-based editors appear and I think that there are a lot of
really interesting patterns that this permits that the paragraphs via
linear sequence of characters, including new lines and whitespace does
not permit, or at least doesn’t allow you to build as structured
00:20:30 - Speaker 2: I’m trying to think what is actually the core of
the difference between a block-based editor, that’s a notion, a RO uses
working on its own block text implementation and a flow of characters,
so that’s Microsoft Word, Google Docs, maybe even text editors. I guess
it’s sort of like paragraphs are separated by like these sort of
nested. Elements or have a parent to the document versus like two new
lines embedded in the stream of characters, but I don’t know, that
seems too unsophisticated, maybe have a better definition for us.
00:21:03 - Speaker 1: So, I actually think about this very similarly to
in the like programming languages and editor tools space.
There is a distinction between structured editors and regular plain text
editors for programs. The idea is that you might have a text-based
programming language and you can write that perfectly fine in any buffer
that allows you to put sequential characters, often AI is sufficient for
some languages, and then on the other hand, These programs might have a
lot of inherent structure. A simple example is with lisps which are
built out of these parenthesis S expressions, everything is, you know,
an S expression. You can think about like the structure of the tree
formed by, I guess a forest, formed by having like these S expressions
with subelements and stuff. that, and then you can do manipulations
directly on the structure in a way that allows you to always have a
syntactically correct program or at least a partial syntactically
correct program by doing things like I’m just going to take this
subtree, which is a sub-expression and move it somewhere else where
there’s room for another subexpression. So, I think of block-based
editors as capturing a very similar zeitgeist to structured editors for
code, because instead of just having this linear buffer of characters
that can have, you know, formatting or things like that, you can have
new lines, you actually have more of a forest structure where you have
lots of like individual blocks, and then you can have blocks that are
children of other blocks and so on, and that allows you to Do things
like move an entire subtree representing an outline to another position
in the document without selecting all of the characters, you know, cut
them and then paste them somewhere else. So things like reparenting
becomes a lot easier, things like setting the background of an entire
subtree becomes a lot easier. Just in general, you have more structure
and there’s more things you can do with that structure, I guess is how
I would phrase it. One of my favorite things that you can do with this
model in notion is you can change the type of a block very easily. So
let’s say I have a bullet list item, and then I hit enter and enter
these like subnote or something like that as children of the initial
bullet list item. I can turn the bullet list item into a page, and then
all of a sudden it’s just a subpage in the document, and the sub
bullets that were there before are just like top level bullets in that
page. And this is particularly important for my workflow because I care
a lot about starting out with something like really rough and sketchy
and then progressively improving it or moving up and down the ladder of
like fidelity into something more polished. So you might, for instance,
start off with just an outline list or even a one dimensional list of to
do blocks when you’re trying to do project planning or something. And
then later on, let’s say I want to put these into like a tasks database
with support for like a conbond view or something like that. I don’t
actually want to sit there and like recreate all of these tasks in Jira.
I’ve been there, you know, I’ve been the person making all the tasks
in Jira after the meeting and then assigning them to people. What the
workflow that I think notion is poised to enable and can certainly do a
better job in this regard, but already offers some benefits on is like,
can I just highlight all of these blocks because everything is a block,
move them into some existing database and have them match the schema.
That kind of like allowing people to do fast and loose prototyping with
very unstructured primitives and then promote them into something more
structured like in a relational database setting or similar, I think is
the sweet spot, structured editing provides the sweet spot between like
just completely unstructured text and these very high fidelity, high
effort interfaces that allows you to kind of move between them.
00:24:47 - Speaker 3: Yeah, I really like that direction and framing,
and if I can extend it a little bit, I think we can also look at a
continuum of richness in terms of the content itself.
So you have plain text, what you might classically call rich text with
links and bold and underlying. And then you maybe start to throw a few
images in, and then what if you can put it in videos and what if you
have a whole table, and that table is actually a database query, and you
can nest the figment document, and this way you can see that there’s
sort of continuum on the richness of the document. One reason I think
Notion has been so successful, they’ve been pushing along that
continuum while maintaining a sort of foundation of rich textness, which
is very familiar and the important basic use case for a lot of people.
A related idea is that I think we’re seeing a lot of the classic
document types converge. So if you look at a rich text like a Microsoft
Word and a PowerPoint and increasingly spreadsheets, those all used to
be 3 distinct Microsoft Office applications, and we’re seeing the value
of them being in or being the same document.
This is actually one of the motivating ideas behind Muse and a lot of the
research we’ve done in the lab, and the kind of something Slim was
saying, you want to take your idea continuously through different media
and different modalities and different degrees of fidelity, and you
don’t want to jump between different applications do that. You want to
be able to do it on the same canvas. That’s by the way, one of the
reasons I like Canvas. It’s not only because it’s a free multimedia
surface, but also it evokes this idea of like flexibility and
potentiality, and I think that’s one of the things that’s really
excited about these mixed media documents.
00:26:16 - Speaker 2: And I know if Jeffrey were here, he might jump in
and say that one downside to our current application silo world is that
the only way to have this deeply rich text where it’s images, video, a
table, a database query, something like that, is to have the Uber
application, to have the everything app, and certainly notion has
probably gotten pretty far on that, but others kind of in In some ways
are forced to do that, like we have to do some of that in Muse as well.
People come in and ask for all these different types here as well, and
there’s more of like an open doc inspired or Unix inspired future that
maybe Jeffrey and others, including me, would hope for, which would be
more that applications could be these individual data types and you
could put them all together through some kind of more operating system
But that is so completely reversed from kind of how all our computing
devices work today. It’s hard to see how we might get to that.
00:27:14 - Speaker 3: Yeah, I’m certainly sympathetic to that concern,
although I suspect the way out is through, and you get platforms from
And so the way we got the whole unit ecosystem was they wanted to build
a computer for, you know, writing and running programs and then
eventually got all this generalized text processing stuff, but it’s not
like they started in like, oh, I’m gonna make a generalized text
I don’t think that was really the way that they approached it and
developed a success. So, I’m still hopeful we could do this, but I
think you got to extract it from something that’s already working as an
app, but it always helps to have an eye towards that, and I think we’ve
done some of that with Muse.
00:27:46 - Speaker 1: I was just going to say that it’s not me talking
about texts, unless I bring up my favorite piece of software of all
And I think that Pandok actually is very relevant to this discussion. So
for those who aren’t as familiar with it, Panok brands itself as this
Swiss Army knife for document formats, and it’s sort of headline
contribution is that it allows you to convert between all kinds of
For instance, I can take a Word document and convert it to a PDF Word
documents to something like, I don’t know, IPython notebook, Jupiter
notebook, back and forth across this incredible bipartite graph of
formats, but I think that the subtler contribution that Pandokc makes,
which is extremely significant, is that Pandok has this form of markdown
called Pandok markdown that essentially aligns and supersedes all of the
different fragments of markdown that we’ve seen before.
So the problem with markdown basically is that the original
specification is sort of ill-defined. There are several cases in which
the behavior is not super clear and then on top of that, it’s not very
There aren’t very many constructs. So things like fenced code blocks,
which many people associate very closely with Markdown today, that was
only added by GitHubb flavored markdown, which is certainly widely used
among the programming community, but not everyone is on GitHub, of
course. And then you have things like table formatting or even like
strike through really strike Through wasn’t defined in the original
markdown specification either. And so you have markdown and then you
have like GitHub flavored markdown, common mark is sort of this unifying
effort remark down all these different is the markdown cinematic
universe. I tried to make a joke about this. I had this joke ready for
the markdown Cinematic universe when the last Marvel. Movie came out.
But then like, it didn’t get nearly the traction in my timeline as the
Dune did, perhaps understandably. So really, I’m just going to have to
wait till the next movie comes out. It’s a real, real tragedy. No, but
like, I guess you have this real pluralism of forms and it becomes very
difficult to use markdown truly as a portable format because the way it
renders in one editor or even parses can very much differ from editor to
editor. So, Pandoc provides this format that essentially serves as an IR
or intermediate representation between all these kinds of documents
using a markdown supersets that somehow magically encapsulates
00:30:18 - Speaker 2: And that includes not just markdown, but also like
PDFs or Microsoft Word, that seems.
00:30:24 - Speaker 1: Well, so the way it works is it’s this
compilation pipeline, I guess, that allows you to go from a markdown
It compiles it to PDF using PDF Lawtech or something. It outputs
Lawtech, it outputs HTML various things, and you can think of it as
being this intermediate representation because you start with this like
Word document, you can turn that into markdown and you can go from that
markdown format into any of these output formats, which turns out to be
like really powerful because the main issue with these kinds of
conversions is that it’s often lossy, there are features that are
supported by Law tech, for instance, that aren’t supported by the web
natively, there are features that are part of like Word documents that
aren’t necessarily supported by HTML and so on and so forth.
So Pandok serves this role of like basically saying, OK, what is an
intermediate language that can encapsulate all the different
implementations of the same concept across different input and output
And what I think is so remarkable about it is that oftentimes when you
are using an AP. of software and you’re like, oh darn, you know, now I
need to support this other thing too. You quickly end up in a situation
where you have the snowball and things start to feel tacked on.
So you’re like, Oh man, it’s very clear that they just glommed on this
additional syntax for this feature. And with Pandok, everything feels
like very principled in its inclusion. And at the same time, whenever
I’m using Pandok and I’m like, darn, I really wish there was a
construct that I could use to express this. particular thing, I look up
in the documentation and it’s always supported. So, as one of my
favorite examples, one of the output formats that Handok supports is
various slideshow frameworks. So Beamer for people who use Lawtech and
Reveal JS for people who use HTML and CSS and these slideshow frameworks
basically allow you to replace something like PowerPoint, Keynote,
Google slides with essentially like a text-based format. I really like
doing slideshows in Pandock markdown. There are a few reasons for that.
The first reason is that it’s really useful to be able to reuse some of
the same content from like my blog post or essay even in the slideshow.
There are some really minor and almost petty, but really significant
reasons. Like, I like to have equations or code blocks with syntax
highlighting in my slideshows, and there’s not really a good solution
to putting like a syntax highlighted code block in Keynote right now.
00:32:39 - Speaker 2: Last I remembered, the gold standard at the Ruby
conferences I used to frequent was to take screenshot of Textmate and
00:32:47 - Speaker 1: Yeah, it’s awful. I don’t want to see your like
monochai editor with like the weird background that contrasts weirdly
with the slide background. I just, ah, and it doesn’t scale on a huge
conference display anyway, I digress, but The other reason why I really
like doing my slideshows in text is actually that there is often a
hierarchical structure to my presentations, right? I’ll have like these
main top level sections and then I’ll have subsections, and then I’ll
have like sub subsections and all of these manifest and slides. But in
the gooey thumbnail view of most of these existing Slideshow editors
like PowerPoint or Google slides, it reduces it all to like this linear
list. It’s like, here are all of your thumbnails in order. And it makes
it very hard, as soon as I have like an hour-long conference talk, how
do I like jump to this subsection that I know exists, aside from like
scrolling past like 117 thumbnails and trying to find the right one,
right? And moreover, let’s say I want to Reorder a certain part of the
talk because I think it better fits the narrative structure. Now I have
to like figure out which thumbnails I need to drag to which other place
or worse, go into the individual slide, select the text from that, move
that somewhere else, and it’s just way, way clunkier actually than
reordering some text in like a bullet list outline in my editor.
And then the other part is that I was talking about how Pandok has
really great support, expressive support for idioms of different
formats, and one thing you often have in slideshows is that I have some
element on the screen and then I press, you know, the next button again
and then another element will appear.
So in Pandoc you can denote this with just like an ellipsis basically so
like dot dot dot and then if I have a slide where I have a paragraph and
then the dot dot dot and then another paragraph, it will render with
just the first paragraph visible and then I press next and then like the
subsequent paragraph comes in.
And that’s like just a very lightweight way to handle these stepped.
Animations compared to going to the animation pane and then clicking the
element that I want to animate in and so on and so forth.
So it started off with me being like, I’ll just prototype in this
format, but then it ended up supporting columns, it supports all these
things that you actually want. And I was like, this is in many ways a
more ergonomic way to handle long technical slideshows. Anyway, I have
to chill for Pandok anytime I talk about rich text, I’m contractually
00:35:08 - Speaker 2: Yeah, it’s a great piece of software, use it here
and there. I think I was doing some Asky doc kind of manuals many years
ago and yeah, just in general, it’s also worth looking at the homepage
that you mentioned the plot they have where it shows all the different
formats that can convert between is quite fun. You click on that, you
00:35:26 - Speaker 1: Yeah, I had this really elaborate plan when I
decided to go to Berkeley, that I was going to print out a door-sized
poster of like that graph that shows all the formats they convert
between and then show up at John McFarlane’s door and ask him to sign
it. But then the pandemic interfered with some of those plans.
Nonetheless, it remains on my list.
00:35:48 - Speaker 2: Good bucket list item, pretty unique one at that.
00:35:51 - Speaker 1: Also, I found my tweet, or I found the draft of my
tweet, which is about eternals, and I said, directed by Chloe Zhao, the
latest entry in the Markdown Cinematic Universe features an ensemble
cast of multi markdown, GitHubb flavored markdown, PHP Markdown Extra, R
Markdown, and Common Mark as they joined forces in battle against
mankind’s ancient enemy, Doc X. Nice.
00:36:12 - Speaker 2: Wow. You would have gotten the like from me.
00:36:16 - Speaker 1: Yeah, we’ll see if it ever sees the light of
00:36:20 - Speaker 2: You briefly mentioned there equations and La tech,
and maybe that’s a good chance to talk about the equation project you
did for notion. And part of what I thought was so interesting or what I
think in general is interesting about equations is that they are
obviously an extremely important symbolic format, but in many ways
extremely different from the pros we’ve been talking about.
So English or other languages, even languages that are right to left or
something like that, they all have the same kind of basic flow and the
way that we represent sound. So with these little squiggly symbols, even
though the symbols themselves and sounds vary and how we put them
together into words across languages, that’s a common thing. If you go
to the mathematical realm, you have symbolic representation, but
equations are the whole own beast, and I think one that has gotten a lot
less attention from kind of the software and editing world. So tell us
00:37:16 - Speaker 1: Yeah, so just as context for people, notion and
many other applications actually have long supported block equations, an
equation that basically takes up, you know, most of the page
What is much more uncommon in editors is support for inline equations
and so this can be something as simple as saying, You want to type let X
be a variable, and X should be formatted or stylized mathematically.
Being able to refer to elements of a block level equation in inline text
is a prerequisite for being able to do any kind of serious mathematical
writing, yet because this is kind of this niche area that has
historically been the purview of Overleaf and other law tech editors,
it’s really not implemented.
So I pushed really hard to add inline equations and inline math to
notion, because I was like, there’s a huge opportunity for people to
write scientific or mathematical documents that take advantage of all of
notion’s other features like being able to embed FIMA or embed
illustrations and things like that, right? So, it turns out that it’s
kind of difficult, exactly as you’re describing to do this equation
There’s been very little innovation and research more generally into
what is like a good interface for inputting equations. So I think most
people Probably familiar with Microsoft Word or Excel have these
equation editors, or even like operating system level sometimes where
you basically like open this palette, and there is a preview and there
is a button for every possible mathematical symbol or operator you can
imagine. And then for composite symbols like the fraction bar or
integral or something like that, you find the button for that, you click
it, and then you click into like the little subboxes and then you find
whatever symbol you want and you put those there too. So it’s kind of a
structured editor, but like in an unimaginably cumbersome interface.
This is what I used to do my lab reports in high school, for example.
And then at the other end of the spectrum, you have things like law
tech. Law tech is basically how everyone in at least in computer science
and mathematics chooses to typeset their work, typesets complex
mathematics. One of the real selling points of law tech, I think is that
It turns out that operator spacing is really important, and there’s a
big difference between, say, a dash that’s used like a hyphen or a dash
character that’s used in text, and a hyphen or a dash character that’s
used as a minus sign in an equation, the spacing is subtly different.
And one of the big things that Lawtech does is it basically allows you
to declare certain operations in certain contexts as like a math
operator versus just a symbol versus just like a tagged group of
characters, and it correctly handles the spacing depending on what kinds
of characters are around the operator in question. And so Lawtech
basically produces really nice looking mathematics at the cost of this
markdown which looks like I kind of smashed my keyboard that only had
like 3 characters. It’s the exact opposite of the equation editors
instead of having a button for every imaginable character, you only have
3 buttons. The buttons are backslash, open curly brace, and closed curly
brace, and somehow like permuting those characters is supposed to get
you like any possible mathematical outfit. There’s just two ends of the
00:40:41 - Speaker 3: Yeah, I used to do my analysis homework in college
in law tech, and I remember when I first looked up how you would input
in law tech these formulas, like, that can’t be right. This is not the
best way in the world to do this. In fact, that’s it, that’s the one
00:40:53 - Speaker 1: It really is, it’s terrifying. It’s the one and
only way and the wild part is there are people who are like super, super
good at law tech. They can like live tech their lecture notes. I was
never nearly like that fast, but some people can do it usually with
extensive use of macros, which macros are another selling point of law
tech as you can define these kind of custom shorthand for operators you
use a lot. But anyway, yeah, so you have a lot of tech sort of at the
other end of the spectrum, like really quite unreadable, oftentimes,
like, it’s like a right only format, many times.
00:41:23 - Speaker 2: And of a regular expressions come to mind on that
00:41:26 - Speaker 1: It’s exactly the same zeitgeist, I think. It
turns out that figuring out how to have like a combination, gooey, plain
text interface that allows you to be like in a rich text editor like
notion, then. into an inline equation field to have like an inline
symbol and then go back into the GUI editor was like just very
And it kind of makes sense that lots of people don’t prioritize this
because many people that notion rightfully had the question like, oh, is
this something we should be working on? But first of all, it turned out
that if you actually tallied up like our user requests, inline math was
Of editor feature based requests. And then more generally, it turns out
that because this is like a prerequisite for many researchers and for
students, you can get a lot of people on your platform who rely on it,
you know, as a student to take notes and something like that, because
there’s literally no alternative. And then they are able to stick
around and use the platform for all kinds of other things.
So this is just kind of a plug that more editors should implement this.
But Yeah, I thought that this project was really interesting because in
the interaction paradigm, you want to capture a lot of the things that
are very fluid about editing regular text. So for instance, we knew it
was important that you should be able to use the arrow keys to move left
and right, kind of straight through a token without editing it if you
wanted, or if you wanted to be able to go. Into a token and edit it
using the arrow keys, you shouldn’t have to like use the mouse to
click, although, of course, you should also be able to use the mouse to
click. And when you have this formatted equation, we made the decision
that the rendered equation would be represented as this atomic token. So
if you were highlighting text to copy and paste and move around, it
would be like highlighting a single character that would just be like
the whole equation. But of course, you could go in and edit the
equation. Any way you want it in kind of this pop up text editing
I think another thing that’s the subtle interface challenge here is
that like Mark was saying, there is often a Uh, disproportionately large
number of characters used to represent the equivalent of like one
character with a formatted output. And so that’s something you don’t
really take into account. The output is like X with a hat in San Sara
font, and then there’s like 25 characters of markup that goes into
that, and you just need to like scale the interface appropriately to
But I think that it’s really interesting because It shows the power of
combining different input and output formats in like the same atom,
right? So you have like a single line of text, and you want to have rich
text that’s formatted and stylized and so on, hyperlinks, and then also
equations or whatever inline rendered output of another input format
that you have. I think that that’s really where GUI editors and whizzy
wig editors can shine is being able to combine these like, Input formats
and output formats like in the same line in Chu, yeah, I guess you
can’t really do that at all with the terminal or something like that,
and I say this as someone who uses like CLIIM for everything.
00:44:34 - Speaker 3: This is bringing back so many memories. I wish I
had notion with equation support back when I was a math undergrad. It’s
00:44:41 - Speaker 1: I’m like the notion math stand guardian, I don’t
know, something like that. And I’m always keeping track of like all the
cool things people are doing using equations and notion.
A lot of people are doing like math blogs in notion, which is really
awesome for me to see. Also, I just feel like they’re having tried lots
of other things. They’re just like really isn’t. A good alternative
short of like actually writing lots like for your blog, which no one
really likes. And yeah, I mean, certainly it’s the kind of thing that I
implemented originally, kind of, I was like, I’m gonna do this for
myself, and then realized that lots of people would be able to benefit
It’s been really cool to see a bit of reception it gets, like the
inline math tweets on the notion, uh Twitter account overwhelmingly get
the most engagement and interaction.
And initially, like the marketing team was shocked. They thought this
would be the super niche feature, but no, it turns out that people love
math and like, they may not be the most vocal proponents or they’re
used to no one caring about math type setting, things like that.
For a while, I think it was the case that when I did find an editor that
had support for equations of some kind, to me, it was overwhelmingly
obvious that the people who implemented it did not regularly use
equations for writing. I think you can often tell that with different
features. So I think that having that kind of Representation is not
quite the right word, but being able to see a feature that was designed
by someone who really cares about using it themselves is really cool for
people who are interested in typesetting, students, researchers, people
who are interested in typesetting more mathematical text.
00:46:11 - Speaker 3: Yeah, and I think it’s really important, like you
were saying that it’s mixed media because you’re combining the
equations, the inline equation and the block equation, by the way, in
the world class form, which is a lot tech based with a world class rich
text editor with text and images and stuff. It’s really nice. I do
think there’s still one frontier here, especially for math, which is
the fully gradual process from you’re taking handwritten notes and
you’re working out a problem and you’re drawing squiggly diagrams all
the way up through your finished homework. I remember when I was at math
undergrad. I would basically have to do the homework twice. You do it
once on paper. Nobody could read that, including myself, so that, you
know, do it in lot again. And I always wish there was a way to do it
incrementally. You sort of changed equation by equation and diagram by
diagram into the final product. And I know there has been some research
on uh turning equations into lot tech formulas with machine learning. I
don’t know if I can do handwriting, but perhaps someday we’ll get the
new support for equations and you can go all the way to the end.
00:47:02 - Speaker 1: Yeah, like you, I share exactly the same
frustration that you have to essentially do lots of things twice, and
the relative position of everything is ambiguous, and Lawtech is what
allows you to do things like have subscripts of subscripts, which would
be really inscrutable in most people’s handwriting, including my own,
and, you know, subscripts of subscripts along with super scripts and
things like that. There are just so many ambiguous details and it turns
out in my experience with like, anything that tries to automate the
transition is that I always end up Going through and like really
rewriting all of the details to be structured in a readable way.
You have this other problem which back in the days of like Wizzy Wig web
editors like Dreamweaver and Microsoft Front Page and things like that,
you would often end up with this problem where you try to do like any
edit in the Wizzy Wig side and then you look at the generated HTML and
it’s ridiculous. There’s just like 16 nested empty span tags, and no
one would ever be able to maintain that.
And my worry is basically that when you automatically create Markup for
something that has a very complex graphical representation, it’s really
like one way, you know, maybe it will help you produce a compiled
output, but it doesn’t actually help you go back in and like edit and
tweak the representation later or it’s just so inscrutable if you do
that it’s kind of also a reg x type situation.
I think we really need to get to some kind of like good intermediate
representation that allows you to flexibly go both ways.
And that goes back to something that I think Adam and I were chatting
about earlier, which is that a lot of people gripe and complain that
like law tech is the best we have and, you know, I’m one of them, but
It really is the case that, you know, lottech was just this like
monumental effort by really a few people and amount of effort that would
be like considered really impressive if I were to try to do the same
thing but better today and not a lot of people just have like spare time
to do this all in one text formatting, packaging, document
representation project, even though it would have huge impact on the way
people write and publish these kinds of documents. And so in many ways
we’re sort of just bottlenecked on the fact that It’s hard to do
incremental improvements to this particular area. We really depend on
these like software monoliths to keep us afloat.
00:49:19 - Speaker 2: I’m not nearly as mathy as either of you, but I
can’t help but make the comparison on these equation editing to what
you mentioned earlier with kind of structured editors and programming,
where whether there’s lightweight help from your text editor, things
like code folding, syntax highlighting and autocomplete, or full
structured editing, some of the visual programming stuff we talked about
with Maggie Appleton, like Scratch, for example, or these flow based
systems that are fully graph. and you sort of can’t have it in a bad
state. And I can’t help but to think there might be some direction like
that that is not necessarily the right only inscrutable tech, but is not
the Microsoft Word one button literally for every symbol you might ever
It does seem like there might be some other path, and yeah, I agree
it’s a monumental effort, but I mean, mathematics is so important and
foundational and so much of human endeavor that certainly seems like one
worth investing in, although perhaps hard to reap a profit from, and
that makes it harder to put concentrated capital behind it.
00:50:20 - Speaker 1: Yeah, I think that there’s definitely very clear
demand for I think something exactly like what you’re describing, which
is somewhere in between the two extremes, and it is really relevant
because ACM, which is the Association for Computing Machinery, the
academic and professional body really for computer science, they are
currently undergoing this.
Fiasco, maybe, I probably shouldn’t go on the record as calling it a
The ACM is currently undergoing this initiative called TAPS, which is
the ACM Publishing System, where they are attempting to revise the
template by which all computer science research is published and
disseminated, and the idea behind this is that right now, computer
science research is published to these PDFs. Initially they were all two
column PDFs, now I think there’s some one column PDFs. They want to
output HTML as the archival format for various reasons, including that
it offers much better reading experience on different screen widths, so
like phones or tablets, which are increasingly how people are reading
papers, not just printed out. And they are much more accessible than
PDFs. PDFs are just like really quite inaccessible, especially to screen
readers and other assistive technologies that are trying to parse out
all the different math or whatever arbitrary formatting you’ve decided
to use. The upshot of this, I guess, is that there are currently a group
of very smart people who are trying to figure out how in the world
we’re going to get people to start writing all of their papers and
outputting them in a different format, in a world where everyone is
already used to preparing. Their publications and preprints in law tech.
And turns out that even if you solve the problem of like what the input
syntax should be, rendering math in the browser is like an extremely
00:52:05 - Speaker 3: Yeah, isn’t the state of the art that it like
generates PNG and sticks it in the web page?
00:52:09 - Speaker 1: Not exactly, but like almost. OK. So MathML, which
is like an XML dialect or like mathematical markup language, was this
HTML XML style syntax for typesetting mathematics.
Naturally, it is only implemented in Firefox, so that’s really
unfortunate. So in terms of the state of the art, there are basically
two libraries that you can use to typeset mathematics. There’s math
Mathjax supports basically all valid law tech, including, you know,
different. Environments and equations and things like that.
The problem is that Mathjacks is very slow. So if you ever go on math
overflow or another like related stock exchange and you see like all of
these answers with like weird gaps, and then as you watch before you,
the page starts to like load all of the rendered equations like bumping
everything down one level at a time. That’s math Jackson action.
And oftentimes it is doing what you’re describing where it is
outputting like an SBG or a PNG or something like that, and it’s just
like reflowing the page with every equation.
So then you have Caltech, which was a library developed at Konn Academy
where they realized that math Jack’s performance was basically just
like not satisfactory for their exercises and things like that. Sootte
supports a much more limited subset of all of Law tech syntax, but it
does it all using CSS basically, and it doesn’t reflow the page for
every equation. It’s basically instant surrender.
So tech is what we use at Notion, it’s also what’s used in like
Facebook Messenger, which supports equations if you ever tried that, and
many other websites, and basically it means that your options, if you
want to render math are only target Firefox. Use a limited subset of
math that’s supported by Kottech and Consign yourself to like extremely
slow, dozens of reflow, full expressive power rendering to inline
And so that’s just not like a great situation to be in, and we haven’t
even gotten to the question of like how people write math. So I would
say that people underestimate like how open this problem spaces.
00:54:17 - Speaker 3: Yeah, man.
00:54:19 - Speaker 1: Just take a moment of silence to like recognize
the gravity of the situation.
00:54:23 - Speaker 3: This is an aside, I don’t know if you want to put
this in the episode, but now I’m curious. It sounds like both of those
are interpreted in the sense that the equations are rendered at load
time instead of being compiled down to some like HTML and CSS that you
can render without JavaScript. Like, basically, do you need JavaScript
00:54:39 - Speaker 1: Yeah, basically, I should say you also need
JavaScript, unless you’re doing the pre-compied to MathML and then hope
that people are using Firefox.
00:54:47 - Speaker 3: Man, I feel like there’s no way that that stuff
loads in 10 years, but we’ll see.
00:54:52 - Speaker 1: I actually had this exact argument, again, I
don’t know if you want to put this in the episode.
I had this exact argument with Jonathan Aldrich, who’s on the taps
committee when we were talking about this, and I think the point was not
so much that you can guarantee that the artifact loads. Exactly the same
way in 10 years, but that the representation is rich enough that one
could feasibly build software that renders it the same way in 10 years.
So it’s more about the fidelity of the like underlying representation
where like a team of, I guess, digital, you know, archaeologists could
recover the work that we were doing and not so much like we trust in the
vendors to like keep everything stable, which is obviously never going
to happen. You know, the only reason like PDFs are stable is because how
many trillions of dollars of IP depend on being able to load the PDF the
same way as it was written, you know, 30 years ago.
00:55:45 - Speaker 3: Yeah, interesting.
00:55:46 - Speaker 1: Nice. Going back to this idea earlier that Mark
mentioned of the spectrum of like plain text, rich text, Wizzy wig
One recurring theme for me is thinking about decoupling this spectrum
into like what is the format and then what are like the editors and
tools that we can use to interact with this format, so they structured,
I want to call outAR, which is a native application for Mac OS and iOS
that does a really great job with this, which is that Bear is basically
Something in between a whizzy wig and a plain text editor in that
you’re always editing markdown documents and indeed, when you have
something that’s bold, you can see the like asterisks around it that
But all of these standard, you know, Control B, U, editor shortcuts work
And more importantly, you can see like the formatting applied in real
So That when you do star star, hello star star, he suddenly becomes bold
And so in many ways it combines like the fluidity and the real-time
preview of a rich text editor or previewer with the flexibility of like
ultimately just writing plain text characters. And I think this is like
I don’t just mean something like Open VS code or VIM and type
characters and then see like different formatting labels attached to the
I mean like a native application that’s really designed like for end
use or end users, that doesn’t fully obscure the input syntax but does
real time rendering in place.
It’s not even like in monospace font, right? It makes it feel much more
like this is actually the output that you’re targeting. And not just
like an input step that needs to be pre-processed. I think that there is
a lot of room for applications that are kind of in between and in that
same spaces where it doesn’t entirely obscure what you are writing, but
it does give you a lot of the benefits of previewing things and having
like a GUI application outside of the terminal in terms of like
capturing the richness of the possible results.
00:57:52 - Speaker 3: Yeah, I like the bear approach a lot. Now, are
there particular domains or types of documents that you think would be
susceptible to this approach, or it just for rich tech specifically?
00:58:01 - Speaker 1: So I was making a list of like all of the
different traditionally graphical outputs that have corresponding plain
text representations and a lot of them I was thinking about, for
example, in engraving sheet music, right, traditionally you would use a
desktop program like Finae or Sibelius nowadays you have options like
new score and flat, which are more web-based editors, but you see the
staff and you click notes. In the staff like corresponding to where you
want the note, and you know you use the quarter note or the 8th note
cursor to pick the duration and so on.
And then at the other end of the spectrum you have Lily Pond, which is
kind of like law tech I guess for engraving sheet music where you type a
very like law tech-esque syntax and out comes, you know, beautifully
typeset sheet music. For me this is like a little bit too. Gnu edgy,
just because when I think of like composing music, I’m very much
thinking about like what the staff looks like, just to be able to
visualize chords and counterpoints and things like that.
But I think the upshot is that like you could very easily have something
in between where you have like a text-based or non-binary representation
of like a piece of music or a composition, and then you can edit it
either using like the text editor or using the structured editor of an
existing Wizzy wigUY like composition software or notation software
rather, and edit the same representation both ways. And then likewise,
you have for diagram generation. This is an area that’s been A real
pain point for me historically because you can basically do something
like really low fidelity, like sketching on paper, but then if you
don’t want to like take a picture and upload it to whatever document,
right? All of the options are like very high fidelity, like there’s
omnigraphle and whimsical and Sigma, which is even more involved where
you get all of these nice things like lots of styles and force directed
layout and so on and so forth, but it’s like quite cumbersome to input
a diagram that you sketched in all of 30 seconds into omnigraphle in its
full glory. And then you have like on the plain text end of the
spectrum. The software like graph is Tie for Law tech things I really
like are Mermaid, which is a markdown type syntax for quickly generating
diagrams. There’s SVG Bob, which is incredible. It basically lets you
turn Asy art into formatted SVG though, as a brief aside, I don’t
actually know what problem this is solving. Aside from being incredibly
cool, because at least for me, I consider myself someone who’s like
fairly artistic, and it takes at least as much effort to figure out how
to make a really nice Asky art like thought bubble as it does to figure
out how to actually like do the SVG. I’ve always really wanted
something that basically allows you to edit it either as text, which
allows you to prototype really quickly, make a fast flow chart or
something like that, and something. I’ve always really wanted an
intermediate representation for diagrams where you can edit it either on
the text end using something like mermaid to do really fast prototyping
for a flow chart or something like that. And then if I wanted to have
more precision and control, I could also pull it into software like
omnigraphle or Figma and make fine grain tweaks, um, get like my nice
force directed layout or control where individual nodes were if I find a
grain control over positioning, things like that. I guess I think there
are lots of different areas outside of just traditional documents that
are ripe for an editor or a representation that learns some things from
the plain text approach and some things from the whizzy wig approach. I
think that we are, yeah, we’re getting close to being able to explore
those, but I would love to see more work in this area.
01:01:40 - Speaker 3: Yeah, this is very interesting. One challenge here
I think is with plain text and rich text, the structure of the text and
structure of the final output are going to be pretty close.
And so that makes it most feasible to have the thing where you’re
seeing both the worlds superimposed with the double asterisks on both
sides and the bold text, for example, with something like a diagram, if
you were to represent a diagram in just like Like not like input, it
would be a complete mess. It basically no resemblance to the final
output, just be like a string of really opaque characters, and then it
would compile out to a nice graph, but it’s kind of hard to go back and
One way to combine these two worlds would be to invoke the command
palette metaphor that we see emerging so often, as you can imagine, OK,
you’re editing a score or you’re editing a graph. And instead Having
1000 buttons around the edge of your screen like you do with these
typical applications, the only interfaces you can click on stuff and
then you can type stuff in the command poet. So you click up where you
want to add a note and you say like B, you know, BQ, and it puts in the
Bcor node and so on. And similarly with graphs, you could click on a
node and you could invoke little commands with your text editor or
perhaps edit the little node locally represented as a little text box.
That’s kind of a way to bridge this issue of a pure tax representation
would have no obvious correspondence to a 2D or a 3D image, but if you
have some way to get more local nodes, it could work well.
01:03:00 - Speaker 1: Yeah, definitely.
01:03:01 - Speaker 2: And the thing that brings to mind for me is our
oft-cited favorite tool for thought, which is the spreadsheet where you
do have this, it’s a very, very simple version of that, this 2D layout,
but in fact you do click on cells and type in symbolic there, so you are
mixing a visual spatial layout, a very lightweight one with some
01:03:23 - Speaker 3: Spreadsheet remains undefeated.
01:03:25 - Speaker 1: One thing that I find really interesting about
spreadsheets, that’s I think often very unexplored is that many
applications like Air Table notion is also very much guilty of this.
You can capture like the power of the spreadsheet as like a relational
database or we like what happens if we impose better structure onto the
different columns and things like that, but there’s like a separate
totally untapped. Under explored area of spreadsheets, which is that
it’s basically this canvas, right? Spreadsheets capture everything that
people liked about table-based layouts in HTML with none of the stigma
associated with it. And so you can create these like really complex
interfaces that basically just do data. and things like that and put
things, be like, OK, I’m going to like copy this data and bring it over
closer to where I’m working now so I can reference it more easily.
It’s basically just this grid, right? And that’s totally unstructured.
It doesn’t correspond to any kind of relational format, but it’s also
a really powerful computation paradigm.
01:04:20 - Speaker 3: Yeah, totally. I think people really love to be
able to click somewhere and put stuff there. And a lot of spreadsheet
use is just that. They just want to click there and put text or put a
color, and there’s no formulas at all. And by the way, this goes back
to our idea of convergence of the Office document types. I see people
using Figma for this a lot, like they’re not designers, they’re not
designing interface. They want to click and put pictures on a 2D canvas,
and they want to click and put text there and you could see a sort of
continuation of this world where these things continue to merge as the
software gets more sophisticated.
01:04:49 - Speaker 1: Yeah, and then on the subject of diagrams real
quickly, I remember that I want to mention sketch and sketch, which is
this project by Brian Hempel, Justin Lubin, Robbie Shug at University of
Chicago from a couple of years ago, and the idea there is you have
direct manipulation programming for SBG.
So in the same you have this editor and then on the left side, you might
see the code that outputs a certain SPG on the right side you see the
SP. itself, and you should be able to do things like directly go in with
the mouse, click an anchor point and drag it somewhere or do other kinds
of transformations that people are used to when SVG editing, and it
should obviously be reflected in the output, but also change the code
that goes into it, and then you can make changes to the code and it will
I think this is one of the most successful examples. I’ve seen of an
editor that actually manages to keep this bidirectional linkage working
and when you make manual edits with the direct manipulation edits with
the cursor, it doesn’t totally botch your code and when you make
changes with the code, it doesn’t lose all of your edits with the
visual side. I think it would be great to see like more things like this
for more structured areas like diagramming or things like that.
01:05:59 - Speaker 3: So many research projects to do.
01:06:02 - Speaker 1: Yes, lots to do.
01:06:04 - Speaker 2: So slum, I see a recurring theme in how you think
about all of this, whether it’s equations, prose, rich texts, musical
score, or diagrams, is this intermediate format concept, and maybe like
a straw man or an outside view might come at this thinking, well, being
able to see something like a markdown is sort of exposing plumbing that
nerdy programmer types might like, but The reason we invented what you
see is what you get word processors, whatever, 40 years ago or whatever
it was, was to potentially liberate us from that.
But I see that you see the future is not one where those go away.
We want to expose that. There’s some value to that separately from a
fully visual 100% mapping the rendered output and the way you edit it
The same, so I think that eliminates somewhat what I would imagine how
you would answer the question I was going to ask you about the future,
but with that in mind, I’ll basically say, yeah, if you look forward,
say 5 or 10 years to what advances either have happened or that you hope
to see happen in terms of how rich text works on our computing devices,
what does that look like?
01:07:16 - Speaker 1: Yeah, I think it’s exactly like you were
describing, we originally had this idea that you would be able to get a
Wizzy Wig editor or something like Microsoft Word and totally decouple
yourself from this underlying representation. I think that works up
until the point where you have lots of different Output formats or
different ways of viewing the document that people would like to use.
And as soon as you are in a world where even something like, let’s say
I want to have two different views in a GUI application, all of a sudden
it becomes much more beneficial to have some kind of intermediate format
so that you don’t have to do like N times different renderers and
parsers and compilation pipelines for all of these internal things.
01:08:01 - Speaker 2: So there’s a simple example of that earlier you
mentioned the reading academic papers on different size screens, you
know, a phone versus a desktop versus a printout, that even just the
basic reflow of the text, simple as that seems to a narrower or wider
screen actually is pretty complicated and There was an approach of
designing for several different screen sizes, but now we know that
that’s not very futureproof and doesn’t the way we want. And so as
soon as you have anything that’s even slightly dynamic, even something
as simple as text free flowing, that’s the place where you think in an
immediate format is necessary.
01:08:36 - Speaker 1: Yeah, exactly, like it’s not tractable to design
like a phone version of the website for every possible phone and then
like a tablet version for all the tablets and then a desktop version.
And but also like a projector version, things like that.
So the layout and the appearance is driven by the content itself. And I
think that there’s an idea of that for outputting a paper.
If you’re thinking about outputting another artifact like a diagram or
something, I think there are situations where it’s really useful to be
able to do standard direct manipulation diagram editing and then also
situations where it’s really useful to be able To like select all of
the text that corresponds to a certain subgraph and just like move it
somewhere else and allowing people the flexibility of choosing between
those different edit options, depending on what task they’re trying to
perform, what problem they’re trying to solve is like a really big area
So I think like, we’re still at a stage where with all of these
different new editors like Coda or Notion or even bare craft. Editors
are still like very much borrowing from each other a lot and
periodically striking out in the direction of like, here’s a new kind
of block or a new kind of like cell or type of text that you can have,
and I think that while we’re still in the stage of like churning
feature churn around, what are the editing primitives that people care
about, what things go in a document, it’s going to be hard to develop
any kind of like unifying framework or IR for these documents to work
I’m hopeful that once we reach a scenario where there’s a little more
stasis and maybe more overlap in the capabilities and interests of
different editors, you could have like this intermediate platform that
extends from things like Rome to notion or notion to air Table or
something like that for the components that make sense to go into those
other platforms, and then you can actually really flexibly move your
data around between these areas and likewise, within applications, maybe
you want to be able to start off with something really low fidelity and
gradually get something higher fidelity, like it would be really nice to
have a slider almost. That allows you to move up and down the ladder of
abstraction, but failing that, like an intermediate tool that you can
plug in and be like, OK, I want to take this like bullet list of to do
items and upgrade it into a database for like things or something like
that. Something that’s more plug and play that also handles structured
data in the same way we have a tool like Handdoc for text. That’s what
I’m really excited to see because I really think of rich text as slowly
expanding to include all the things you might want to have in a
document, which might include Embedded views of other databases or
things like that. So just having a more expansive interpretation of rich
text that is less constraining with respect to the kinds of artifacts
that you can produce, allows you to combine more things together, has
like a notion of structure that enables these kinds of really Powerful
edits like re-parenting an entire subtree, while also allowing you to do
things like select a linear region of text and copy it somewhere else. I
think that’s kind of the direction we’re moving in, where we combine a
lot of flexibility of plain text editors that we’ve seen to date with
some of the power of having more structure.
01:11:51 - Speaker 3: It’s a pretty exciting future.
01:11:53 - Speaker 1: Yeah, let’s hope we get there.
01:11:54 - Speaker 2: Well, let’s wrap it there. Thanks everyone for
listening. If you have feedback, write us on Twitter at @museapphq. You can
reach us on email, hello at museapp.com. You can also leave us a review
on Apple Podcasts. And slim, your drive and passion for all things text
and in fact expanding my mind has been expanded on what even we would
think of text as being and what these intermediate formats can do for us
in the future. So I’m really excited that you’re on the forefront of
this and pushing forward our tools.
01:12:29 - Speaker 1: Yeah, thanks so much for having me. This was great