July 22, 2024

Mike Dillinger: Knowledge Graphs as “Jet Fuel” for Generative AI – Episode 2

31 minutes

Mike Dillinger

Knowledge graphs provide the digital foundation for some of most visible companies on the web.

Mike Dillinger built LinkedIn's Economic Graph, the knowledge graph that powers the social media giant's recommendation systems.

Mike now helps people understand knowledge graph technology and how it can complement and improve generative AI, whether by acting as "jet fuel" to better train LLMs or by providing "adult supervision" for their unruly, adolescent behavior.

We talked about:

how he describes knowledge graphs

how the richness of information in a knowledge graph helps computers better understand the things in a system

the differences between knowledge graphs and LLMs

how LinkedIn's Economic Graph, which Mike's team built, works

how LLMs can help build knowledge graphs, and how knowledge graphs can act as "jet fuel" to train LLMs

the RDF "triples" that are at the foundation of knowledge graphs

the importance of distinguishing between unique concepts in a knowledge graph and how practitioners do this

the two main crafts needed to build knowledge graphs: linguistic expertise and software engineering

the job opportunities for language professionals in the LLM and knowledge graph worlds

the propensity of tech companies to staff knowledge graph efforts with engineers while there is actually a need for a variety of talent, as well as better collaboration skills

his assertion that "language professionals aren't janitors," put on teams only to clean up data for software engineers

how knowledge graphs provide "adult supervision" for unruly, adolescent LLMs

his hypothesis that using KGs as a separate modality of data rather than as training data for LLMs will advance AI

Mike's bio

Mike Dillinger, PhD is a technical advisor, consultant, and thought leader who champions the importance of capturing and leveraging reusable, explicit human knowledge to enable more reliable machine intelligence. He was Technical Lead for Knowledge Graphs in the AI Division at LinkedIn and for LinkedIn’s and eBay’s first machine translation systems. He was also an independent consultant specialized in deploying translation technologies for Fortune 500 companies, and Director of Linguistics at two machine translation software companies where he led development of the first commercial MT-TM integration. He was President of the Association for Machine Translation in the Americas and has two MT-related patents. Dr. Dillinger has also taught at more than a dozen universities in several countries, has been a visiting researcher on four continents, and has a weekly blog on Knowledge Architecture.

Connect with Mike online

Video

Here’s the video version of our conversation:

https://youtu.be/wX2C3DwiWG4

Podcast intro transcript

This is the Knowledge Graph Insights podcast, episode number 2. If you've ever looked for a job or recruited talent on LinkedIn, you've seen Mike Dillinger's work. His team built LinkedIn's Economic Graph, the knowledge graph that powers the social media platform's recommendation system. These days, Mike thinks a lot about how knowledge graph technology can work with generative AI, seeing opportunities for the technologies to help the other, like the ability of knowledge graphs to act as "jet fuel" to train large language models.

Interview transcript

Larry:

Hi everyone. Welcome to Episode Number 2 of the Knowledge Graph Insights podcast. I am really delighted today to welcome to the program Mike Dillinger. Mike is a cage-free consultant based in San Jose. He's been doing knowledge graph and other technical things for many years. So welcome, Mike. Tell the folks a little bit more about what you're up to these days.

Mike:

Thanks a lot, Larry. What am I up to? I'm trying to help people wrap their heads around knowledge graphs and how we can make AI, transform GenAI into Next-GenAI by leveraging more explicit content and human knowledge.

Larry:

Ooh, I love that. Next-GenAI. I hope you've copyrighted that.

Mike:

No.

Larry:

Yeah, no, but that's ... because right now, it seems like most of the oxygen in the room has been sucked up by Open AI and LLMs and chatbots and GPTs, but knowledge graphs have something to offer as well. I guess the first thing I'd like to ask is, can you describe for folks, because I think a lot of folks listening to this podcast might not be as familiar as some of us with what a knowledge graph is and what it does. Can you sort of set out for folks what a knowledge graph is?

Mike:

Sure. The usual way I describe knowledge graphs are as collections of densely interconnected facts about individual things and categories of things, based on a range of different relations. So one thing that people get caught on is, oh, so there is a taxonomy? Not really, but taxonomies are a part of the knowledge graph. Oh, so it's an ontology? No, but ontologies are a part of the knowledge graph. So there are a range of different kinds of facts in a knowledge graph, so it's broader than an taxonomy or an ontology.

Larry:

And I think a lot of people come ... like, I work in the content world mostly, and information architecture. Most people in that world, I think the first time they go to organize stuff, they start thinking taxonomically, which I guess makes sense. But tell me the benefits of going beyond a simple taxonomy or just an ontology. How does it come together? How does it help people do more interesting and better stuff?

Mike:

Well, the question that we're talking about here is, you might call it the richness or the depth of the knowledge representation. With a taxonomy, you only have relations like this is a subcategory of that, or this is an instance of that, and you don't have information about what is this for or what are its attributes or what are its components? So when you move from a taxonomy to a knowledge graph, we're talking about giving algorithms more information in more detail about the things that we want them to think about.

Larry:

Interesting. And I think a lot of people right now, a lot of the curiosity and interest around these kinds of technologies is around LLMs and OpenAI and ChatGPT and all those things. Can you contrast a knowledge graph and what it can do with what those kinds of systems are doing?

Mike:

Oh, sure. So language models focus very much on strings, and sequences of strings, and knowledge graphs focus more on facts or concepts. So concepts built into facts, as it were. So they're really focusing on very different things: sequences of words or strings, or graphs of concepts. So the notion of meaning is really different. They're both in terms of similarity, but an LLM computes similar meaning in terms of context.

Mike:

So if two words have similar words around them, then those two words are considered in an LLM to have similar meanings. But in a knowledge graph, you compare to concepts by saying, oh, do they have similar components and similar characteristics? If so, then they're related in meaning.

Mike:

So they're very different ways of getting at a similar problem.

Larry:

Interesting.

Mike:

You might phrase it, for linguistic people, you might say LLMs focus on syntax and knowledge graphs focus on semantics, or LLMs focus on data and knowledge graphs focus on knowledge. There are a lot of different ways of describing it. So they're very, very much complementary technologies for getting at some of the same problems.

Larry:

You know what I'd love to do now is I'd love to ground what you just said in some examples. Like, what are your ... the first accomplishment of yours that I learned about was your work on the Economic Graph at LinkedIn. Can you talk a little bit about what the aim of that is and how a knowledge graph helped LinkedIn do better stuff with their data?

Mike:

Oh, sure. Okay. So the Economic Graph at LinkedIn is a model of the entities in the economy, focusing on schools that produce talent, people with talent, and companies that absorb that talent, okay, and then companies produce products. So we have things like companies, products, workers, schools, these are main entities, and there are a wide range of relationships between them. So when we want to find a worker who fits in a company in a particular position, we need to have a detailed and reliable description of both the position and the worker.

Mike:

This is what LinkedIn's technology is all about, is matching workers to openings, or now increasingly doing other things, like matching, in their ads, business matching products to people, kind of thing. So knowledge graphs are all about making matching work more systematically and in a more understandable way.

Mike:

So this is what we did at LinkedIn. We built up a kind of vocabulary for describing people or workers and for describing jobs, but we used the same vocabulary for both, so that we could translate, as it were, your worker profile and this company's job profile into a same meta-language. And it made it much easier and much more accurate to compare one with the other. And that meta-language is what we call a knowledge graph.

Larry:

And some of the mechanisms that permit that ... I work mostly in the content world, and we are famous for being bad at our core competency, which is naming and labeling things. And so there's a lot of people doing content jobs that they're doing the same thing, but they have a different job title. I know there's techniques in the knowledge graph world for resolving that kind of discrepancy. Can you talk a little bit about ... and I assume that must have happened at scale in the economic graph.

Mike:

Yes. Oh yeah. Yeah. So when I built a team there, we faced a little problem of having 150 million distinct job titles to navigate. So this is way bigger than anything that normal taxonomists usually deal with. So we had to cut that problem down to size, and then,...

...more

View all episodes

By Larry Swanson

55 ratings