Experiencing Data w/ Brian T. O’Neill  (UX for AI Data Products, SAAS Analytics, Data Product Management)

010 - Carl Hoffman (CEO, Basis Technology) on text analytics, NLP, entity resolution, and why exact match search is stupid


Listen Later

My guest today is Carl Hoffman, the CEO of Basis Technology, and a specialist in text analytics. Carl founded Basis Technology in 1995, and in 1999, the company shipped its first products for website internationalization, enabling Lycos and Google to become the first search engines capable of cataloging the web in both Asian and European languages. In 2003, the company shipped its first Arabic analyzer and began development of a comprehensive text analytics platform. Today, Basis Technology is recognized as the leading provider of components for information retrieval, entity extraction, and entity resolution in many languages. Carl has been directly involved with the company’s activities in support of U.S. national security missions and works closely with analysts in the U.S. intelligence community.
Many of you work all day in the world of analytics: numbers, charts, metrics, data visualization, etc. But, today we’re going to talk about one of the other ingredients in designing good data products: text! As an amateur polyglot myself (I speak decent Portuguese, Spanish, and am attempting to learn Polish), I really enjoyed this discussion with Carl. If you are interested in languages, text analytics, search interfaces, entity resolution, and are curious to learn what any of this has to do with offline events such as the Boston Marathon Bombing, you’re going to enjoy my chat with Carl. We covered:
How text analytics software is used by Border patrol agencies and its limitations.
The role of humans in the loop, even with good text analytics in play
What actually happened in the case of the Boston Marathon Bombing?
Carl’s article“Exact Match” Isn’t Just Stupid. It’s Deadly.
The 2 lessons Carl has learned regarding working with native tongue source material.
Why Carl encourages Unicode Compliance when working with text, why having a global perspective is important, and how Carl actually implements this at his company
Carl’s parting words on why hybrid architectures are a core foundation to building better data products involving text analytics
Resources and Links:
Basis Technology
Carl’s article: “Exact Match” isn’t Just Stupid. It’s Deadly.
Carl Hoffman on LinkedIn
Quotes from Today’s Episode
“One of the practices that I’ve always liked is actually getting people that aren’t like you, that don’t think like you, in order to intentionally tease out what you don’t know. You know that you’re not going to look at the problem the same way they do…” — Brian O’Neill
“Bias is incredibly important in any system that tries to respond to human behavior. We have our own innate cultural biases that we’re sometimes not even aware of. As you [Brian] point out, it’s impossible to separate human language from the underlying culture and, in some cases, geography and the lifestyle of the people who speak that language…” — Carl Hoffman
“What I can tell you is that context and nuance are equally important in both spoken and written human communication…Capturing all of the context means that you can do a much better job of the analytics.” — Carl Hoffman
“It’s sad when you have these gaps like what happened in this border crossing case where a name spelling is responsible for not flagging down [the right] people. I mean, we put people on the moon and we get something like a name spelling [entity resolution] wrong. It’s shocking in a way.” — Brian O’Neill
“We live in a world which is constantly shades of gray and the challenge is getting as close to yes or no as we can.”– Carl Hoffman
...more
View all episodesView all episodes
Download on the App Store

Experiencing Data w/ Brian T. O’Neill  (UX for AI Data Products, SAAS Analytics, Data Product Management)By Brian T. O’Neill from Designing for Analytics

  • 5
  • 5
  • 5
  • 5
  • 5

5

39 ratings


More shows like Experiencing Data w/ Brian T. O’Neill (UX for AI Data Products, SAAS Analytics, Data Product Management)

View all
Software Engineering Radio - the podcast for professional software developers by se-radio@computer.org

Software Engineering Radio - the podcast for professional software developers

262 Listeners

HBR IdeaCast by Harvard Business Review

HBR IdeaCast

257 Listeners

a16z Podcast by Andreessen Horowitz

a16z Podcast

997 Listeners

Data Skeptic by Kyle Polich

Data Skeptic

474 Listeners

UI Breakfast: UI/UX Design and Product Strategy by Jane Portman

UI Breakfast: UI/UX Design and Product Strategy

134 Listeners

Acquired by Ben Gilbert and David Rosenthal

Acquired

3,659 Listeners

Odd Lots by Bloomberg

Odd Lots

1,733 Listeners

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence) by Sam Charrington

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

429 Listeners

Super Data Science: ML & AI Podcast with Jon Krohn by Jon Krohn

Super Data Science: ML & AI Podcast with Jon Krohn

295 Listeners

Data Engineering Podcast by Tobias Macey

Data Engineering Podcast

143 Listeners

Masters of Scale by WaitWhat

Masters of Scale

3,968 Listeners

DataFramed by DataCamp

DataFramed

267 Listeners

Practical AI by Practical AI LLC

Practical AI

196 Listeners

Machine Learning Street Talk (MLST) by Machine Learning Street Talk (MLST)

Machine Learning Street Talk (MLST)

90 Listeners

Product Thinking by Melissa Perri

Product Thinking

144 Listeners