NLP Highlights

By Allen Institute for Artificial Intelligence

**The podcast is currently on hiatus. For more active NLP content, check out the Holistic Intelligence Podcast linked below.**

Welcome to the NLP highlights podcast, where we invite researchers to ta... more

· Science

4.2

2424 ratings

Download on the App Store

Download on the App Store

Get it on Google Play

FAQs about NLP Highlights:

How many episodes does NLP Highlights have?

The podcast currently has 145 episodes available.

NLP Highlights episodes:

February 29, 2024 Are LLMs safe?
Curious about the safety of LLMs? 🤔 Join us for an insightful new episode featuring Suchin Gururangan, Young Investigator at Allen Institute for Artificial Intelligence and Data Science Engineer at Appuri. 🚀 Don't miss out on expert insights into the world of LLMs!
...more
43min
January 08, 2024 "Imaginative AI" with Mohamed Elhoseiny
This podcast episode features Dr. Mohamed Elhoseiny, a true luminary in the realm of computer vision with over a decade of groundbreaking research. As an Assistant Professor at KAUST, Dr. Elhoseiny's work delves into the intersections of Computer Vision, Language & Vision, and Computational Creativity in Art, Fashion, and AI. Notably, he co-organized the 1st and 2nd Workshops on Closing the Loop between Vision and Language, demonstrating his commitment to advancing interdisciplinary research. With a rich educational background from Stanford University's Graduate School of Business Ignite Program, and Rutgers University as MS/PhD Researcher, coupled with influential stints at Stanford, Baidu Research, Facebook AI Research, Adobe Research, and SRI International, Dr. Elhoseiny brings a wealth of experience to our discussion.
...more
24min
December 28, 2023 142 - Science Of Science, with Kyle Lo
Our first guest with this new format is Kyle Lo, the most senior lead scientist in the Semantic Scholar team at Allen Institute for AI (AI2), who kindly agreed to share his perspective on #Science of #Science (#scisci) on our podcast. SciSci is concerned with studying how people do science, and includes developing methods and tools to help people consume AND produce science. Kyle has made several critical contributions in this field which enabled a lot of SciSci work over the past 5+ years, ranging from novel NLP methods (eg, SciBERT https://lnkd.in/gTP_tYiF ), to open data collections (eg, S2ORK https://lnkd.in/g4J6tXCG), to toolkits for manipulating scientific documents (eg, PaperMage https://lnkd.in/gwU7k6mJ which JUST received a Best Paper Award 🏆 at EMNLP 2023).
Kyle Lo's homepage: https://kyleclo.github.io/
...more
49min
June 29, 2023 141 - Building an open source LM, with Iz Beltagy and Dirk Groeneveld
In this special episode of NLP Highlights, we discussed building and open sourcing language models. What is the usual recipe for building large language models? What does it mean to open source them? What new research questions can we answer by open sourcing them? We particularly focused on the ongoing Open Language Model (OLMo) project at AI2, and invited Iz Beltagy and Dirk Groeneveld, the research and engineering leads of the OLMo project to chat.
Blog post announcing OLMo: https://blog.allenai.org/announcing-ai2-olmo-an-open-language-model-made-by-scientists-for-scientists-ab761e4e9b76
Organizations interested in partnership can express their interest here: https://share.hsforms.com/1blFWEWJ2SsysSXFUEJsxuA3ioxm
You can find Iz at twitter.com/i_beltagy and Dirk at twitter.com/mechanicaldirk
...more
30min
June 06, 2023 140 - Generative AI and Copyright, with Chris Callison-Burch
In this special episode, we chatted with Chris Callison-Burch about his testimony in the recent U.S. Congress Hearing on the Interoperability of AI and Copyright Law. We started by asking Chris about the purpose and the structure of this hearing. Then we talked about the ongoing discussion on how the copyright law is applicable to content generated by AI systems, the potential risks generative AI poses to artists, and Chris’ take on all of this. We end the episode with a recording of Chris’ opening statement at the hearing.
...more
52min
March 24, 2023 139 - Coherent Long Story Generation, with Kevin Yang
How can we generate coherent long stories from language models? Ensuring that the generated story has long range consistency and that it conforms to a high level plan is typically challenging. In this episode, Kevin Yang describes their system that prompts language models to first generate an outline, and iteratively generate the story while following the outline and reranking and editing the outputs for coherence. We also discussed the challenges involved in evaluating long generated texts.
Kevin Yang is a PhD student at UC Berkeley.
Kevin's webpage: https://people.eecs.berkeley.edu/~yangk/
Papers discussed in this episode:
1. Re3: Generating Longer Stories With Recursive Reprompting and Revision (https://www.semanticscholar.org/paper/Re3%3A-Generating-Longer-Stories-With-Recursive-and-Yang-Peng/2aab6ca1a8dae3f3db6d248231ac3fa4e222b30a)
2. DOC: Improving Long Story Coherence With Detailed Outline Control (https://www.semanticscholar.org/paper/DOC%3A-Improving-Long-Story-Coherence-With-Detailed-Yang-Klein/ef6c768f23f86c4aa59f7e859ca6ffc1392966ca)
...more
46min
January 20, 2023 138 - Compositional Generalization in Neural Networks, with Najoung Kim
Compositional generalization refers to the capability of models to generalize to out-of-distribution instances by composing information obtained from the training data. In this episode we chatted with Najoung Kim, on how to explicitly evaluate specific kinds of compositional generalization in neural network models of language. Najoung described COGS, a dataset she built for this, some recent results in the space, and why we should be careful about interpreting the results given the current practice of pretraining models of lots of unlabeled text.
Najoung's webpage: https://najoungkim.github.io/
Papers we discussed:
1. COGS: A Compositional Generalization Challenge Based on Semantic Interpretation (Kim et al., 2020): https://www.semanticscholar.org/paper/b20ddcbd239f3fa9acc603736ac2e4416302d074
2. Compositional Generalization Requires Compositional Parsers (Weissenhorn et al., 2022): https://www.semanticscholar.org/paper/557ebd17b7c7ac4e09bd167d7b8909b8d74d1153
3. Uncontrolled Lexical Exposure Leads to Overestimation of Compositional Generalization in Pretrained Models (Kim et al., 2022): https://www.semanticscholar.org/paper/8969ea3d254e149aebcfd1ffc8f46910d7cb160e
Note that we referred to the final paper by an earlier name in the discussion.
...more
49min
January 13, 2023 137 - Nearest Neighbor Language Modeling and Machine Translation, with Urvashi Khandelwal
We invited Urvashi Khandelwal, a research scientist at Google Brain to talk about nearest neighbor language and machine translation models. These models interpolate parametric (conditional) language models with non-parametric distributions over the closest values in some data stores built from relevant data. Not only are these models shown to outperform the usual parametric language models, they also have important implications on memorization and generalization in language models.
Urvashi's webpage: https://urvashik.github.io
Papers discussed:
1) Generalization through memorization: Nearest Neighbor Language Models (https://www.semanticscholar.org/paper/7be8c119dbe065c52125ee7716601751f3116844)
2)Nearest Neighbor Machine Translation (https://www.semanticscholar.org/paper/20d51f8e449b59c7e140f7a7eec9ab4d4d6f80ea)
...more
36min
May 19, 2022 136 - Including Signed Languages in NLP, with Kayo Yin and Malihe Alikhani
In this episode, we talk with Kayo Yin, an incoming PhD at Berkeley, and Malihe Alikhani, an assistant professor at the University of Pittsburgh, about opportunities for the NLP community to contribute to Sign Language Processing (SLP). We talked about history and misconceptions about sign languages, high-level similarities and differences between spoken and sign languages, distinct linguistic features of signed languages, representations, computational resources, SLP tasks, and suggestions for better design and implementation of SLP models.
...more
1h 3min
March 02, 2022 135 - PhD Application Series: After Submitting Applications
This episode is the third in our current series on PhD applications.
We talk about what the PhD application process looks like after applications are submitted. We start with a general overview of the timeline, then talk about how to approach interviews and conversations with faculty, and finish by discussing the different factors to consider in deciding between programs.
The guests for this episode are Rada Mihalcea (Professor at the University of Michigan), Aishwarya Kamath (PhD student at NYU), and Sanjay Subramanian (PhD student at UC Berkeley).
Homepages:
- Aishwarya Kamath: https://ashkamath.github.io/
- Sanjay Subramanian: https://sanjayss34.github.io/
- Rada Mihalcea: https://web.eecs.umich.edu/~mihalcea/
The hosts for this episode are Alexis Ross and Nishant Subramani.
...more
37min

FAQs about NLP Highlights:

How many episodes does NLP Highlights have?

The podcast currently has 145 episodes available.