
Sign up to save your podcasts
Or
ArXiv NLP research for Monday, June 10, 2024.
00:19: Shoulders of Giants: A Look at the Degree and Utility of Openness in NLP Research
00:59: HOLMES: Hyper-Relational Knowledge Graphs for Multi-hop Question Answering using LLMs
02:29: The Curse of Popularity: Popular Entities have Catastrophic Side Effects when Deleting Knowledge from Language Models
03:24: MATES: Model-Aware Data Selection for Efficient Pretraining with Data Influence Models
04:51: A Multidimensional Framework for Evaluating Lexical Semantic Change with Social Science Applications
05:49: Synth-SBDH: A Synthetic Dataset of Social and Behavioral Determinants of Health for Clinical Text
07:10: Efficient k-Nearest-Neighbor Machine Translation with Dynamic Retrieval
09:08: Recurrent Context Compression: Efficiently Expanding the Context Window of LLM
10:35: Enhancing Long-Term Memory using Hierarchical Aggregate Tree for Retrieval Augmented Generation
11:26: Verifiable Generation with Subsentence-Level Fine-Grained Citations
12:36: Comparing Data Augmentation Methods for End-to-End Task-Oriented Dialog Systems
13:55: Building Bridges: A Dataset for Evaluating Gender-Fair Machine Translation into German
15:28: Can I understand what I create? Self-Knowledge Evaluation of Large Language Models
16:28: Language Models Resist Alignment
17:58: LINGOLY: A Benchmark of Olympiad-Level Linguistic Reasoning Puzzles in Low-Resource and Extinct Languages
19:27: Learning Fine-Grained Controllability on Speech Generation via Efficient Fine-Tuning
20:27: Combining Embeddings and Domain Knowledge for Job Posting Duplicate Detection
21:37: MaskLID: Code-Switching Language Identification through Iterative Masking
22:49: Multi-Prompting Decoder Helps Better Language Understanding
24:22: Tx-LLM: A Large Language Model for Therapeutics
26:21: Self-Tuning: Instructing LLMs to Effectively Acquire New Knowledge through Self-Teaching
27:43: A Parameter-efficient Language Extension Framework for Multilingual ASR
29:06: MedExQA: Medical Question Answering Benchmark with Multiple Explanations
30:36: Sustained Vowels for Pre- vs Post-Treatment COPD Classification
31:49: MASSW: A New Dataset and Benchmark Tasks for AI-Assisted Scientific Workflows
33:40: Symmetric Dot-Product Attention for Efficient Training of BERT Language Models
35:00: Annotation alignment: Comparing LLM and human annotations of conversational safety
36:07: mHuBERT-147: A Compact Multilingual HuBERT Model
37:27: Should We Fine-Tune or RAG? Evaluating Different Techniques to Adapt LLMs for Dialogue
39:00: INTERSPEECH 2009 Emotion Challenge Revisited: Benchmarking 15 Years of Progress in Speech Emotion Recognition
40:06: Meta Learning Text-to-Speech Synthesis in over 7000 Languages
40:59: Controlling Emotion in Text-to-Speech with Natural Language Prompts
41:55: Language Models are Alignable Decision-Makers: Dataset and Application to the Medical Triage Domain
43:29: Multimodal Contextualized Semantic Parsing from Speech
44:25: Interpretability of Language Models via Task Spaces
45:45: Evaluating the Retrieval Component in LLM-Based Question Answering Systems
46:52: Reasoning in Token Economies: Budget-Aware Evaluation of LLM Reasoning Strategies
48:08: Can Language Models Serve as Text-Based World Simulators?
ArXiv NLP research for Monday, June 10, 2024.
00:19: Shoulders of Giants: A Look at the Degree and Utility of Openness in NLP Research
00:59: HOLMES: Hyper-Relational Knowledge Graphs for Multi-hop Question Answering using LLMs
02:29: The Curse of Popularity: Popular Entities have Catastrophic Side Effects when Deleting Knowledge from Language Models
03:24: MATES: Model-Aware Data Selection for Efficient Pretraining with Data Influence Models
04:51: A Multidimensional Framework for Evaluating Lexical Semantic Change with Social Science Applications
05:49: Synth-SBDH: A Synthetic Dataset of Social and Behavioral Determinants of Health for Clinical Text
07:10: Efficient k-Nearest-Neighbor Machine Translation with Dynamic Retrieval
09:08: Recurrent Context Compression: Efficiently Expanding the Context Window of LLM
10:35: Enhancing Long-Term Memory using Hierarchical Aggregate Tree for Retrieval Augmented Generation
11:26: Verifiable Generation with Subsentence-Level Fine-Grained Citations
12:36: Comparing Data Augmentation Methods for End-to-End Task-Oriented Dialog Systems
13:55: Building Bridges: A Dataset for Evaluating Gender-Fair Machine Translation into German
15:28: Can I understand what I create? Self-Knowledge Evaluation of Large Language Models
16:28: Language Models Resist Alignment
17:58: LINGOLY: A Benchmark of Olympiad-Level Linguistic Reasoning Puzzles in Low-Resource and Extinct Languages
19:27: Learning Fine-Grained Controllability on Speech Generation via Efficient Fine-Tuning
20:27: Combining Embeddings and Domain Knowledge for Job Posting Duplicate Detection
21:37: MaskLID: Code-Switching Language Identification through Iterative Masking
22:49: Multi-Prompting Decoder Helps Better Language Understanding
24:22: Tx-LLM: A Large Language Model for Therapeutics
26:21: Self-Tuning: Instructing LLMs to Effectively Acquire New Knowledge through Self-Teaching
27:43: A Parameter-efficient Language Extension Framework for Multilingual ASR
29:06: MedExQA: Medical Question Answering Benchmark with Multiple Explanations
30:36: Sustained Vowels for Pre- vs Post-Treatment COPD Classification
31:49: MASSW: A New Dataset and Benchmark Tasks for AI-Assisted Scientific Workflows
33:40: Symmetric Dot-Product Attention for Efficient Training of BERT Language Models
35:00: Annotation alignment: Comparing LLM and human annotations of conversational safety
36:07: mHuBERT-147: A Compact Multilingual HuBERT Model
37:27: Should We Fine-Tune or RAG? Evaluating Different Techniques to Adapt LLMs for Dialogue
39:00: INTERSPEECH 2009 Emotion Challenge Revisited: Benchmarking 15 Years of Progress in Speech Emotion Recognition
40:06: Meta Learning Text-to-Speech Synthesis in over 7000 Languages
40:59: Controlling Emotion in Text-to-Speech with Natural Language Prompts
41:55: Language Models are Alignable Decision-Makers: Dataset and Application to the Medical Triage Domain
43:29: Multimodal Contextualized Semantic Parsing from Speech
44:25: Interpretability of Language Models via Task Spaces
45:45: Evaluating the Retrieval Component in LLM-Based Question Answering Systems
46:52: Reasoning in Token Economies: Budget-Aware Evaluation of LLM Reasoning Strategies
48:08: Can Language Models Serve as Text-Based World Simulators?