
Sign up to save your podcasts
Or


ArXiv NLP research for Thursday, June 06, 2024.
00:20: Efficient Knowledge Infusion via KG-LLM Alignment
01:25: NAP^2: A Benchmark for Naturalness and Privacy-Preserving Text Rewriting by Learning from Human
02:34: Character-Level Chinese Dependency Parsing via Modeling Latent Intra-Word Structure
03:30: XL-HeadTags: Leveraging Multimodal Retrieval Augmentation for the Multilingual Generation of News Headlines and Tags
04:59: End-to-End Trainable Soft Retriever for Low-resource Relation Extraction
06:07: Light-PEFT: Lightening Parameter-Efficient Fine-Tuning via Early Pruning
07:37: Improving Zero-Shot Chinese-English Code-Switching ASR with kNN-CTC and Gated Monolingual Datastores
08:52: ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search
10:29: Chaos with Keywords: Exposing Large Language Models Sycophancy to Misleading Keywords and Evaluating Defense Strategies
11:39: Lean Workbook: A large-scale Lean problem set formalized from natural language math problems
12:56: Speculative Decoding via Early-exiting for Faster LLM Inference with Thompson Sampling Control Mechanism
14:18: Performance of large language models in numerical vs. semantic medical knowledge: Benchmarking on evidence-based Q&As
16:24: Recovering document annotations for sentence-level bitext
17:40: BLSP-Emo: Towards Empathetic Large Speech-Language Models
19:01: Decoder-only Streaming Transformer for Simultaneous Translation
20:28: Evaluating the IWSLT2023 Speech Translation Tasks: Human Annotations, Automatic Metrics, and Segmentation
21:53: Spontaneous Speech-Based Suicide Risk Detection Using Whisper and Large Language Models
23:06: How Good is Zero-Shot MT Evaluation for Low Resource Indian Languages?
24:13: HeSum: a Novel Dataset for Abstractive Text Summarization in Hebrew
25:19: ArMeme: Propagandistic Content in Arabic Memes
26:26: Culturally Aware and Adapted NLP: A Taxonomy and a Survey of the State of the Art
27:11: UltraMedical: Building Specialized Generalists in Biomedicine
28:43: Tox-BART: Leveraging Toxicity Attributes for Explanation Generation of Implicit Hate Speech
30:02: A + B: A General Generator-Reader Framework for Optimizing LLMs to Unleash Synergy Potential
31:29: On The Persona-based Summarization of Domain-Specific Documents
33:14: Assessing LLMs for Zero-shot Abstractive Summarization Through the Lens of Relevance Paraphrasing
34:28: American Sign Language Handshapes Reflect Pressures for Communicative Efficiency
By Brad EdwardsArXiv NLP research for Thursday, June 06, 2024.
00:20: Efficient Knowledge Infusion via KG-LLM Alignment
01:25: NAP^2: A Benchmark for Naturalness and Privacy-Preserving Text Rewriting by Learning from Human
02:34: Character-Level Chinese Dependency Parsing via Modeling Latent Intra-Word Structure
03:30: XL-HeadTags: Leveraging Multimodal Retrieval Augmentation for the Multilingual Generation of News Headlines and Tags
04:59: End-to-End Trainable Soft Retriever for Low-resource Relation Extraction
06:07: Light-PEFT: Lightening Parameter-Efficient Fine-Tuning via Early Pruning
07:37: Improving Zero-Shot Chinese-English Code-Switching ASR with kNN-CTC and Gated Monolingual Datastores
08:52: ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search
10:29: Chaos with Keywords: Exposing Large Language Models Sycophancy to Misleading Keywords and Evaluating Defense Strategies
11:39: Lean Workbook: A large-scale Lean problem set formalized from natural language math problems
12:56: Speculative Decoding via Early-exiting for Faster LLM Inference with Thompson Sampling Control Mechanism
14:18: Performance of large language models in numerical vs. semantic medical knowledge: Benchmarking on evidence-based Q&As
16:24: Recovering document annotations for sentence-level bitext
17:40: BLSP-Emo: Towards Empathetic Large Speech-Language Models
19:01: Decoder-only Streaming Transformer for Simultaneous Translation
20:28: Evaluating the IWSLT2023 Speech Translation Tasks: Human Annotations, Automatic Metrics, and Segmentation
21:53: Spontaneous Speech-Based Suicide Risk Detection Using Whisper and Large Language Models
23:06: How Good is Zero-Shot MT Evaluation for Low Resource Indian Languages?
24:13: HeSum: a Novel Dataset for Abstractive Text Summarization in Hebrew
25:19: ArMeme: Propagandistic Content in Arabic Memes
26:26: Culturally Aware and Adapted NLP: A Taxonomy and a Survey of the State of the Art
27:11: UltraMedical: Building Specialized Generalists in Biomedicine
28:43: Tox-BART: Leveraging Toxicity Attributes for Explanation Generation of Implicit Hate Speech
30:02: A + B: A General Generator-Reader Framework for Optimizing LLMs to Unleash Synergy Potential
31:29: On The Persona-based Summarization of Domain-Specific Documents
33:14: Assessing LLMs for Zero-shot Abstractive Summarization Through the Lens of Relevance Paraphrasing
34:28: American Sign Language Handshapes Reflect Pressures for Communicative Efficiency