June 13, 2024

Ep. 262 - June 12, 2024

54 minutes

ArXiv NLP research for Wednesday, June 12, 2024.

00:19: VALL-E R: Robust and Efficient Zero-Shot Text-to-Speech Synthesis via Monotonic Alignment

02:05: BookSQL: A Large Scale Text-to-SQL Dataset for Accounting Domain

03:15: Designing a Dashboard for Transparency and Control of Conversational AI

04:46: Label-aware Hard Negative Sampling Strategies with Momentum Contrastive Learning for Implicit Hate Speech Detection

05:51: Exploring Speech Foundation Models for Speaker Diarization in Child-Adult Dyadic Interactions

06:53: Exploring Self-Supervised Multi-view Contrastive Learning for Speech Emotion Recognition with Limited Annotations

07:52: Guiding Frame-Level CTC Alignments Using Self-knowledge Distillation

08:55: DeTriever: Decoder-representation-based Retriever for Improving NL2SQL In-Context Learning

10:20: Automated Information Extraction from Thyroid Operation Narrative: A Comparative Study of GPT-4 and Fine-tuned KoELECTRA

11:35: Large Language Model Unlearning via Embedding-Corrupted Prompts

13:17: Defining and Detecting Vulnerability in Human Evaluation Guidelines: A Preliminary Study Towards Reliable NLG Evaluation

14:46: Better than Random: Reliable NLG Human Evaluation with Constrained Active Sampling

16:02: LibriTTS-P: A Corpus with Speaking Style and Speaker Identity Prompts for Text-to-Speech and Style Captioning

17:18: Guiding In-Context Learning of LLMs through Quality Estimation for Machine Translation

18:37: It Takes Two: On the Seamlessness between Reward and Policy Model in RLHF

20:02: Adversarial Evasion Attack Efficiency against Large Language Models

21:06: Learning Job Title Representation from Job Description Aggregation Network

21:59: Large Language Models Meet Text-Centric Multimodal Sentiment Analysis: A Survey

23:35: AustroTox: A Dataset for Target-Based Austrian German Offensive Language Detection

24:38: Languages Transferred Within the Encoder: On Representation Transfer in Zero-Shot Multilingual Translation

25:56: Multimodal Table Understanding

27:20: CoXQL: A Dataset for Parsing Explanation Requests in Conversational XAI Systems

28:51: Supportiveness-based Knowledge Rewriting for Retrieval-augmented Language Modeling

30:36: Legend: Leveraging Representation Engineering to Annotate Safety Margin for Preference Datasets

31:57: Semi-Supervised Spoken Language Glossification

33:16: Underneath the Numbers: Quantitative and Qualitative Gender Fairness in LLMs for Depression Prediction

34:37: A Dialogue Game for Eliciting Balanced Collaboration

35:23: Transformer-based Model for ASR N-Best Rescoring and Rewriting

36:16: SumHiS: Extractive Summarization Exploiting Hidden Structure

36:53: Figuratively Speaking: Authorship Attribution via Multi-Task Figurative Language Modeling

38:08: Leveraging Large Language Models for Web Scraping

39:51: M3T: A New Benchmark Dataset for Multi-Modal Document-Level Machine Translation

41:15: Is Programming by Example solved by LLMs?

42:29: Speech Emotion Recognition with ASR Transcripts: A Comprehensive Study on Word Error Rate and Fusion Techniques

43:42: Towards Unsupervised Speech Recognition Without Pronunciation Models

44:50: cPAPERS: A Dataset of Situated and Multimodal Interactive Conversations in Scientific Papers

45:57: Understanding Sounds, Missing the Questions: The Challenge of Object Hallucination in Large Audio-Language Models

47:02: Tailoring Generative AI Chatbots for Multiethnic Communities in Disaster Preparedness Communication: Extending the CASA Paradigm

48:12: Next-Generation Database Interfaces: A Survey of LLM-based Text-to-SQL

49:56: TasTe: Teaching Large Language Models to Translate through Self-Reflection

51:28: OLMES: A Standard for Language Model Evaluations

52:47: Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing

...more

By Brad Edwards