June 04, 2024

Ep. 253 - June 3, 2024

1 hour 10 minutes

ArXiv NLP research for Monday, June 03, 2024.

00:19: Luna: An Evaluation Foundation Model to Catch Language Model Hallucinations with High Accuracy and Low Cost

01:38: Generative Pre-trained Speech Language Model with Efficient Hierarchical Transformer

03:06: Selectively Answering Visual Questions

04:11: Take its Essence, Discard its Dross! Debiasing for Toxic Language Detection via Counterfactual Causal Effect

05:36: Predicting Drug-Gene Relations via Analogy Tasks with Word Embeddings

06:51: SemCoder: Training Code Language Models with Comprehensive Semantics

08:39: Mobile-Agent-v2: Mobile Device Operation Assistant with Effective Navigation via Multi-Agent Collaboration

10:26: Combining Qualitative and Computational Approaches for Literary Analysis of Finnish Novels

11:45: Strengthened Symbol Binding Makes Large Language Models Reliable Multiple-Choice Selectors

13:26: Decompose, Enrich, and Extract! Schema-aware Event Extraction using LLMs

14:34: MACT: Model-Agnostic Cross-Lingual Training for Discourse Representation Structure Parsing

15:48: Guiding ChatGPT to Generate Salient Domain Summaries

17:51: Synergizing Unsupervised and Supervised Learning: A Hybrid Approach for Accurate Natural Language Task Modeling

19:30: TCMBench: A Comprehensive Benchmark for Evaluating Large Language Models in Traditional Chinese Medicine

21:38: Explore then Determine: A GNN-LLM Synergy Framework for Reasoning over Knowledge Graph

22:51: Two Tales of Persona in LLMs: A Survey of Role-Playing and Personalization

24:08: Are AI-Generated Text Detectors Robust to Adversarial Perturbations?

25:42: Automatic Essay Multi-dimensional Scoring with Fine-tuning and Multiple Regression

26:35: Improving Pseudo Labels with Global-Local Denoising Framework for Cross-lingual Named Entity Recognition

28:01: Demonstration Augmentation for Zero-shot In-context Learning

29:31: EffiQA: Efficient Question-Answering with Strategic Multi-Model Collaboration on Knowledge Graphs

31:05: Towards Scalable Automated Alignment of LLMs: A Survey

32:19: EduNLP: Towards a Unified and Modularized Library for Educational Resources

33:44: Focus on the Core: Efficient Attention via Pruned Token Compression for Document Classification

35:07: Improved Few-Shot Jailbreaking Can Circumvent Aligned Language Models and Their Defenses

36:36: When Can LLMs Actually Correct Their Own Mistakes? A Critical Survey of Self-Correction of LLMs

37:58: CodeR: Issue Resolving with Multi-Agent and Task Graphs

38:54: Unsupervised Distractor Generation via Large Language Model Distilling and Counterfactual Contrastive Decoding

40:10: FactGenius: Combining Zero-Shot Prompting and Fuzzy Relation Mining to Improve Fact Verification with Knowledge Graphs

41:27: Probing Language Models for Pre-training Data Detection

42:45: R2C2-Coder: Enhancing and Benchmarking Real-world Repository-level Code Completion Abilities of Code Large Language Models

44:32: Privacy in LLM-based Recommendation: Recent Advances and Future Directions

45:23: Linguistic Analysis, Description, and Typological Exploration with Categorial Grammar (TheBench Guide)

46:52: D-CPT Law: Domain-specific Continual Pre-Training Scaling Law for Large Language Models

48:52: Do Large Language Models Perform the Way People Expect? Measuring the Human Generalization Function

50:07: Sparsity-Accelerated Training for Large Language Models

51:36: Superhuman performance in urology board questions by an explainable large language model enabled for context integration of the European Association of Urology guidelines: the UroBot study

53:34: Editing the Mind of Giants: An In-Depth Exploration of Pitfalls of Knowledge Editing in Large Language Models

54:42: LexMatcher: Dictionary-centric Data Collection for LLM-based Machine Translation

55:55: Enabling ASR for Low-Resource Languages: A Comprehensive Dataset Creation Approach

57:10: Understanding Token Probability Encoding in Output Embeddings

...more

View all episodes

By Brad Edwards