HuggingFace 每日AI论文速递

2024.11.27 每日AI论文 | ShowUI提升GUI效率,F2F改进图像编辑。


Listen Later

本期的 18 篇论文如下:

[00:28] 🖥 ShowUI: One Vision-Language-Action Model for GUI Visual Agent(ShowUI:一种用于GUI视觉代理的视觉-语言-动作模型)

[01:08] 🎥 Pathways on the Image Manifold: Image Editing via Video Generation(图像流形上的路径:通过视频生成进行图像编辑)

[01:45] ⭐ Star Attention: Efficient LLM Inference over Long Sequences(星型注意力:长序列上高效的大型语言模型推理)

[02:24] ⚡ Rethinking Token Reduction in MLLMs: Towards a Unified Paradigm for Training-Free Acceleration(重新思考MLLMs中的Token减少:迈向无训练加速的统一范式)

[03:01] 📊 MME-Survey: A Comprehensive Survey on Evaluation of Multimodal LLMs(MME-Survey: 多模态大语言模型评估的综合调查)

[03:44] 🎨 TEXGen: a Generative Diffusion Model for Mesh Textures(TEXGen:一种用于网格纹理的生成扩散模型)

[04:27] 🎨 SketchAgent: Language-Driven Sequential Sketch Generation(SketchAgent:语言驱动的顺序草图生成)

[05:11] 🔄 Learning 3D Representations from Procedural 3D Programs(从程序化3D程序中学习3D表示)

[05:55] 🧠 VLRewardBench: A Challenging Benchmark for Vision-Language Generative Reward Models(VLRewardBench:视觉语言生成奖励模型的挑战性基准)

[06:50] 🔄 SAR3D: Autoregressive 3D Object Generation and Understanding via Multi-scale 3D VQVAE(SAR3D:通过多尺度3D VQVAE实现自回归3D物体生成与理解)

[07:27] 🖼 FINECAPTION: Compositional Image Captioning Focusing on Wherever You Want at Any Granularity(精细标题:聚焦任意粒度的组合图像描述)

[08:09] 🎨 DreamMix: Decoupling Object Attributes for Enhanced Editability in Customized Image Inpainting(DreamMix:解耦对象属性以增强定制化图像修复的可编辑性)

[08:41] 📹 SALOVA: Segment-Augmented Long Video Assistant for Targeted Retrieval and Routing in Long-Form Video Analysis(SALOVA:长视频助手在长视频分析中的目标检索与路由)

[09:19] 📉 Low-Bit Quantization Favors Undertrained LLMs: Scaling Laws for Quantized LLMs with 100T Training Tokens(低比特量化有利于未充分训练的大型语言模型:基于100万亿训练标记的量化大型语言模型缩放规律)

[10:05] 🧬 MolReFlect: Towards In-Context Fine-grained Alignments between Molecules and Texts(MolReFlect:面向分子与文本之间细粒度对齐的研究)

[10:40] 👕 Controllable Human Image Generation with Personalized Multi-Garments(个性化多服装的可控人体图像生成)

[11:12] 🤖 Visual Counter Turing Test (VCT^2): Discovering the Challenges for AI-Generated Image Detection and Introducing Visual AI Index (V_AI)(视觉反图灵测试(VCT²):发现AI生成图像检测的挑战并引入视觉AI指数(V_AI))

[11:55] 🎥 AnchorCrafter: Animate CyberAnchors Saling Your Products via Human-Object Interacting Video Generation(锚点创作者:通过人-物交互视频生成动画网络锚点推广产品)

【关注我们】

您还可以在以下平台找到我们,获得播客内容以外更多信息

小红书: AI速递

...more
View all episodesView all episodes
Download on the App Store

HuggingFace 每日AI论文速递By duan

  • 5
  • 5
  • 5
  • 5
  • 5

5

2 ratings


More shows like HuggingFace 每日AI论文速递

View all
硅谷101|中国版 by 泓君Jane

硅谷101|中国版

56 Listeners

商业就是这样 by 商业就是这样

商业就是这样

291 Listeners

声动早咖啡 by 声动活泼

声动早咖啡

294 Listeners

思文,败类 by 思文败类

思文,败类

157 Listeners

不开玩笑 Jokes Aside by 不开玩笑JokesAside

不开玩笑 Jokes Aside

136 Listeners

人民公园说AI by JustSayAI

人民公园说AI

7 Listeners

數創實驗室 - AI時代的學習指南 by Vincent在數創

數創實驗室 - AI時代的學習指南

1 Listeners

AI可可AI生活 by fly51fly

AI可可AI生活

0 Listeners