Emergent temporal abstractions in autoregressive models enable hierarchical reinforcement learning Paper • 2512.20605 • Published Dec 23, 2025 • 63
Nested Browser-Use Learning for Agentic Information Seeking Paper • 2512.23647 • Published Dec 29, 2025 • 19
TimeBill: Time-Budgeted Inference for Large Language Models Paper • 2512.21859 • Published Dec 26, 2025 • 25
GRAN-TED: Generating Robust, Aligned, and Nuanced Text Embedding for Diffusion Models Paper • 2512.15560 • Published Dec 17, 2025 • 25
Dream-VL & Dream-VLA: Open Vision-Language and Vision-Language-Action Models with Diffusion Language Model Backbone Paper • 2512.22615 • Published Dec 27, 2025 • 51
LightOnOCR: A 1B End-to-End Multilingual Vision-Language Model for State-of-the-Art OCR Paper • 2601.14251 • Published Jan 20 • 29
DynamicVLA: A Vision-Language-Action Model for Dynamic Object Manipulation Paper • 2601.22153 • Published Jan 29 • 75
Generation Enhances Understanding in Unified Multimodal Models via Multi-Representation Generation Paper • 2601.21406 • Published Jan 29 • 6
GigaBrain-0.5M*: a VLA That Learns From World Model-Based Reinforcement Learning Paper • 2602.12099 • Published Feb 12 • 62
Qianfan-OCR: A Unified End-to-End Model for Document Intelligence Paper • 2603.13398 • Published Mar 11 • 155
Jackrong/Qwen3.5-4B-Claude-4.6-Opus-Reasoning-Distilled-v2-GGUF Image-Text-to-Text • 4B • Updated Apr 6 • 19.1k • 126
OpenResearcher: A Fully Open Pipeline for Long-Horizon Deep Research Trajectory Synthesis Paper • 2603.20278 • Published Mar 17 • 98
Think, Act, Build: An Agentic Framework with Vision Language Models for Zero-Shot 3D Visual Grounding Paper • 2604.00528 • Published Apr 1 • 12
HISA: Efficient Hierarchical Indexing for Fine-Grained Sparse Attention Paper • 2603.28458 • Published Mar 30 • 44
Epicure: Navigating the Emergent Geometry of Food Ingredient Embeddings Paper • 2605.22391 • Published 9 days ago • 35