Adaptive Auto-Harness: Sustained Self-Improvement for Agentic System Deployment on Open-Ended Task Streams Paper • 2606.01770 • Published 3 days ago • 10
electricsheepasia/asia-owid-area-of-permanent-meadows-and-pastures Viewer • Updated 1 day ago • 2.71k • 15 • 1
EvalVerse: Pipeline-Aware and Expert-Calibrated Benchmarking for Professional Cinematic Video Generation Paper • 2605.23271 • Published 13 days ago • 79
Video2GUI: Synthesizing Large-Scale Interaction Trajectories for Generalized GUI Agent Pretraining Paper • 2605.14747 • Published 21 days ago • 145
IndusAgent: Reinforcing Open-Vocabulary Industrial Anomaly Detection with Agentic Tools Paper • 2605.20682 • Published 15 days ago • 83
Safety Alignment as Continual Learning: Mitigating the Alignment Tax via Orthogonal Gradient Projection Paper • 2602.07892 • Published 23 days ago • 2
CiteVQA: Benchmarking Evidence Attribution for Trustworthy Document Intelligence Paper • 2605.12882 • Published 22 days ago • 270
Learning to Foresee: Unveiling the Unlocking Efficiency of On-Policy Distillation Paper • 2605.11739 • Published 22 days ago • 59
Metal-Sci: A Scientific Compute Benchmark for Evolutionary LLM Kernel Search on Apple Silicon Paper • 2605.09708 • Published 25 days ago • 5
OpenSearch-VL: An Open Recipe for Frontier Multimodal Search Agents Paper • 2605.05185 • Published 29 days ago • 101