view article Article SigLIP 2: A better multilingual vision language encoder +1 ariG23498, merve, qubvel-hf • Feb 21, 2025 • 215
Molmo Collection Artifacts for open multimodal language models. • 5 items • Updated Dec 23, 2025 • 309
view article Article Llama can now see and run on your device - welcome Llama 3.2 +5 merve, philschmid, osanseviero, reach-vb, lewtun, ariG23498, pcuenq • Sep 25, 2024 • 191
view article Article How to generate text: using different decoding methods for language generation with Transformers patrickvonplaten • Mar 1, 2020 • 299
view article Article Fine-tuning Florence-2 - Microsoft's Cutting-edge Vision Language Models +1 andito, merve, SkalskiP • Jun 24, 2024 • 207
view article Article PaliGemma – Google's Cutting-Edge Open Vision Language Model +1 merve, andsteing, pcuenq • May 14, 2024 • 287
Zero-Shot Detection and Segmentation Collection Demos of projects focused on zero-shot detection and segmentation. • 4 items • Updated Feb 7, 2024 • 3
OpenAI Vision API Collection Demos of projects using the OpenAI Vision API. • 3 items • Updated Nov 22, 2023 • 3
LMMs - Large Multimodal Models Collection Demos of LMM projects. • 5 items • Updated Apr 24, 2024 • 1