Saved LORA adapter checkpoints from training Qwen2.5-7B to generate decision trees for Abalone age regression dataset, using reinforce++ algorithm.
Zhiyuan He
nickhe
AI & ML interests
None yet
Recent Activity
upvoted a paper about 1 month ago
Web2BigTable: A Bi-Level Multi-Agent LLM System for Internet-Scale Information Search and Extraction upvoted a paper about 1 month ago
From Skills to Talent: Organising Heterogeneous Agents as a Real-World Company upvoted a paper about 2 months ago
InfoSeeker: A Scalable Hierarchical Parallel Agent Framework for Web Information Seeking