LLaVA-MORE: Enhancing Visual Instruction Tuning with LLaMA 3.1
Updated 2024-09-16 03:07:24 +03:00
OSWorld: A real computer environment for multimodal agents to evaluate open-ended computer tasks
cli
llm
natural-language-processing
artificial-intelligence
language-model
reinforcement-learning
gui
multimodal
agent
benchmark
code-generation
large-action-model
rpa
vlm
Updated 2024-04-29 12:26:03 +03:00