OpenGameEval: Eval Framework to Benchmark Agentic AI Assistants
Posted by moneil971 19 hours ago
Comments
Comment by moneil971 19 hours ago
OpenGameEval offers a unique testing ground to evaluate core model capabilities related to agenetic reasoning and long-horizon task solving.