AgentOdyssey:解鎖大模型持續學習的開放式文字遊戲生成引擎 | AgentOdyssey: An Open-Ended Text Game Engine for Test-Time Continual Learning Agents

一個專為測試大模型在長程規劃與持續學習能力而設計的開放式遊戲生成與評估框架。 / An open-ended game generation and evaluation framework designed to test the long-horizon planning and continual learning capabilities of LLM agents.

🔎 工具速覽 / AT A GLANCE

CategoryAI Agent Evaluation Framework / LLM Research Tool
PricingFree / Open Source
BestForAI researchers studying continual learning, long-horizon planning, and autonomous agents.
GitHub Stars⭐ 37

🚀 引言 / Introduction

AgentOdyssey 是一個輕量級的交互式環境,旨在挑戰 AI Agent 在未知環境中的適應能力。它不僅能生成全新的長程文字遊戲,更提供了一套嚴謹的評估機制,用以衡量 Agent 在測試時的持續學習表現。 / AgentOdyssey is a lightweight interactive environment designed to challenge the adaptability of AI agents in unknown environments. It not only generates novel long-horizon text games but also provides a rigorous evaluation mechanism to measure the test-time continual learning performance of agents.

🛠️ 核心功能 / Key Features

Open-Ended Long-Horizon Game Generation: Create rich games with novel entities, dynamics, and storylines via a single command.

開放式長程遊戲生成:僅需單一指令即可生成具有全新實體、動態機制與豐富劇情的遊戲。

Unified Agent Interface: Standardized framework using inherited classes to ensure fair comparisons across different LLM-based agents.

統一 Agent 接口:透過繼承類別實現標準化框架,確保不同 LLM Agent 之間在提示詞共享與對比時的公平性。

Multifaceted Evaluation Metrics: Advanced probing of failure modes beyond simple game progress, focusing on cognitive abilities.

多維度評估指標:超越單純的遊戲進度,提供深層指標以探測 Agent 的特定失效模式與認知能力。

💡 技術亮點 / Tech Highlights

Focuses on five key abilities: exploration, world knowledge acquisition, episodic memory, skill learning, and long-horizon planning.

聚焦五大核心能力:探索能力、世界知識獲取、情節記憶、技能學習以及長程規劃。

Highly extensible architecture allowing researchers to implement new agents by simply overriding a few methods.

高擴展性架構,研究人員僅需實現少量方法即可快速部署並測試新的 Agent 方案。

📦 快速上手 / Quick Start

Install: Clone the repository and install via 'pip install -e .'

安裝:克隆儲存庫並使用 'pip install -e .' 進行安裝

Configuration: Set your LLM API key (e.g., OPENAI_API_KEY) in the environment.

配置:在環境變數中設置 LLM API 金鑰(例如 OPENAI_API_KEY)

Execution: Use 'python eval.py' to run evaluations or manually play the game with 'HumanAgent'.

執行:使用 'python eval.py' 運行評估,或透過 'HumanAgent' 親自體驗遊戲

準備好試試 AgentOdyssey:解鎖大模型持續學習的開放式文字遊戲生成引擎 | AgentOdyssey: An Open-Ended Text Game Engine for Test-Time Continual Learning Agents 了嗎?

Ready to try AgentOdyssey:解鎖大模型持續學習的開放式文字遊戲生成引擎 | AgentOdyssey: An Open-Ended Text Game Engine for Test-Time Continual Learning Agents?

前往 GitHub 頁面 →

留言

這個網誌中的熱門文章

[Security] wpa_supplicant setup

[拆機] Nexus5 更換背蓋、電池

[我的MAC Air] 2012年中,MAC Air SSD升級