解密 Claude Mythos:OpenMythos 帶你探索「遞歸深度變壓器」的推理奧秘 Decoding Claude Mythos: Exploring the Secrets of Recurrent-Depth Transformers with OpenMythos

透過開源重建,揭開大模型「隱藏推理」與遞歸深度架構的神秘面紗。 Unveiling the mystery of "hidden reasoning" and recurrent-depth architectures in LLMs through open-source reconstruction.

🔎 工具速覽 / AT A GLANCE

CategoryAI Architecture / Research
PricingFree (Open Source)
BestForAI Researchers, LLM Architects, Deep Learning Enthusiasts
GitHub Stars⭐ 1274

🚀 引言 / Introduction

在當前大語言模型(LLM)的競賽中,人們對 Claude 等頂尖模型如何實現強大的邏輯推理能力充滿好奇。傳統的 Transformer 依賴於堆疊數百層參數,但這是否是唯一路徑?OpenMythos 提出了一個大膽的理論假設:模型可能採用了「遞歸深度變壓器」(Recurrent-Depth Transformer),讓權重在內部循環運算,實現「沉默的思考」。

In the current LLM race, there is immense curiosity about how top-tier models like Claude achieve powerful logical reasoning. Traditional Transformers rely on stacking hundreds of unique layers, but is this the only path? OpenMythos proposes a bold theoretical hypothesis: models may employ a "Recurrent-Depth Transformer" (RDT), allowing weights to loop internally to achieve "silent thinking."

🛠️ 核心功能 / Key Features

- **遞歸深度架構 (Recurrent-Depth Architecture)**:將模型分為 Prelude(前奏)、Recurrent Block(遞歸塊)與 Coda(結尾),透過權重共享與循環增加推理深度。

Recurrent-Depth Architecture: Divides the model into a Prelude, a Recurrent Block, and a Coda, increasing reasoning depth through weight sharing and looping.

- **可切換注意力機制 (Switchable Attention)**:支援 MLA (Multi-head Latent Attention) 與 GQA (Grouped-Query Attention),靈活適應不同的記憶體與計算需求。

Switchable Attention Mechanisms: Supports both MLA and GQA, flexibly adapting to different memory and compute requirements.

- **稀疏混合專家模型 (Sparse MoE)**:採用包含路由專家(Routed Experts)與共享專家(Shared Experts)的 FFN,優化計算效率並提升模型容量。

Sparse Mixture of Experts (MoE): Utilizes FFNs with routed and shared experts to optimize compute efficiency and expand model capacity.

- **計算自適應推理 (Compute-Adaptive Reasoning)**:使用者可根據複雜度調整 `max_loop_iters`,在單次前向傳播中決定「思考」的時間長短。

Compute-Adaptive Reasoning: Users can adjust `max_loop_iters` based on complexity, determining the duration of "thinking" within a single forward pass.

💡 技術亮點 / Tech Highlights

- **非 Token 級推理 (Non-Token Level Reasoning)**:與 Chain-of-Thought (CoT) 不同,OpenMythos 的遞歸發生在潛在空間(Latent Space)中,無需輸出中間 Token 即可完成深度推理。

Non-Token Level Reasoning: Unlike Chain-of-Thought (CoT), OpenMythos's recurrence happens in the latent space, completing deep reasoning without needing to output intermediate tokens.

- **注入機制 (Injection Mechanism)**:透過學習參數 $A$ 與 $B$ 持續將原始輸入信號注入遞歸塊,有效防止模型在多次循環後發生信號漂移。

Injection Mechanism: Using learned parameters $A$ and $B$, the original input signal is continuously injected into the recurrent block, effectively preventing signal drift over multiple loops.

- **第一原理重建 (First-Principles Reconstruction)**:該專案並非官方發布,而是基於公開研究文獻與理論推測的獨立社區實現,旨在探索最前沿的 AI 架構。

First-Principles Reconstruction: This project is not an official release but an independent community implementation based on public research and speculation, aimed at exploring frontier AI architectures.

📦 快速上手 / Quick Start

**1. 安裝套件 / Install the package:**

```bashpip install open-mythos

```

**2. 基礎實作 / Basic Implementation:**

```pythonimport torchfrom open_mythos.main import OpenMythos, MythosConfig

# 配置參數 / Configuration

cfg = MythosConfig( vocab_size=1000, dim=256, n_heads=8, max_loop_iters=4, prelude_layers=1, coda_layers=1, n_experts=8, n_shared_experts=1, n_experts_per_tok=2, expert_dim=64, lora_rank=8, attn_type="mla", n_kv_heads=8, kv_lora_rank=32, q_lora_rank=64, qk_rope_head_dim=16, qk_nope_head_dim=16, v_head_dim=16

)

# 初始化模型 / Initialize Model

model = OpenMythos(cfg)

# 生成內容 / Generation

ids = torch.randint(0, cfg.vocab_size, (2, 16))out = model.generate(ids, max_new_tokens=8, n_loops=8)print(f"Generated shape: {out.shape}")

```

準備好試試 解密 Claude Mythos:OpenMythos 帶你探索「遞歸深度變壓器」的推理奧秘 Decoding Claude Mythos: Exploring the Secrets of Recurrent-Depth Transformers with OpenMythos 了嗎?

Ready to try 解密 Claude Mythos:OpenMythos 帶你探索「遞歸深度變壓器」的推理奧秘 Decoding Claude Mythos: Exploring the Secrets of Recurrent-Depth Transformers with OpenMythos?

前往 GitHub 頁面 →

留言

這個網誌中的熱門文章

[Security] wpa_supplicant setup

[拆機] Nexus5 更換背蓋、電池

[我的MAC Air] 2012年中,MAC Air SSD升級