發表文章

目前顯示的是有「MLA」標籤的文章

解密 Claude Mythos:OpenMythos 帶你探索「遞歸深度變壓器」的推理奧秘 Decoding Claude Mythos: Exploring the Secrets of Recurrent-Depth Transformers with OpenMythos

圖片
透過開源重建,揭開大模型「隱藏推理」與遞歸深度架構的神秘面紗。 Unveiling the mystery of "hidden reasoning" and recurrent-depth architectures in LLMs through open-source reconstruction. 🔎 工具速覽 / AT A GLANCE Category AI Architecture / Research Pricing Free (Open Source) BestFor AI Researchers, LLM Architects, Deep Learning Enthusiasts GitHub Stars ⭐ 1274 🚀 引言 / Introduction 在當前大語言模型(LLM)的競賽中,人們對 Claude 等頂尖模型如何實現強大的邏輯推理能力充滿好奇。傳統的 Transformer 依賴於堆疊數百層參數,但這是否是唯一路徑?OpenMythos 提出了一個大膽的理論假設:模型可能採用了「遞歸深度變壓器」(Recurrent-Depth Transformer),讓權重在內部循環運算,實現「沉默的思考」。 In the current LLM race, there is immense curiosity about how top-tier models like Claude achieve powerful logical reasoning. Traditional Transformers rely on stacking hundreds of unique layers, but is this the only path? OpenMythos proposes a bold theoretical hypothesis: models may employ a "Recurrent-...