LoongForge:打造跨模態大模型訓練的模組化引擎 | LoongForge: A Modular Engine for Cross-Modal Large Model Training
🔎 工具速覽 / AT A GLANCE
| Category | AI Training Framework |
| Pricing | Open Source |
| BestFor | Large-scale model pre-training and SFT across heterogeneous hardware |
| GitHub Stars | ⭐ 136 |
🚀 引言 / Introduction
LoongForge 是百度 Baige AI 基礎設施平台推出的 Loong 開源系列之一。基於 Megatron-LM 並進行深度優化,它提供了一個高效且具備高度擴展性的解決方案,涵蓋從預訓練、持續預訓練到監督微調(SFT)的全過程。 | LoongForge is part of the Loong open-source series from Baidu's Baige AI infrastructure platform. Built upon and significantly enhancing Megatron-LM, it delivers an efficient and highly extensible solution covering pre-training, continued pre-training, and supervised fine-tuning (SFT).
🛠️ 核心功能 / Key Features
原生支持 LLM、VLM、VLA 及擴散模型,透過靈活的組件抽象化輕鬆添加新模態。 | Natively supports LLMs, VLMs, VLAs, and Diffusion Models, making the addition of new multi-modal variants effortless through flexible composition abstraction.
提供先進的並行計算與內存管理優化,顯著降低訓練成本並加速開發。 | Provides advanced optimizations in parallelism and memory management, significantly reducing training costs and accelerating model development.
原生支持 NVIDIA GPU 與 崑崙 (Kunlun) XPU,確保在不同硬件集群間的無縫遷移與穩定擴展。 | Native, high-performance support for both NVIDIA GPUs and Kunlun XPUs, ensuring seamless migration and stable training across diverse hardware clusters.
💡 技術亮點 / Tech Highlights
採用配置驅動方式,可自由組合可替換的 ViT 與 LLM 組件來構建 VLM。 | A configuration-driven approach to assemble VLMs using interchangeable ViT and LLM components.
允許為不同模型組件(如 Vision Encoder 與 LLM)分配獨立的張量/數據並行規模,優化吞吐量。 | Enables assigning independent Tensor/Data Parallel sizes to different model components for optimal throughput and memory efficiency.
將視覺編碼器與 LLM 拆分為獨立任務,消除流水線氣泡並防止 ViT 計算阻塞 LLM。 | Separates vision encoder and LLM into independent tasks, eliminating pipeline bubbles and preventing ViT computation from blocking LLM throughput.
利用負載感知數據重新分發算法,優化由數據打包引起的數據並行不平衡。 | Leverages a load-aware data redistribution algorithm to optimize data parallel imbalances caused by data packing.
📦 快速上手 / Quick Start
安裝依賴並配置對應的硬件環境(NVIDIA GPU 或 Kunlun XPU)。 | Install dependencies and configure the corresponding hardware environment (NVIDIA GPU or Kunlun XPU).
透過配置文件定義 ViT 與 LLM 組件,組裝所需的多模態模型結構。 | Define ViT and LLM components via configuration files to assemble the desired multi-modal model architecture.
配置並行策略(Tensor/Data Parallel)並啟動預訓練或 SFT 任務。 | Configure parallelism strategies (Tensor/Data Parallel) and launch pre-training or SFT tasks.
準備好試試 LoongForge:打造跨模態大模型訓練的模組化引擎 | LoongForge: A Modular Engine for Cross-Modal Large Model Training 了嗎?
Ready to try LoongForge:打造跨模態大模型訓練的模組化引擎 | LoongForge: A Modular Engine for Cross-Modal Large Model Training?
前往 GitHub 頁面 →
留言
張貼留言