XTUNER

A next-generation training engine built for ultra-large MoE models

Apache-2.0

ABOUT

Training ultra-large MoE models traditionally requires complex, hard-to-configure 3D parallelism that scales poorly and demands excessive hardware resources. XTuner solves this by enabling dropless training of 200B-parameter MoE models with minimal expert parallelism, supporting 64k sequence lengths without sequence parallelism, and delivering higher FSDP throughput than conventional 3D parallel approaches on GPUs and Ascend NPUs.

INSTALL

git clone https://github.com/InternLM/xtuner.git
cd xtuner
pip install -e .

INTEGRATION GUIDE

1. Supervised fine-tuning of LLMs and multimodal models on custom datasets 2. Pre-training and continual pre-training of ultra-large MoE models up to 1T parameters 3. Reinforcement learning with GRPO for reasoning and agentic capabilities 4. Efficient long-context training (64k+ sequences) for dense and MoE architectures

XTUNER

ABOUT

INTEGRATION GUIDE

TAGS