Instella
Open 3B language models from AMD
2025-03-10

Instella, from AMD, is the high-performance 3B language models. ResearchRAIL license for model weights, MIT license for code. Trained on MI300X.
Instella, developed by AMD, is a cutting-edge family of fully open 3-billion-parameter language models designed to push the boundaries of AI performance. Trained from scratch on AMD Instinct MI300X GPUs, these models outperform existing open models of similar sizes and compete with top-tier open-weight models like Llama-3.2-3B and Gemma-2-2B. Instella's multi-stage training pipeline, leveraging high-quality datasets and advanced techniques like FlashAttention-2 and Fully Sharded Data Parallelism, ensures superior natural language understanding and instruction-following capabilities. Fully open-source, Instella releases model weights, training configurations, datasets, and code to foster collaboration and innovation within the AI community. This marks a significant step in demonstrating AMD's hardware prowess while advancing open-source AI research.
Open Source
Artificial Intelligence
GitHub