QuEST: Stable Training of LLMs with 1-Bit Weights and Activations

Published in Workshop on Sparsity in Large Language Models (SLLM), ICLR 2025, 2025

Recommended citation: A. Panferov, J. Chen, S. Tabesh, R. L. Castro, M. Nikdan, D. Alistarh. (2024). "QuEST: Stable Training of LLMs with 1-Bit Weights and Activations." Workshop on Sparsity in Large Language Models (SLLM), ICLR 2025. https://arxiv.org/abs/2502.05003

QuEST pushes quantization further by showing, for the first time, that full 1-bit weight and activation training is stable on transformer LLMs. It couples fast Hadamard-based normalization with an MSE-optimal fitting scheme plus a novel “trust” gradient estimator that explicitly bridges the gap between noisy low-precision gradients and their full-precision counterparts, yielding Pareto-dominant accuracy–size trade-offs and scalable GPU kernels.

Access paper here