TACO: Vision Models Can Be Efficiently Specialized via Few-Shot Task-Aware Compression
Published in Transactions on Machine Learning Research (TMLR), 2023
Recommended citation: D. Kuznedelev†, S. Tabesh†, K. Noorbakhsh†, E. Frantar†, S. Beery, E. Kurtic, D. Alistarh. (2023). "TACO: Vision Models Can Be Efficiently Specialized via Few-Shot Task-Aware Compression." Transactions on Machine Learning Research (TMLR), in press 2025. https://arxiv.org/abs/2303.14409
TACO shows that large vision backbones (ResNet, ViT, ConvNeXt) can be “specialised” into efficient subnetworks using only a handful of task-specific samples. Its layer-wise, data-aware pruning plus distillation pipeline slashes non-zero parameters by up to 20 × and cuts inference latency 2–5 ×, yet retains—or exceeds—the parent model’s accuracy on narrow downstream tasks such as wildlife or vehicle classification.