TACO: Vision Models Can Be Efficiently Specialized via Few-Shot Task-Aware Compression

Published in Transactions on Machine Learning Research (TMLR), 2023

Recommended citation: D. Kuznedelev, S. Tabesh, K. Noorbakhsh, E. Frantar, S. Beery, E. Kurtic, D. Alistarh. (2023). "TACO: Vision Models Can Be Efficiently Specialized via Few-Shot Task-Aware Compression." Transactions on Machine Learning Research (TMLR), in press 2025. https://arxiv.org/abs/2303.14409

TACO shows that large vision backbones (ResNet, ViT, ConvNeXt) can be “specialised” into efficient subnetworks using only a handful of task-specific samples. Its layer-wise, data-aware pruning plus distillation pipeline slashes non-zero parameters by up to 20 × and cuts inference latency 2–5 ×, yet retains—or exceeds—the parent model’s accuracy on narrow downstream tasks such as wildlife or vehicle classification.

Access paper here