Lan, Shiyi and Yang, Xitong and Yu, Zhiding and Wu, Zuxuan and Alvarez, Jose M. and Anandkumar, Anima (2023) Vision Transformers Are Good Mask Auto-Labelers. . (Unpublished) https://resolver.caltech.edu/CaltechAUTHORS:20230316-153757695
![]() |
PDF
- Submitted Version
Creative Commons Attribution. 6MB |
Use this Persistent URL to link to this item: https://resolver.caltech.edu/CaltechAUTHORS:20230316-153757695
Abstract
We propose Mask Auto-Labeler (MAL), a high-quality Transformer-based mask auto-labeling framework for instance segmentation using only box annotations. MAL takes box-cropped images as inputs and conditionally generates their mask pseudo-labels.We show that Vision Transformers are good mask auto-labelers. Our method significantly reduces the gap between auto-labeling and human annotation regarding mask quality. Instance segmentation models trained using the MAL-generated masks can nearly match the performance of their fully-supervised counterparts, retaining up to 97.4% performance of fully supervised models. The best model achieves 44.1% mAP on COCO instance segmentation (test-dev 2017), outperforming state-of-the-art box-supervised methods by significant margins. Qualitative results indicate that masks produced by MAL are, in some cases, even better than human annotations.
Item Type: | Report or Paper (Discussion Paper) | ||||||||
---|---|---|---|---|---|---|---|---|---|
Related URLs: |
| ||||||||
ORCID: |
| ||||||||
Additional Information: | Attribution 4.0 International (CC BY 4.0). | ||||||||
Record Number: | CaltechAUTHORS:20230316-153757695 | ||||||||
Persistent URL: | https://resolver.caltech.edu/CaltechAUTHORS:20230316-153757695 | ||||||||
Usage Policy: | No commercial reproduction, distribution, display or performance rights in this work are provided. | ||||||||
ID Code: | 120089 | ||||||||
Collection: | CaltechAUTHORS | ||||||||
Deposited By: | George Porter | ||||||||
Deposited On: | 16 Mar 2023 17:57 | ||||||||
Last Modified: | 16 Mar 2023 17:57 |
Repository Staff Only: item control page