A Caltech Library Service

Automatic Synthesis of Diverse Weak Supervision Sources for Behavior Analysis

Tseng, Albert and Sun, Jennifer J. and Yue, Yisong (2021) Automatic Synthesis of Diverse Weak Supervision Sources for Behavior Analysis. . (Unpublished)

[img] PDF - Submitted Version
See Usage Policy.


Use this Persistent URL to link to this item:


Obtaining annotations for large training sets is expensive, especially in behavior analysis settings where domain knowledge is required for accurate annotations. Weak supervision has been studied to reduce annotation costs by using weak labels from task-level labeling functions to augment ground truth labels. However, domain experts are still needed to hand-craft labeling functions for every studied task. To reduce expert effort, we present AutoSWAP: a framework for automatically synthesizing data-efficient task-level labeling functions. The key to our approach is to efficiently represent expert knowledge in a reusable domain specific language and domain-level labeling functions, with which we use state-of-the-art program synthesis techniques and a small labeled dataset to generate labeling functions. Additionally, we propose a novel structural diversity cost that allows for direct synthesis of diverse sets of labeling functions with minimal overhead, further improving labeling function data efficiency. We evaluate AutoSWAP in three behavior analysis domains and demonstrate that AutoSWAP outperforms existing approaches using only a fraction of the data. Our results suggest that AutoSWAP is an effective way to automatically generate labeling functions that can significantly reduce expert effort for behavior analysis.

Item Type:Report or Paper (Discussion Paper)
Related URLs:
URLURL TypeDescription Paper
Sun, Jennifer J.0000-0002-0906-6589
Yue, Yisong0000-0001-9127-1989
Additional Information:We thank Adith Swaminathan of Microsoft Research and Pietro Perona of Caltech for their invaluable feedback and helpful discussions regarding this work. We also thank Microsoft Research for the compute resources for our experiments. This work is partially supported by NSF Award #1918839 (YY) and NSERC Award #PGSD3-532647-2019 (JJS).
Funding AgencyGrant Number
Natural Sciences and Engineering Research Council of Canada (NSERC)PGSD3-532647-2019
Record Number:CaltechAUTHORS:20220224-200830238
Persistent URL:
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:113585
Deposited By: George Porter
Deposited On:28 Feb 2022 17:19
Last Modified:28 Feb 2022 17:19

Repository Staff Only: item control page