CaltechAUTHORS
  A Caltech Library Service

AutoBiasTest: Controllable Sentence Generation for Automated and Open-Ended Social Bias Testing in Language Models

Kocielnik, Rafal and Prabhumoye, Shrimai and Zhang, Vivian and Alvarez, R. Michael and Anandkumar, Anima (2023) AutoBiasTest: Controllable Sentence Generation for Automated and Open-Ended Social Bias Testing in Language Models. . (Unpublished) https://resolver.caltech.edu/CaltechAUTHORS:20230316-153717662

[img] PDF - Submitted Version
Creative Commons Attribution.

813kB

Use this Persistent URL to link to this item: https://resolver.caltech.edu/CaltechAUTHORS:20230316-153717662

Abstract

Social bias in Pretrained Language Models (PLMs) affects text generation and other downstream NLP tasks. Existing bias testing methods rely predominantly on manual templates or on expensive crowd-sourced data. We propose a novel AutoBiasTest method that automatically generates sentences for testing bias in PLMs, hence providing a flexible and low-cost alternative. Our approach uses another PLM for generation and controls the generation of sentences by conditioning on social group and attribute terms. We show that generated sentences are natural and similar to human-produced content in terms of word length and diversity. We illustrate that larger models used for generation produce estimates of social bias with lower variance. We find that our bias scores are well correlated with manual templates, but AutoBiasTest highlights biases not captured by these templates due to more diverse and realistic test sentences. By automating large-scale test sentence generation, we enable better estimation of underlying bias distributions.


Item Type:Report or Paper (Discussion Paper)
Related URLs:
URLURL TypeDescription
http://arxiv.org/abs/2302.07371arXivDiscussion Paper
ORCID:
AuthorORCID
Kocielnik, Rafal0000-0001-5602-6056
Alvarez, R. Michael0000-0002-8113-4451
Anandkumar, Anima0000-0002-6974-6797
Additional Information:Attribution 4.0 International (CC BY 4.0). We would like to thank the Caltech SURF program for contributing to the funding of this project. This material is based upon work supported by the National Science Foundation under Grant # 2030859 to the Computing Research Association for the CIFellows Project. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation nor the Computing Research Association. Anima Anandkumar is partially supported by Bren Named Chair Professorship at Caltech and is a paid employee of Nvidia.
Funders:
Funding AgencyGrant Number
Caltech Summer Undergraduate Research Fellowship (SURF)UNSPECIFIED
NSFCCF-2030859
Bren Professor of Computing and Mathematical SciencesUNSPECIFIED
Record Number:CaltechAUTHORS:20230316-153717662
Persistent URL:https://resolver.caltech.edu/CaltechAUTHORS:20230316-153717662
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:120083
Collection:CaltechAUTHORS
Deposited By: George Porter
Deposited On:16 Mar 2023 18:57
Last Modified:16 Mar 2023 18:57

Repository Staff Only: item control page