A Caltech Library Service

Few-shot Instruction Prompts for Pretrained Language Models to Detect Social Biases

Prabhumoye, Shrimai and Kocielnik, Rafal and Shoeybi, Mohammad and Anandkumar, Anima and Catanzaro, Bryan (2021) Few-shot Instruction Prompts for Pretrained Language Models to Detect Social Biases. . (Unpublished)

[img] PDF - Submitted Version
Creative Commons Attribution.


Use this Persistent URL to link to this item:


Detecting social bias in text is challenging due to nuance, subjectivity, and difficulty in obtaining good quality labeled datasets at scale, especially given the evolving nature of social biases and society. To address these challenges, we propose a few-shot instruction-based method for prompting pre-trained language models (LMs). We select a few class-balanced exemplars from a small support repository that are closest to the query to be labeled in the embedding space. We then provide the LM with instruction that consists of this subset of labeled exemplars, the query text to be classified, a definition of bias, and prompt it to make a decision. We demonstrate that large LMs used in a few-shot context can detect different types of fine-grained biases with similar and sometimes superior accuracy to fine-tuned models. We observe that the largest 530B parameter model is significantly more effective in detecting social bias compared to smaller models (achieving at least 13% improvement in AUC metric compared to other models). It also maintains a high AUC (dropping less than 2%) when the labeled repository is reduced to as few as 100 samples. Large pretrained language models thus make it easier and quicker to build new bias detectors.

Item Type:Report or Paper (Discussion Paper)
Related URLs:
URLURL TypeDescription Paper
Anandkumar, Anima0000-0002-6974-6797
Additional Information:Attribution 4.0 International (CC BY 4.0) Warning: this paper contains content that may be offensive or upsetting.
Record Number:CaltechAUTHORS:20220714-224624824
Persistent URL:
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:115599
Deposited By: George Porter
Deposited On:15 Jul 2022 23:19
Last Modified:15 Jul 2022 23:19

Repository Staff Only: item control page