Large language models and surveys

Surveys have a long tradition in social science research. They form the basis of much we know today about human populations and assist informed policy decisions. For example, the American Community Survey is administered to more than 3 million people in the US on a yearly basis to gather timely statistics related to income, employment, health, transportation, and housing. Composing of a set of well-curated questions in natural language, surveys have recently also gained popularity as a tool to study large language models.
Surveys for alignment research
Inspired by social science research, prompting large language models with survey questions promises insights about the demographics, political opinions, and values represented by current models. Appealing as a methodology, such insights can be brittle if the way surveys are used does not offer a valid measurement tool to extract subjective information from language models. In fact, we critically examine popular methodologies of prompting language models with survey questions and demostrate how systematic biases present in LLM responses confound alignment evaluations []. Highlighting important pitfalls, we envision our work to offer a first step towards a more rigorous study of survey methodology for LLMs, taking inspiration from efforts that went into designing surveys for human populations.
Surveys for benchmarking LLMs in social prediction
In addition to questions, surveys also come with high quality human reference data. This rich data source offers a valuable basis for the evaluation of LLMs on social prediction tasks. To make the data more accessible we have developed a benchmarking suite to test models for accuracy and calibration [] on binary prediction tasks derived from survey data. Interestingly, compared to traditional knowledge testing tasks, human outcomes are rarely predictable and come with inherent uncertainty. Thus, the use of LLMs in this context hinges on widely understudied capabilities, a gap in the current evaluation landscape our work aims to fill.