The Chameleon's Limit: Investigating Persona Collapse and Homogenization in Large Language Models

作者: Yunze Xiao, Vivienne J. Zhang, Chenghao Yang, Ningshan Ma, Weihao Xuan, Jen-tse Huang

单位: CMU, UChicago, MIT, 2077.ai, UTokyo, RIKEN AIP, JHU

主分类: cs.CL · 全部: cs.CL

命中关键词: large language model, llm, agent, multi-agent, rag, reasoning

自动分析不可用（claude CLI timeout）。展示原始摘要。

摘要

Applications based on large language models (LLMs), such as multi-agent simulations, require population diversity among agents. We identify a pervasive failure mode we term \emph{Persona Collapse}: agents each assigned a distinct profile nonetheless converge into a narrow behavioral mode, producing a homogeneous simulated population. To quantify persona collapse, we propose a framework that measures how much of the persona space a population occupies (Coverage), how evenly agents spread across it (Uniformity), and how rich the resulting behavioral patterns are (Complexity). Evaluating ten LLMs on personality simulation (BFI-44), moral reasoning, and self-introduction, we observe persona collapse along two axes: (1) Dimensions: a model can appear diverse on one axis yet structurally degenerate on another, and (2) Domains: the same model may collapse the most in personality yet be the most diverse in moral reasoning. Furthermore, item-level diagnostics reveal that behavioral variation tracks coarse demographic stereotypes rather than the fine-grained individual differences specified in each persona. Counter-intuitively, \textbf{the models achieving the highest per-persona fidelity consistently produce the most stereotyped populations}. We release our toolkit and data to support population-level evaluation of LLMs.

论文图表

图 1: Figure 1: Persona collapse in LLM-based population simulation. Although two personas differ across multiple identity dimensions, Qwen3-32B assigns both the same neutral response on a socially sensitive judgment task. At the population level, the most conservative and most liberal persona pools also concentrate on the same Likert rating.

图 1

图 2: (a) Coverage.

图 2

图 3: (b) Uniformity.

图 3

图 4: (c) Complexity.

图 4

图 5: Figure 4: Population-level diagnostics on BFI-44 (10 models, 1,144 personas each). Left : Coverage vs. Complexity (LID). The human reference occupies the upper-right; no model simultaneously achieves high coverage and high complexity. Right : Persona fidelity ( ρ \rho ) vs. trait polarization (Cohen’s d d ). Every model with ρ > 0.9 \rho>0.9 produces d > 6 d>6 , far exceeding the d = 2

图 5