Computational · Paper · 2026

Recovering LLM-Persona Accuracies from Unlabeled Votes

Documentation folder for catalog row 152 · Canonical work page

Folderpapers/2026_RecoveringLLMPersona/

Overview

Extracted from the local README when available.

Algebraic (NTQR) evaluation infers how accurate a group of noisy classifiers was on a finite test using only their responses — no answer key. We test this end to end on real large language models. Three trader "personas" (optimistic, neutral, pessimistic), instantiated as system prompts, each make a binary bullish/bearish call on the same 64 market scenarios; we run the identical trio through six locally-hosted models via Ollama. For each model we recover per-persona, per-label accuracy with ErrorIndependentEvaluation (unsupervised) and score it against the authored ground truth (supervised), which is used only as a check. On the five models whose three judges all varied (mistral:latest, gemma4:latest, gemma3:4b, gemma2:2b, granite4.1:3b), the unsupervised algebra recovered persona accuracies to a mean absolute error of 0.012, within the 0.102 sampling-noise floor across all six per-labe

Artifacts

Tracked documentation and PDFs served directly from this folder.

PDF Files