Recovering LLM-Persona Accuracies from Unlabeled Votes

Catalog Row152

Citation KeyFriedman2026RecoveringLLMPersonaAccuracies152

Paper FolderAvailable

DOI10.5281/zenodo.20498699

Platform availability

✅ Zenodo
✅ GitHub
⬜ arXiv
⬜ OSF
⬜ HuggingFace
⬜ Software Heritage
⬜ PyPI
✅ Full documentation

Overview

Extracted from the local paper documentation when available.

Algebraic (NTQR) evaluation infers how accurate a group of noisy classifiers was on a finite test using only their responses — no answer key. We test this end to end on real large language models. Three trader "personas" (optimistic, neutral, pessimistic), instantiated as system prompts, each make a binary bullish/bearish call on the same 64 market scenarios; we run the identical trio thr...

algebraic evaluationNTQRunsupervised evaluationevaluation on unlabeled dataLLM-as-judgeerror-independent evaluationensemble evaluabilityconstant classifierAI safety warning lightreproducible researchanswer-key-free recoverylocal large language models

Use Notes

Concise findings and methods pulled from README/SKILL documentation.

Findings / Concepts

Algebraic (NTQR) evaluation infers how accurate a group of noisy classifiers was on a finite test using only their responses — no answer key.
We test this end to end on real large language models.

Methods / Techniques

Software pipeline design
Data-driven analysis

Citation

Plain-text citation for quick reuse.

Friedman, Daniel Ari. 2026. Recovering LLM-Persona Accuracies from Unlabeled Votes. Zenodo. DOI: 10.5281/zenodo.20498699. URL: https://doi.org/10.5281/zenodo.20498699.

Primary source Documentation Full Text Image Gallery Source repository BibTeX

Related in Computational

Other catalogued works in the same domain.

View all Computational works, software & media →