Active Inference · Paper · 2026

On-Policy Distillation as Active Inference in Finite Variational Models

Zenodo

Catalog Row171
Citation KeyFriedman2026PolicyDistillationAsActive171
Paper FolderAvailable

Overview

Extracted from the local paper documentation when available.

Abstract This paper formulates on-policy distillation as active inference in finite variational models, with exact claims only for declared objects and interpretive claims explicitly bounded outside them. In the construction, the intractable teacher policy plays the role of the generative model $p(o,s)$, the tractable student policy is the approximate posterior $q(s)$, and the per-token reverse-KL distillation loss is variational free energy up to the evidence constant, $F = D {\mathrm{KL}}(q\,\ \,p(s\mid o)) - \log p(o)$, whose KL target is the teacher-induced posterior $p(s\mid o)\propto p(o,s)$ . The title's "as" is therefore a scoped mathematical correspondence rather than the slogan OPD = Active Inference. Variational free energy names the realized-rollout distillation loss; expected free energy remains the planning-side objective by which the pymdp agent selects actions . On-policy

on-policy distillationactive inferenceself-distillationprivileged informationfree energy principlereverse KL divergencepymdpsophisticated inference

Use Notes

Concise findings and methods pulled from README/SKILL documentation.

Findings / Concepts
  • on-policy distillation
  • active inference
  • self-distillation
  • privileged information
  • free energy principle
Methods / Techniques
  • Not yet summarized.

Citation

Plain-text citation for quick reuse.

Friedman, Daniel Ari. 2026. On-Policy Distillation as Active Inference in Finite Variational Models. Zenodo.

Primary source Documentation Source repository BibTeX

Related in Active Inference

Other catalogued works in the same domain.