Active Inference Institute

Karl Friston "Active inference and deep temporal models" 23.09.19

2020-07-16 - 1h 44m 00s - 1092 recorded YouTube views in the cached snapshot.

Active Inference

Open on YouTube - Active Inference Institute - interactive timeline

Related Site Routes

Internal links that connect this video to topical pages, site search, and the broader research graph.

Related Works

Bibliography entries inferred from the video title, topic tags, and cached transcript excerpt when present.

Transcript

Cached caption text when YouTube exposes captions and the transcript fetcher has been run.

i spent the morning at the kremlin and its museums museums and i've never seen so many beautiful things in my life in one day before so to thank you i'm going to show you my most beautiful formula for mathematical questions now we're going to spend the next two hours deep on strategically i'm i'm joking i have been told that not everybody here is a mathematician mathematician so i will be showing you my favorite equations equations i'm using them as pictures to remind me what to say and just to rehearse the formal structure structure of the arguments this lecture is really an invitation for us to think about about what the brain is doing and i'm going to take take the perspective that the brain is optimizing something the question is what is it optimizing so

i'm going to ask you to stem back from psychology or machine learning or education artificial intelligence and just ask a simple question what is it that we want to achieve what quantity do we want to optimize and i'm going to offer an answer which is indeed borrowed from statistical physics from the energy but it has a very different meaning and interpretation for others and so what i want to do is to walk through the interpretations and the meanings from the psychological cognitive and sentient perspective um further to that if we understand the mathematical principles principles of higher brain function or indeed just perceptions and action then it should be possible to shine a new light on message passing in the brain of the aeronal dynamics on the anatomy of the grade the functional anatomy called

the computational anatomy so much of my lecture will be in trying to to rehearse the general idea in some of that that terms but illustrating how far one can take the formal principles in understanding the empirical brave responses that a lot of us measure say for example the eugene electric magnetic brain responses or indeed indeed just our choices and our decisions on our behavior so the agenda is ambitious i'm going to address a very deep and bigger question of what do our brains do i provide a potential answer answer but use the answer to illustrate to you how how far i can get in stimulating or understanding understanding quite high level of sophisticated behaviors and i wanted to eventually end with an example of reading and language understanding and its brain response currents so that's

where we're going to try and go if i take too long you're going to stop me me and then we'll have a conversation which is usually the best part of these sections sections of questions and answers so i'm not going to talk for more than the and hour i look forward to having a conversation conversation about some of these locations now have you all read this this is what we're going to be talking about i'm going to introduce um the question at hand and then provide this answer in terms of something called active inference closely related to things like active learning machine learning or active vision vision in the visual sciences this is putting perception in an inactive context that we are actively perceiving and we are in charge of the way that we palpate

or sample the world in order to optimize our perceptual synthesis i'm going to show that mathematically that's just the same as gathering evidence for one's own existence existence and now i've explained that connection in about five or six months all of this self-evidencing this active perception this active inference rests upon predicting worlds actively having a model of how the world works and how the world supplies our sensory organs with sensations and stimulation on thermal outcomes and that's known as a generative model a model that generates sensory consequences consequences from the causes out there which we want to understand so i'm going to spend some time illustrating the kinds of genetic models that we use to simulate behaviors either in animal studies studies or in economic gains and the resulting or ensuing belief updating that we

can then look at as if we were electrophysiologists or brain matters and that's from the principles to the process today so making a distinction between the principles principles that we're going to be talking about active influence and self-evidence to the neuronal processes that are occurring in your head at the moment that we can measure as neuroscientists neuroscientists and cultivate scientists i'll briefly rehearse some empirical predictions i'll go through this quite quickly because i want to get to the end which is some of the most advanced applications of these ideas to understanding functional architectures of the brain and that's the use of deep generative models that have a deep hierarchical structure both in terms of abstraction but also in terms of time exactly the sort of models you would need to generate language and i'm going

to turn that on its head by saying these are the models that we use to understand language and they're going to present some very simple simulations of reading reading and show that they produce the same electrophysiological responses that we use in classical paradigms like the smashing negativity or p300 paradigm so let me just start very abstractly and very simply imagine you are an adult you are a bird of prey and you're hungry so what are you going to do [Music] [Music] what are you going to do search perfect in that one simple answer search there is a whole range of hidden truths about what we are trying to optimize so here's the owl searching and there's the unhappy prey he's about to eat by the searching owl owl so clearly if i'm hungry the first

thing i'm going to do is to resolve my uncertainty about where the prey exists where here the mouse exists and it's that notion of searching resolving uncertainty uncertainty that underwrites everything i'm going to talk about for the next 50 minutes it's important because there are two ways you can write down and here are the first of the formulas but i repeat please do not worry about the maths they're just here to remind me about different ways of thinking about things that you can write down in computer code or indeed mathematically there are two ways you can write down the objectives for living from the power of you or for me first of all we could write down an expression expression for things that we do controlled variables variables in engineering which we'll call you at

this point in time as um maximizing the value of some states of the world if i did that thing and then what i would be able to do is to create a policy pie that for any given current state if i apply this action i will move to the net state and then i will maximize the value of being in that next step so this is the notion of a value function of states giving rise to a state action policy it requires you to believe in and commit to to the idea that for every state there is a label which tells you how valuable that straight is and then all you have to do is to choose the action that takes you from this state state to the most valuable state but that's not going

to work to explain searching searching for the simple reason that it's searching searching is all about reducing uncertainty then we know immediately because uncertainty uncertainty is an attribute of a belief it's not an attribute of a thing it's an attribute of the belief about about something then the function that we need to optimize has to be a function of a belief and if the belief is about something then that's that's a function of a function of something mathematically that's called a function so what we actually need what the alternative way of doing this is our best action at this point in time maximizes a function of beliefs about states and plans states in the world world if i did this action here and i'm going to describe this belief in terms terms of probability through

scarier probabilities probabilities over different states again don't worry about the math just remember the queue is a probabilistic belief furthermore the notion of searching tells you something else very important it means that it matters whether i search for my phrase and then i eat it or i try which means that you can't just write down an objective function as a function of beliefs you have to define a policy which is a sequence of actions in a particular order so now what we have is a notion that there is a best policy that entails or a sequence of actions in a particular order and in machine learning or controlled theories and sequential concept translation for us all it means is we know that the best kind of um policy defined style and it maximizes the sum

of this functional and i'll say now this is the free energy function of elites to give them that so these two contrasting ways of writing down formally what things do what living things emerge in many different guises in many different contexts so the notion that you can explain behavior in terms of optimizing your value function that's the problem technology principle from this you'll find lots of examples optimal control theory dynamic programming deep reinforcement learning learning especially utility theory and economics apple's induction and so on the other approach which is the approach that we are going to pursue is based upon the principle of least action so action here here is just basically a time integral or a time average of an energy and i've just said efforts of free energy so this path is an

action and we're going to maximize that action so that we choose the optimum action and this is the energy principle also known as active inference from this we will hopefully can speculate along artificial curiosity um an intrinsic motivation in robotics emerges emerges we can also cast this in terms of basic decision theory as i mentioned before this is an aspect of sequential policy optimization optimization now i've deliberately sort of contrasted classical value function with three energy functionals to highlight the foundation these approaches are belief-based and these are not but i'd actually show that they come up together again so this becomes this when we remove uncertainty so i'll try to start bringing back to expectations and theory in a few slides but for the moment let's just focus on what this quantity begins and i'm

going to give you one answer answer and i'm not going to motivate if there is a deeper back story here from statistical physics and basic mechanics mechanics but i'm not going to worry about it i'm just going to tell you what it is and then hopefully convince you it is a suitable objective function by a series of examples which will be less useful so here's the basic idea this quantity is known as a variation it's a information theoretical quantity quite closely related to something invented here congrats complexity or multiple complexity it's also known as an attendance low bandwidth machine learning in statistics it's known as blog model evidence evidence also known as marginal likelihood is it's a very very important quantity which you see in nearly every field the fact that it's called called basic

moral evidence um give me a license to describe this optimization of self-evidence and i will see see why that works from a statistician's point of view this negative physical free energy here this negative evidence and not evidence of of evidence lower bound elbow and machine learning learning um is just the probability of getting these these observations own at this point in time given a model of how these outcomes were generated and i am going to beat that one from a statistician's point of view you can always write this quantity this evidence as complexity as negative uh evidence as complexity minus accuracy or log evidence minus complexity what that means is we're going to consider the brain as a statistical organ an organ that's trying to make inferences just in exactly the same way that you

would scientists try to make inferences about differences between one group and another group using a t statistical and analysis of covariance covariance the brain is doing exactly the same thing with its sensual data it's trying to test different hypotheses different beliefs about how those sensory data were caused and it's doing so by maximizing the negative moral evidence which means that it is trying to find the simplest minimally complex explanation that provides an accurate account of the sensory data and that's going to be very important in the complexity particles are very important important so it's not just finding an accurate account of data it has to be personalized and simple in a sense so this variation free energy is just the mathematical mathematical expression of this mixture of complexity and accuracy and we imagine the brain

just just organizes learns infers passes messages all in the service of minimizing its complexity minus accuracy or maximizing legacy minimizing different complexities complexities and if that were at the end then that would be a perfectly suitable and formal account intersection but what we're seeing here is how the brain comes back and actively samples samples the data that he could use to infer the causal structure in the outside world world and that's a really interesting part so we've already said that we have to define the problem in terms of sequences of actions and policies and what we're going to say is that we're going to select those policies that will maximize the expected free energy after performing that sequence of behaviors behaviors so what that means is we're going to effectively effectively choose policies that minimize

complexity expected following an action and minimize accuracy sorry maximize accuracy following an action action but notice now the outcomes are now random variables that they haven't yet occurred they are in the future so now we have to take an average over things that could happen in the future future so now we're talking about the average complexity and that turns out to be risk risk and this is where the greatest reminds me to provide the formal definition definition so risks it really believes about what will happen if i pursue this policy compared to what a priority i prefer so i'll say that again risk is the divergence or the difference between what i think will happen if i do this this and what i prefer a priority to happen so here my beliefs about the sorts

of outcomes that i encounter define the sorts of outcomes that i expect to experience i've been rich rich happy warm having my temperature within the physiological range all the things are making me and they're good and a happy me they are like a priori by prior beliefs about the outcomes that i will attain attain if i could see this policy and the goodness of the policy corresponds to the minimum wage reducing the difference between what i think is going to happen and like my preferences mathematically that's called risk and we'll see another instance of that from that economic perspective in a moment moment at the same time i'm going to maximize my expected accuracy so what would that look like what does the expected accuracy look um like if i haven't actually got the observations

at hand well what it means is i am going to differently choose concepts that make the sensory data as unambiguous as possible possible so for example if i walk into a dark room room i'm going to turn the light off because that's a policy which means that i can unambiguously to reduce the uncertainty about what would be causing those sensory impressions impressions there's a table in front of me that's enlightened over there that i would not be able to see in an unparticular sensory context so this is a little bit about like the joke about the man who is drunk and is searching for his kids and his searching chris cleans under the life post or the streetlight and someone else said what are you doing i am searching for my keys did you drop

them there no i dropped them over there so why is searching here because i can't see over there that's a perfectly perfectly based optional response and it reflects the fact that we were part of the drives for our good policies and those which minimize ambiguity or maximize maximize expected agency and then once we've found a good policy or sound action from that the actual change in states in the world down there beyond that sensory blanket and that will then supply new observations and then we'll redo our perceptual synthesis again um finally simplest hacking explanation what's going on use our beliefs about states of the world to run out and simulate another future another policy select the policy policy that minimizes the risk of ambiguity select the action and so the perception action or the action

perception cycle continues on and on and on all in the service of minimizing risk of an ambiguity minimizing expected surprise or negative free energy which is just uncertainty uncertainty in physics but you can remember this is putting these two things together it's just just minimizing uncertainty about the future where that uncertainty includes preferences that i feel familiar with so that's the basic story in terms of what is this functional of beliefs that we want to optimize and this is a horrible slide if you don't do maths but again please ignore these questions i just wanted to show you how easy it is to take away from these equations and end up with a formalism that people have been working with for centuries or at least decades and decades decades so if you are a mathematician

or a physicist physicist you will now recognize where why this quantity is called free energy it's basically basically an expected native uh a lot of probability here which is called an entropy entropy um and then this quantity if i put them together energy so it's basically the difference between an energy and an entropy so it's the energy level to do work hence free energy but just by shifting these things around or grouping them together in a different way we can interpret it in terms of capacity and accuracy as we've just described so all i'm saying here is that there are different ways of interpreting these monitors depending upon the words that you use the constructs you know taught and you use in conversation with your colleagues they're all equally viable another nice example of just

switching things around here just over here this over here means there's another interpretation of splitting this uncertainty minimizing capacity of good policies or expecting the energy g here we can actually carve it or decompose it into another pair of quantities called epistemic volume volume and expected value so let me show you how that works more precisely let me show you how people have already been using these constructs these quantities in their work before if you just focus on these two terms here here what this corresponds to is essentially the expected difference between beliefs about what's going on out there if i had some observation in the future relative to the beliefs about about state of the world without those observations observations so what that means is this corresponds to the salience of the policy or

the action it tells me the amount of uncertainty i am reducing or the amount of information i have gained gained if i looked over there as opposed to looking over here so this quantity has been used a lot in digital neuroscience to understand visual searches and sailing stats in terms of the best place to go and sample the world from it's the place that minimizes your uncertainty or maximizes your information again that has savings or epistemic performance performance it's also mathematically exactly the same as the mutual information or the mutual predictability or the shared variance between the causes states of the world and the consequences the outcomes that are generated by those states so effectively what we're trying to do is to move palpate our world have a visually with eye movements literally with our

skin receptors by between between for example the layout of a new hotel room in the dark you're testing hypotheses you're feeding your way around you're sampling actively those sensations that tell you ah no this is a table not a bed but you have to have those hypotheses in mind in order to reach reaches of uncertainty uncertainty over the hypotheses that you are entertaining entertaining and in doing that you are increasing the mutual information between what you feel and what caused those feelings feelings so that's a very important aspect of this um uncertainty reducing imperative this free energy function of beliefs um let's make things a little bit simpler so now i'm doing my comments before i'm gonna get back to the value function but if you remember before i said the difference between the valid

function and the free energy functional is that one is belief based and the other is not i can actually convert the scheme into a value-based scheme by removing uncertainty so the first circumstance is basically ambiguity i'm going to assume there are creatures out there that can see every hidden state of the world there's no sensory noise there are no hidden states in the world and what i see is my sensory organs my observations my boundaries are the states and what we are left with is just the divergence of the difference between the predicted and preferred outcomes it's just our risk so this is risk sensitive control in economics economics also known as kl on control because this is a a klr called divergence uh so in optical control theory this is done scale control economics

with sensitive control what it tells us is that this risk sensitive control is basically what is left if we remove ambiguity ambiguity but let's make the final move and actually take away ambiguity so now not only um sorry let's make the final movement take away the risk having taken away from the ambiguity so by taking away the risk what i'm saying is that i am equally uncertain what will happen if i do that in the future and if i take that away we end up with just this term here so i'll remove this now and now i'm just left with this so what is this well it's just expected utility so this is what economists use to score the probability of choosing this positive policy in the absence of differential uncertainty both in terms of

ambiguity but also in terms of risk there's a deep history uh to the exception to their utility specific value both in economics and of course in and reinforcement based on exactly the same same idea and some reward loss function uh that can ignore uncertainty and then you will see policy of action selection being completely described by this value function so the purpose of that was really to illustrate illustrate how in general belief-based formulation of the thing that we are trying to optimize mainly manchester united evidence based on model evidence for our models of the world world or minimize our uncertainty to active palpation of that world and are generalizations and things that we have all been working with for possible attention but you when you get to these special cases if you remove uncertainty so

that the belief aspect triggers now to the point if we talk about value functions functions just to make it very clear for those people people who haven't come across information gain or epistemic value before i just want to give you an intuitive example of what it means to reduce to tunes actions and dissolve uncertainty even before you know what's going to happen so imagine you're driving a car and you are looking around in the night time and you don't have a chance chance you're stopped at a traffic light and there is a filter on this traffic light it could be pointing right or left and you can choose to either look over here or you can choose to look exactly at the side side now if you're wondering about driving you're going to have a

50 50 believe posterior belief or private leave before looking at over here but the sign is pointing to the left or to the right and if you look over here then you're not going to change that posterior belief so it doesn't matter whether the side is pointing to the right or to the left looking over here won't resolve any uncertainty you will have a 50-50 posterior belief whether the sign is pointing this direction or in that direction so this is an example example of a policy that has no epistemic value at all contrast that with a situation where you're looking directly at the sign and resolving your ambiguity and getting very precise sensory information so now if the side is pointing in this direction then you will be 100 certain point in this direction if

the side is pointing in this direction then you will be 100 certain it's pointing in that direction and you know that and you've known that before you even made an eye movement so that's basically basically what this chaos divergence this epistemic performance this salience um fundamental imperative epistemic imperative uncertainty reducing self-evidence so that's all the hard work done now i'm just going to see some pretty examples uh and simulations to show what it looks like like in simulations and hopefully convince you that you've seen a lot of this feminine biology in papers indeed possibly in your opponent research i should say you know i made a joke about the mathematics and the formula before and perhaps i should excuse why we use the mathematics so much much there's a simple reason that if you really

want to understand something that's richard finally so you have to be able to build one and to be able to build a little creature that does the paradise that we ask our subjects or our experimental animals to do you need to be able to write down the song and you need to be able to write down down the mathematical formula upon which the software software are generated so that's our motivation it's really to create little silicone in silicon creatures that we can expose to the same experimental paradigms or legions as real creatures and then see what this belief updated this active infant self tendency looks like and then we'll see if we can see the same kind of empirical criminology in real creatures creatures so that's the excuse or the motivation for it but it

does require you to commit to very particular and well specified morals of that are used by creatures living systems and explain their paradigm or their world and a very general one again please ignore the questions we're going to walk you through with the graphics graphics on the moral um is called a mark of decision process so with this single and same model we have modeled an entire range of different kinds of behavior ranging from curiosity and problem solving so many different kinds of paradigms can be modeled with this one very general generative model so i'm just going to briefly take you through to give you one worked example just to give you a feel for the simplicity in case in your future work you want to use these kinds of models to simulate your paradigms

paradigms so the idea is we need to generate outcomes things that can be of a creature creature or something like me would be able to also also see hear feel and these outcomes we're going to say are going to be generated by hidden states of the world they are hidden because they are not directly observable we can only observe our sensations we can't see directly the causes of those sensations so they're often referred to as either latent or hidden states and they have a narrative they have dynamics dynamics there are successions and transitions off in states so if i'm in a particular state at this point in time i will be promising this state next point in time and so on and so forth so we have this cascade of hidden states each day generating

an outcome as time ticks along and the mapping between between the hidden states and the outcome is just a life economic it's a probability of getting this outcome given this particular state of the world that's usually by an eight matrix now clearly the way that the world unfolds unfolds these hidden states depends upon how i have to plot it matters you know the hidden state of where my eye is pointing determines what i actually see so we imagine that some states of the world depend upon policies in other words the transitions between this state of the world in the next into the world at the next time encoded by probability transition matrix b is a function of policy and we've just said that policies are determined by our prior preferences um and to ensure that

the prior parentheses by c cost function and they will have some precision so confidence associated now we've all encountered that gamma i won't talk about that very much but it is interesting because it looks very much like dopamine responses in your brain and i'll show you a brief example of that later and finally i just have to specify beliefs about the initial states of the initial states and a few hyper parameters parameters that statistically privatize it and with that that model i can model um furthermore if i make some simplifying assumptions about my beliefs about all the unknowns the key unknowns of course being the hidden states of the world generating outcomes and crucially the confidence so these are the two things i need to infer full good beliefs about by minimizing minimizing that by

optimizing that free energy or that energy about i'm also going to think about optimizing the confidence of my policies but the important thing here notice the recasting behavior here in terms of forming beliefs about what i could do and then just selecting action from the most likely or the policy that i think happens likely to be perceived at the moment it's sometimes sometimes known as planning as influence so it's casting action as a process of inference so you actually if you subscribe to this formulation formulation the choice on the selection of what to do next is an active difference it's inferring this is the most likely thing that somebody think something like he would do in this belief state so i'm going to be trying to involve that kind of uncertainty over the workforce workforce

or preferences and you can write all of these uh particular privatizations if you then just uh apply something that this is called a mean field approximation uh just parameterize these beliefs with but from my perspective the results are remarkably simple and more importantly look very very similar similar to the dynamics and believe update as you have been seen in simplified versions of breakfast breakfast so remember before i said there are just three things we don't know in this long state of the world the policies pie and the precision or confidence placed with those policies and it turns out that the solutions that optimize that free energy function can be expressed here as a non-linear function function of linear mixtures of beliefs about the past and future and observations now and this starts to look very

much like a very simple neural network model a sigmoidifying activation function operating operating on linear mixtures of activities elsewhere in the brain a very very simple expression that now starts to provide a metaphor for neuronal responses about the past in the future so written into this giant model is an elemental form of memory memory and prospection post-fiction and prediction prediction memory for the past and the future so there's a sense of time and progression implicit in this update if we look at policy selection it's simply a soft class function of this expected free energy not the goodness of a policy and this is the classic softness response rule used in economics and much of reinforcement learning interestingly the confidence one over gamma gamma here um looks as though it's updated according to something that's very

similar to a reward prediction there which takes us off off in a very different direction in the direction of dopamine and the relationship with reward predictions but i want to move on to to different parameters with this one itself itself again it's very nice because with the update rules and solutions that optimize this free energy functional functional look exactly like that so you have these functions that have associated here calcium this is the d one but if we were looking at the a1 we see the outcomes and the states coming together as a product an and we just accumulate evidence by building building uh connection strengths that then decay again again as a function of time and then finally we have our action selection here and with these very simple rules you can start to

engineer or propose a very crude or coarse functional anatomy observations come in say visual quartets they are used to update beliefs about states in the world world hit the campus at the time these beliefs were then used to evaluate the goodness of the policy in terms of the expected free energy and ambiguity um in the front past the brexit basal gangular corticothalamic um uh loops um where the confidence in his pulses may be mediated by sodium eventually technical area and that nature around the next action that changes the world the world gives you gives you the next observation and so the cycle continues they're very true that relatively simple understanding of the computational mastery that you get to so here's here's and i'll close down with two examples one very simple example of origin in

a maze and then we'll come to um a brief survey of advances in models of this sort of happy meals trying to start things like language comprehension and reading so this is an example i don't i don't need to go through it in detail in brief what we have is a little rat our house in the teammates and it lights rewards and it's got two moons it can make two moons and it doesn't know whether the reward is on the right or on the left what it also knows though is that there is an instructional cue at the bottom arm of the maze that tells him him whether the reward is on the left or the right right so that presents an interesting choice for this place it can go to one part and once

it goes through one of the two rewarding ones it has to stay there which means that it can go there and get two rewards on 50 cents at a time where he can go there and get nothing on 15 of the time or he can go down here and to find out whether reward is and then get with 100 probability the reward for half the time so the expected value is exactly the same same but by going to the instructional theory you can immediately reduce the expanding which means that if this mass was minimizing minimizing its uh it was optimizing its expected free energy of minimizing its it should go if we simulate those equations on the previous slide with the generative model which i've written down here to the appropriate for this this paradigm

it should go and get the extent and then go and get its reward and indeed that's what it does so it starts off what i'm showing here is behavior behavior over 32 trials in terms of where the reward was it was on the left or the right and the policy that he chose chose the outcome the amount of reward and beliefs about whether the reward is on the right or the left in terms of the additional steps what i've done here is after switching the reward up to the first first couple of presentations i then left the reward reward on the left hand side and i want to see what's going to happen so initially as we might anticipate the mass goes and finds the epistemic view resolves with uncertainty about what to do and

then indulges indulges in its risk behavior so then chooses the pragmatic preferred option by going straight to the reward and is as happy as it could be however as time goes on it now learns that in fact reward is over there all the time so now the energetic value of that instructional cue gets less and less and less it's resolving less and less uncertainty because it's increasingly certain that the reward is on the left-hand side due to its experience potential learning learning about these initial states through evidence accumulation under heavier style plasticity so at one point point it changes its preferred policy and jumps to a pragmatic explosive policy so we've got this natural natural progression from exploration to exploitation that is purely a reflection of the fact that we are using a belief-based functional

because the goodness or the thing to choose depends upon my beliefs and my uncertainty whether i need to write that so to your answer search it depends upon the need for search and that costs like a very familiarism environment there isn't enough um this line just summarizes that and it's exactly the same point that so basically learning underwrites confident policy selection and that confidence is reflected in this precision parameter which was so much longer trial updates of this parameter here it looks almost exactly like deadly we can also look at the simulations of these updates during a particular trial so it seemed it makes a move that sees this makes another move because as soon as it sees anything it has to iterate these equations in order order to find the day's optimal solution the

self-evidence observation and that looks a lot like the bench-related potential potential physiological uh research and interesting what happens is there's less belief updating when it's more familiar and confident about the environment you can get an attenuation attenuation of these responses but an increase in the confidence because it knows exactly what's going to happen and indeed what it expects to happen does indeed happen and it goes and it starts its knowledge about where the reward is so using magic for example you can tell all sorts of stories you can tell a story about the representation of the future in the past so so this is the beginning of the trial the first second move and of course as these beliefs are updated updated what are beliefs about the future consequences of action now become the memories

of outcasts there's an interesting shift of time and title frames of reference that means things like what's predictions become post pictures that comes very interesting when you accumulate the from a child to child it also allows you to think about the approach responses to the things that you would ultimately try to choose as opposed to things that you are not going to choose a nice literature in the empirical papers showing exactly this form of saltation divergence as time progresses in terms of selective responses responses shown by these expectation encoded simulated simulated europa neural populations that mirrors or reflects exactly the empirical results and we can even plot these responses as a function of where the mouse is and what it emerges from this kind of architecture are place things place cells that sometimes are very

unambiguous for example example the two rewarding locations sometimes a little bit more ambitious we can also perform simulated onboard experiments with this much negativity experiments so these are the same results i showed you before but i'm now telling a different story about them using a different language as if i were an electrophysiologist doing doing bubble paradigms so what i'm going to do when is is allowed sees the same stimulus stimulus when he's familiar with it and when he's not familiar with it it does the same response selects the same policy so the only thing that's different between the perception of the action is its beliefs that it has accumulated through experience and if i associate this with a standard stimulus and this was a normal or normal stimulus and indeed we can reproduce the phenology

phenology of this much negativity do the same thing with those simulated green responses and show a classical phenomena in single unit in metrophysiology in dopamine dopamine cells may be a transfer of phasing responses responses from the rewarding view the unconditioned stimulus to the instruction like islamic key which you can think of here as the conditioned students so again nothing's changing the other that i've told a slightly different story about the results that emerge from this simulated rat and all of them lend a degree of conscious validity to its overall thesis that everything is in service of self-evidencing maximizing evidence to my joint model of the world and selecting actions that minimize uncertainty namely this ambiguity so i'm going to finish now with a very quick run through of exactly the same technology and ideas that

apply to slightly sophisticated genetic models or the soul that people might use to understand language and generate language language now i like this graphic it's a nice graphic because once you've written down formally what you think is driving message passing belief updating and behavior you can now just extend that formalism by generalizing it to hierarchical structures structures and when you do that you start to see lots of emerging behaviors that now look a lot more like the kind of behaviors that psychologists study study in human beings so what we've done here is is taking our standard little mdp model states kicking over time one times two generating through the likelihood matrix and outcome here in transitions encountered by the b matrix matrix depend on some points as pi that itself that are informed by the

expected free energy gene equivalence of those pulses what we've done is put another one of these on top crucially it operates at a slower time scale scale so the amp comes from the process at the higher hierarchical level level now cause things that don't change on a faster time scale there are lots of things we could have chosen we could determine the likelihood of matrices where we could have chosen the likelihood of a particular policy we've actually chosen here just the initial state it really means that the outcomes from the journey model's point of view of the higher level are in play for the duration of the state transitions at the lower level and you can imagine you're putting a faster level than this and a faster and faster one so you're writing in you're

baking in to your generative model model not only by high level depth of outstretching outstretching but also a deep diaphragm on time depth a separation of temporal scales scales over time and of course that's what we need to understand language you know i would have a representation representation of a sentence a phrase one level and that's the same sentence of phrase from the beginning of the first phoneme to the end of the last 20 it's the same object but i've passed the time skill that this current word will change but this current word is the same word from the beginning of the words first time frequency glide phoning possibly to the last one as we keep going lower and lower and lower we now generate faster and faster dynamics dynamics using this kind of model

so you may be asking and that's all this these equations say that they're just a higher up in generalization of the first level we use the rat into this deep dichronic structure this is exactly the same model and the reason i show this and the reason i like this model is you can generate this graph automatically from this graph and this graph is known as a factor graph now it may not mean very much to psychologists psychologists but if you're a computer scientist and you want to design the message capacity in the most efficient way this is the design so what was saying here are what this figure figure says if you can write down the form of your generative markers you have automatically written down the message passing graph and at some level level

a brain must be used in terms of connections connections and passing messages over these connections connections for those of you who are interested these photographs are interesting because they place the variables on the edges and the probability distributions at the nodes nodes that's why they're called fraction graphs so the probability distributions are are the factors of identity of the marginals you can forget that if it wasn't interesting what is interesting to remember is you can generate these things automatically and once you've done that you can start to make little brains in software that could actually be animal animal systems or very large scale silica integrated chips for example or classical uh computers during computers anyway so once you bring down the factory graph and you learn the architecture of the message passing you can actually

go to the neural network and ask where are the isomorphisms in terms of the structure of the dynamics the temporal scheduling of those messengers in real brains and it's an interesting jigsaw problem to solve but there are lots of immediate parallels that become evident when you just look at the fact about love and you look at the textbook neurologically i just sketched out some ideas here we don't need to talk about this the point is that there's a very interesting opportunity opportunity once you understand what has to be the computational anatomy if you commit to this genitive model approach and self-evidence and formulation of belief updating behavior if you commit to that then you've got that necessary computation lately you've got the empirical neural anatomy that we presume does this computation so now you can

start to look for parallels and assign different roles to different parts of the brain so for example in this instance it looks as if the goodness paradigm contains labels for policies for example just on the basis of the connectivity in reference to that photograph of the previous stage but let me finish that just by taking you through conceptually a journey model of a simulated agent is doing a very simple form of pictographic reading so here there are no letters but there are little icons and the position of these icons defends a particular word so an icon can be that comprises each word if you like comprises two icons there could be seeds a bird or a cat and if the cat is next to the bird that means the bird will flee so that's the

word flea if the seeds are linked to a bird that means the word feed where it can feed feed on the thumb on the seats if however there's nothing next to the bird so the seats are down here from the diagonal corner that just means weight so it's a very simple little language that we've you know we've arbitrarily invented um and the reason for using this pictographic form is that the agent has to decide where to look if he wants to read this word look at the letters in the word it has to decide where to look at i can look over here over here over here over here as it denoted by the locations one two three and four from the point of view of the geometry model what does that mean these are

the states the hidden states that you would need to generate an outcome so what would be essentially essentially a sensory outcome would be feeding the diet looking at positions one two three or four or what i'm actually sampling or phone creating a time which can either do nothing seeds a bird or a cat but to generate those outcomes i have to know the configuration of these pictographic letters where i am looking and i've introduced another hidden straight here which is flipping really like presenting words in upper or lowercase so with those three causes i can generate any particular outcome in a visual modality modality and a pre-receptive or feeling where i am currently pointing my eyes modality and because i have written down the generative model i can use that standard message passing scheme of

the previous slide to simulate inference inference i can simulate what this sort of creature would do in terms of foraging for information by trying to understand what word it is looking at but what i really want to do is to do that with some deep technical structure some diachronic aspect i wanted to actually remember the words and see and from the point of view magenta model generates sentences or sequences of words words from the point of view of self-evidencing self-evidencing inverting magenta law to recognize what this word is in the context of beliefs about what sentences so to do that i now have to put together four words or four different pages if you like and these four words are basically sequences of these three words here typically 50 percent three point feet uh classify

those with an even higher level which we actually didn't have nothing to show here so now if i know the sentence i don't sequence one you know three-way feed point and i know where we are in terms of which page we're looking at i can now generate the word and generate uh the word if i have now regular information about where i am looking and whether i flip it or not i can now generate the outcome and if i can generate the outcome that means i can invert the outcomes to make inferences and have beliefs about the sentence so i'm going to take that factor graph that computational activity activity and this generated model and then simulate reading in terms of where this agent looks to try and accumulate heavens and build posterior beliefs about

the sentence in his reading and the actual sentence that is reading is three weight feed weight and these are the expectations of the lowest level about what is actually there at the lowest level and the key point made in these simulations simulations is you can have very precise beliefs about what you would see if you looked over there even though you never actually looked at them them and these precise beliefs come from this deep structure so for example you can see that to the second word at no point because they actually sample a stimulus that is either seeds or birds it sees nothing on either sample sample and yet he knows because he knows what could could possibly happen in terms of the alternative structure of the sentence then the solution has to be be

seen up here and birds up here and indeed it's posterior elite is hallucinating effectively in a very positive and basical way the existence of these persons even though he never looked there and you see evidence of that if you look in detail at this sequence of of eye movements as they form beliefs they dissolve uncertainty they respond to anesthetic affordance and go and get the next sort of information information that will resolve uncertainty about what this agent is looking at i share the same results here in a different format just to emphasize the separation of temple scales so these are beliefs at the higher level about one or six sentences sentences that this synthetic agent was looking at i notice uh it's only right at the end it actually resolves it's uncertainty about the sentence

because these three sentences share everything apart from the last word so in no glass immediately you must be looking around these two sentences because the first first word is unique to these three sentences but the last word is ambiguous so this uncertainty evolves slowly and is maintained slowly to his only result by the last word in contrast the beliefs about what particular word i am currently looking at develop much much more quickly so these converge to a particular posterior belief and then that is evidence for the higher sentence-based belief and then we start here as we can get a new outcome the value outcome different visual impressions and i try to indicate that in terms of this separation of temple time scales dictated by and only by the belief updating i just want to make

a point the point again that what you actually see in these in cynical creatures is very similar to what you see as an electrophysiologist so for example pre-circadian delicate activity in the prefrontal cortex shown here in elastic format looks very similar to the kind of responses that observe some of the recordings shown here in terms of a bar chart furthermore if we look at the deflections associated with the brief updating on a stimulus by stimulus basis we've seen something that looks very much like the peripheral evoked potentials during antivision in markets and i just want to close by pursuing that by focusing on the responses to stimuli towards the end of the sentence and i'm going to play a trick on this little creature i'm going to introduce a violation two violations of a different

sort in the home while reproducing canonical responses inclined in common neuroscience namely a pre-attended like this much negativity at about 100 270 milliseconds and then a reorienting novelty-like response a later more endogenous response sometimes referred to as a p300 associated with somali violations by changing stimuli so there's a surprising surprising meaning to the word so i'm going to do that just by presenting it in uppercase i'm going to present it in lowercase so this is if you like the steepness inclination at the low level and i've shown the results of the low level in blue without the um manipulation and then plotting the difference here maybe it looks very much like this much negativity if i now do the same shape at this time instead of just changing uppercase to lowercase i now actually change

the meaning of the word but use the same stimuli i know i've got a much more semantic high level violation and now the thing that recognizes the surprise of the free energy here is first but also now the second semantic level but of course it's had to wait longer to get the evidence from the first level to be surprised to do with belief updating to respond to that surprise which means that the difference waveforms are now expressed in the peristine this time in a regime that would correspond to the p300 so it's a very particular specific example example but i use this just to mention how far you can get an understanding of classical results that are used and have been used for decades in cognitive neuroscience and electrophysiology electrophysiology that can be understood in

terms of the computational architecture and message passing under simply imperative imperative to minimize uncertainty about the way the world works and indeed how i work in that world this conclusion i think is nicely summarized in relation to eye movements by helmholtz who was the father of many of the uh um inferential interpretations of the scheme so each movement we make by which we alter the appearance of objects should be thought of as an experiment designed to test where they have understood correctly correctly the invariant relationships in the phenomenon of the forest that is their existence in definite spatial relations and with that opinion remains remains for me to thank the people whose ideas might be talking about but as well thank you for your attention thank you very much so to clarify something can we

return to your first example with the oval oval and so i have two questions related to this first example the first one is how in term of how can you describe in your theory the hungriness of the oh why is that now and why the girl has believed believed that the prey should be catched it's the first question and another one is related and maybe it's it's how i'm trying to resolve this problem is is the question is how is active interventional influential selection and because testing hypothesis is risky it's time and energy consuming i mean that you are resolving uncertainty you invest something in in terms of your energy and your time and so on and what do you think about how how natural selection shapes these prior beliefs to make them like more

efficient in terms of survival not only in terms of complete so the first question how do i account for for context specific drives you know what determines the best sort of behavior of even when i'm very hungry as opposed to when i'm just meeting so in this scheme [Applause] so here that would be a nice example of having a hierarchical challenging where your prior preferences will now become conditioned upon different states of people so if i were to and now we're interested that we are entering entering the interesting world of acting influence influence applied to the homeless spaces and spaces of interception the last way i have involved creating the next question to have a genetic model under the brain that was able to recognize that certain states states so that these were good explanations

for particular gut feelings of a particular particular inputs from say receptors measuring measuring your blood sugar and also my beliefs about when i last have something to eat then if i had those representations i would be able to do two things first of all i would be able to contextualize my prior preferences about what i think i will be doing in half an hour which could be being in a restaurant or it could be continuing to work or it could be continuing to socialize so you can contextualize any behavior simply by conditioning a parameter model at one level on both slowly context context dependent states at the higher level if you have a sufficiently deep project and that leads to all sorts of interesting issues you know the distinction between homeless spaces and other spaces

and so i would imagine that the virus unlike the earth doesn't worry about whether it's hungry or not people just want let us take an e coli the e coli doesn't worry about whether it's hungry or not it just consumes chemotactic gradients and um immediately minimizes uh it chooses policies of a very trivial saw which is just the very next thing to do um based upon fixed price that never changed the homeostatic standpoint that's very different difficult having another level above which recognizes a different state um that would then actually change the satellite satellite at that point would be from a homostatic reflex reflex to our spaces where we now start to anticipate anticipate the causes of our behavior for our homeless places so i will go and eat something before i need to have

a difficult autonomic reflex until my very very lower ones so i think it is a great question because what it speaks to is about we're not dealing with simple genetic variables for a thermostat or a virus we're dealing with very deeply structured and i repeat diatronic in the sense of separation and temporal time spills spills which are the kinds of models which would be devastating and possibly even the bacteria and then the question answer that by is your um third question the model is just a price so the price would be of a formal or a structural sort how many levels does my model have so if you were doing machine learning and and deep learning let's say using a variational encoder you have priorities when you say yes there are 12 hidden layers and

hidden there 6 has 270 units you know all of these are formed files and you believe are fitting the purpose and appropriate to the kind of data that you want to classify and then of course you actually have the parametric prize the actual prior expectations the inspirations and everything that you would normally associate with so by model i need promise and you know obviously there's an idea but that's usually because the likelihood just lives at the bottom of these models so if we go to that this graphic here all of these are products products the only time becomes evident is right at the bottom here so all the interesting structure in terms of its depth and conditioning and contextualization is in the prior structure and the actual quantity of parametric prize prize you apply to

all of these variables so that brings you me to you to your second question where do they come from and again you appeal to hierarchical immigrants but now over a more extended time frame so from the point of view of statistician the way that you would learn the good prize prize for a particular environment for a particular sort of data by monitoring learning for example or structural learning which used mainly modern selection so now now you are selecting the form of those models models that maximizes more importance for the average sharing time and if you look for the physiological common likelihood of that it could be things like mindfulness i will be giving a lecture that demonstrates that point towards the end at some point this week so uh opportunity to discuss this further uh

but probably more interesting i think all of this takes place at an evolutionary time scale so the way that mathematically you bring all of these things um under the same principle is to interpret natural selection as bayesian model selection and that's actually very easy to do by not only that in the past 10 years people have now started to interpret in theoretical biology the models or the dynamics of natural selection selection in terms of asian filtering so for example example the replicated dynamics or the replicator replicator or price equation these are can be very easily shown to inspect with abusive abusive simplified assumptions to they can be shortly a bayesian filters or caliber built built what you're saying is that evolution is just nature's way of doing amazing amazing selection to change or to select

those models those prior structures as phenotypes phenotypes that have the greatest evidence what do you mean by heavens the likelihood that they are there i hope that they are about to orange for that environment so from the environment point of view it is testing the hypothesis that this is a good thing for me the environment or this phenotype or this phenotype and the evidence that it is a good fit for me in the environment corresponds to adaptive thickness you scored scored by the time interval of the variation of the energy because that's the bandwidth of the probability of that model being the right model given the experience provided by the environment so the answer is it all comes down to evolution it's exactly the same process operating very very very slowly and in that context

you wouldn't think about the phenotype as being the individual is it possible in terms of your theory to describe just to describe and explain the huge quality quality difference between humans and other animals animals and he's there already maybe in that presentation by the theory and he because i think it speaks very much to what we were just talking about i think it's just the depth of the german model that distinguishes the kinds of structures structures that you associate with being you as opposed to say a dog as opposed to say a fish say so just the district between single cell organisms and multicellular organisms it immediately speaks to some hierarchical aspect a nesting aspect but if you take that further and just think what's what kinds of hierarchical extensions would make you you as

opposed to a bacterium bacterium or being mean as opposed to a bacterium i think the answer is in this depth in time i think that's basically it i think it almost reduces to the very simple observation observation that you and i plan and the bacteria doesn't doesn't and we can argue about when the being deals your dog does but certainly the bacterium doing its chemotaxis does not count it does not plan to go to school or to buy birthday presents whereas we do so what do you need to plan plan to plan this to select a month's various policies policies which means you have to have a genetic problem in the future the consequences of those pulses so that's just simply a statement of the fact that we infer our world under generative models that

have a temporal depth they have a horizon that goes beyond the present present that enables us to plan and to think about it what would happen if i did that now because as soon as you have that capacity capacity you've now got to make a chance so the bacteria doesn't have a choice about what it does but you and i do and i think that's of qualitative distinction which again is just a structural fire is the thing that distinguishes higher life from lower life it's not the same um you know for any particular economic environment there will be a free energy extreme extreme so viruses are great for fitting and inferring their video their environment which is usually another person's cell another cell we'd be very proud of that i'd have to compress you down

very very small small you would not do very well inside a cell but the virus will not do very well at university university so it's all in relation to um the way of your sensory inputs from your environment at that your time scale and once you acknowledge that that there are lots of global minima that correspond to all sorts of different ways of being both in terms of species but also humanized within the species then you can see now an easy taxology in terms of hierarchal depth and sophistication and particularly temporal depth that i think comfortably separates us from other animals other animals from planets uh i much like to put it like a little bit away from let me say a scientific point of view towards a big kind of uh theoretical future prediction

whatever do you think it is possible first question uh to fit a current state of psychological outlooks towards uh personal belief motivations and beliefs into the scheme that you're describing like saying that those uh beliefs that are like like wrongly you know into the person and they could also be kind of hidden states for the person which is not recognizing um and if so is the responsibility your point of view to create like a maybe backed software which will be impossible to predict um uh behavior of each kind of like certain person in certain conditions like i've given kind of enormous power outside you know something i'll just um a practical one um and take your question as a psychiatrist because then it becomes very important in terms of understanding made beliefs of false beliefs

and the remediation remediation of that false influence that may or may not be a good thing or a bad thing but certainly being able to do it and understand what you're doing is practically very relevant confidence could be psychology or it could be marching by drugs so the idea here is that um most psychiatric syndromes can be thought of as a form of false or buried influence so a lucillation would be basically believing a person that something is there when there's an ancestry evidence for it and illusion is having a conviction in some state of affairs affairs usually interpersonal for which there is no no evidence that a normal person would accept us attendance foreign things like anorexia and disorder phobia holding beliefs about myself which have no evidence when i look at myself in

a miracle i see myself as fat quite a scene by the third third that i am flat even when you kind of see things very very thin so you can certainly get an enormous variety of posterior deletes i'd say that like a constable you could bring them to a certain number of fears like like death hunger whatever just i mean that's why i'm asking so those i believe are not not uncountable to me there can be brought to like maybe but still a final number of countable factors you know yes uh there will be mathematical reasons for competition um so if that's the case if there is an opportunity opportunity there's political evidence for a parent relief relief updating then you people have been focusing on how that could happen and it seems that the

most important synaptic mechanism that enables that sort of false inference is in the precision afforded to various sources of sensory evidence relative to priorities priorities so if you're a psychologist you you think of that in terms of retention so now the game becomes shifting from how we form our beliefs update our beliefs to how we attend to different things and we were talking just before about big data but the real problem is is to select select what you accredit or assign a position that enable that enables that sort of information to update your beliefs so i think in terms of understanding the mechanics of relief updating and intervening in that kind of therapeutically more for commercial reasons i think advertising for example it's probably going to be all about how you manipulate attention and what

you mean by attention and the physiological mechanism mechanisms of protection and also just gaining some high level control of attention to be able to mentalize suppression so this is the phenomenon when i move my eyes as i look around the room during the motion of the eyes i do not see the object i don't see the world's sweet person all i see are the static samples that i then integrate integrate into a coherent theme so that circadic suppression is a very interesting example of temporarily ignoring or reducing the precision precision of the likelihood mapping here in a very very um temporally precise way which you cannot affect five at least higher higher in the world now imagine that you now had a connection in the genetic model so you had you felt hard reason to

talk about contextualizing before saying some higher level representation myself or myself state itself at this point in time you can actually now then suspend that surcharge oppression so you could choose whether to see the world really when you move your eyes or not if you have to say you can so there are ways of getting around it if you have a sufficiently deep degenerative model so just in terms of practically what you would be looking at i think you'll be looking at the processes the media and the precision and the degree of message opacity between these hierarchical levels that would look very much like what you treat us and use if you get foreign of that for yourself and possible for a patient that you you may well have an enormous therapy thank you i

wasn't talking like exactly about therapeutic affects like protein association system which is still being bad you know so can you give me a call um well it is same thing um like as you said said advertisement right so it's not like kind of like changing human attitudes explaining that facts are bad or whatever whatever you know uh not hypnotized that's right but they just predict what kind of uh disorder will react to this kind of like what will be the way that people with these or that kind of disorder not psychiatric psychiatric like that like uh epithelial level you know know um to this certain message uh like the behavior non-humanized paths of that so the builders still die will they still like will they say no will they say yes i'm talking about kind

of a prognostication model not like improvement but just improvement of the management not improvement of your agents themselves this particular mathematical formulation hasn't really gone that far it's penetrated theoretical biology in terms of multi-ancient games um construction and mythology so there are simulations of how multiple agents you can each construct and constrain simulations at the level of cells organizing different patterns and coalitions coalitions but nobody to my knowledge has taken these these variational free energy solutions and to model markets of advertising or geopolitical events that provides a really exciting opportunity but to my knowledge knowledge it hasn't been done it's only been initially invested in the past couple of years years in mythology and in uh revolutionary psychology psychology and in the world genesis well thank you very much for an interesting presentation i would

like to ask several questions concerning our mark of our processes uh first of all the principal feature called market processes processes is that the future is determined by the past only while present is this fact resulted from the application application or it's only their simplification to be able to solve mathematical problems associated with problem-solving the second question is on this slide we would do uh there is two magnitudes uh entropy and energy entropy is non-dimensional [Music] [Music] energy so can i ask you as a physicist um so a really interesting question first of all you get to what extent um are the markovian independences or um markovian process assumptions implicit in the markup decision process do we think what is one thing that these are are the only kinds of genetic models that are fit

for the universe in which we live i think technically um all of this maths inherits from the treatment of random developmental systems systems castle is a large vibrant system in which case the markovian properties would be true and fundamental that's where you start so um my answer would be they are not a device of that matter of convenience convenience they are actually baked in to the underlying premise [Applause] [Applause] and that also enables me to um address the third question um what we are basically solving here are not ordinary differential equations that would follow from a stochastic differential equation that you can treat and launch our performance paths they could be applied to the density delays for example uh equation on the density dynamics of a strong wave equation or a master equation equation so

much of this really all of these simulations are actually up to great intercepts on functions where there's a function of probability distribution distribution very much like uh the vodka plank equation so that's the belief aspect that we're moving away from gradient flows on functions and states of gradient flows and path functions of states states but interpreting them now in terms of great inflows that are functionals of beliefs and coding by states it's a big technical review if you'd like to discuss that further in email i'll send you something never dead that takes you through those arguments uh very precisely um the interesting thing um though is is the coming back to the first first question because clearly this particular mdp is semi-markovian i have broken the markovian property because i've now got two time scales

in place here so in practice what happens with these immensely complicated itinerant attracting sex that we use as the mathematical model of self-organizing systems like biological systems systems when you summarize it in terms of a markovian-like process with a discrete uh in speech states and in speed time it looks like it has now acquired a semiconductor semiconductor aspect so it is interesting that from the purely markovian physics of it due to the complicated nature of the systems that we like to study when you find the free energy optimizing models they turn out to be hierarchical with that briefing the civil market to make the decision so at that point i think that the discrete discrete monarch decision process does become a mathematical approximation to what is actually a continuous time markovian process that looks as

if it's lost its uh simple markovian property because it's just become so active so complicated complicated so structured and so interesting from a biological biological perspective is that a reasonable answer