Difference between revisions of "Learning System"
DurkKingma (talk | contribs) |
DurkKingma (talk | contribs) |
||
Line 3: | Line 3: | ||
== Glucose level estimation == | == Glucose level estimation == | ||
The most top-level functioning of the learning system is to give near future glucose level estimation. The current glucose level estimation is done by (1) taking the last glucose measurement, and then (2) adding up the typical | The most top-level functioning of the learning system is to give near future glucose level estimation. The current glucose level estimation is done by (1) taking the last glucose measurement, and then (2) adding up the typical glycemic response (''glucose rise/fall'') of all events since the last measurement. | ||
=== The function f(t) === | === The function f(t) === |
Revision as of 10:59, 3 June 2006
This page reflects my/our idea about the Learning system. It consists of a few interdependent systems for functions of estimation, inference and storage.
Glucose level estimation
The most top-level functioning of the learning system is to give near future glucose level estimation. The current glucose level estimation is done by (1) taking the last glucose measurement, and then (2) adding up the typical glycemic response (glucose rise/fall) of all events since the last measurement.
The function f(t)
The glycemic response of each event is modelled in terms of a glucose rise/fall in function of time: f(t). Time real time t is mapped to discrete intervals of 15 minutes. Event types are split into distinct categories (see below). For computational conveniance, each event type category c is modelled by a function fc(t), and each concrete individual event type is modelled as transformation of that function using parameters a and b: a*fc(b*t).
- Food intake. Usually has a positive glycemic effect. It appears to be modelled as:
<math>f(t) =
a * exp \left [ \left ( \frac{t-b }{0,667*b} \right )^2 \right ] +
(a/2) * exp \left [ \left ( \frac{t-2b}{0,667*b} \right )^2 \right ] + (a/4) * exp \left [ \left ( \frac{t-3b}{0,667*b} \right )^2 \right ] </math>
- Insulin intake. Usually has a negative glycemic effect.
- Stress level.
- Time of the day, because glucose levels structurally differ during the day.
- Health status.
- Other event types.
[Add picture here]
g2 Estimation
As told above, the estimate for future moments in time is done by taking the last glucose measurement and adding the sum of glycemic responses of events. If g1 at <math>t_{g1}</math> is the last glucose measurement, g2 at <math>t_{g2}</math> is the glucose level to be estimated, and <math>(e_1,e_2,...,e_n)</math> events that have influence on g2. <math>(f_1,f_2,...,f_n)</math> are the estimated functions of the events. <math>(t_1,t_2,...,t_n)</math> are the (start) times of each events. Then the glucose prediction g2 at <math>t_{g2}</math> is:
<math>g2_{estimate}(t_{g2}) = g1 + \sum_{k = 1}^n \left ( f_k(t_{g2}-t_k)-f_k(t_{g1}-t_k) \right )</math>
On events
Like told above, the term 'event' can be things like apple intake. Our definition is broader then that: events can also be composite. A composite event is a set or cominbation or multiple single events. Why use composite events? Because, for example, eating different food types combined leads to a different glycemic response then the sum of individual foods. Eating certain food types nullify the effect of other foods. Another positive thing about compositive events is that it decreases the amount of events in the sum of g2_estimate (see above). Less summation means less uncertainty about the estimate.
Creation of a new event type
What needs to be done when a new event type is created, for example when a user eats something new or gets new insulin therapy? The first the system needs to create is an a priori estimate of f(t). This is called the a priori function. For food, this would be based on carbonhydrate count. For insulin, this would be done by entering medicine information. A better a priori fprior(t) means the system needs less training time to estimate the real function f(t). When evidence arrives in the form of a sample, an a posteriori fpost(t) is formed that esimates the real function f(t). A sample is an observation value of f(t) at some t. More evidence/samples means a better a posteriori fpost(t).
In other words:
- Better prior knowledge (carbonhydrate count etc) leads to a better fprior(t)
- A better fprior(t) leads to a better fpost(t)
- More samples leads to a better fpost(t)
- A good fpost(t) means it is close to the real f(t)
Significance of good fprior(t)
In our case, we will see that the samples are estimations too. Later on, we will conclude that better fpost(t) functions lead to better estimations of samples. In the 'bigger picture', this means that bad-quality fprior(t)'s implicates inititally bad-quality fpost(t)'s, which in turn implicate intially bad-quality samples, leading to initially slow progression of inference. This is important to know, because quality fprior(t)'s are VITAL to fast initial inference. Concretely said, good a priori functions will decrease the startup time significantly, maybe from months to just weeks or days.
Generating of a fprior(t) or its prior parameters
So what are the steps of creating fprior(t) for certain event types? For...
- Food intake, calculate the a and b paramaters (for information about these parameters, see above). [Mapping of Carbonhydrate count to a and b parameters to be added]
- Insulin intake. [To do]
- Stress level. [To do]
- Time of the day. [To do]
- Health status. [To do]
- Other event types. [To do]
Attributes of event types
Summarizing what we have said above, each event type has the following attributes:
- An a priori function fprior(t)
- A (initially empty) set of samples, each a tuple {t,dg} with t=time and dg=delta-g, the glycemic response.
- An a posteriori function fpost(t)
The following section will describe the process of computation of fpost(t).
Bayesian Inference
So how does the system calculate fpost(x)? And how are the samples created?
Statistical nature of the function f(t)
In the texts above, we wrote about the glycemic response functions f(t), like fprior(t) and fpost(t) functions. For inference reasons, because we are using bayesian inference, we must describe the problem in terms of statistics. Following this viewpoint, one could say at t, there is a mean estimated value and variance value indicating the mean error. This way we describe each the function in terms of a normal (Gaussian) distribution. So each point t doesnt map to just one value, but to two: mean <math>\mu</math> and squared variance <math>\sigma^2</math>, written as <math>\mu \pm \sigma^2</math>. We say that the variance is a static, and the mean value μ is the to be estimated variable, or the unknown parameter θ.
This unknown parameter θ is exactly the thing we need to know for each f(t). This is the parameter where we define the prior function for, and each to
o be clear: the fprior(t) function defines the prior mean values for an effect. Likewise, The fpost(t) function defines the posterior mean values of an effect, initially equal to fprior(t).
This way, using samples, we can use Bayesian inference to calculate a μpost for each t.
Learning System, ignite your engine!
Assume that with formula's described in above sections, we are given a group of event types, each event type has a fprior(t). Additionally, each event type has an initially empty set of samples <math>{s_1 s_2,...,s_n}</math>.
Assume the last glucose measurement was g1 at tg1 with value g1. Now we do a new glucose measurement g2 at tg2 with value g2. The set of events that have impact on glucose level g2 is <math>{e_1,e_2,...,e_n}</math> each having an event type with attributes described above plus a multiplicity indicator. So e1 has a μe1 and a
The first thing we calculate is the helper variable a:
<math>a = \frac{g2-(\mu_e1+\mu_e2+...)}{\sigma_1^2+\sigma_2^2+...}</math>
Bayesian inference
Events are things like food intake (eg one glass of lemonade), insuline intake (one unit of type X), sports (half an hour of running), but also current health status and stress level etc. Before the system learns anything, each event is assigned an 'a priori' estimating curve, which tells us how, in time, the estimated effect on the blood glucose level 'g'. This a priori curve is assigned before any measurements have been made (example: the a priori curve for food could be based on known carbonhydrate). Quickly said, the learning system uses the blood glucose measurements to update and improve the estimating curve.
Lets describe these events, their effects and their computations, in terms of a statistics problem.
About events
Each single event <math>e_i</math> has three things.
1) Firstly: A set of samples. Each sample is a tuple (Δt, Δg) So the set of samples could be represented on a 2-dimensional area. The sample set is initially empty, and samples are added through bayesian inference (explained below). [For extra clarity, image to be added here].
2) A prior (a priori) function fe,prior(Δt) → Δg = μprior. This is the estimated mean effect of the event, for each determined before any samples have arrived. For food, it could be determined by looking at carbonhydrate amount. For insuline, it could be determined by medicine information. If no prior function can be made, an effect is assigned a default prior function. The prior function also has a pre-determined variance σprior². Spoken in statistic terms, the event effect at each moment in time has a normal distribution with:
<math>\sigma_e^2 = \mbox{some-static-value} \,</math>
<math>\mu_e = \theta \,</math>
This parameter θ is unknown, but it has a prior distribution with
<math>\sigma_{\theta ,prior}^2 = \mbox{some-a-priori-value} \,</math>
<math>\mu_{\theta, prior} = f_{e,prior}(\triangle t) \,</math>
3) A posterior (a posteriori) function fe,post(Δt) → Δg. This is the esimated effect of the event after looking at the samples. It is determined as follows. The samples are divived into give time intervals, for example 15 minutes. So we have intervals ti with i=(1, ..., n), and each interval representing 15 minutes. Each of these intervals ti have a distribution θ with a prior distribution as explained above. The posterior distribution of ti is calculated as follows:
<math>\sigma_{\theta, post}^2=\frac{\sigma_{\theta, prior}^2\sigma_{\theta, prior}^2}{\sigma_e^2+n\sigma_{\theta, prior}^2}</math>
<math>\mu_{\theta, post}=\frac{\sigma_{\theta, prior}^2\mu_{\theta, prior}+n\sigma_{\theta, prior}^2\bar x_n}{\sigma_e^2+n\sigma_{\theta, prior}^2}</math>
To assign an evidence xi to each individual events ei you do:
<math>x_i=\mu_{prior}+a*\sigma_i^2</math>
where
<math>a = \frac{x_{tot}-(\mu_1+\mu_2+...)}{\sigma_1^2+\sigma_2^2+...}</math>
So it comes down to some quite simple math. I'll make my explanation better when I have more time.