Learning System

2006-06-04T11:48:52Z

DurkKingma: /* 2c. Update posterior values */ correction of formula

This page reflects my/our idea about the Learning system. It consists of a few interdependent systems for functions of estimation, inference and storage.

== Glucose level estimation ==

[[Image:Summing_events.png|thumb|right|Example of glucose level estimation.]]

The most top-level functioning of the learning system is to give near future glucose level estimation. The current glucose level estimation is done by (1) taking the last glucose measurement, and then (2) adding up the typical glycemic response (''glucose rise/fall'') of all events since the last measurement.

=== The function f(t) ===

The glycemic response of each event is modelled in terms of a glucose rise/fall in function of time: f(t). Time real time t is mapped to discrete intervals of 15 minutes. Event types are split into distinct categories (see below). For computational conveniance, each event type category ''c'' is modelled by a function fc(t), and each concrete individual event type is modelled as transformation of that function using parameters a and b: a*fc(b*t).

* Food intake. Usually has a positive glycemic effect. It appears to be modelled as:

<math>f(t) =
a * exp \left [ \left ( \frac{t-b }{0,667*b} \right )^2 \right ] +
(a/2) * exp \left [ \left ( \frac{t-2b}{0,667*b} \right )^2 \right ] +
(a/4) * exp \left [ \left ( \frac{t-3b}{0,667*b} \right )^2 \right ]
</math>

* Insulin intake. Usually has a negative glycemic effect.

* Stress level.

* Time of the day, because glucose levels structurally differ during the day.

* Health status.

* For pregnancy diabetes: progress of pregnancy. Its hormone decreases insulin sensivity.

* Other event types.

=== g2 Estimation ===

As told above, the estimate for future moments in time is done by taking the last glucose measurement and adding the sum of glycemic responses of events. If g1 at <math>t_{g1}</math> is the last glucose measurement, g2 at <math>t_{g2}</math> is the glucose level to be estimated, and <math>(e_1,e_2,...,e_n)</math> events that have influence on g2. <math>(f_1,f_2,...,f_n)</math> are the estimated functions of the events. <math>(t_1,t_2,...,t_n)</math> are the (start) times of each events. Then the glucose prediction g2 at <math>t_{g2}</math> is:

<math>g2_{estimate}(t_{g2}) = g1 + \sum_{k = 1}^n \left ( f_k(t_{g2}-t_k)-f_k(t_{g1}-t_k) \right )</math>

== On events ==

Like told above, the term 'event' can be things like apple intake. Our definition is broader then that: events can also be composite. A composite event is a set or cominbation or multiple single events. Why use composite events? Because, for example, eating different food types combined leads to a different glycemic response then the sum of individual foods. Eating certain food types nullify the effect of other foods.
Another positive thing about compositive events is that it decreases the amount of events in the sum of g2_estimate (see above). Less summation means less uncertainty about the estimate.

=== Creation of a new event type ===
What needs to be done when a new event type is created, for example when a user eats something new or gets new insulin therapy? The first the system needs to create is an ''a priori'' estimate of f(t). This is called the ''a priori'' function. For food, this would be based on carbonhydrate count. For insulin, this would be done by entering medicine information. A better ''a priori'' fprior(t) means the system needs less training time to estimate the real function f(t). When evidence arrives in the form of a ''sample'', an ''a posteriori'' fpost(t) is formed that esimates the real function f(t). A sample is an observation value of f(t) at some t. More evidence/samples means a better ''a posteriori'' fpost(t).

In other words:
* Better prior knowledge (carbonhydrate count etc) leads to a better fprior(t)
* A better fprior(t) leads to a better fpost(t)
* More samples leads to a better fpost(t)
* A good fpost(t) means it is close to the real f(t)

=== Significance of good fprior(t) ===
In our case, we will see that the samples are estimations too. Later on, we will conclude that better fpost(t) functions lead to better estimations of samples. In the 'bigger picture', this means that bad-quality fprior(t)'s implicates inititally bad-quality fpost(t)'s, which in turn implicate intially bad-quality samples, leading to initially slow progression of inference. This is important to know, because quality fprior(t)'s are VITAL to fast initial inference. Concretely said, good a priori functions will decrease the startup time significantly, maybe from months to just weeks or days.

=== Generating of a fprior(t) or its prior parameters ===
So what are the steps of creating fprior(t) for certain event types? For...

* Food intake, calculate the ''a'' and ''b'' paramaters (for information about these parameters, see above). [Mapping of Carbonhydrate count to ''a'' and ''b'' parameters to be added]

* Insulin intake. [To do]

* Stress level. [To do]

* Time of the day. [To do]

* Health status. [To do]

* Other event types. [To do]

=== Attributes of event types ===

Summarizing what we have said above, each event type has the following attributes:
* An a priori function fprior(t)
* A (initially empty) set of samples, each a tuple {t,dg} with t=time and dg=delta-g, the glycemic response.
* An a posteriori function fpost(t)

The following section will describe the process of computation of fpost(t).

== Bayesian Inference ==

So how does the system calculate fpost(x)? And how are the samples created?

=== Statistical nature of the function f(t) ===

In the texts above, we wrote about the glycemic response functions f(t), like fprior(t) and fpost(t) functions. For inference reasons, because we are using bayesian inference, we must describe the problem in terms of statistics. Following this viewpoint, one could say at t, there is a ''mean'' estimated value and variance value indicating the mean error. This way we describe each function in terms of a normal (Gaussian) distribution. So each point ''t'' doesnt map to just one value, but to two: mean <math>\mu</math> and squared variance <math>\sigma^2</math>, written as <math>f(t) = \mu_t \pm \sigma_t^2</math>. The variance <math>\sigma_t^2</math> is a static value, and we assign some reasonable value to it, defined by the event type (like 3 for food or something). The mean value μ is the to be estimated variable, or the unknown parameter θ. This unkonwn paramter θ is exactly (and only) thing we need to learn for each t. θ is where its all about. And each event type has a whole line of θ's because it has one θ for each ''t''. To be able to compute θ, we need to see it as a normal distribution too: <math>\theta = \mu_\theta \pm \sigma_\theta^2</math>.

And now we can use our prior and posterior functions. The fprior(t) function defines the prior mean values <math>\mu_\theta \pm \sigma_\theta^2</math> for each t for each event type. Through bayesian inference, which functions we will soon explain, we will compute the <math>\mu_\theta \pm \sigma_\theta^2</math> for each t for each event type: the post(t) function.

Using samples, we can use Bayesian inference to compute μpost for each t. This will be explained in the following section.

=== Learning System, ignite your engine! ===

Assume that with formula's described in above sections, we are given a group of event types, each event type has a fprior(t). Additionally, each event type has an initially empty set of samples <math>\{s_1 s_2,...,s_n\}\!</math>.

Assume the last glucose measurement was g1 at tg1 with value g1. Now we do a new glucose measurement g2 at tg2 with value g2. The set of events that have impact on glucose level g2 is <math>\{e_1,e_2,...,e_n\}\!</math>. Each event ek has an event type with attributes described above, a timestamp tek, and a multiplicity indicator.

==== 1. Calculate helper variable a ====

The first thing we calculate is the helper variable ''a''. Each event i has a <math>f_i(t_{g2}-t_{ei}) \to \mu_{\theta,post} \pm \sigma_{\theta,post}^2\!</math>. If <math>\mu_i \pm \sigma_i\!</math> are (synonyms of) these posterior mean and variance values for event i, and (g2-g1) is the measured glucose rise/fall, then:

<math>a = \frac{(g2-g1)-(\mu_e1+\mu_e2+...)}{\sigma_1^2+\sigma_2^2+...}\!</math>

==== 2. Update event knowledge ====

This step is looped trough all events i in <math>\{e_1,e_2,...,e_n\}\!</math>. Furthermore this loops through composite events <math>{e_a+e_b+e_c,e_d+e_e,...}</math> which are combinations events that happen at the same time. Time t is measured in intervals, so the chance that events happen at the same t is quite big.

==== 2a. Calculate subsample <math>s_i</math> of <math>e_i</math> ====

Now we can calculate the subsample <math>s_i</math> for event k. What is this? The user did a new glucose measurement, and the system sees the difference in glucose level (g2-g1): in statistical terms, (g2-g1) is our new composite sample <math>s_{tot}</math>. This sample is actually a composite sample, because it is caused by the sum of all events <math>\{e_1,e_2,...,e_n\}\!</math>. So to add a sample to each single event, the system needs to divide this composite sample into subsamples, one for each event. Using our calculated helper variable ''a'', we do this using this simple formula:

<math>s_i = \mu_i + a \times \sigma_i \!</math>

For proof, ask me.

While <math>s_i\!</math> is the most likely subsample of <math>s_{tot}\!</math>, it is still an estimation. How well it comes close to the 'real' value of <math>s_i\!</math> depends on the precision of the posterior variables <math>\mu_{\theta,post} \pm \sigma_{\theta,post}^2\!</math>. Like written above, the initial values of these 'a posteriori' variables are close to the 'a prior' variables, so it can't be pressed enough that these prior variables are important.

(Maybe an improvement would be to store the supersample in combination with the event types somewhere. Old supersamples can then be used again to compute even more likely posterior distributions. This whole routine can then be iterated of all glucose measurements)

==== 2b. Add <math>s_i</math> to <math>e_i</math>'s sample set ====

Now that we have a new subsample, we can add it to its sample set: <math>\{s_1 s_2,...,s_n,s_i\}\!</math>. Now this is used to update the posterior values of i.

==== 2c. Update posterior values ====

If <math>\sigma_t^2</math> is the static event variance, and <math>\bar s</math> is the mean value of <math>\{s_1 s_2,...,s_n,s_i\}\!</math>, then:

<math>\mu_{\theta, post}=\frac{\sigma_t^2\mu_{\theta, prior}+n\sigma_{\theta, prior}^2 \bar s}{\sigma_t^2+n\sigma_{\theta, prior}^2}</math>

<math>\sigma_{\theta, post}^2=\frac{\sigma_t^2\sigma_{\theta, prior}^2}{\sigma_t^2+n\sigma_{\theta, prior}^2}</math>

For proof, see "Morris H.DeGroot and Mark J.Schervish. ''Probability and Statistics, third edition'': blz 330".

==== 2d. Update posterior f(x) ====

Now the function f(x), or its parameters, can be updated similar to the methods in "Generating of a fprior(t) or its prior parameters". The posterior function should approximate the measured samples as good as possible.

==== 2e. Repeat ====

Repeat 2a-2c for each i, and all worthfull composites.

Learning System

2006-06-04T10:06:49Z

DurkKingma: /* 2d. Update posterior f(x) */

This page reflects my/our idea about the Learning system. It consists of a few interdependent systems for functions of estimation, inference and storage.

== Glucose level estimation ==

[[Image:Summing_events.png|thumb|right|Example of glucose level estimation.]]

The most top-level functioning of the learning system is to give near future glucose level estimation. The current glucose level estimation is done by (1) taking the last glucose measurement, and then (2) adding up the typical glycemic response (''glucose rise/fall'') of all events since the last measurement.

=== The function f(t) ===

The glycemic response of each event is modelled in terms of a glucose rise/fall in function of time: f(t). Time real time t is mapped to discrete intervals of 15 minutes. Event types are split into distinct categories (see below). For computational conveniance, each event type category ''c'' is modelled by a function fc(t), and each concrete individual event type is modelled as transformation of that function using parameters a and b: a*fc(b*t).

* Food intake. Usually has a positive glycemic effect. It appears to be modelled as:

<math>f(t) =
a * exp \left [ \left ( \frac{t-b }{0,667*b} \right )^2 \right ] +
(a/2) * exp \left [ \left ( \frac{t-2b}{0,667*b} \right )^2 \right ] +
(a/4) * exp \left [ \left ( \frac{t-3b}{0,667*b} \right )^2 \right ]
</math>

* Insulin intake. Usually has a negative glycemic effect.

* Stress level.

* Time of the day, because glucose levels structurally differ during the day.

* Health status.

* For pregnancy diabetes: progress of pregnancy. Its hormone decreases insulin sensivity.

* Other event types.

=== g2 Estimation ===

As told above, the estimate for future moments in time is done by taking the last glucose measurement and adding the sum of glycemic responses of events. If g1 at <math>t_{g1}</math> is the last glucose measurement, g2 at <math>t_{g2}</math> is the glucose level to be estimated, and <math>(e_1,e_2,...,e_n)</math> events that have influence on g2. <math>(f_1,f_2,...,f_n)</math> are the estimated functions of the events. <math>(t_1,t_2,...,t_n)</math> are the (start) times of each events. Then the glucose prediction g2 at <math>t_{g2}</math> is:

<math>g2_{estimate}(t_{g2}) = g1 + \sum_{k = 1}^n \left ( f_k(t_{g2}-t_k)-f_k(t_{g1}-t_k) \right )</math>

== On events ==

Like told above, the term 'event' can be things like apple intake. Our definition is broader then that: events can also be composite. A composite event is a set or cominbation or multiple single events. Why use composite events? Because, for example, eating different food types combined leads to a different glycemic response then the sum of individual foods. Eating certain food types nullify the effect of other foods.
Another positive thing about compositive events is that it decreases the amount of events in the sum of g2_estimate (see above). Less summation means less uncertainty about the estimate.

=== Creation of a new event type ===
What needs to be done when a new event type is created, for example when a user eats something new or gets new insulin therapy? The first the system needs to create is an ''a priori'' estimate of f(t). This is called the ''a priori'' function. For food, this would be based on carbonhydrate count. For insulin, this would be done by entering medicine information. A better ''a priori'' fprior(t) means the system needs less training time to estimate the real function f(t). When evidence arrives in the form of a ''sample'', an ''a posteriori'' fpost(t) is formed that esimates the real function f(t). A sample is an observation value of f(t) at some t. More evidence/samples means a better ''a posteriori'' fpost(t).

In other words:
* Better prior knowledge (carbonhydrate count etc) leads to a better fprior(t)
* A better fprior(t) leads to a better fpost(t)
* More samples leads to a better fpost(t)
* A good fpost(t) means it is close to the real f(t)

=== Significance of good fprior(t) ===
In our case, we will see that the samples are estimations too. Later on, we will conclude that better fpost(t) functions lead to better estimations of samples. In the 'bigger picture', this means that bad-quality fprior(t)'s implicates inititally bad-quality fpost(t)'s, which in turn implicate intially bad-quality samples, leading to initially slow progression of inference. This is important to know, because quality fprior(t)'s are VITAL to fast initial inference. Concretely said, good a priori functions will decrease the startup time significantly, maybe from months to just weeks or days.

=== Generating of a fprior(t) or its prior parameters ===
So what are the steps of creating fprior(t) for certain event types? For...

* Food intake, calculate the ''a'' and ''b'' paramaters (for information about these parameters, see above). [Mapping of Carbonhydrate count to ''a'' and ''b'' parameters to be added]

* Insulin intake. [To do]

* Stress level. [To do]

* Time of the day. [To do]

* Health status. [To do]

* Other event types. [To do]

=== Attributes of event types ===

Summarizing what we have said above, each event type has the following attributes:
* An a priori function fprior(t)
* A (initially empty) set of samples, each a tuple {t,dg} with t=time and dg=delta-g, the glycemic response.
* An a posteriori function fpost(t)

The following section will describe the process of computation of fpost(t).

== Bayesian Inference ==

So how does the system calculate fpost(x)? And how are the samples created?

=== Statistical nature of the function f(t) ===

In the texts above, we wrote about the glycemic response functions f(t), like fprior(t) and fpost(t) functions. For inference reasons, because we are using bayesian inference, we must describe the problem in terms of statistics. Following this viewpoint, one could say at t, there is a ''mean'' estimated value and variance value indicating the mean error. This way we describe each function in terms of a normal (Gaussian) distribution. So each point ''t'' doesnt map to just one value, but to two: mean <math>\mu</math> and squared variance <math>\sigma^2</math>, written as <math>f(t) = \mu_t \pm \sigma_t^2</math>. The variance <math>\sigma_t^2</math> is a static value, and we assign some reasonable value to it, defined by the event type (like 3 for food or something). The mean value μ is the to be estimated variable, or the unknown parameter θ. This unkonwn paramter θ is exactly (and only) thing we need to learn for each t. θ is where its all about. And each event type has a whole line of θ's because it has one θ for each ''t''. To be able to compute θ, we need to see it as a normal distribution too: <math>\theta = \mu_\theta \pm \sigma_\theta^2</math>.

And now we can use our prior and posterior functions. The fprior(t) function defines the prior mean values <math>\mu_\theta \pm \sigma_\theta^2</math> for each t for each event type. Through bayesian inference, which functions we will soon explain, we will compute the <math>\mu_\theta \pm \sigma_\theta^2</math> for each t for each event type: the post(t) function.

Using samples, we can use Bayesian inference to compute μpost for each t. This will be explained in the following section.

=== Learning System, ignite your engine! ===

Assume that with formula's described in above sections, we are given a group of event types, each event type has a fprior(t). Additionally, each event type has an initially empty set of samples <math>\{s_1 s_2,...,s_n\}\!</math>.

Assume the last glucose measurement was g1 at tg1 with value g1. Now we do a new glucose measurement g2 at tg2 with value g2. The set of events that have impact on glucose level g2 is <math>\{e_1,e_2,...,e_n\}\!</math>. Each event ek has an event type with attributes described above, a timestamp tek, and a multiplicity indicator.

==== 1. Calculate helper variable a ====

The first thing we calculate is the helper variable ''a''. Each event i has a <math>f_i(t_{g2}-t_{ei}) \to \mu_{\theta,post} \pm \sigma_{\theta,post}^2\!</math>. If <math>\mu_i \pm \sigma_i\!</math> are (synonyms of) these posterior mean and variance values for event i, and (g2-g1) is the measured glucose rise/fall, then:

<math>a = \frac{(g2-g1)-(\mu_e1+\mu_e2+...)}{\sigma_1^2+\sigma_2^2+...}\!</math>

==== 2. Update event knowledge ====

This step is looped trough all events i in <math>\{e_1,e_2,...,e_n\}\!</math>. Furthermore this loops through composite events <math>{e_a+e_b+e_c,e_d+e_e,...}</math> which are combinations events that happen at the same time. Time t is measured in intervals, so the chance that events happen at the same t is quite big.

==== 2a. Calculate subsample <math>s_i</math> of <math>e_i</math> ====

Now we can calculate the subsample <math>s_i</math> for event k. What is this? The user did a new glucose measurement, and the system sees the difference in glucose level (g2-g1): in statistical terms, (g2-g1) is our new composite sample <math>s_{tot}</math>. This sample is actually a composite sample, because it is caused by the sum of all events <math>\{e_1,e_2,...,e_n\}\!</math>. So to add a sample to each single event, the system needs to divide this composite sample into subsamples, one for each event. Using our calculated helper variable ''a'', we do this using this simple formula:

<math>s_i = \mu_i + a \times \sigma_i \!</math>

For proof, ask me.

While <math>s_i\!</math> is the most likely subsample of <math>s_{tot}\!</math>, it is still an estimation. How well it comes close to the 'real' value of <math>s_i\!</math> depends on the precision of the posterior variables <math>\mu_{\theta,post} \pm \sigma_{\theta,post}^2\!</math>. Like written above, the initial values of these 'a posteriori' variables are close to the 'a prior' variables, so it can't be pressed enough that these prior variables are important.

(Maybe an improvement would be to store the supersample in combination with the event types somewhere. Old supersamples can then be used again to compute even more likely posterior distributions. This whole routine can then be iterated of all glucose measurements)

==== 2b. Add <math>s_i</math> to <math>e_i</math>'s sample set ====

Now that we have a new subsample, we can add it to its sample set: <math>\{s_1 s_2,...,s_n,s_i\}\!</math>. Now this is used to update the posterior values of i.

==== 2c. Update posterior values ====

If <math>\sigma_t^2</math> is the static event variance, and <math>\bar s</math> is the mean value of <math>\{s_1 s_2,...,s_n,s_i\}\!</math>, then:

<math>\mu_{\theta, post}=\frac{\sigma_t^2\mu_{\theta, prior}+n\sigma_t^2 \bar s}{\sigma_t^2+n\sigma_{\theta, prior}^2}</math>

<math>\sigma_{\theta, post}^2=\frac{\sigma_t^2\sigma_{\theta, prior}^2}{\sigma_t^2+n\sigma_{\theta, prior}^2}</math>

For proof, see "Morris H.DeGroot and Mark J.Schervish. ''Probability and Statistics, third edition'': blz 330".

==== 2d. Update posterior f(x) ====

Now the function f(x), or its parameters, can be updated similar to the methods in "Generating of a fprior(t) or its prior parameters". The posterior function should approximate the measured samples as good as possible.

==== 2e. Repeat ====

Repeat 2a-2c for each i, and all worthfull composites.

Learning System

2006-06-04T09:55:49Z

DurkKingma: /* Learning System, ignite your engine! */

This page reflects my/our idea about the Learning system. It consists of a few interdependent systems for functions of estimation, inference and storage.

== Glucose level estimation ==

[[Image:Summing_events.png|thumb|right|Example of glucose level estimation.]]

The most top-level functioning of the learning system is to give near future glucose level estimation. The current glucose level estimation is done by (1) taking the last glucose measurement, and then (2) adding up the typical glycemic response (''glucose rise/fall'') of all events since the last measurement.

=== The function f(t) ===

The glycemic response of each event is modelled in terms of a glucose rise/fall in function of time: f(t). Time real time t is mapped to discrete intervals of 15 minutes. Event types are split into distinct categories (see below). For computational conveniance, each event type category ''c'' is modelled by a function fc(t), and each concrete individual event type is modelled as transformation of that function using parameters a and b: a*fc(b*t).

* Food intake. Usually has a positive glycemic effect. It appears to be modelled as:

<math>f(t) =
a * exp \left [ \left ( \frac{t-b }{0,667*b} \right )^2 \right ] +
(a/2) * exp \left [ \left ( \frac{t-2b}{0,667*b} \right )^2 \right ] +
(a/4) * exp \left [ \left ( \frac{t-3b}{0,667*b} \right )^2 \right ]
</math>

* Insulin intake. Usually has a negative glycemic effect.

* Stress level.

* Time of the day, because glucose levels structurally differ during the day.

* Health status.

* For pregnancy diabetes: progress of pregnancy. Its hormone decreases insulin sensivity.

* Other event types.

=== g2 Estimation ===

As told above, the estimate for future moments in time is done by taking the last glucose measurement and adding the sum of glycemic responses of events. If g1 at <math>t_{g1}</math> is the last glucose measurement, g2 at <math>t_{g2}</math> is the glucose level to be estimated, and <math>(e_1,e_2,...,e_n)</math> events that have influence on g2. <math>(f_1,f_2,...,f_n)</math> are the estimated functions of the events. <math>(t_1,t_2,...,t_n)</math> are the (start) times of each events. Then the glucose prediction g2 at <math>t_{g2}</math> is:

<math>g2_{estimate}(t_{g2}) = g1 + \sum_{k = 1}^n \left ( f_k(t_{g2}-t_k)-f_k(t_{g1}-t_k) \right )</math>

== On events ==

Like told above, the term 'event' can be things like apple intake. Our definition is broader then that: events can also be composite. A composite event is a set or cominbation or multiple single events. Why use composite events? Because, for example, eating different food types combined leads to a different glycemic response then the sum of individual foods. Eating certain food types nullify the effect of other foods.
Another positive thing about compositive events is that it decreases the amount of events in the sum of g2_estimate (see above). Less summation means less uncertainty about the estimate.

=== Creation of a new event type ===
What needs to be done when a new event type is created, for example when a user eats something new or gets new insulin therapy? The first the system needs to create is an ''a priori'' estimate of f(t). This is called the ''a priori'' function. For food, this would be based on carbonhydrate count. For insulin, this would be done by entering medicine information. A better ''a priori'' fprior(t) means the system needs less training time to estimate the real function f(t). When evidence arrives in the form of a ''sample'', an ''a posteriori'' fpost(t) is formed that esimates the real function f(t). A sample is an observation value of f(t) at some t. More evidence/samples means a better ''a posteriori'' fpost(t).

In other words:
* Better prior knowledge (carbonhydrate count etc) leads to a better fprior(t)
* A better fprior(t) leads to a better fpost(t)
* More samples leads to a better fpost(t)
* A good fpost(t) means it is close to the real f(t)

=== Significance of good fprior(t) ===
In our case, we will see that the samples are estimations too. Later on, we will conclude that better fpost(t) functions lead to better estimations of samples. In the 'bigger picture', this means that bad-quality fprior(t)'s implicates inititally bad-quality fpost(t)'s, which in turn implicate intially bad-quality samples, leading to initially slow progression of inference. This is important to know, because quality fprior(t)'s are VITAL to fast initial inference. Concretely said, good a priori functions will decrease the startup time significantly, maybe from months to just weeks or days.

=== Generating of a fprior(t) or its prior parameters ===
So what are the steps of creating fprior(t) for certain event types? For...

* Food intake, calculate the ''a'' and ''b'' paramaters (for information about these parameters, see above). [Mapping of Carbonhydrate count to ''a'' and ''b'' parameters to be added]

* Insulin intake. [To do]

* Stress level. [To do]

* Time of the day. [To do]

* Health status. [To do]

* Other event types. [To do]

=== Attributes of event types ===

Summarizing what we have said above, each event type has the following attributes:
* An a priori function fprior(t)
* A (initially empty) set of samples, each a tuple {t,dg} with t=time and dg=delta-g, the glycemic response.
* An a posteriori function fpost(t)

The following section will describe the process of computation of fpost(t).

== Bayesian Inference ==

So how does the system calculate fpost(x)? And how are the samples created?

=== Statistical nature of the function f(t) ===

In the texts above, we wrote about the glycemic response functions f(t), like fprior(t) and fpost(t) functions. For inference reasons, because we are using bayesian inference, we must describe the problem in terms of statistics. Following this viewpoint, one could say at t, there is a ''mean'' estimated value and variance value indicating the mean error. This way we describe each function in terms of a normal (Gaussian) distribution. So each point ''t'' doesnt map to just one value, but to two: mean <math>\mu</math> and squared variance <math>\sigma^2</math>, written as <math>f(t) = \mu_t \pm \sigma_t^2</math>. The variance <math>\sigma_t^2</math> is a static value, and we assign some reasonable value to it, defined by the event type (like 3 for food or something). The mean value μ is the to be estimated variable, or the unknown parameter θ. This unkonwn paramter θ is exactly (and only) thing we need to learn for each t. θ is where its all about. And each event type has a whole line of θ's because it has one θ for each ''t''. To be able to compute θ, we need to see it as a normal distribution too: <math>\theta = \mu_\theta \pm \sigma_\theta^2</math>.

And now we can use our prior and posterior functions. The fprior(t) function defines the prior mean values <math>\mu_\theta \pm \sigma_\theta^2</math> for each t for each event type. Through bayesian inference, which functions we will soon explain, we will compute the <math>\mu_\theta \pm \sigma_\theta^2</math> for each t for each event type: the post(t) function.

Using samples, we can use Bayesian inference to compute μpost for each t. This will be explained in the following section.

=== Learning System, ignite your engine! ===

Assume that with formula's described in above sections, we are given a group of event types, each event type has a fprior(t). Additionally, each event type has an initially empty set of samples <math>\{s_1 s_2,...,s_n\}\!</math>.

Assume the last glucose measurement was g1 at tg1 with value g1. Now we do a new glucose measurement g2 at tg2 with value g2. The set of events that have impact on glucose level g2 is <math>\{e_1,e_2,...,e_n\}\!</math>. Each event ek has an event type with attributes described above, a timestamp tek, and a multiplicity indicator.

==== 1. Calculate helper variable a ====

The first thing we calculate is the helper variable ''a''. Each event i has a <math>f_i(t_{g2}-t_{ei}) \to \mu_{\theta,post} \pm \sigma_{\theta,post}^2\!</math>. If <math>\mu_i \pm \sigma_i\!</math> are (synonyms of) these posterior mean and variance values for event i, and (g2-g1) is the measured glucose rise/fall, then:

<math>a = \frac{(g2-g1)-(\mu_e1+\mu_e2+...)}{\sigma_1^2+\sigma_2^2+...}\!</math>

==== 2. Update event knowledge ====

This step is looped trough all events i in <math>\{e_1,e_2,...,e_n\}\!</math>. Furthermore this loops through composite events <math>{e_a+e_b+e_c,e_d+e_e,...}</math> which are combinations events that happen at the same time. Time t is measured in intervals, so the chance that events happen at the same t is quite big.

==== 2a. Calculate subsample <math>s_i</math> of <math>e_i</math> ====

Now we can calculate the subsample <math>s_i</math> for event k. What is this? The user did a new glucose measurement, and the system sees the difference in glucose level (g2-g1): in statistical terms, (g2-g1) is our new composite sample <math>s_{tot}</math>. This sample is actually a composite sample, because it is caused by the sum of all events <math>\{e_1,e_2,...,e_n\}\!</math>. So to add a sample to each single event, the system needs to divide this composite sample into subsamples, one for each event. Using our calculated helper variable ''a'', we do this using this simple formula:

<math>s_i = \mu_i + a \times \sigma_i \!</math>

For proof, ask me.

While <math>s_i\!</math> is the most likely subsample of <math>s_{tot}\!</math>, it is still an estimation. How well it comes close to the 'real' value of <math>s_i\!</math> depends on the precision of the posterior variables <math>\mu_{\theta,post} \pm \sigma_{\theta,post}^2\!</math>. Like written above, the initial values of these 'a posteriori' variables are close to the 'a prior' variables, so it can't be pressed enough that these prior variables are important.

(Maybe an improvement would be to store the supersample in combination with the event types somewhere. Old supersamples can then be used again to compute even more likely posterior distributions. This whole routine can then be iterated of all glucose measurements)

==== 2b. Add <math>s_i</math> to <math>e_i</math>'s sample set ====

Now that we have a new subsample, we can add it to its sample set: <math>\{s_1 s_2,...,s_n,s_i\}\!</math>. Now this is used to update the posterior values of i.

==== 2c. Update posterior values ====

If <math>\sigma_t^2</math> is the static event variance, and <math>\bar s</math> is the mean value of <math>\{s_1 s_2,...,s_n,s_i\}\!</math>, then:

<math>\mu_{\theta, post}=\frac{\sigma_t^2\mu_{\theta, prior}+n\sigma_t^2 \bar s}{\sigma_t^2+n\sigma_{\theta, prior}^2}</math>

<math>\sigma_{\theta, post}^2=\frac{\sigma_t^2\sigma_{\theta, prior}^2}{\sigma_t^2+n\sigma_{\theta, prior}^2}</math>

For proof, see "Morris H.DeGroot and Mark J.Schervish. ''Probability and Statistics, third edition'': blz 330".

==== 2d. Update posterior f(x) ====

Now the function f(x), or its parameters, can be updated similar to the methods in "Generating of a fprior(t) or its prior parameters".

==== 2e. Repeat ====

Repeat 2a-2c for each i, and all worthfull composites.

Learning System

2006-06-03T23:17:07Z

DurkKingma: /* The function f(t) */ new event type: progress into pregnancy

This page reflects my/our idea about the Learning system. It consists of a few interdependent systems for functions of estimation, inference and storage.

== Glucose level estimation ==

[[Image:Summing_events.png|thumb|right|Example of glucose level estimation.]]

The most top-level functioning of the learning system is to give near future glucose level estimation. The current glucose level estimation is done by (1) taking the last glucose measurement, and then (2) adding up the typical glycemic response (''glucose rise/fall'') of all events since the last measurement.

=== The function f(t) ===

The glycemic response of each event is modelled in terms of a glucose rise/fall in function of time: f(t). Time real time t is mapped to discrete intervals of 15 minutes. Event types are split into distinct categories (see below). For computational conveniance, each event type category ''c'' is modelled by a function fc(t), and each concrete individual event type is modelled as transformation of that function using parameters a and b: a*fc(b*t).

* Food intake. Usually has a positive glycemic effect. It appears to be modelled as:

<math>f(t) =
a * exp \left [ \left ( \frac{t-b }{0,667*b} \right )^2 \right ] +
(a/2) * exp \left [ \left ( \frac{t-2b}{0,667*b} \right )^2 \right ] +
(a/4) * exp \left [ \left ( \frac{t-3b}{0,667*b} \right )^2 \right ]
</math>

* Insulin intake. Usually has a negative glycemic effect.

* Stress level.

* Time of the day, because glucose levels structurally differ during the day.

* Health status.

* For pregnancy diabetes: progress of pregnancy. Its hormone decreases insulin sensivity.

* Other event types.

=== g2 Estimation ===

As told above, the estimate for future moments in time is done by taking the last glucose measurement and adding the sum of glycemic responses of events. If g1 at <math>t_{g1}</math> is the last glucose measurement, g2 at <math>t_{g2}</math> is the glucose level to be estimated, and <math>(e_1,e_2,...,e_n)</math> events that have influence on g2. <math>(f_1,f_2,...,f_n)</math> are the estimated functions of the events. <math>(t_1,t_2,...,t_n)</math> are the (start) times of each events. Then the glucose prediction g2 at <math>t_{g2}</math> is:

<math>g2_{estimate}(t_{g2}) = g1 + \sum_{k = 1}^n \left ( f_k(t_{g2}-t_k)-f_k(t_{g1}-t_k) \right )</math>

== On events ==

Like told above, the term 'event' can be things like apple intake. Our definition is broader then that: events can also be composite. A composite event is a set or cominbation or multiple single events. Why use composite events? Because, for example, eating different food types combined leads to a different glycemic response then the sum of individual foods. Eating certain food types nullify the effect of other foods.
Another positive thing about compositive events is that it decreases the amount of events in the sum of g2_estimate (see above). Less summation means less uncertainty about the estimate.

=== Creation of a new event type ===
What needs to be done when a new event type is created, for example when a user eats something new or gets new insulin therapy? The first the system needs to create is an ''a priori'' estimate of f(t). This is called the ''a priori'' function. For food, this would be based on carbonhydrate count. For insulin, this would be done by entering medicine information. A better ''a priori'' fprior(t) means the system needs less training time to estimate the real function f(t). When evidence arrives in the form of a ''sample'', an ''a posteriori'' fpost(t) is formed that esimates the real function f(t). A sample is an observation value of f(t) at some t. More evidence/samples means a better ''a posteriori'' fpost(t).

In other words:
* Better prior knowledge (carbonhydrate count etc) leads to a better fprior(t)
* A better fprior(t) leads to a better fpost(t)
* More samples leads to a better fpost(t)
* A good fpost(t) means it is close to the real f(t)

=== Significance of good fprior(t) ===
In our case, we will see that the samples are estimations too. Later on, we will conclude that better fpost(t) functions lead to better estimations of samples. In the 'bigger picture', this means that bad-quality fprior(t)'s implicates inititally bad-quality fpost(t)'s, which in turn implicate intially bad-quality samples, leading to initially slow progression of inference. This is important to know, because quality fprior(t)'s are VITAL to fast initial inference. Concretely said, good a priori functions will decrease the startup time significantly, maybe from months to just weeks or days.

=== Generating of a fprior(t) or its prior parameters ===
So what are the steps of creating fprior(t) for certain event types? For...

* Food intake, calculate the ''a'' and ''b'' paramaters (for information about these parameters, see above). [Mapping of Carbonhydrate count to ''a'' and ''b'' parameters to be added]

* Insulin intake. [To do]

* Stress level. [To do]

* Time of the day. [To do]

* Health status. [To do]

* Other event types. [To do]

=== Attributes of event types ===

Summarizing what we have said above, each event type has the following attributes:
* An a priori function fprior(t)
* A (initially empty) set of samples, each a tuple {t,dg} with t=time and dg=delta-g, the glycemic response.
* An a posteriori function fpost(t)

The following section will describe the process of computation of fpost(t).

== Bayesian Inference ==

So how does the system calculate fpost(x)? And how are the samples created?

=== Statistical nature of the function f(t) ===

In the texts above, we wrote about the glycemic response functions f(t), like fprior(t) and fpost(t) functions. For inference reasons, because we are using bayesian inference, we must describe the problem in terms of statistics. Following this viewpoint, one could say at t, there is a ''mean'' estimated value and variance value indicating the mean error. This way we describe each function in terms of a normal (Gaussian) distribution. So each point ''t'' doesnt map to just one value, but to two: mean <math>\mu</math> and squared variance <math>\sigma^2</math>, written as <math>f(t) = \mu_t \pm \sigma_t^2</math>. The variance <math>\sigma_t^2</math> is a static value, and we assign some reasonable value to it, defined by the event type (like 3 for food or something). The mean value μ is the to be estimated variable, or the unknown parameter θ. This unkonwn paramter θ is exactly (and only) thing we need to learn for each t. θ is where its all about. And each event type has a whole line of θ's because it has one θ for each ''t''. To be able to compute θ, we need to see it as a normal distribution too: <math>\theta = \mu_\theta \pm \sigma_\theta^2</math>.

And now we can use our prior and posterior functions. The fprior(t) function defines the prior mean values <math>\mu_\theta \pm \sigma_\theta^2</math> for each t for each event type. Through bayesian inference, which functions we will soon explain, we will compute the <math>\mu_\theta \pm \sigma_\theta^2</math> for each t for each event type: the post(t) function.

Using samples, we can use Bayesian inference to compute μpost for each t. This will be explained in the following section.

=== Learning System, ignite your engine! ===

Assume that with formula's described in above sections, we are given a group of event types, each event type has a fprior(t). Additionally, each event type has an initially empty set of samples <math>\{s_1 s_2,...,s_n\}\!</math>.

Assume the last glucose measurement was g1 at tg1 with value g1. Now we do a new glucose measurement g2 at tg2 with value g2. The set of events that have impact on glucose level g2 is <math>\{e_1,e_2,...,e_n\}\!</math>. Each event ek has an event type with attributes described above, a timestamp tek, and a multiplicity indicator.

==== 1. Calculate helper variable a ====

The first thing we calculate is the helper variable ''a''. Each event i has a <math>f_i(t_{g2}-t_{ei}) \to \mu_{\theta,post} \pm \sigma_{\theta,post}^2\!</math>. If <math>\mu_i \pm \sigma_i\!</math> are (synonyms of) these posterior mean and variance values for event i, and (g2-g1) is the measured glucose rise/fall, then:

<math>a = \frac{(g2-g1)-(\mu_e1+\mu_e2+...)}{\sigma_1^2+\sigma_2^2+...}\!</math>

==== 2. Update event knowledge ====

This step is looped trough all events i in <math>\{e_1,e_2,...,e_n\}\!</math>. Furthermore this loops through composite events <math>{e_a+e_b+e_c,e_d+e_e,...}</math> which are combinations events that happen at the same time. Time t is measured in intervals, so the chance that events happen at the same t is quite big.

==== 2a. Calculate subsample <math>s_i</math> of <math>e_i</math> ====

Now we can calculate the subsample <math>s_i</math> for event k. What is this? The user did a new glucose measurement, and the system sees the difference in glucose level (g2-g1): in statistical terms, (g2-g1) is our new composite sample <math>s_{tot}</math>. This sample is actually a composite sample, because it is caused by the sum of all events <math>\{e_1,e_2,...,e_n\}\!</math>. So to add a sample to each single event, the system needs to divide this composite sample into subsamples, one for each event. Using our calculated helper variable ''a'', we do this using this simple formula:

<math>s_i = \mu_i + a \times \sigma_i \!</math>

For proof, ask me.

While <math>s_i\!</math> is the most likely subsample of <math>s_{tot}\!</math>, it is still an estimation. How well it comes close to the 'real' value of <math>s_i\!</math> depends on the precision of the posterior variables <math>\mu_{\theta,post} \pm \sigma_{\theta,post}^2\!</math>. Like written above, the initial values of these 'a posteriori' variables are close to the 'a prior' variables, so it can't be pressed enough that these prior variables are important.

(Maybe an improvement would be to store the supersample in combination with the event types somewhere. Old supersamples can then be used again to compute even more likely posterior distributions. This whole routine can then be iterated of all glucose measurements)

==== 2b. Add <math>s_i</math> to <math>e_i</math>'s sample set ====

Now that we have a new subsample, we can add it to its sample set: <math>\{s_1 s_2,...,s_n,s_i\}\!</math>. Now this is used to update the posterior values of i.

==== 2c. Update posterior values ====

If <math>\sigma_t^2</math> is the static event variance, and <math>\bar s</math> is the mean value of <math>\{s_1 s_2,...,s_n,s_i\}\!</math>, then:

<math>\mu_{\theta, post}=\frac{\sigma_t^2\mu_{\theta, prior}+n\sigma_t^2 \bar s}{\sigma_t^2+n\sigma_{\theta, prior}^2}</math>

<math>\sigma_{\theta, post}^2=\frac{\sigma_t^2\sigma_{\theta, prior}^2}{\sigma_t^2+n\sigma_{\theta, prior}^2}</math>

For proof, see "Morris H.DeGroot and Mark J.Schervish. ''Probability and Statistics, third edition'': blz 330".

==== 2d. Repeat ====

Repeat 2a-2c for each i, and all worthfull composites.

This page reflects my idea about the Condition Effect Learning system. If you have comments, please dont delete text but add comments so I can reflect. See this as a first draft, which can be used as a basis for the Cheetah condition effect learning system.

As said in [[Advisory System]], Cheetah needs a system that learns about the effect of certain conditions. Condtions are variables that have effect on blood glucose levels. As you can read in [[Advisory System]], conditions can be classified as follows:
# Certain conditions: the effect is known and fixed. The effect is fixed, so the learning system will see the effect as certain.
# Uncertain conditions: the effect is not certain or not known yet. So, the effect is a prediction done by the system. The learning system tries to approve the prediction by looking at the past.

There are some complications regarding learning about condition (e.g. food) effects. Since each human and each body is different, conditions dont always have a fixed certain effect. Food for instance has a GL (Glycamic Load) that tells about the effect of the food on BG (Blood Glucose) levels. Food effect (response) varies between individuals and between days as much as 20%. Likewise, effect of insulin and activities varies between people and temporally. Therefore, to account for intra-individual and temporal differences, I think it is a good thing to generally express condition effect by a range instead of a simple number. Such a range could be a probability distribution, or expressed as (minimum,maximum) tuple. For Cheetah, 'Learning' about conditions means assigning a range or distribution to it. There are several levels of expressing such variation in terms of a range or distribution.
I solved the first one, but not yet the second.
# Learning a condition's effect by assigning it a minimal and maximal effect value. For example, a minimum and maximum BG effect.
# Learning a condition's effect by seeing it as a normally distributed random variable.

== 1. Learning a condition's effect by assigning it a minimal and maximal effect value ==
With this system, all conditions X have a Xhardmin and Xhardmax variables which tells the system the hard minimum and maximum of effect. For example, [http://www.ajcn.org/cgi/content/full/76/1/5/T1 research] has pointed out that a can of Coca-Cola has a Glycemic Load (GL) of minimally 14 and maximally 16. If we see the GL as the foods effect, we can assign CocaColahardmin=14 and CocaColahardmax=16. So, CocaCola could be a type 1 condition because it has a certain effect.
When a user adds a new food type X to the database, it can add information like carbonhydrate(%), GI and such, to help the system determine Xhardmin and Xhardmax.

So, all conditions have a minimum and a maximum effect. 
Suppose c is a condition, then: 
<math>c_{min}, c_{max} \in \mathbb{R}</math> 
<math>c_{min} \le c_{max}</math> 

Also, conditions can be contained in a group (set) of conditions S: 
S = {c1, c2, ...} 
Like conditions, such group of conditions S also has minimum and maxmimum cumulative effect. Such effect is the sum of all of the effects of its contained conditions: 
Smin = c1min + c2min + ... 
Smax = c1max + c2max + ... 

Using intuition, I came up with the following theorem. I have to give it a name so lets call it ''Kingma's Theorem'' :). It's not formally proved yet, but the following seems to be true for all cases. Its actually quite logical. 
If condition c is part of set S (''and (S-c) is set S minus condition c') 
and Smin and Smax are known 
and chardmin and chardmax are known 
then can be said: 
(S-c)min = Smin - chardmax 
(S-c)max = Smax - chardmin 

Lets assume that at each blood glucose (BG) measurement, Cheetah starts its learning system. Cheetah looks at all conditions that could have an effect, and puts all type 1 conditions into group CE (certain effect) and all type 2 into group UE (uncertain effect).
Then, cheetah calculates the difference (min and max) between the BGmeasurement and predicted CE group effect. This difference should equal the UE group effect. So: 
BGmeasurement = UEreal + CEreal 
So: 
UEmin = BGmeasurement - CEmax 
UEmax = BGmeasurement - CEmin 

UE is the range of the sum of the to be learned conditions. These conditions all have their own effect range, chardmin and chardmax. But because the sum of effects (UE) is restricted , each individual effect for each condition must be somewhat more restricted. The possible effect range for each effect can be deduced from the total effect and the hardmin and hardmax value of the other effects. Using Kingma's Theorem, one can deduce the effect range of each individual condition.

'''Example 1.1'''

Lets look at a simple example of how a system would learn. Say, the user adds a glucose mearurement entry and the system starts its learning system. The user has taken a glass of applejuice A and a bread B. The user fills in a blood glucose measurement of 15.

The system knows a priori (from its database):

Ahardmin = 2 
Ahardmax = 5 
Bhardmin = 7 
Bhardmax = 9

The system calculates the effect range these two items must have had:

UEmin = BGmeasurement - CEmax = 20 - 13 = 7 
UEmax = BGmeasurement - CEmin = 20 - 9 = 11

Using Kingma's Theorem, The system then calculates the current Amin, Amax, Bmin, Bmax values:

(UE-A)min = UEmin - Ahardmax = 7 - 5 = 2 (this is under B's hardmin, so) => 7 
(UE-A)max = UEmax - Ahardmin = 11 - 2 = 9 (not above B's hardmax, so keep it) 

(UE-B)min = UEmin - Bhardmax = 7 - 9 = -2 (this is under A's hardmin, so) => 2 
(UE-B)max = UEmax - Bhardmin = 11 - 7 = 4 (not above A's hardmax, so keep it) 

In this case, (UE-A) is B, and (UE-B) is A, is the system already calculated everything it needed:

Amin = (UE-B)min = 2 
Amax = (UE-B)max = 4 
Bmin = (UE-A)min = 7 
Bmax = (UE-A)max = 9 

Above values indicate the possible current effect range of A and B. The system can use this to tweak its knowledge about the overall effect range of this values. This could be done by giving a condition a list of these calculated effect ranges. An algorithm then a mean (or something) of this list to use in effect prediction.

In other cases, when there are more then 2 conditions, the system iterates down a few levels to calculate the individual possible condition effect ranges. I will add such an example when I have time.

== 2. Bayesian Inference way ==

Lets forget the term 'condition' but the more intuitive term 'event' instead. So events are things like food intake (eg one glass of lemonade), insuline intake (one unit of type X), sports (half an hour of running), but also current health status and stress level etc. Before the system learns anything, each event is assigned an 'a priori' estimating curve, which tells us how, in time, the estimated effect on the blood glucose level 'g'. This a priori curve is assigned before any measurements have been made (example: the a priori curve for food could be based on known carbonhydrate). Quickly said, the learning system uses the blood glucose measurements to update and improve the estimating curve.

Lets describe these events, their effects and their computations, in terms of a statistics problem.

=== About events ===
Each single event <math>e_i</math> has three things.

1) Firstly: A set of samples. Each sample is a tuple (Δt, Δg) So the set of samples could be represented on a 2-dimensional area. The sample set is initially empty, and samples are added through bayesian inference (explained below). [For extra clarity, image to be added here].

2) A prior (''a priori'') function fe,prior(Δt) → Δ = μprior. This is the estimated mean effect of the event, for each determined before any samples have arrived. For food, it could be determined by looking at carbonhydrate amount. For insuline, it could be determined by medicine information. If no prior function can be made, an effect is assigned a default prior function. The prior function also has a pre-determined variance σprior². Spoken in statistic terms, the event effect at each moment in time has a normal distribution with:

<math>\sigma_e^2 = \mbox{some-static-value} \,</math> 
<math>\mu_e = \theta \,</math>

This parameter θ is unknown, but is assigned has a prior distribution with

<math>\sigma_{\theta ,prior}^2 = \mbox{some-a-priori-value} \,</math> 
<math>\mu_{\theta, prior} = f_{e,prior}(\delta t) \,</math>

3) A posterior (''a posteriori'') function fe,secondior(Δt) → Δg. This is the esimated effect of the event after looking at the samples. It is determined as follows. The samples are divived into give time intervals, for example 15 minutes. So we have intervals ti with i=(1, ..., n), and each interval representing 15 minutes. Each of these intervals ti have a distribution θ with a prior distribution as explained above. The posterior distribution of ti is calculated as follows:

<math>\sigma_{\theta, post}^2=\frac{\sigma_{\theta, prior}^2\sigma_{\theta, prior}^2}{\sigma_e^2+n\sigma_{\theta, prior}^2}</math> 
<math>\mu_{\theta, post}=\frac{\sigma_{\theta, prior}^2\mu{\theta, prior}+n\sigma_{\theta, prior}^2\bar x_n}{\sigma_e^2+n\sigma_{\theta, prior}^2}</math>

Each interval has an a priori distribution as explained above. Some intervals have >0 collected samples: for these the sample mean is computed.
(unfinished)

How does it do that? Suppose it gets 'evidence' that at ΔT, the effect is ΔBG. It then adds that evidence to its set of all evidences. These evidences are then mapped to certain time intevals, e.g. 15 minutes. Some intervals then have a couple of evidences, some only one or none. At each interval, the a posteriori effect is computed. The math is as follows.

The esimated effect has a normal distribution with mean=M and variance=V. The set of evidence values are e1, ..., en. 

A posteriori mean[further explanation to be added]:
<math>\mu_{post}=\frac{\sigma^2\mu+nv^2\bar x_n}{\sigma^2+nv^2}</math>

A posteriori variance:
<math>v_{post}^2=\frac{\sigma^2v^2}{\sigma^2+nv^2}</math>.

where <math>\bar x</math> means the mean of all evidences.

To assign an evidence xi to each individual events ei you do:
<math>x_i=\mu_{prior}+a*\sigma_1^2</math> 

where ''a'' is
<math>\frac{x_{tot}-(\mu_1+\mu_2+...)}{\sigma_1+\sigma_2+...}</math>

The prior means used here are derived from the estimated curve. The prior variances is also derived from this. Alternativly, when there are enough evidences on that particular moment, the system can use just these values.

So it comes down to some quite simple math. I'll make my explanation better when I have more time.