scaffold kalman filter implementation here

notation:

x_(k) : n x 1 : true system state that we want to estimate
F(k)  : n x n : state transition matrix at time t(k)
w_(k) : n x 1 : system noise,  with mean 0, covariance Q(k)
Q(k)  : n x n : covariance of systen noise

v_(k) : m x 1 : observation errors at time t(k)
R(k)  : m x m : observation error covariance matrix at time t(k)

x(k)  : n x 1 : state vector estimate for time t(k)
P(k)  : n x n : covariance matrix for x(k)
z(k)  : m x 1 : observation vector at time t(k)
H(k)  : m x n : observation matrix at time t(k)

K(k)  :       : kalman gain - measures information gain from observation z(k)
                relative to prior P(k | k-1)

use shorthand:
  xT, F(k)T   for transpose(x), transpose(F(k))
  x(k+1 | k)  for "estimate x(k+1) given information known at t(k)"
  
A. Linear Kalman Filter
-----------------------

1. system model:
     x_(k+1) = F(k).x_(k) + w_(k),    w_(k) ~ N(0,Q)

   i.e. expected behavior of system from t(k) -> t(k+1),
   absent system noise, is given by linear transformation F(k)

   ofc model is not directly observable,
   since we don't know x_(k) or w_(k),
   instead we will be estimating it.

2. measurement model:
     z(k+1) = H(k+1).x_(k+1) + v_(k),   v_(k) ~ N(0, R)

3. prior:
     x(0), P(0)

   must be supplied as initial input

4. pre-estimate for t(k+1) system state:
   (before incorporating z(k+1) and accounting for system noise)
     x(k+1|k) := F(k).x(k)

   in other words propagate t(k) estimate to t(k+1),
   using F(k)
   
5. pre-estimate for t(k+1) estimate covariance
   (before incorporating z(k+1) and accounting for system noise)
     P(k+1|k) := F(k).P(k).F(k)T + Q(k)

6. kalman gain matrix
     K(k+1) := P(k+1|k).H(k)T.inverse(H(k).P(k+1|k).H(k)T + R(k))

   note that the matrix-to-be-inverted in the gain expression
   is symmetric and positive-definite (so can use cholesky decomposition)

7. innovation: difference between actual observation vector and
   observation predicted from state estimate x(k+1|k):
     z(k+1) - H(k+1).x(k+1|k)

8. corrected state estimate for t(k+1):
     x(k+1) := x(k+1|k) + K(k+1)[z(k+1) - H(k+1).x(k+1|k)]

9. correct state covariance for t(k+1):
     P(k+1) := [I - K(k+1).H(k+1)].P(k+1|k)
     

B. Extended Kalman Filter
-------------------------
Must write in continuous form,  to deal with non-linearities

notation:

x_(t)      : n x 1 : true system state that we want to estimate,
                     continuously evolves with t
f(x_(t),t) : n x 1 : expected derivative (d/dt)(x_(t))
h(x_(t),t) : m x 1 : observation function
x(k)       : n x 1 : state vector estimate for x(t) at time t=t(k)
F(x(t),t)  : n x n : jacobian of f(x_(t),t) evaluated at x_(t)=x(t)
H(x(t),t)  : m x n : jacobian of h(x_(t),t) evaluated at x_(t)=x(t)

1. system model:
    (d/dt)x_(t) = f(x_(t),t) + w(t),
   where
    w(t) is a white noise with

      / t2
      |
      |   w(t).dt   ~   N(0, Q(t1,t2))
      |
      / t1

   editorial:
   note that we exclude more general model
     (d/dt)x_(t) = f(x_(t),t)) + G(x_(t),t).w(t)
   because would be unable to assume RHS mean-square integrable.
   The form chosen effectively band-limits the frequencies at which
   w(t) acts,   since it's independent of x_(t).  This allows mean-square
   rules to be applied as in linear kalman filter

2. measurement model:
     z(k+1) = h(x_(t(k+1)), t(k+1)) + v(k),  v(k) ~ N(0, Rk)

3. prior:
     x(0), P(0)

    (same as for linear kalman filter)

(3a). state estimate propagation model:
    (d/dt)x(t) = f(x(t),t)

   editorial:
   this step is *sketchy*:  x(t) is intended to be minimum-variance
   estimate for system state,  but that property is lost when solving
   differential equation (d/dt)x(t) = f(x(t),t) with non-linear f.

(3b). covariance estimate propagation model:
    (d/dt)P(t) = F(x(t),t).P(t) + P(t).F(x(t),t)T + Q(t)

4. pre-estimate (extrapolation) for t(k+1) system state:

   can use first order approx:
    x(k+1|k) = x(k) + F(x(tk),tk).(t(k+1) - t(k))
   or multiple timesteps if t(k+1) - t(k) is too large

   i.e. approximately solving (3a)

5. pre-estimate (extrapolation) for t(k+1) error covariance:

    P(k+1|k) = P(k) + [F(x(tk),tk).P(k) + P(k)F(x(tk),tk)T + Q(tk)].(t(k+1) - t(k))
   or multiple timesteps if t(k+1) - t(k) is too large

   i.e. approximately solving (3b)

6. kalman gain matrix
    K(k+1) = P(k+1|k)
              .H(k+1)T(x(k+1|k))
	      .inverse[H(k+1)(x(k+1|k)).P(k+1|k).H(k+1)T(x(k+1|k)) + R(k+1)]

7. innovation:
    z(k+1) - h(x(k+1|k), t(k+1))

8. corrected state estimate for t(k+1):
    x(k+1) = x(k+1|k) + K(k+1).(z(k+1) - h(k+1)(x(k+1|k))

9. corrected error covariance for t(k+1):
    P(k+1) = [I - K(k+1).H(k+1)(x(k+1|k))].P(k+1|k)

   editorial:
   This step is *sketchy*, to the extend h() is non-linear,  linearizing around
   x(k+1|k) may give poor estimate of state covariance

10. linearization (1st order term for taylor series of f() around x(t))
  jacobian:
    F(x(t),t) = (df/dx_) evaluated at x_(t)=x(t)

    H(k+1)(x(k+1|k)) = (dh/dx_) evaluated at x_(t)=x(k+1|k)

