Modeling Systems in State Space

We have no real business writing a page about state space techniques because so many others have done it better. Our hope here is to provide some background locally so that people who haven't heard about all this stuff before can follow what we're doing.

External links:

Local links:

Modeling Systems in State Space

Modeling Systems the Old Fashioned Way
State Space Representations
Discreet Time Formulations
Other System Representations

Modeling Systems the Old Fashioned Way

Sometimes when people talk about state space techniques, they refer to them as "Modern". Implying that techniques that were developed earlier are old fashioned. That implication is really baseless. State space is just another notational device, with all the usual advantages and disadvantages. Sometimes state space representations are very convenient, when they are not, consider using something else.

Many physical systems can be described using differential equations. For example, consider an idealized system consisting of a point mass moving in one dimension subject to an externally applied force. The governing equation is

#!latex (-)
  $$\displaystyle (1)\quad
    F = m a
$$

Where (F) is the force, (m) the mass, and (a) the acceleration

To completely describe this system the position (p) and velocity (v) can be related to acceleration

#!latex (-)
  \begin{eqnarray*}
  (2)\quad
  a &=& v'\\
  v &=& p'\\
  \end{eqnarray*}

For a one dimensional point mass, the complete state of the system can be described by two parameters, position and velocity. The acceleration is determined by the force, which we have defined to be externally determined. With these ideas in mind the system description can be re-written in a suggestive form

#!latex (-)
  \begin{eqnarray*}
  (3)\quad
  p' &=& v\\
  v' &=& a\\
  \end{eqnarray*}

In this form, everything on the left of an equal sign is the first derivative of a `state' variable. The right hand sides can be considered as a set of functions whose arguments are the state variables and the external inputs. Formally, we can write

#!latex (-)
  \begin{eqnarray*}
  (4)\quad
  x_1' &=& f_1[x_1, x_2,\ \ldots\ , x_n, u_1, u_2,\ \ldots\ , u_m]\\
  x_2' &=& f_2[x_1, x_2,\ \ldots\ , x_n, u_1, u_2,\ \ldots\ , u_m]\\
       &\vdots\\
  x_n' &=& f_n[x_1, x_2,\ \ldots\ , x_n, u_1, u_2,\ \ldots\ , u_m]\\
  \end{eqnarray*}

Where the (x_i) are the state variables, the (f_i) are arbitrary single valued functions, and the (u_i) are external inputs to the system.

Equation (4) is an example of a `state space representation'. Its necessary properties are, first order derivatives of all the state variables on the left, functions of the state variables and external inputs on the right. This arrangement can produce a very regular structure, one that is suitable for a matrix formulation.

In general, the functions (f) in (4) may be time varying. Time varying functions complicate the analysis somewhat, but don't alter the fundamental underpinnings of state space analysis, so in the interest of simplifying things, constant functions are assumed throughout.

State Space Representations

If the functions (f_i) in (4) are linear functions, then the whole system of equations can be represented as a set of sums

#!latex (-)
  \begin{eqnarray*}
  (5)\quad
  x_1' &=& \sum_{i=1}^n {{c_1}_i\ x_i} + \sum_{i=1}^m {{g_1}_j\ u_j}\\
  x_2' &=& \sum_{i=1}^n {{c_2}_i\ x_i} + \sum_{i=1}^m {{g_2}_j\ u_j}\\
       &\vdots\\
  x_n' &=& \sum_{i=1}^n {{c_n}_i\ x_i} + \sum_{i=1}^m {{g_n}_j\ u_j}\\
  \end{eqnarray*}

The sums can be written compactly as a matrix

#!latex (-)
  $$\displaystyle (6)\quad
    \bf{x}' = \bf{F} \bf{x} + \bf{G} \bf{u}
$$

Where (F = {c_ki}), (G = {g_kj}), and (k) runs from 1 to (n)

A very reasonable question is, "How useful is this?" After all, not every dynamic system can be described by differential equations. Even if differential equations can be used, often the equations are not linear, or the derivatives used are not of first order. How can (6) be used in those cases?

Informally, equation (6) can be stated in English, "The current change in the system state (x') depends in part on the current system state (x), and in part on the external influence (u)." Clearly this description could apply to a large variety of physical systems.

Formally, any system which is continuous and linear can be represented in the form given by equation (6). Equations with derivatives of nth-order can be transformed into n coupled equations of 1st order. (See for example phase-variable canonical form. More or less this works by including the higher order derivatives as state variables.)

In practice, equation (6) can be used with non-linear systems too. Although non-linear systems can't be represented exactly in a linear matrix equation, they present no problem in the general state space formulation given by (4). However matrix techniques are sufficiently attractive that non-linear systems are often linearized to allow the matrix formulation to be applied, approximately, to them as well <link???>.

The previous example of a one dimensional accelerated mass, can be put into matrix form like this

#!latex (-)
  $$\displaystyle (7)\quad
    \pmatrix{p \cr v}' = \pmatrix{0 & 1 \cr 0 & 0}\pmatrix{p \cr v}+\pmatrix{0 \cr 1}a
$$

Expanding out equation (7) reproduces exactly the system given in (3).

In (7) the state variables are position (p) and velocity (v). It's worth noting that the choice of state variables is not unique. Clearly, any other set of state variables which can be solved for the original state variables will also work. As suggested by (6), any invertible linear combination of state variables will work. The situation is equivalent to having multiple basis sets for sub-spaces in linear algebra.

In complex situations, picking the state variables can be tricky. The first problem is finding a good system model. In a real-world system there are always many possible factors that influence the system. In some cases it is worth the extra burden to incorporate some of these factors into the system model. Doing so increases the development time and computational load, so often it's better to model some of the influences as process noise, or to just ignore them in the model. Likewise, some system inputs may have subtle relations to the system state. The interdependence could be modeled by state variables, but again, at increasing cost.

A set of state variables is theoretically minimal if together they are sufficient to describe every aspect of the system, but elimination of any variable, singly or in combination, leaves at least some of the system unobservable. Knowing this, it seems like picking good state variables means picking some orthogonal combination of spanning state variables. Particularly if the variables chosen are easily related to available measurements. For linear (or linearized) systems this idea is "mostly true". However, it's still possible to imagine a linear system where some of the possible state variables are so sensitive to small variations, that the accuracy of the system model will be low, but these same variables may be expressible as functions of other possible state variables which are relatively insensitive to small variations. In this case choosing the latter as state variables should result in more accurate system modeling.

Here are some other pathological situations where the choice of state variables may be difficult.

A system where a needed output is difficult to compute from potential state variables. It may be possible to find other state variables that can be used to compute the output more easily. Sometimes this will result in a set of state variables which is not minimal. If the computations resulting from a minimal set are sufficiently complex, the non-minimal set may result in a lower overall system modeling burden.
When the only available measurements of a system involve combinations of variables considered to be representative of the system state. Again this situation can suggest a non-minimal set of state variables to match the sensor outputs. Another possibility is to do fairly complex pre-computation on the measurements to produce a de-tangled set of measurements to feed into the state model. (Sometimes significant non-linearities can be removed this way too.)
If the system to be modeled is not well understood, it's model can be derived observationally by formal or informal techniques. Usually the model which is derived is of very low order compared to the actual system. Sometimes it is discovered that seemingly unrelated state variables are actually tightly dependent. Dimensional analysis may help when analyzing this type of system.

Discreet Time Formulations

For many purposes, writing the differential equations for a dynamical system is not sufficient. For state space equations the value of the state variables through time is usually desired.

Recall the linear form of the state space equations given in (6)

#!latex (-)
  $$\displaystyle (8)\quad
    \bf{x}' = \bf{F} \bf{x} + \bf{G} \bf{u}
$$

If you ignore the external inputs and the fact that this is a matrix equation you're left with a linear homogenous first order differential equation

#!latex (-)
  $$\displaystyle (9)\quad
    x' - F x = 0
$$

The solution is well known (and can be verified by direct substitution)

#!latex (-)
  $$\displaystyle (10)\quad
    x = e^{F t}
$$

Since the matrix equation (8) is linear, it can be solved exactly the same way. We define the state transition matrix (Φ) to be the matrix exponential:

#!latex (-)
  $$\displaystyle (11)\quad
    \bf\Phi\rm[t] \equiv e^{\bf{F}\rm t}
$$

And the derivative of (Φ) is

#!latex (-)
  $$\displaystyle (12)\quad
    \bf\Phi\rm'[t] = \bf{F}\rm e^{\bf{F}\rm t} = \bf{F \Phi}\rm[t]
$$

Now we can write a solution to (8)

#!latex (-)
  $$\displaystyle (13)\quad
    \bf{x}\rm[t] = \bf\Phi\rm[t] \bf{x}\rm[0] + \int_0^t \bf\Phi\rm[t-\tau] \bf{G}\rm[\tau] \bf{u}\rm[\tau]\, d\tau
$$

Where we have assumed that the system evolves beginning at time (t = 0)

The 1st term in (13) is the homogeneous solution to (8) exactly analogous to (10), the 2nd term is the particular solution. To motivate the 2nd term, consider that (F x) and (G u) in (8) have exactly the same influence on (x'). To find the current influence of a past input applied at time (τ), on the present time (t), take the effect at time (τ), namely (G[τ] u[τ]) and propagate it forward to time (t) by multiplying by (Φ[t-τ]). The cumulative influence of all inputs from time zero to (t) is just the integral given by the 2nd term. If this is too hand-wavy for you, set the particular solution equal to the 2nd term and take the derivative

#!latex (-)
  \begin{eqnarray*}
  (14)\quad
    &(\bf{x}\rm_{particular}[t])' = (\int_0^t \bf\Phi\rm[t-\tau] \bf{G}\rm[\tau] \bf{u}\rm[\tau]\, d\tau)' = \int_0^t \bf\Phi\rm[t-\tau] \bf{G}\rm[\tau] \bf{u}\rm[\tau]\, d\tau\ +\ \bf\Phi\rm[0] \bf{G}\rm[t] \bf{u}\rm[t] \\
&= \bf{F}\rm \int_0^t \bf\Phi\rm[t-\tau] \bf{G}\rm[\tau] \bf{u}\rm[\tau]\, d\tau\ +\ \bf{G}\rm[t] \bf{u}\rm[t] = \bf{F}\rm \cdot \bf{x}\rm_{particular}[t] + \bf{G}\rm[t] \bf{u}\rm[t]
  \end{eqnarray*}

Where the fact that (Φ[0]) is an identity matrix has be used. Since the integral in (14) involves time in the upper limit, the time derivative must be taken according to Leibniz' rule. (See mathworld or wikipedia).

An equation for the discreet time evolution of the system can be derived from the continuous time solution (13).

For the time sequence { T, T₁, ... , T_i }, the discreet solution is

#!latex (-)
  $$\displaystyle (15)\quad
    \bf{x}\rm[T_i] = \bf\Phi\rm[T_i] \bf{x}\rm[0] + \int_0^{T_i} \bf\Phi\rm[T_i-\tau] \bf{G}\rm[\tau] \bf{u}\rm[\tau]\, d\tau
$$

This can be written in a incremental form

#!latex (-)
  $$\displaystyle (16)\quad
    \bf{x}\rm_i = \bf\Phi\rm_i \bf{x}\rm_{i-1} + \int_{T_{i-1}}^{T_i} \bf\Phi\rm[T_i-\tau] \bf{G}\rm[\tau] \bf{u}\rm[\tau]\, d\tau
$$

Where (Φ_i) is the state transition matrix from time (T_{i - 1}) to (T_i)

If in (16), we assume (u) is constant over a time step (T), that the time steps are all equal, and that (Φ) and (G) are constant, then we can write the simplified form

#!latex (-)
  $$\displaystyle (17)\quad
    \bf{x}\rm_{i} = \bf\Phi\rm_{i} \bf{x}\rm_{i-1} + \left( \int_0^T \bf\Phi[\rm T-\tau]\, d\tau \right) \bf G u\rm_{i}
$$

To make this a little more clear, refer once more to the example equation of a one dimensional accelerated mass, which is repeated here

#!latex (-)
  $$\displaystyle (18)\quad
    \pmatrix{p \cr v}' = \pmatrix{0 & 1 \cr 0 & 0}\pmatrix{p \cr v}+\pmatrix{0 \cr 1}a
$$

This example satisfies all the assumptions of (17) providing acceleration (a) is constant over each time step. Since this is the usual assumption for a sampled data system, let's assume it here. Identify the system and input matrices as

#!latex (-)
  $$\displaystyle (19)\quad
    \bf{F}\rm = \pmatrix{0 & 1 \cr 0 & 0},\ \bf{G}\rm = \pmatrix{0 \cr 1}
$$

To calculate (Φ) the matrix exponential referred to in (11) must be found. In analogy to the scalar case, the exponential can be found from the series

#!latex (-)
  $$\displaystyle (20)\quad
    \bf\Phi\rm[t] \equiv e^{\bf{F}\rm t} = \bf{I}\rm + \bf{F}\rm t + \frac{\bf{F}\rm^2 t^2}{2} + \frac{\bf{F}\rm^3 t^3}{6} +\ \ldots\ + \frac{\bf{F}\rm^k t^k}{k!} +\ \ldots
$$

In this case the exponential is easy to compute because (F²) and all higher powers are the (2×2) zero matrix, therefore

#!latex (-)
  $$\displaystyle (21)\quad
    \bf\Phi\rm = \pmatrix{1 & \rm T \cr 0 & 1}
$$

Now using (17) the discreet solution for the example can written

#!latex (-)
  $$\displaystyle (22)\quad
    \pmatrix{p_i \cr v_i} = \pmatrix{1 & \rm T \cr 0 & 1} \pmatrix{p_{i-1} \cr v_{i-1}} + \int_0^{\rm T} \pmatrix{1 & \rm T-\tau \cr 0 & 1}\, d\tau \cdot \pmatrix{0 \cr 1} a = \pmatrix{1 & \rm T \cr 0 & 1} \pmatrix{p_{i-1} \cr v_{i-1}} + \pmatrix{\rm T & \rm T^2/2 \cr 0 & \rm T} \pmatrix{0 \cr 1} a
$$

Or in scalar form

#!latex (-)
  \begin{eqnarray*}
  (23)\quad
  p_i &=& p_{i-1} + \rm T \it v_{i-1} + a_i\ \rm T^2/2 \\
  v_i &=& v_{i-1} + a_i\ \rm T
  \end{eqnarray*}

Other System Representations

As suggested above, state space representations are more or less directly related the differential equations describing the dynamic system. In traditional control theory, as opposed to "modern" state space methods, transfer functions are often used. Since either description refers to the same system, state space representations can be transformed into transfer functions and vice versa, subject to some limitations.

The transfer function can be found from a state space representation fairly easily.

Recall the linear form of the state space representation (6)

#!latex (-)
  $$\displaystyle (24)\quad
    \bf{x}' = \bf{F} \bf{x} + \bf{G} \bf{u}
$$

If we let the system outputs be defined as (y[t]), introduce the output matrix (H) such that

#!latex (-)
  $$\displaystyle (25)\quad
    \bf{y}\rm[t] = \bf{H}\rm[t] \bf{x}\rm[t]
$$

Take the Laplace transform of (24)

#!latex (-)
  $$\displaystyle (26)\quad
    \rm s\ \bf{X}\rm[s]-\bf{x}\rm[0] = \bf{F} \bf{X}\rm[s] + \bf{G} \bf{U}\rm[s]
$$

Solve for (X[s])

#!latex (-)
  $$\displaystyle (27)\quad
    \bf{X}\rm[s] = (\bf{I}\rm\ s -\bf{F}\rm)^{-1} (\bf{G} \bf{U}\rm[s] + \bf{x}\rm[0])
$$

If it is assumed that the initial conditions are zero, multiply by (H) and divide by (U) to get the transfer function from the state space representation

#!latex (-)
  $$\displaystyle (28)\quad
    \bf{Y}\rm[s]\bf{U}\rm^{-1}[s] = \bf{H}\rm(\bf{I}\rm\ s -\bf{F}\rm)^{-1} \bf{G}
$$

Comparing equation (27) to (13) it can be seen that the role of the state transition matrix (Φ) in (27) is preformed by the first factor, that is

#!latex (-)
  $$\displaystyle (29)\quad
    \bf\Phi\rm[s] = (\bf{I}\rm\ s -\bf{F}\rm)^{-1}
$$

So as an alternative method of computation, (Φ[t]) can be found from the inverse Laplace transform

#!latex (-)
  $$\displaystyle (30)\quad
    \bf\Phi\rm[t] = \mathcal{L}^{-1}[(\bf{I}\rm\ s -\bf{F}\rm)^{-1}]
$$