wiki/ StateSpace

Modeling Systems in State Space

We have no real business writing a page about state space techniques because so many others have done it better. Our hope here is to provide some background locally so that people who haven't heard about all this stuff before can follow what we're doing.

External links:

Local links:

Modeling Systems in State Space

  1. Modeling Systems the Old Fashioned Way
  2. State Space Representations
  3. Discreet Time Formulations
  4. Other System Representations

Modeling Systems the Old Fashioned Way

Sometimes when people talk about state space techniques, they refer to them as "Modern". Implying that techniques that were developed earlier are old fashioned. That implication is really baseless. State space is just another notational device, with all the usual advantages and disadvantages. Sometimes state space representations are very convenient, when they are not, consider using something else.

Many physical systems can be described using differential equations. For example, consider an idealized system consisting of a point mass moving in one dimension subject to an externally applied force. The governing equation is

(1)\quad\
  F = m a

Where (F) is the force, (m) the mass, and (a) the acceleration

To completely describe this system the position (p) and velocity (v) can be related to acceleration

(2)\quad\
  \begin{array}{lcl}\
    a &=& v'\\\
    v &=& p'\
  \end{array}

For a one dimensional point mass, the complete state of the system can be described by two parameters, position and velocity. The acceleration is determined by the force, which we have defined to be externally determined. With these ideas in mind the system description can be re-written in a suggestive form

(3)\quad\
  \begin{array}{lcl}\
    p' &=& v\\\
    v' &=& a\
  \end{array}

In this form, everything on the left of an equal sign is the first derivative of a `state' variable. The right hand sides can be considered as a set of functions whose arguments are the state variables and the external inputs. Formally, we can write

(4)\quad\
  \begin{array}{lcl}\
    x_1' &=& f_1[x_1, x_2,\ \ldots\ , x_n, u_1, u_2,\ \ldots\ , u_m]\\
    x_2' &=& f_2[x_1, x_2,\ \ldots\ , x_n, u_1, u_2,\ \ldots\ , u_m]\\
     &\vdots\\
    x_n' &=& f_n[x_1, x_2,\ \ldots\ , x_n, u_1, u_2,\ \ldots\ , u_m]\\
  \end{array}

Where the (xi) are the state variables, the (fi) are arbitrary single valued functions, and the (ui) are external inputs to the system.

Equation (4) is an example of a `state space representation'. By definition the state space form has first order derivatives of all the state variables on the left, and functions of the state variables and external inputs on the right. This arrangement can produce a very regular structure, one that is suitable for a matrix formulation.

In general, the functions (f) in (4) may be time varying. Time varying functions complicate the analysis somewhat, but don't alter the fundamental underpinnings of state space analysis, so in the interest of simplifying things, constant functions are assumed throughout.


State Space Representations

If the functions (fi) in (4) are linear functions, then the whole system of equations can be represented as a set of sums

(5)\quad\
  \begin{array}{lcl}\
    x_1' &=& \sum_{i=1}^n {{c_1}_i\ x_i} + \sum_{i=1}^m {{g_1}_j\ u_j}\\
    x_2' &=& \sum_{i=1}^n {{c_2}_i\ x_i} + \sum_{i=1}^m {{g_2}_j\ u_j}\\
     &\vdots\\
    x_n' &=& \sum_{i=1}^n {{c_n}_i\ x_i} + \sum_{i=1}^m {{g_n}_j\ u_j}\\
  \end{array}

The sums can be written compactly in matrix form

(6)\quad\
  \bf{x}' = \bf{F} \bf{x} + \bf{G} \bf{u}

Where (F = {cki}), (G = {gkj}), and (k) runs from 1 to (n)

A very reasonable question is, "How useful is this?" After all, not every dynamic system can be described by differential equations. Even if differential equations can be used, often the equations are not linear, or the derivatives used are not of first order. How can (6) be used in those cases?

Informally, equation (6) can be stated in English, "The current change in the system state (x') depends in part on the current system state (x), and in part on the external influence (u)." Clearly this description could apply to a large variety of physical systems.

Formally, any system which is continuous and linear can be represented in the form given by equation (6). Equations with derivatives of nth-order can be transformed into n coupled equations of 1st order. (See for example phase-variable canonical form. More or less this works by including the higher order derivatives as state variables.)

In practice, equation (6) can be used with non-linear systems too. Although non-linear systems can't be represented exactly in a linear matrix equation, they present no problem in the general state space formulation given by (4). However matrix techniques are sufficiently attractive that non-linear systems are often linearized to allow the matrix formulation to be applied, approximately, to them as well <link???>.

The previous example of a one dimensional accelerated mass, can be put into matrix form like this

(7)\quad\
  \begin{pmatrix} p \cr v \end{pmatrix}' =\
  \begin{pmatrix} 0 & 1 \cr 0 & 0 \end{pmatrix}\
  \begin{pmatrix} p \cr v \end{pmatrix} +\
  \begin{pmatrix} 0 \cr 1 \end{pmatrix}\
  a

Expanding out equation (7) reproduces exactly the system given in (3).

In (7) the state variables are position (p) and velocity (v). It's worth noting that the choice of state variables is not unique. Clearly, any other set of state variables which can be solved for the original state variables will also work. As suggested by (6), any invertible linear combination of state variables will work. The situation is equivalent to having multiple basis sets for sub-spaces in linear algebra.

In complex situations, picking the state variables can be tricky. The first problem is finding a good system model. In a real-world system there are always many possible factors that influence the system. In some cases it is worth the extra burden to incorporate some of these factors into the system model. Doing so increases the development time and computational load, so often it's better to model some of the influences as process noise, or to just ignore them in the model. Likewise, some system inputs may have subtle relations to the system state. The interdependence could be modeled by state variables, but again, at increasing cost.

A set of state variables is theoretically minimal if together they are sufficient to describe every aspect of the system, but elimination of any variable, singly or in combination, leaves at least some of the system unobservable. Knowing this, it seems like picking good state variables means picking some orthogonal combination of spanning state variables. Particularly if the variables chosen are easily related to available measurements. For linear (or linearized) systems this idea is "mostly true". However, it's still possible to imagine a linear system where some of the possible state variables are so sensitive to small variations, that the accuracy of the system model will be low, but these same variables may be expressible as functions of other possible state variables which are relatively insensitive to small variations. In this case choosing the latter as state variables should result in more accurate system modeling.

Here are some other pathological situations where the choice of state variables may be difficult.


Discreet Time Formulations

For many purposes, writing the differential equations for a dynamical system is not sufficient. For state space equations the value of the state variables through time is usually desired.

Recall the linear form of the state space equations given in (6)

(8)\quad\
  \bf{x}' = \bf{F} \bf{x} + \bf{G} \bf{u}

Ignoring the external inputs and the fact that this is a matrix equation, equation (8) reduces to a linear homogenous first order differential equation

(9)\quad\
  x' - F x = 0

The solution is well known (and can be verified by direct substitution)

(10)\quad\
  x = e^{F t}

Since the matrix equation (8) is linear, it can be solved exactly the same way. We define the state transition matrix (Φ) to be the matrix exponential:

(11)\quad\
  \bf\Phi\rm[t] \equiv e^{\bf{F}\rm t}

And the derivative of (Φ) is

(12)\quad\
  \bf\Phi\rm'[t] = \bf{F}\rm e^{\bf{F}\rm t} = \bf{F \Phi}\rm[t]

Now we can write a solution to (8)

(13)\quad\
  \bf{x}\rm[t] =\
  \bf\Phi\rm[t] \bf{x}\rm[0] +\
    \int_0^t \bf\Phi\rm[t-\tau] \bf{G}\rm[\tau] \bf{u}\rm[\tau]\, d\tau

Where we have assumed that the system evolves beginning at time (t = 0)

The 1st term in (13) is the homogeneous solution to (8) exactly analogous to (10), the 2nd term is the particular solution. To motivate the 2nd term, consider that (F x) and (G u) in (8) have exactly the same influence on (x'). To find the current influence of a past input applied at time (τ), on the present time (t), take the effect at time (τ), namely (G[τ] u[τ]) and propagate it forward to time (t) by multiplying by (Φ[t-τ]). The cumulative influence of all inputs from time zero to (t) is just the integral given by the 2nd term. If this is too hand-wavy for you, set the particular solution equal to the 2nd term and take the derivative

(14)\quad\
  \begin{array}{lcl}\
    (\bf{x}\rm_{particular}[t])' =\
      (\int_0^t \bf\Phi\rm[t-\tau] \bf{G}\rm[\tau] \bf{u}\rm[\tau]\, d\tau)'\\\
    = \int_0^t \bf\Phi\rm[t-\tau] \bf{G}\rm[\tau] \bf{u}\rm[\tau]\, d\tau\ +\
      \bf\Phi\rm[0] \bf{G}\rm[t] \bf{u}\rm[t]\\\
    = \bf{F}\rm \int_0^t \bf\Phi\rm[t-\tau]\
      \bf{G}\rm[\tau] \bf{u}\rm[\tau]\, d\tau\ +\
      \bf{G}\rm[t] \bf{u}\rm[t]\\\
    = \bf{F}\rm \cdot \bf{x}\rm_{particular}[t] +\
      \bf{G}\rm[t] \bf{u}\rm[t]\
  \end{array}

Where the fact that (Φ[0]) is an identity matrix has be used. Since the integral in (14) involves time in the upper limit, the time derivative must be taken according to Leibniz' rule. (See mathworld or wikipedia).

An equation for the discreet time evolution of the system can be derived from the continuous time solution (13).

For the time sequence { T, T1, ... , Ti }, the discreet solution is

(15)\quad\
  \bf{x}\rm[T_i] =\
  \bf\Phi\rm[T_i] \bf{x}\rm[0] +\
    \int_0^{T_i} \bf\Phi\rm[T_i-\tau] \bf{G}\rm[\tau] \bf{u}\rm[\tau]\, d\tau\

This can be written in a incremental form

(16)\quad\
  \bf{x}\rm_i =\
  \bf\Phi\rm_i \bf{x}\rm_{i-1} +\
    \int_{T_{i-1}}^{T_i} \bf\Phi\rm[T_i-\tau]\
    \bf{G}\rm[\tau] \bf{u}\rm[\tau]\, d\tau

Where (Φi) is the state transition matrix from time (Ti - 1) to (Ti)

If in (16), we assume (u) is constant over a time step (T), that the time steps are all equal, and that (Φ) and (G) are constant, then we can write the simplified form

(17)\quad\
  \bf{x}\rm_{i} =\
  \bf\Phi\rm_{i} \bf{x}\rm_{i-1} +\
  \left( \int_0^T \bf\Phi[\rm T-\tau]\, d\tau \right) \bf G u\rm_{i}

To make this a little more clear, refer once more to the example equation of a one dimensional accelerated mass, which is repeated here

(18)\quad\
  \begin{pmatrix} p \cr v \end{pmatrix}' =\
  \begin{pmatrix} 0 & 1 \cr 0 & 0 \end{pmatrix}\
    \begin{pmatrix} p \cr v \end{pmatrix} +\
    \begin{pmatrix} 0 \cr 1 \end{pmatrix} a

This example satisfies all the assumptions of (17) providing acceleration (a) is constant over each time step. Since this is the usual assumption for a sampled data system, let's assume it here. Identify the system and input matrices as

(19)\quad\
\bf{F}\rm = \begin{pmatrix} 0 & 1 \cr 0 & 0 \end{pmatrix},\
  \bf{G}\rm = \begin{pmatrix} 0 \cr 1 \end{pmatrix}

To calculate (Φ) the matrix exponential referred to in (11) must be found. In analogy to the scalar case, the exponential can be found from the series

(20)\quad\
  \bf\Phi\rm[t] \equiv e^{\bf{F}\rm t} =\
  \bf{I}\rm + \bf{F}\rm t + \frac{\bf{F}\rm^2 t^2}{2} +\
    \frac{\bf{F}\rm^3 t^3}{6} +\ \ldots\ + \frac{\bf{F}\rm^k t^k}{k!} +\ \ldots\

In this case the exponential is easy to compute because (F2) and all higher powers are the (2×2) zero matrix, therefore

(21)\quad\
  \bf\Phi\rm = \begin{pmatrix} 1 & \rm T \cr 0 & 1 \end{pmatrix}

Now using (17) the discreet solution for the example can written

(22)\quad\
\begin{array}{lcl}\
  \begin{pmatrix} p_i \cr v_i \end{pmatrix} &=&\
  \begin{pmatrix} 1 & \rm T \cr 0 & 1 \end{pmatrix}\
  \begin{pmatrix} p_{i-1} \cr v_{i-1} \end{pmatrix} +\
    \int_0^{\rm T} \begin{pmatrix} 1 & \rm T-\tau \cr 0 & 1 \end{pmatrix}\,\
    d\tau \cdot\
    \begin{pmatrix} 0 \cr 1 \end{pmatrix} a\\\
   &=&\
    \begin{pmatrix} 1 & \rm T \cr 0 & 1 \end{pmatrix}\
    \begin{pmatrix} p_{i-1} \cr\ v_{i-1} \end{pmatrix} +\
    \begin{pmatrix} \rm T & \rm T^2/2 \cr 0 & \rm T \end{pmatrix}\
    \begin{pmatrix} 0 \cr 1 \end{pmatrix} a
\end{array}

Or in scalar form

(23)\quad\
  \begin{array}{lcl}\
    p_i &=& p_{i-1} + \rm T \it v_{i-1} + a_i\ \rm T^2/2 \\\
    v_i &=& v_{i-1} + a_i\ \rm T\
  \end{array}


Other System Representations

As suggested above, state space representations are more or less directly related the differential equations describing the dynamic system. In traditional control theory, as opposed to "modern" state space methods, transfer functions are often used. Since either description refers to the same system, state space representations can be transformed into transfer functions and vice versa, subject to some limitations.

The transfer function can be found from a state space representation fairly easily.

Recall the linear form of the state space representation (6)

(24)\quad\
  \bf{x}' = \bf{F} \bf{x} + \bf{G} \bf{u}

If we let the system outputs be defined as (y[t]), introduce the output matrix (H) such that

(25)\quad\
  \bf{y}\rm[t] = \bf{H}\rm[t] \bf{x}\rm[t]

Take the Laplace transform of (24)

(26)\quad\
  \rm s\ \bf{X}\rm[s]-\bf{x}\rm[0] = \bf{F}\ \bf{X}\rm[s] + \bf{G}\ \bf{U}\rm[s]\

Solve for (X[s])

(27)\quad\
  \bf{X}\rm[s] =\
  (\bf{I}\rm\ s -\bf{F}\rm)^{-1} (\bf{G} \bf{U}\rm[s] + \bf{x}\rm[0])

If it is assumed that the initial conditions are zero, multiply by (H) and divide by (U) to get the transfer function from the state space representation

(28)\quad\
  \bf{Y}\rm[s]\bf{U}\rm^{-1}[s] =\
  \bf{H}\rm(\bf{I}\rm\ s -\bf{F}\rm)^{-1} \bf{G}

Comparing equation (27) to (13) it can be seen that the role of the state transition matrix (Φ) in (27) is preformed by the first factor, that is

(29)\quad\
  \bf\Phi\rm[s] = (\bf{I}\rm\ s -\bf{F}\rm)^{-1}

So as an alternative method of computation, (Φ[t]) can be found from the inverse Laplace transform

(30)\quad\
  \bf\Phi\rm[t] = \mathcal{L}^{-1}[(\bf{I}\rm\ s -\bf{F}\rm)^{-1}]