I’ve been spending a lot of time working with inertial measurement units recently and am discovering the surprising amount of mathematics that goes into using data from accelerometers and gyroscopes to get the orientation of an object in 3D space. The story begins with me trying to integrate an angular velocity vector (in 3D) to get the orientation of an object. Angular velocity is a vector but common representations of orientation (like Euler Angles) are not. So getting the orientation is *not* as simple as doing $\int_0^t \vec{\omega} \mathscr{d}t$.

Rotations in two dimensions are really easy because there is only one plane in which you can do the rotation. The only thing you can change is the center of rotation. The orientation of an object in 2 dimensions can be given by a single angle $\phi$ measured relative to the origin of the coordinate system. The angular velocity of an object in 2 dimensions can also be represented by a single number $\omega$ which is defined as the rate of change of the angle $\phi$. This gives us the basic formula we learn in high school physics. $$\omega = \frac{d\phi}{dt} = \frac{v_{\bot}}{r}$$

Rotations in 3D are much more complicated. In 3 dimensions, the plane of rotation can be any among an infinite number of possibilities. The origin or “center” of rotation now becomes an *axis* of rotation which is represented by a vector. The plane of rotation is perpendicular to the axis of rotation. This is a result of the Euler Rotation Theorem

Before I can start integrating angular velocity I need to choose a representation for the orientation. There are several different conventions when it comes to representing the orientation of an object in 3D. The most popular approach is to use a set of three angles called Euler Angles. This approach was developed by the mathematician Leonhard Euler (pronounced oy-ler not you-ler). These set of three angles specify three successive rotations around three different axes in a very specific order. The order is important because rotations in 3D have an important property: they do not *commute*. This means that if I do rotation 1 before 2 I end up with a different orientation than if I do 2 before 1. Euler angles usually specify rotations in the ZXZ order. This means that the first axis of rotation is the current Z-axis. The second axis of rotation is the new X-axis *after* the first rotation. The final axis of rotation is the new Z-axis after the first and second rotations. A closely related approach is to use what’s called Tait-Bryan angles. Tait-Bryan angles are called yaw, pitch and roll and are commonly used when talking about the orientation of aircraft. The angles represent successive rotations around the body fixed X, Y and Z axes. Both Euler angles and Tait-Bryan angles however, suffer from something called Gimbal Lock. This is when for a certain value of one of the rotation angles, a degree of freedom is lost and two rotations “collapse” into a single rotation. The Wikipedia page linked has some nice visualizations and a short mathematical explanation (using rotation matrices) for why this happens.

Using Quaternions to represent rotations is a way to avoid the Gimbal Lock problem. Quaternions are so useful for representing orientations that most Kalman Filters that need to track 3D orientations use them instead of Euler Angles. So I settled on using quaternions. When I first started working with quaternions I found them a little difficult to understand. So I thought of writing an article about the path I took to understand and use quaternions for “integrating” angular velocity.

From all my time working with mathematics, I’ve come to realize that mathematical ideas exist on a spectrum. At one end is math that is focused on the practical application. Math that is very close to the real world (as experienced by humans) and can be understood easily by making analogies to things that humans encounter in everyday life. This is the type of math that is often applied to solve everyday problems like settling bills or building bridges. At the other end of the spectrum is the type of math that is pure abstraction. This type of abstract math is often used to generalize specific results in practical math to other problems or to come up with new insights that can pave the way for new solutions to a problem. Quaternions are a little more towards the abstract end of the spectrum and can be difficult to get an intuition for. Sometimes though there are ideas that you *can’t* get an intuition for. Working with such math is mostly a matter of getting used to it. With abstract math, the best way I’ve found to get used to it is to learn the basic definition and keep applying it to problems until you get the hang of it. The understanding forms in the brain automatically as you get the hang of it.

## Quaternion Math

So without further ado, I’ll talk about what quaternions are and how they behave.

### Representation

A quaternion $q$ can be represented as a tuple of 4 numbers: $$q = \begin{bmatrix} w & \ x &\ y & \ z \end{bmatrix} = \begin{bmatrix} w & \ \vec{v}\end{bmatrix} = w + x\mathrm{i} + y\mathrm{j} + z\mathrm{k} $$

where the $w$ is the *scalar* part and the $\vec{v}$ is the *vector* part.

Two binary operations are defined for quaternions: addition $+$ and quaternion multiplication $\otimes$.

### Addition

Addition is defined as the component-wise sum just like for a 4D vector. The sum is commutative (order is not important) and associative (grouping is not important).

$$ q_1 + q_2 = \begin{bmatrix} w_1 + w_2 & \ x_1 + x_2 & \ y_1 + y_2 & \ z_1 + z_2 \end{bmatrix} = q_2 + q_1$$

### Multiplication

Quaternion multiplication is defined in multiple ways but the formula that I find the easiest to remember is: $$ q_1 \otimes q_2 = \begin{bmatrix} w_1 w_2 - \vec{v}_1\cdot\vec{v}_2 & \ w_1\vec{v}_2 + w_2\vec{v}_1 + \vec{v}_1\times\vec{v}_2 \end{bmatrix} $$

The multiplication is *non-commutative* ($q_1 \otimes q_2 \neq q_2 \otimes q_1$ ) and distributive over the sum $(q_1 \otimes (q_2 + q_3) = q_1 \otimes q_2 + q_1 \otimes q_3 )$.

### Norm, Conjugate and Inverse

It’s also useful to define the norm of a quaternion as: $$ ||q|| = \sqrt{w^2 + x^2 + y^2 + z^2} $$

a conjugate quaternion $ q^* $ that satisfies the property $ q \otimes q^* = ||q||^2 $ as $$ q^* = \begin{bmatrix} w & \ -x & \ -y & \ -z \end{bmatrix} $$

and a quaternion inverse $q^{-1}$ with the property $q \otimes q^{-1} = \begin{bmatrix} 1 \ 0 \ 0 \ 0 \end{bmatrix}$ as $$ q^{-1} = \frac{q^*}{||q||^2} $$

## Using Quaternions for Rotation

Now that the behaviour of quaternions are established, there is the question of how to use them to represent 3D rotation. From Euler’s Rotation Theorem it is clear that rotations have 3 degrees of freedom. But quaternions as 4 tuples have 4 degrees of freedom. So an additional constraint needs to be imposed to use them to represent rotations. This is done by requiring that the quaternions are *unit* quaternions: $||q|| = 1$. Unit quaternions are also called *versors*. There are many diagrams and visualizations that attempt to make understanding the unit quaternion more intuitive but I found that for me, most were misleading. I found it the most useful to think of quaternions as just an abstract object with the properties that I’ve mentioned and *not* trying to have any picture in mind. The moment I left behind the crutch of visualization and forced myself to accept and think about quaternions as they are, everything fell into place.

To rotate a 3D vector $\vec{r}$ by a versor $q$ an operation called *conjugation* is used.
$$ \vec{r}’ = q \otimes \begin{bmatrix} 0 & \ \vec{r} \end{bmatrix} \otimes q^{*} $$

If I find the formula quite mysterious (as I did when I first saw it), it’s helpful to use one of the other methods (euler angles, axis angle) to rotate a vector and verify that the result is the same.

It’s possible to compose two rotations represented by two quaternions $q_1$ and $q_2$ by multiplying the two quaternions together $q_2 \otimes q_1$. This product represents the rotation $q_1$ applied before $q_2$.

### Quaternions, Rotation Matrices and the Rotation Group $SO(3)$

Earlier, I mentioned that rotations in 3D have certain properties (like non-commutativity) that implies that they don’t belong to a vector space (and therefore can’t be represented by one). If I want to be exact when talking about rotations, I have to consider them as a *group*. A group is yet another mathematical abstraction. Abstractly, think of them as a set of objects that follow certain rules. Remember that groups are another concept that leans towards the abstract end of the mathematical spectrum. They *can* be applied to rotations but they were invented by mathematicians to study more general ideas. So what rules does a group follow?

A group is a set $G$ along with some *binary* operation $\cdot$ (takes two elements of the set and gives a third). The elements of the set follow these rules:

- Closure: If $a,b \in G$ then $a\cdot b \in G$.
- Associativity: If $a,b,c \in G$ then $a\cdot ( b \cdot c) = (a \cdot b) \cdot c $.
- Identity: There’s some element of the set $e$ which has the property $a \cdot e = e\cdot a = a$.
- Inverse: If $a \in G$ there’s some element $a^* $ such that $a \cdot a^* = a^* \cdot a = e$.

If you compare this to 3D rotations, you can see that the set of 3D rotations are an example of a group under the binary operation of composition (doing one rotation after another)!

- Closure: If I compose two rotations it forms another rotation. It doesn’t suddenly become a translation or scaling or shear. There’s no way to combine rotations to do any of these other operations.
- Associativity: If I do three rotations, it doesn’t matter which two I compose first.
- Identity: There’s an identity rotation ($0^\circ$ of rotation around any axis).
- And there’s an inverse rotation (rotating by negative of the angle around the same axis.)

Quaternions under multiplications *also* satisfy all these properties.

If I convert Euler angle rotations to rotation matrices and compare them with quaternions, the parallels between them are very clear.

Group Property | Rotation Matrix $R_i \in G$ | Quaternion $q_i \in \mathbb{H}_R$ |
---|---|---|

Closure | $R_1 \cdot R_2 \in G$ | $q_1 \otimes q_2 \in \mathbb{H}_R$ |

Associativity | $R_1 \cdot (R_2 \cdot R_3) = (R_1 \cdot R_2) \cdot R_3$ | $q_1 \otimes (q_2 \otimes q_3) = (q_1 \otimes q_2) \otimes q_3$ |

Identity | $I$ | $1$ |

Inverse | $R_i^{-1} = R_i^{T} $ | $q_i^{-1}$ |

The closure and associativity properties rotation matrices can be easily seen as a consequence of the fact that rotation matrices are orthogonal matrices.

With this knowledge of the rules that rotations follow, it’s clear why it’s silly to think of the Euler angles as a vector. They’re a set of related numbers but vectors they are not! It’s also clear why integrating the angular velocity vector over time does not directly give the orientation in an easily usable form.

### Representing quaternion rotation as a matrix

Linearity of operations are an important property that both engineers and mathematicians like to take advantage of. It can result in useful simplifications of results. So it’s useful (later in this article) to ask the question: Is quaternion multiplication a linear operation? The simplest way to find out is to write out the result of a general quaternion multiplication operation and check if it’s possible to represent it as a matrix multiplication.

The symbolic result of multiplying two quaternions $q_1$ and $q_2$ is: $$ q_1 \otimes q_2 = \left[\begin{matrix}w_1 w_2 - x_1 x_2 - y_1 y_2 - z_1 z_2\\ w_1 x_2 + w_2 x_1 + y_1 z_2 - y_2 z_1\\ w_1 y_2 + w_2 y_1 - x_1 z_2 + x_2 z_1\\ w_1 z_2 + w_2 z_1 + x_1 y_2 - x_2 y_1\end{matrix}\right] $$

From inspection, this can be written as a matrix product: $$ q_1 \otimes q_2 = \left[\begin{matrix}w_2 & -x_2 & -y_2 & -z_2\\ x_2 & w_2 & z_2 & - y_2\\ y_2 & - z_2 & w_2 & x_2\\ z_2 & y_2 & - x_2 & w_2\end{matrix}\right] \begin{bmatrix} w_1 \\ x_1 \\ y_1 \\ z_1 \end{bmatrix} $$

And if I change the order of multiplication: $$ q_2 \otimes q_1 = \left[\begin{matrix}w_2 & - x_2 & - y_2 & - z_2 \\ x_2 & w_2 & - z_2 & y_2 \\ y_2 & z_2 & w_2 & - x_2 \\ z_2 & - y_2 & x_2 & w_2\end{matrix}\right] \begin{bmatrix} w_1 \\ x_1 \\ y_1 \\ z_1 \end{bmatrix} $$

## Integrating Angular Velocity

To properly integrate angular velocity to get a quaternion, I need to find a relationship a quaternion and the angular velocity - or more precisely - a differential equation that relates the time derivative of the quaternion $\dot{q}$ and the angular velocity vector $\vec{\omega}$.

A natural place to start is the original definition of the angular velocity from physical law. If I imagine a vector of constant $\vec{s}(t)$ length stretching out from the origin undergoing rotation with the instantaneous angular velocity $\vec{\omega}(t)$ I can find the velocity at the tip of this vector by taking it’s derivative. $$ \frac{\mathrm{d}\vec{s}}{\mathrm{d}t} = \vec{\omega} \times \vec{s} $$

Since the angular velocity is perpendicular to the vector $\vec{s}$ (driving their dot product $\vec{\omega} \cdot \vec{s}$ to zero) the equation can also be written in quaternion form as: $$ \frac{\mathrm{d}\vec{s}}{\mathrm{d}t} = \vec{\omega} \otimes \vec{s} $$

Now I imagine that this instantaneous vector is represented by a quaternion $q$ rotation from a constant vector $\vec{s}_0$. $$\begin{align} \vec{s} &= q \otimes \vec{s}_0 \otimes q^* \\ \frac{\mathrm{d}\vec{s}}{\mathrm{d}t} &= \frac{\mathrm{d}}{\mathrm{d}t} \left[ q \otimes \vec{s}_0 \otimes q^* \right] \end{align}$$

So how do I take that nasty looking derivative on the side? Well it turns out that the product rule of derivatives that is valid in basic calculus is also perfectly valid for quaternion multiplication! I converted the quaternion product into a matrix multiplication and spent some time converting the derivatives of the product to the result from applying the product rule. The only thing to watch out for is the non commutativity of the multiplication. The order of the quaternion product shouldn’t be changed when applying the product rule.

$$ \frac{\mathrm{d}}{\mathrm{d}t} \left[ q \otimes \vec{s}_0 \otimes q^* \right] = \dot{q} \otimes \vec{s}_0 \otimes q^* + q \otimes \vec{s}_0 \otimes \dot{q^*} $$

There’s a term in this equation - the derivative of the conjugate - that’ll cause some trouble. It is possible to eliminate this using other methods but the simplest way is to directly find a relationship between $\dot{q}$ and $\dot{q^*}$. The easy way to do this is to take the derivative of the product of a quaternion and it’s conjugate which we know to be 1:

$$\begin{align} \frac{\mathrm{d}}{\mathrm{d}t} (q \otimes q^* ) &= \frac{\mathrm{d}}{\mathrm{d}t} 1 \\ \dot{q} \otimes q^* + q \otimes \dot{q^*} &= 0 \end{align} $$

That gives the relationship

$$ \dot{q^* } = -q^* \otimes \dot{q} \otimes q^* $$

Expressing $s_0$ in terms of $s$, substituting $\dot{q^* } $ and doing a little algebra: $$ \frac{\mathrm{d}}{\mathrm{d}t} \left[ q \otimes \vec{s}_0 \otimes q^* \right] = \dot{q} \otimes q^* \otimes \vec{s} - \vec{s} \otimes \dot{q} \otimes q^* = \vec{\omega} \otimes \vec{s} $$

### The Quaternion Commutator

The expression $ \dot{q} \otimes q^* \otimes \vec{s} - \vec{s} \otimes \dot{q} \otimes q^* $ is in the form $p \otimes q - q \otimes p$ which is defined as a *commutator* operation written as $[p, q]$. Going through the algebra of this operation and simplifying:

$$ [q_1, q_2] = \begin{bmatrix} 0 \ 2(\vec{v}_1 \times \vec{v}_2) \end{bmatrix} $$

Interestingly, if both $q_1$ and $q_2$ are *pure quaternions* (They do not have a scalar part) then the quaternion commutator and the product are related:

$$[\vec{q}_1, \vec{q}_2] = 2(q_1 \times q_2) = 2(q_1 \otimes q_2) $$

### The Product $\dot{q} q^* $ is a Pure Quaternion

$$\begin{align} \dot{q} \otimes q^* + q \otimes \dot{q^* } &= 0 \\ \dot{q} \otimes q^* &= -(\dot{q} \otimes q^* )^* \end{align} $$

Saying that a $q = -q^* $ is the same as saying that $w = -w$ which means that the scalar part of the quaternion is zero.

### The Differential Equation

So finally, I can extract the quaternion differential equation:

$$ \begin{align} \dot{q} \otimes q^* \otimes \vec{s} - \vec{s} \otimes \dot{q} \otimes q^* &= \vec{\omega} \otimes \vec{s} \\ \left[\dot{q} \otimes q^* , \vec{s}\right] &= \vec{\omega} \otimes \vec{s} \\ 2\dot{q} \otimes q^* \otimes \vec{s} &= \vec{\omega} \otimes \vec{s} \\ \dot{q} &= \frac{1}{2} \vec{\omega} \otimes q \end{align} $$

In this equation the $\vec{\omega}$ is the angular velocity in the global fixed frame. In many situations it’s more useful to have an equation in terms of the angular velocity as measured by a reference frame fixed to the moving body - like when it is measured using a gyrscope. The angular velocity in this frame is the global angular velocity rotated into the body frame $\vec{\omega}’ = q^* \otimes\vec{\omega}\otimes q $. Replacing $\vec{\omega}$ gives the more useful differential equation:

$$ \dot{q} = \frac{1}{2} q \otimes \vec{\omega} $$

### Integration

To solve this differential equation is to be able to integrate it. Normal differential equations are difficult enough. How does one solve a differential equation with a *quaternion multiplication* in it? This is where the linearity of the quaternion multiplication becomes very useful. I am also going to make a (reasonable) assumption - that the angular velocity is constant over a time $\Delta t$. Then I can rewrite the differential equation in a well known form:

$$ \begin{bmatrix} \dot{w} \\ \dot{x} \\ \dot{y} \\ \dot{z} \end{bmatrix} = \left[\begin{matrix}0 & - \omega_x & - \omega_y & - \omega_z \\ \omega_x & 0 & \omega_z & - \omega_y \\ \omega_y & - \omega_z & 0 & \omega_x \\ \omega_z & \omega_y & - \omega_x & 0 \end{matrix}\right] \cdot \begin{bmatrix} w \\ x \\ y \\ z \end{bmatrix} $$

This is an ODE in the form $\dot{q} = Aq$ where A is the big matrix. The solution to this differential equation then is:

$$ q(t) = e^{A(t - t_0 )} q_0 $$

### Quaternion Exponential

Interestingly, if I define the quaternion exponential in the same way as the matrix exponential (using its Taylor Series representation), I get a quaternion equivalent formula.

$$ \begin{align} \exp{(q)} &= e^{w}e^{\vec{v}} \\ &= e^{w}\left(\sum_0^\infty \frac{\vec{v}^k}{k!}\right) \\ &= e^{w}\left(\cos{|\vec{v}|} + \frac{\vec{v}}{|\vec{v}|} \sin{|\vec{v}|}\right) \end{align} $$

Using the quaternion exponential, the solution to the differential equation can be expressed in quaternion form as:

$$ q(t) = \exp{\left(\frac{1}{2}\vec{\omega}\Delta t\right)} \otimes q_0 $$

## Summing Up

I started out with the simple sounding task of integrating angular velocity and in trying to solve it, traversed through a several different areas of mathematics and learned a lot along the way before coming to the final solution. However, my description here is far from complete. The equation above only holds if the angular velocity is constant over a time period. This means its a “first order” model. Dropping this assumption gives $n^{th}$ order models for integration. There are also intricate details that I’m only beginning to understand. For instance, I’m reading about how rotations are a special type of group called Lie Groups where the group is also a differentiable manifold (yet another interesting abstract mathematical object). The space of angular velocity forms what is called a Lie Algebra on the group. And the quaternion exponential function which most texts I refer to seem to pull out of thin air is actually related to a more general idea called an exponential map which maps general Lie Algebras to Lie Groups.

Despite it’s incompleteness however, it is the minimum that I needed to understand to be able to actually implement in code the integration of a quaternion - i.e use quaternions practically for integrating data coming in from an inertial measurement unit. I hope that others trying to understand quaternions and their role in representing 3D rotations will find this article useful. Some of the references below go deeper into the nature of quaternions and how to use them for useful things like tracking orientations.

## References

- Boyle, Michael. “The integration of angular velocity.” Advances in Applied Clifford Algebras (2016): 1-30.
- https://en.wikipedia.org/wiki/Rotation_group_SO(3)
- https://en.wikipedia.org/wiki/Quaternion
- Sola, Joan. “Quaternion kinematics for the error-state KF.” (2015).
- https://math.stackexchange.com/questions/1030737/exponential-function-of-quaternion-derivation