Introduction
If you ever worked on a project involving 3D rotations, you are probably familiar with the Rodrigues transformation that lets you convert between the axis-angle and the \(3\times3\) matrix representations of a rotation operation. In this post, I would like to discuss a numerical issue that comes up when dealing with the derivatives of its inverse transformation.
Jacobians
We can think about the Rodrigues transformation as a map \( \mathcal{T} \) taking rotation vectors \( \boldsymbol{\omega} \) to rotation matrices \( \boldsymbol{R} \). This map is between 3-dimensional objects and 9-dimensional ones and can be written in components as \( \boldsymbol{R}_{ij} = \mathcal{T}_{ij}(\boldsymbol{\omega})\). Since changes in rotation vectors propagate smoothly to changes in the corresponding rotation matrices, we can talk about the Jacobian \( \frac{\partial\mathcal{T}}{\partial\boldsymbol{\omega}} \) of \( \mathcal{T} \) that describes how a small change \( \delta\boldsymbol{\omega} \) in the former becomes a small change in the latter via \( \delta\boldsymbol{R}_{ij} = \sum_{k}\frac{\partial\mathcal{T}_{ij}}{\partial\boldsymbol{\omega}_{k}} \delta\boldsymbol{\omega_{k}} \). In the language of Lie theory, \( \mathcal{T} \) is the exponential map of the \( SO(3) \) group.
The map \( \mathcal{T} \) is, of course, invertible because we can associate an axis and an angle of rotation to every rotation matrix. Therefore, the map \( \mathcal{T}^{-1}: SO(3) \rightarrow \mathbb{R}^3 \) is well defined as it is called the logarithm map. We now may ask a similar question to the one before: how do changes in the rotations matrix propagate through \( \mathcal{T}^{-1}\) to changes in rotation vectors? This one is actually tricky! The problem with defining the analogous jacobian \( \frac{\partial\mathcal{T}^{-1}}{\partial\boldsymbol{R}} \) is that in calculating the partial derivatives, we let one coordinate vary while fixing all the rest. However, the rotation matrices \( \boldsymbol{R} \) are subject to the constraint \( \boldsymbol{R}^T \boldsymbol{R} = \boldsymbol{I} \) that simply doesn’t permit that kind of variation! Nevertheless, the analytical expression for \( \mathcal{T}^{-1} \) is still well-defined even if we treat the components of \( \boldsymbol{R} \) as independent so we can indeed obtain an expression for the Jacobian.
The famous OpenCV library provides the implementation of both \( \mathcal{T} \) and \( \mathcal{T}^{-1} \), which we are rightfully going to call Exp and Log from now on, via the function cv2.Rodrigues(). In Python, it would look something like this:
import cv2
import numpy as np
from math import sin, cos, radians
angle = 180
# The exponential map and its jacobian
w = radians(angle) * np.asarray([0, 0, 1])
R, dRdw = cv2.Rodrigues(w)
# The logarithm map and its jacobian
c = cos(radians(angle))
s = sin(radians(angle))
R = np.asarray([[c, -s, 0], [s, c, 0], [0, 0, 1]])
w, dwdR = cv2.Rodrigues(R)
A a minor remark, note that dwdR will be of shape \( 3 \times 9\), where the 9 stands for the row-wise concatenation of the entries of \( R \), namely \( R_{11}, R_{12}, R_{13}, R_{21}, R_{22}, R_{23}, R_{31}, R_{32}, R_{33} \). Similarly, dRdw will be of shape \( 9 \times 3\).
Singularities
The fact that the jacobian of the logarithmic map was born in sin has, in fact, left a mark. What happens when the rotation angle approaches 180? If you run the code above with angle = 180, you will get the following result for dwdR:
[[0. 0. 0.]
[0. 0. 0.]
[0. 0. 0.]
[0. 0. 0.]
[0. 0. 0.]
[0. 0. 0.]
[0. 0. 0.]
[0. 0. 0.]
[0. 0. 0.]]
OK, but what happens if we change the angle to 179.99? Surprisingly, the result is
[[ 0.00000000e+00 0.00000000e+00 -5.15662013e+07]
[ 0.00000000e+00 0.00000000e+00 -8.99950005e+03]
[ 0.00000000e+00 8.99950005e+03 0.00000000e+00]
[ 0.00000000e+00 0.00000000e+00 8.99950005e+03]
[ 0.00000000e+00 0.00000000e+00 -5.15662013e+07]
[-8.99950005e+03 0.00000000e+00 0.00000000e+00]
[ 0.00000000e+00 -8.99950005e+03 0.00000000e+00]
[ 8.99950005e+03 0.00000000e+00 0.00000000e+00]
[ 0.00000000e+00 0.00000000e+00 -5.15662013e+07]]
This clearly seems wrong! How can a tiny change in the rotation vector produce such a big change in the Jacobian? The mistake lies, in fact, in the previous result. When the rotation angle is 180 degrees, the Jacobian actually diverges, so the result should have been a matrix of NaN-s of Inf-s instead of zeros!
To understand why this is so, let’s think about what happens as we fix the rotation axis \( \boldsymbol{n} \) and let the angle \( \theta \) approach 180°. If it is slightly below 180°, the logarithm will return a vector pointing in the direction \( \boldsymbol{n} \) with magnitude \( \theta \). However, the moment \( \theta \) goes above 180°, the returned vector will flip its direction to \( -\boldsymbol{n} \) and its magnitude will be \( 2\pi – \theta \)! Therefore, the logarithm is discontinuous at matrices representing rotations by 180°, making its Jacobian explode.
To see this mathematically, look at the formulas for \( \theta \) and \( \boldsymbol{n} \):
$$ \DeclareMathOperator{\tr}{tr}\cos(\theta) = \frac{\tr(\boldsymbol{R})-1}{2}$$
$$ \boldsymbol{n} = \frac{1}{2\sin(\theta)} \begin{pmatrix} R_{32}-R_{23} \\ R_{13}-R_{31} \\ R_{21}-R_{12} \end{pmatrix}$$
When \( \theta \) approaches 180°, the matrix \(\boldsymbol{R}\) approaches its own inverse, meaning \( \boldsymbol{R} \approx \boldsymbol{R}^T \). This makes the expression for \( \boldsymbol{n} \) to be of the form “\( \frac{0}{0} \)” that, although not exploding, produces all the discontinuous mess in \( \boldsymbol{n} \). However, the moment we allow ourselves to differentiate with respect to the components of \(\boldsymbol{R}\) by treating them as independent, we expose the singularity at the denominator making the expression divergent.
Conclusion
The discussed issue might cause problems when optimizing over the space of rotations. For example, in robotic navigation, one may wish to minimize with respect to some \( \boldsymbol{\omega} \)-s expressions involving sums of \( \DeclareMathOperator{\Log}{Log} \DeclareMathOperator{\Exp}{Exp} \| \Log(\boldsymbol{P} \Exp(\boldsymbol{\omega}) \boldsymbol{Q}) \|_2 \). It is tempting to try to derive the Jacobian of the expression above using the chain rule, which will involve using the Jacobian of the logarithm (as done here, for example). This approach, however, might run into numerical difficulties if the rotation angle of \( \boldsymbol{P} \boldsymbol{Q} \) approaches 180°. I believe a better approach is to do “by-the-book” Lie theory calculations, as beautifully explained here.