Change Coordinates!

March 11, 2010 at 12:45 (math.SG, physics.class-ph) (, , )


Hello, I am Greg Graviton, this is our new blog, and this is my first post which is going to be about the usefulness of changes of coordinates.

My goal is to write about Hamiltonian Mechanics and symplectic geometry, but we need some prerequisites in differential geometry for that. Hence, I’m planning a series of introductory posts, beginning with this one which tackles the key idea of change of coordinates and coordinate independence.

This is a very general and useful concept to have in your toolbox; for instance, it subsumes eigenvalues and it’s the very foundation of differential geometry.

Now, I’m not going to write a standard textbook introduction to differential geometry here. Rather, I would like to present it as a small mathematical gem: lightweight and delightful, with a focus on examples and key ideas instead of systematic textbook development, for the latter all too often obscures the former.

Solving ordinary differential equations (ODEs)

Two masses connected by a spring

As a first example, consider two masses connected by a spring. For simplicity, we assume that the masses are equal m_1=m_2=m and that they can only move in horizontal direction. Denote the positions of the masses with x_1 and x_2.


By Newton’s laws, the equations of motion are

m\,\ddot{x}_1 = F_{12}  = k (x_2 - x_1 - L) \\ m\,\ddot{x}_2 = -F_{12} = - k (x_2 - x_1 - L)

where k is the spring constant, L the length of the spring and F_{12} is the force that the first mass exerts on the second mass via the spring.

So, now, uh, how do we solve this? I mean, single equations like m\ddot{x}=g are bad enough when the right hand side is too complicated. But what about systems like this, where the equations are coupled, i.e. where x_1 appears in the right hand side of \ddot{x}_2 and the vice-versa?

The key idea is that x_1 and x_2 are not the right quantities to talk about, we should perform a change of coordinates to quantities that greatly simplify the equations of motion. For instance, let us consider

\begin{array}{rcll} x &=& \frac12 (x_1+x_2) & \text{ the \emph{center of mass}, and} \\ d &=& (x_2-x_1) & \text{ the \emph{relative displacement}} \end{array}

Clearly, the coordinates x and d are just as good as the coordinates x_1 and x_2 to describe the motion of the two masses, because we can express the latter in terms of the former

x_1 = x-\frac 12 d	\\ x_2 = x+\frac 12 d

But more importantly, these new quantities now fulfill the differential equations

m \ddot{x} = 0 \\ m \ddot{d} = - 2k (d - L)

which are no longer coupled and can be solved independently! For instance, the solution to the first is clearly x=vt+x_0, which just means that the center of mass moves with constant velocity. You calculate d yourself, and x_1 and x_2 from that.


Let’s ramp difficulty up a bit and consider a general linear system of differential equations

\frac{d}{dt} \vec x(t) = A \vec x(t),

essentially given by a matrix A that does not depend on the time t.

How to solve that? Previously, the center of mass was a very useful coordinate, but since A can be anything now, it probably won’t help.

The idea here is to simply consider all possible changes of coordinates at once and pick the one that works best. More precisely, we make the ansatz of a linear change of coordinates given by a matrix B:

\vec y = B \vec x

and hope that we can find a good B that simplifies the problem. In new coordinates \vec y, the equation reads

\frac{d}{dt} \vec y(t) = BAB^{-1} \vec y(t) ,

so we want BAB^{-1} to become really simple, for instance a diagonal matrix. And that’s exactly what eigenvalues are set out to do! Namely, there is this awesome theorem that for every matrix A, we can find a coordinate change B such that BAB^{-1} is (almost) a diagonal matrix with the eigenvalues being the diagonal entries.

Hence, if you know how to calculate eigenvalues, you can solve differential equations.

Variation of constants

Let’s be even more general and consider an inhomogenous linear system of differential equations where the coefficients may now depend on t:

\frac{d}{dt} \vec x(t) = A(t) \vec x(t) + \vec b(t)

How to solve that? Well, in general, it won’t be possible to calculate an analytic solution. But suppose that we can somehow solve the corresponding homogenous problem

\frac{d}{dt} \vec y(t) = A(t) \vec y(t)

(Homogenous means that the sum of two solutions is again a solution, and that we can multiply a solution with a constant c to get another one.) If you know your ODEs well, you know that the solutions to the homogenous problem are always linear combinations

\vec y (t) = c_1 \vec y_1(t) + c_2 \vec y_2(t) + \dots + c_n \vec y_n(t)

of a set of n fundamental solutions \vec y_1(t),\dots,\vec y_n(t). You probably also know that the inhomogenous problem can be solved with a trick called the variation of constants, which says that the ansatz

\vec x (t) = c_1(t) \vec y_1(t) + c_2(t) \vec y_2(t) + \dots + c_n(t) \vec y_n(t)

that replaces the constants c_i\in\mathbb{R} with functions c_i\in\mathcal{C}^1(\mathbb{R}) will do the trick.

I never understood how this mysterious procedure of varying things that are supposed to be constant works, until I realized that it’s actually a change of coordinates! I mean, we want to calculate \vec x(t) as a linear combination of the basis vectors \vec e_i

\vec x(t) = x_1(t) \vec e_1 + x_2(t) \vec e_2 + \dots + x_n(t) \vec e_n

But why should there be anything special about the standard basis vectors \vec e_i? They’re just one basis of the vector space \mathbb{R}^n out of many, and not particularly well suited to our problem.

In contrast, the \vec y_i(t) are much more natural basis for our problem! That’s why we should change coordinates and aim to calculate the solution \vec x(t) in terms of this new basis (which varies with t), and this is exactly what the variation of constants is doing.


Calculating integrals is another prime example for the power of coordinate changes. If you have ever calculated a multi-dimensional integral, you probably know about coordinate systems different from the usual cartesian one, like polar or spherical coordinates.

For instance, consider the area of the unit circle. In cartesian coordinates x,y, a straightforward integration is pretty difficult

A = \int_0^1 \int_{-\sqrt{1-x^2}}^{+\sqrt{1-x^2}} 1\, dy\, dx = \int_0^1 2\sqrt{1-x^2}\, dx = ?

But that’s the fault of the cartesian coordinates, they simply don’t reflect the symmetries of a circle as you can see from the picture.


In contrast, polar coordinates are much more suitable and give

A = \int_0^1 \int_0^{2\pi} 1\cdot rd\phi\,dr = \int_0^1 2\pi r\,dr = \pi r^2

immediately, with the caveat that the volume element is now r d\phi\,dr and not d\phi\,dr.



I hope that these examples have demonstrated the usefulness of changes of coordinates; a technique which is, in a sense, pervasive to mathematics and physics.

Now, to foreshadow future blog posts, if differential equations and integration are best solved by changing coordinates, how about describing these problems in a coordinate independent fashion, i.e. without mentioning coordinates in the first place, so that we are free to pick a good set of coordinates later on?

This question is one of the starting point for differential geometry and the mathematics of manifolds. For instance, the coordinate independent formulation of ordinary differential equations is given by vector fields, and the so-called differential forms are the key ingredient to coordinate-free integrals.

This geometric approach is also taken by V.I. Arnold’s marvelous book “Ordinary differential equations”, which for example also mentions the interpretation of the variation of constants as a coordinate change. Highly recommended!

Next time, I intend to write about vector fields and their coordinate-free description.

Permalink Leave a Comment


February 6, 2010 at 13:10 (Uncategorized)


we, Greg Graviton and Julio Junction, would like to welcome you to our new blog “Quantum Marmalade”. With this blog, we want to share what we’re learning, or struggling to learn, about physics and the associated mathematics.

Surprisingly, and unlike famous mathematicians, physicists don’t seem to have picked up blogging yet, at least when it comes to physics not aimed at the general public. We had no choice but to start our own blog.

No inauguration of a blog about physics is complete without the most famous formula of physics, courtesy of Albert Einstein:

E = \sqrt{m_0^2 + \vec p^2} = m_0 + \frac12 \frac{\vec p^2}{m_0} + \mathcal{O}\left(\frac{\vec p^4}{m_0^3}\right)

You can’t miss that… 😉

Permalink Leave a Comment