Suppose we have a coordinate frame in dimensions, where will typically be 4 for relativistic spacetime (with the 0th coordinate equal to as usual) or 3 for just the spatial part. To simplify our notation, we will use roman characters such as for the three-vector spatial-only part of a four-vector, and use greek characters such as for the entire four-vector (where recall, repeated indices imply summation over e.g. or , hence the distinction as it can be used to de-facto restrict the summation range).
Now suppose that we wish to transform to a new coordinate frame . At this time we place very few restrictions on this transformation. The transformation might, therefore, translate, rotate, rescale or otherwise alter the original coordinate description. As we do this, our description of physical quantities expressed in the old coordinates must systematically change to a description in the new coordinates, since the actual physical situation being described is not altered by the change in coordinate frames. All that is altered is our point of view.
Our first observation might be that it may not be possible to describe our physical quantities in the new frame if the transformation were completely general. For example, if the dimension of were different (either larger or smaller than that of ) we might well be unable to represent some of the physics that involved the missing coordinate or have a certain degree of arbitrariness associated with a new coordinate added on. A second possible problem involves regions of the two coordinate frames that cannot be made to correspond - if there is a patch of the frame that simply does not map into a corresponding patch of the frame we cannot expect to correctly describe any physics that depends on coordinates inside the patch in the new frame.
These are not irrelevant mathematical issues to the physicist. A perpetual open question in physics is whether or not any parts of it involve additional variables. Those variables might just be ``parameters'' that can take on some range of values, or they might be supported only within spacetime scales that are too small to be directly observed (leaving us to infer what happens in these microscale ``patches'' from observations made on the macroscale), they may be macroscopic domains over which frame transformations are singular (think ``black holes'') or they may be actual extra dimensions - hidden variables, if you like - in which interactions and structure can occur that is only visible to us in our four dimensional spacetime in projection. With no a priori reason to include or exclude any of these possibilities, the wise scientist must be prepared to believe or disbelieve them all and to include them in the ``mix'' of possible explanations for otherwise difficult to understand phenomena.
However, our purposes here are more humble. We only want to be able to describe the relatively mundane coordinate transformations that do not involve singularities, unmatched patches, or additional or missing coordinate dimensions. We will therefore require that our coordinate transformations be one-to-one - each point in the spacetime frame corresponds to one and only one point in the spacetime frame - and onto - no missing or extra patches in the frame. This suffices to make the transformations invertible. There will be two very general classes of transformation that satisfy these requirements to consider. In one of them, the new coordinates can be reached by means of a parametric transformation of the original ones where the parameters can be continuously varied from a set of 0 values that describe ``no transformation''. In the other, this is not the case.
For the moment, let's stick to the first kind, and start our discussion by looking at our friends the coordinates themselves. By definition, the untransformed coordinates of an inertial reference frame are contravariant vectors. We symbolize contravariant components (not just 4-vectors - this discussion applies to tensor quantities on all manifolds on the patch of coordinates that is locally flat around a point) with superscript indices: x_contravariant = (x^0,x^1,x^2,x^3...) where we are not going to discuss manifolds, curved spaces, tangent or cotangent bundles (much) although we will still use a few of these terms in a way that is hopefully clear in context. I encourage you to explore the references above to find discussions that extend into these areas. Note that I'm using a non-bold to stand for a four-vector, which is pretty awful, but which is also very common.
Now let us define a mapping between a point (event)
in the frame
and the same point
described in the
consists of a set of four scalar numbers, its frame coordinates, and we
need to transform these four numbers into four new numbers in
From the discussion above, we want this mapping to be a continuous
function in both directions. That is:
x^0' & = & x^0'(x^0, x^1, x^2...)
x^1' & = & x^1'(x^0, x^1, x^2...)
x^2' & = & x^2'(x^0, x^1, x^2...)
... and x^0 & = & x^0(x^0', x^1', x^2'...)
x^1 & = & x^1(x^0', x^1', x^2'...)
x^2 & = & x^2(x^0', x^1', x^2'...)
... have to both exist and be well behaved (continuously differentiable and so on). In the most general case, the coordinates have to be linearly independent and span the or frames but are not necessarily orthonormal. We'll go ahead and work with orthonormal coordinate bases, however, which is fine since non-orthnormal bases can always be othogonalized with Gram-Schmidt and normalized anyway.
Given this formal transformation, we can write the following relation
using the chain rule and definition of derivative:
dx^0' & = & x^0'x^0 dx^0 +
x^0'x^1 dx^1 +
x^0'x^2 dx^2 + ...
dx^1' & = & x^1'x^0 dx^0 + x^1'x^1 dx^1 + x^1'x^2 dx^2 + ...
dx^2' & = & x^2'x^0 dx^0 + x^2'x^1 dx^1 + x^2'x^2 dx^2 + ...
&vellip#vdots; where again, superscripts stand for indices and not powers in this context. We can write this in a tensor-matrix form: ( ) = ( ) ( )
The determinant of the matrix above is called the Jacobean of the transformation and must not be zero (so the transformation is invertible. This matrix defines the differential transformation between the coordinates in the and frame, given the invertible maps defined above. All first rank tensors that transform like the coordinates, that is to say according to this transformation matrix linking the two coordinate systems, are said to be contravariant vectors where obviously the coordinate vectors themselves are contravariant by this construction.
We can significantly compress this expression using Einsteinian summation: dx^i' = x^i'x^j dx^j in which compact notation we can write the definition of an arbitrary contravariant vector as being one that transforms according to: A^i' = x^i'x^j A^j There, that was easy!