Common section

Notes on the Origin of the General Theory of Relativity

I GLADLY ACCEDE TO the request that I should say something about the history of my own scientific work. Not that I have an exaggerated notion of the importance of my own efforts, but to write the history of other men’s work demands a degree of absorption in other people’s ideas which is much more in the line of the trained historian; to throw light on one’s own earlier thinking appears incomparably easier. Here one has an immense advantage over everybody else, and one ought not to leave the opportunity unused out of modesty.

When, by the special theory of relativity I had arrived at the equivalence of all so-called inertial systems for the formulation of natural laws (1905), the question whether there was not a further equivalence of co-ordinate systems followed naturally, to say the least of it. To put it in another way, if only a relative meaning can be attached to the concept of velocity, ought we nevertheless to persevere in treating acceleration as an absolute concept?

From the purely kinematic point of view there was no doubt about the relativity of all motions whatever; but physically speaking, the inertial system seemed to occupy a privileged position, which made the use of co-ordinate systems moving in other ways appear artificial.

I was of course acquainted with Mach’s view, according to which it appeared conceivable that what inertial resistance counteracts is not acceleration as such but acceleration with respect to the masses of the other bodies existing in the world. There was something fascinating about this idea to me, but it provided no workable basis for a new theory.

I first came a step nearer to the solution of the problem when I attempted to deal with the law of gravity within the framework of the special theory of relativity. Like most writers at the time, I tried to frame a field-law for gravitation, since it was no longer possible, at least in any natural way, to introduce direct action at a distance owing to the abolition of the notion of absolute simultaneity.

The simplest thing was, of course, to retain the Laplacian scalar potential of gravity, and to complete the equation of Poisson in an obvious way by a term differentiated as to time in such a way that the special theory of relativity was satisfied. The law of motion of the mass point in a gravitational field had also to be adapted to the special theory of relativity. The path was not so unmistakably marked out here, since the inert mass of a body might depend on the gravitational potential. In fact this was to be expected on account of the principle of the inertia of energy.

These investigations, however, led to a result which raised my strong suspicions. According to classical mechanics the vertical acceleration of a body in the vertical gravitational field is independent of the horizontal component of velocity. Hence in such a gravitational field the vertical acceleration of a mechanical system or of its center of gravity works out independently of its internal kinetic energy. But in the theory I advanced the acceleration of a falling body was not independent of the horizontal velocity or the internal energy of a system.

This did not fit in with the old experimental fact that all bodies have the same acceleration in a gravitational field. This law, which may also be formulated as the law of the equality of inertial and gravitational mass, was now brought home to me in all its significance. I was in the highest degree amazed at its persistence and guessed that in it must lie the key to a deeper understanding of inertia and gravitation. I had no serious doubts about its strict validity even without knowing the results of the admirable experiments of Eötvös, which—if my memory is right—I only came to know later. I now abandoned as inadequate the attempt to treat the problem of gravitation, in the manner outlined above, within the framework of the special theory of relativity. It clearly failed to do justice to the most fundamental property of gravitation. The principle of the equality of inertial and gravitational mass could now be formulated quite clearly as follows:—In a homogeneous gravitational field all motions take place in the same way as in the absence of a gravitational field in relation to a uniformly accelerated co-ordinate system. If this principle held good for any events whatever (the “principle of equivalence”), this was an indication that the principle of relativity needed to be extended to co-ordinate systems in non-uniform motion with respect to each other, if we were to reach an easy and natural theory of the gravitational fields. Such reflections kept me busy from 1908 to 1911, and I attempted to draw special conclusions from them, of which I do not propose to speak here. For the moment the one important thing was the discovery that a reasonable theory of gravitation could only be hoped for from an extension of the principle of relativity.

What was needed, therefore, was to frame a theory whose equations kept their form in the case of nonlinear transformations of the co-ordinates. Whether this was to apply to absolutely any (constant) transformations of co-ordinates or only to certain ones, I could not for the moment say.

I soon saw that bringing in non-linear transformations, as the principle of equivalence demanded, was inevitably fatal to the simple physical interpretation of the co-ordinates—i.e., that it could no longer be required that differentials of co-ordinates should signify direct results of measurement with ideal scales or clocks. I was much bothered by this piece of knowledge, for it took me a long time to see what co-ordinates in general really meant in physics. I did not find the way out of this dilemma till 1912, and then it came to me as a result of the following consideration :—

A new formulation of the law of inertia had to be found which in case of the absence of a real “gravitational field with application of an inertial system” as a co-ordinate system passed over into Galileo’s formula for the principle of inertia. The latter amounts to this:—A material point, which is acted on by no force, will be represented in four-dimensional space by a straight line, that is to say by a line that is as short as possible or more correctly, an extreme line. This concept presupposes that of the length of a linear element, that is to say, a metric. In the special theory of relativity, as Minkowski had shown, this metric was a quasi-Euclidean one, i.e., the square of the “length” ds of the linear element was a definite quadratic function of the differentials of the coordinates.

If other co-ordinates are introduced by means of a non-linear transformation, ds2 remains a homogeneous function of the differentials of the co-ordinates, but the co-efficients of this function (gμν) cease to be constant and become certain functions of the coordinates. In mathematical terms this means that physical (four-dimensional) space has a Riemannian metric. The time-like extremal lines of this metric furnish the law of motion of a material point which is acted on by no force apart from the forces of gravity. The co-efficients (gμν) of this metric at the same time describe the gravitational field with reference to the co-ordinate system selected. A natural formulation of the principle of equivalence had thus been found, the extension of which to any gravitational field whatever formed a perfectly natural hypothesis.

The solution of the above-mentioned dilemma was therefore as follows:—A physical significance attaches not to the differentials of the co-ordinates but only to the Riemannian metric co-ordinated with them. A workable basis had now been found for the general theory of relativity. Two further problems remained to be solved, however.

(1) If a field-law is given in the terminology of the special theory of relativity, how can it be transferred to the case of a Riemannian metric?

(2) What are the differential laws which determine the Riemannian metric (i.e., gμν) itself?

I worked on these problems from 1912 to 1914 together with my friend Grossmann. We found that the mathematical methods for solving problem (1) lay ready to our hands in the infinitesimal differential calculus of Ricci and Levi-Civita.

As for problem (2), its solution obviously needed invariant differential systems of the second order taken from gμν. We soon saw that these had already been established by Riemann (the tensor of curvature). We had already considered the right field-equation for gravitation for two years before the publication of the general theory of relativity, but we were unable to see how they could be used in physics. On the contrary I felt sure that they could not do justice to experience. Moreover I believed that I could show on general considerations that a law of gravitation invariant in relation to any transformation of co-ordinates whatever was inconsistent with the principle of causation. These were errors of thought which cost me two years of excessively hard work, until I finally recognized them as such at the end of 1915 and succeeded in linking the question up with the facts of astronomical experience, after which I ruefully returned to the Riemannian curvature.

In the light of knowledge attained, the happy achievement seems almost a matter of course, and any intelligent student can grasp it without too much trouble. But the years of anxious searching in the dark, with their intense longing, their alternations of confidence and exhaustion, and the final emergence into the light;—only those who have experienced it can understand that.

If you find an error please notify us in the comments. Thank you!