Chapter 9: The Carathéodory principle | Lectures on Thermodynamics and Statistical Mechanics

9. The Carathéodory principle

The formulation of the second law from thermodynamics used the concept of heat engines, at least indirectly. But the law is very general and one could ask whether there is another formulation which does not invoke heat engines, but leads to the notion of absolute temperature and the principle that entropy cannot spontaneously decrease. Such a version of the second law is obtained in an axiomatization of thermodynamics due to C. Carathéodory.

9.1 Mathematical Preliminaries

We will start with a theorem on differential forms which is needed to formulate Carathéodory’s version of the second law.

Before proving Carathéodory’s theorem, we will need the following result.

Theorem 9.1.1 - Integrating factor theorem. Let denote a differential one-form. If , then at least locally, one can find an integrating factor for A; i.e, there exist functions T and such that

The proof of this result is most easily done inductively in the dimension of the space. First we consider the two-dimensional case, so that . In this case the condition is vacuous. Write . We make a coordinate transformation to , where

(9.1)

Where is an arbitrary function which can be chosen in any convenient way. This equation shows that

(9.2)

Equations (9.1) define a set of nonintersecting trajectories, being the parameter along the trajectory. We choose as the coordinate on transverse sections of the flow generated by (9.1). Making the coordinate transformation from , to , , we can now write the one-form A as

(9.3)

This proves the theorem for two dimensions. In three dimensions, we have

(9.4)

The strategy is to start by determining 𝒯, for the , subsystem. We choose the new coordinates as , , and impose (9.1). Solving these, we will find and as functions of and . The trajectories will also depend on the staring points which may be taken as points on the transverse section and hence labeled by . Thus we get

(9.5)

The one-form A in (9.4) now becomes

(9.6)

We now consider imposing the equations ,

(9.7)

Since and from (9.6), this equation becomes

(9.8)

Writing , this becomes

(9.9)

Sinec 𝒯 is not identically zero for us, we get and, going back to (9.6), we can write

(9.10)

The quantity in the square brackets is a one-form on the two-dimensional space defined by , . For this we can use the two-dimensional result and write it as , so that

(9.11)

This proves the theorem for the three-dimensional case.

The extension to four dimensions follows a similar pattern. The solutions to (9.1) become

(9.12)

so that we can bring A to the form

(9.13)

We now turn to imposing the condition . In local coordinates this becomes

(9.14)

There are four independent conditions here corresponding to . Using and , these four equations become

(9.15)

(9.16)

(9.17)

(9.18)

Again, we introduce and by . Then equations (9.15) and (9.16) become

(9.19)

Equation (9.17) is then identically satisfied. The last equation, namely, (9.18), simplifies to

(9.20)

Using these results (9.13) becomes

(9.21)

The quantity in the square brackets is a one-form on the three-dimensional space of , , and we can use the previous result for an integrating factor for this. The condition for the existence of an integrating factor for is precisely (9.20). Thus if we have (9.20), we can write as for some functions 𝓉 and 𝒮, so that finally takes the form . Thus the theorem is proved for four dimensions. The procedure can be extended to higher dimensions recursively, establishing the theorem for all dimensions.

Now we turn to the basic theorem needed for the Carathéodory formulation. Consider an 𝓃-dimensional manifold with a one-form on it. A solution curve to is defined by along the curve. Explicitly, the curve may be taken as given by a set of function where 𝓉 is the parameter along the curve and

(9.22)

In other words, the tangent vector to the curve is orthogonal to . The curve therefore lies on an -dimensional surface. Two points, say, and on are said to be - accessible if there is a solution curve which contains and . Carathéodory’s theorem is the following:

Theorem 9.1.2 - Carathéodory’s theorem.

If in the neighborhood of a point there are - inaccessible points, then admits an integrating factor; i.e., where and are well defined functions in the neighborhood

The proof of the theorem involves a reductio ad absurdum argument which constructs paths connecting to any other point in the neighborhood. (This proof is due to H.A. Buchdahl, Proc. Camb. Phil. Soc. 76, 529 (1979).) For this, define

(9.23)

Now consider a point near . We have a displacement vector εηi for the coordinates of (from ). can in general have a component along and some components orthogonal to .The idea is to solve for these from the equation . Let be a path which begins and ends at ,i.e., , ,and which is orthogonal to . Thusitis a solution curve. Any closed curve starting at and lying in the -dimensional space orthogonal to can be chosen. Consider now a nearby path given by . This will also be a solution curve if . Expanding to first order in ϵ, this is equivalent to

(9.24)

where we also used . We may choose to be of the form where is antisymmetric, to be consistent with. We can find quantities such that this is true; in any case, it is sufficient to show one path which makes accessible. So we may consider ’s of this form. Thus (9.24) becomes

(9.25)

This is one equation for the 𝓃 components of the displacement . We can choose the components of which are orthogonal to as we like and view this equation as determining the remaining component, the one along . So we rewrite this equation as an equation for as follows.

(9.26)

This can be rewritten as

(9.27)

where. The important point is that we can choose , along with a coordinate transformation if needed, such that has no component along . For this, notice that

(9.28)

where . There are components for , for which we have one equation if we set to zero. We can always find a solution; in fact, there are many solutions. Making this choice, has no component along Ai, so the components of η on the right hand side of (9.27) are orthogonal to . As mentioned earlier, there is a lot of freedom in how these components of are chosen. Once they are chosen, we can integrate (9.27) to get , the component along . Integrating (9.27), we get

(9.29)

We have chosen . It is important that the right hand side of (9.27) does not involve for us to be able to integrate like this. We choose all components of orthogonal to to be such that

(9.30)

We then choose , if needed by scaling it, such that in (9.30) gives . We have thus shown that we can always access along a solution curve. The only case where the argument would fail is when . In this case, as calculated is zero and we have no guarantee of matching the component of the displacement of along the direction of . Thus if there are inaccessible points in the neighborhood of , then we must have . In this case, by the previous theorem, admits an integrating factor and we can write for some functions and in the neighborhood of . This completes the proof of the Carathéodory theorem.

9.2 Carathéodory statement of the second law

The statement of the second law due to Carathéodory is:

Carathéodory Principle: In the neighborhood of any equilibrium state of a physical system with any number of thermodynamic coordinates, there exist states which are inaccessible by adiabatic processes.

The adiabatic processes can be quite general, not necessarily quasi-static. It is easy to see that this leads immediately to the notion of absolute temperature and entropy. This has been discussed in a concise and elegant manner in Chandrasekhar’s book on stellar structure. We briefly repeat his argument for completeness. For simplicity, consider a gas characterized by pressure 𝒫 and volume , and (empirical) temperature 𝓉, only two of which are adequate to specify the thermodynamic state, the third being given by an equation of state. Since these are the only variables, has an integrating factor and we may write

(9.31)

where σ and τ will be functions of the variables 𝒫, , t. The power of Carathéodory’s formulation becomes clear when we consider two such systems brought into thermal contact and come to equilibrium. We then have a common temperature t and the thermodynamic variables can now be taken as , , , t (or 𝓉 and one variable from each of , ). We also have . The number of variables is now three; nevertheless, the Carathéodory principle tells us that we can write

(9.32)

We now choose 𝓉, , as the independent variables. Equation (9.32) then leads to

(9.33)

The last of these equations tells us that σ is only a function of and , . Further, since σ is a well-defined function of the various variables, derivatives on σ commute and so

(9.34)

with a similar relation for derivatives with respect to as well. Thus we have the result

(9.35)

Equivalently, we can write

(9.36)

This shows that the combination is independent of the system and is a universal function of the common variable 𝓉. Taking this function as and integrating, we get

The τ’s are determined up to a function of the σ’s; we take this arbitrariness as , where is a constant and is a function of the σ’s involved. We can now define the absolute temperature as

(9.38)

Notice that, in the case under consideration, as expected for equilibrium. This gives , etc. The relation now reduces to

(9.39)

In the two-dimensional space with coordinates , the vector has vanishing curl, i.e., , since only depends on and similarly for . Thus (9.39) shows that is a perfect differential. This means that there exists a function such that ; this also means that can depend on and only through the combination . Thus finally we have

(9.40)

In this way, the Carathéodory principle leads to the definition of entropy .

One can also see how this leads to the principle of increase of entropy. For this, consider a system with n thermodynamic variables. The entropy will be a function of these. We can alternatively choose of the given variables and the entropy to characterize states of the system. Now we ask the question: Given a state , can we find a path which takes us via adiabatic processes to another state ? It is useful to visualize this in a diagram, with as

one of the axes, as in Fig. 9.1. We show one of the other axes, but there could be many. To get to , we can start from A and go along a quasi-static reversible adiabatic to and then, via some nonquasi-static process such as stirring, mixing, etc., get to , keeping the system in adiabatic isolation. This second process can be irreversible. The idea is that the first part does not change the entropy, but brings the other variables to their desired final value. Then we move to the required value of by some irreversible process. As shown . Suppose the second process can also decrease the entropy in some cases, so that we can go from to by some similar process. Then we see that all states close to are accessible. Starting from any point, we can move along the surface of constant to get to the desired value of the variables, except for and then jump to the required value of by the second process. This contradicts the Carathéodory principle. Thus, if we postulate this principle, then we have to conclude that in all irreversible processes in adiabatic isolation the entropy has to either decrease or increase; we cannot have it increase in some processes and decrease in some.other processes. So should be either a nondecreasing quantity or a nonincreasing quantity. The choice of the sign of the absolute temperature, via the choice of the sign of the constant in (9.38), is related to which case we choose for entropy. The conventional choice, of course, is to take and entropy to be nondecreasing. In other words