9. The CaratheĢodory principle
The formulation of the second law from thermodynamics used the concept of heat engines, at least indirectly. But the law is very general and one could ask whether there is another formulation which does not invoke heat engines, but leads to the notion of absolute temperature and the principle that entropy cannot spontaneously decrease. Such a version of the second law is obtained in an axiomatization of thermodynamics due to C. CaratheĢodory.
9.1 Mathematical Preliminaries
We will start with a theorem on differential forms which is needed to formulate CaratheĢodoryās version of the second law.
Before proving CaratheĢodoryās theorem, we will need the following result.
Theorem 9.1.1 - Integrating factor theorem. Let denote a differential one-form. If
, then at least locally, one can find an integrating factor for A; i.e, there exist functions T and
such that
The proof of this result is most easily done inductively in the dimension of the space. First we consider the two-dimensional case, so that . In this case the condition
is vacuous. Write
. We make a coordinate transformation to
,
where
Where is an arbitrary function which can be chosen in any convenient way. This equation shows that
Equations (9.1) define a set of nonintersecting trajectories, being the parameter along the trajectory. We choose
as the coordinate on transverse sections of the flow generated by (9.1). Making the coordinate transformation from
,
to
,
, we can now write the one-form A as
This proves the theorem for two dimensions. In three dimensions, we have
The strategy is to start by determining šÆ, for the
,
subsystem. We choose the new coordinates as
,
,
and impose (9.1). Solving these, we will find
and
as functions of
and
. The trajectories will also depend on the staring points which may be taken as points on the transverse section and hence labeled by
. Thus we get
The one-form A in (9.4) now becomes
We now consider imposing the equations ,
Since and
from (9.6), this equation becomes
Sinec šÆ is not identically zero for us, we get and, going back to (9.6), we can write
The quantity in the square brackets is a one-form on the two-dimensional space defined by ,
. For this we can use the two-dimensional result and write it as
, so that
This proves the theorem for the three-dimensional case.
The extension to four dimensions follows a similar pattern. The solutions to (9.1) become
so that we can bring A to the form
We now turn to imposing the condition . In local coordinates this becomes
There are four independent conditions here corresponding to . Using
and
, these four equations become
Again, we introduce and
by
. Then equations (9.15) and (9.16) become
Equation (9.17) is then identically satisfied. The last equation, namely, (9.18), simplifies to
Using these results (9.13) becomes
The quantity in the square brackets is a one-form on the three-dimensional space of ,
,
and we can use the previous result for an integrating factor for this. The condition for the existence of an integrating factor for
is precisely (9.20). Thus if we have (9.20), we can write
as
for some functions š and š®, so that finally
takes the form
. Thus the theorem is proved for four dimensions. The procedure can be extended to higher dimensions recursively, establishing the theorem for all dimensions.
Now we turn to the basic theorem needed for the CaratheĢodory formulation. Consider an š-dimensional manifold with a one-form
on it. A solution curve to
is defined by
along the curve. Explicitly, the curve may be taken as given by a set of function
where š is the parameter along the curve and
In other words, the tangent vector to the curve is orthogonal to . The curve therefore lies on an
-dimensional surface. Two points, say,
and
on
are said to be
- accessible if there is a solution curve which contains
and
. CaratheĢodoryās theorem is the following:
Theorem 9.1.2 - CarathĆ©odoryās theorem.
If in the neighborhood of a point there are
- inaccessible points, then
admits an integrating factor; i.e.,
where
and
are well defined functions in the neighborhood
The proof of the theorem involves a reductio ad absurdum argument which constructs paths connecting to any other point in the neighborhood. (This proof is due to H.A. Buchdahl, Proc. Camb. Phil. Soc. 76, 529 (1979).) For this, define
Now consider a point near
. We have a displacement vector εηi for the coordinates of
(from
).
can in general have a component along
and some components orthogonal to
.The idea is to solve for these from the equation
. Let
be a path which begins and ends at
,i.e.,
,
,and which is orthogonal to
. Thusitis a solution curve. Any closed curve starting at
and lying in the
-dimensional space orthogonal to
can be chosen. Consider now a nearby path given by
. This will also be a solution curve if
. Expanding to first order in ϵ, this is equivalent to
where we also used . We may choose
to be of the form
where
is antisymmetric, to be consistent with
. We can find quantities
such that this is true; in any case, it is sufficient to show one path which makes
accessible. So we may consider
ās of this form. Thus (9.24) becomes
This is one equation for the š components of the displacement . We can choose the
components of
which are orthogonal to
as we like and view this equation as determining the remaining component, the one along
. So we rewrite this equation as an equation for
as follows.
This can be rewritten as
where. The important point is that we can choose
, along with a coordinate transformation if needed, such that
has no component along
. For this, notice that
where . There are
components for
, for which we have one equation if we set
to zero. We can always find a solution; in fact, there are many solutions. Making this choice,
has no component along Ai, so the components of Ī· on the right hand side of (9.27) are orthogonal to
. As mentioned earlier, there is a lot of freedom in how these components of
are chosen. Once they are chosen, we can integrate (9.27) to get
, the component along
. Integrating (9.27), we get
We have chosen . It is important that the right hand side of (9.27) does not involve
for us to be able to integrate like this. We choose all components of
orthogonal to
to be such that
We then choose , if needed by scaling it, such that
in (9.30) gives
. We have thus shown that we can always access
along a solution curve. The only case where the argument would fail is when
. In this case,
as calculated is zero and we have no guarantee of matching the component of the displacement of
along the direction of
. Thus if there are inaccessible points in the neighborhood of
, then we must have
. In this case, by the previous theorem,
admits an integrating factor and we can write
for some functions
and
in the neighborhood of
. This completes the proof of the CaratheĢodory theorem.
9.2 CaratheĢodory statement of the second law
The statement of the second law due to CaratheĢodory is:
CaratheĢodory Principle: In the neighborhood of any equilibrium state of a physical system with any number of thermodynamic coordinates, there exist states which are inaccessible by adiabatic processes.
The adiabatic processes can be quite general, not necessarily quasi-static. It is easy to see that this leads immediately to the notion of absolute temperature and entropy. This has been discussed in a concise and elegant manner in Chandrasekharās book on stellar structure. We briefly repeat his argument for completeness. For simplicity, consider a gas characterized by pressure š« and volume , and (empirical) temperature š, only two of which are adequate to specify the thermodynamic state, the third being given by an equation of state. Since these are the only variables,
has an integrating factor and we may write
where Ļ and Ļ will be functions of the variables š«, , t. The power of CaratheĢodoryās formulation becomes clear when we consider two such systems brought into thermal contact and come to equilibrium. We then have a common temperature t and the thermodynamic variables can now be taken as
,
, , t (or š and one variable from each of
,
). We also have
. The number of variables is now three; nevertheless, the CaratheĢodory principle tells us that we can write
We now choose š, ,
as the independent variables. Equation (9.32) then leads to
The last of these equations tells us that Ļ is only a function of and
,
. Further, since Ļ is a well-defined function of the various variables, derivatives on Ļ commute and so
with a similar relation for derivatives with respect to as well. Thus we have the result
Equivalently, we can write
This shows that the combination is independent of the system and is a universal function of the common variable š. Taking this function as
and integrating, we get
The Ļās are determined up to a function of the Ļās; we take this arbitrariness as , where
is a constant and
is a function of the Ļās involved. We can now define the absolute temperature as
Notice that, in the case under consideration, as expected for equilibrium. This gives
, etc. The relation
now reduces to
In the two-dimensional space with coordinates , the vector
has vanishing curl, i.e.,
, since
only depends on
and similarly for
. Thus (9.39) shows that
is a perfect differential. This means that there exists a function
such that
; this also means that
can depend on
and
only through the combination
. Thus finally we have
In this way, the CaratheĢodory principle leads to the definition of entropy .
One can also see how this leads to the principle of increase of entropy. For this, consider a system with n thermodynamic variables. The entropy will be a function of these. We can alternatively choose of the given variables and the entropy
to characterize states of the system. Now we ask the question: Given a state
, can we find a path which takes us via adiabatic processes to another state
? It is useful to visualize this in a diagram, with
as
one of the axes, as in Fig. 9.1. We show one of the other axes, but there could be many. To get to , we can start from A and go along a quasi-static reversible adiabatic to
and then, via some nonquasi-static process such as stirring, mixing, etc., get to
, keeping the system in adiabatic isolation. This second process can be irreversible. The idea is that the first part does not change the entropy, but brings the other variables to their desired final value. Then we move to the required value of
by some irreversible process. As shown
. Suppose the second process can also decrease the entropy in some cases, so that we can go from
to
by some similar process. Then we see that all states close to
are accessible. Starting from any point, we can move along the surface of constant
to get to the desired value of the variables, except for
and then jump to the required value of
by the second process. This contradicts the CaratheĢodory principle. Thus, if we postulate this principle, then we have to conclude that in all irreversible processes in adiabatic isolation the entropy has to either decrease or increase; we cannot have it increase in some processes and decrease in some.other processes. So
should be either a nondecreasing quantity or a nonincreasing quantity. The choice of the sign of the absolute temperature, via the choice of the sign of the constant
in (9.38), is related to which case we choose for entropy. The conventional choice, of course, is to take
and entropy to be nondecreasing. In other words
Thus effectively, we have obtained the version of the second law as given in Proposition 4 in chapter 3.