Statistical mechanics

Statistical mechanics is the application of statistics, which includes mathematical tools for dealing with large populations, to the field of mechanics, which is concerned with the motion of particles or objects when subjected to a force.

It provides a framework for relating the microscopic properties of individual atoms and molecules to the macroscopic or bulk properties of materials that can be observed in everyday life, therefore explaining thermodynamics as a natural result of statistics and mechanics (classical and quantum) at the microscopic level. In particular, it can be used to calculate the thermodynamic properties of bulk materials from the spectroscopic data of individual molecules.

This ability to make macroscopic predictions based on microscopic properties is the main asset of statistical mechanics over thermodynamics. Both theories are governed by the second law of thermodynamics through the medium of entropy. However, Entropy in thermodynamics can only be known empirically, whereas in Statistical mechanics, it is a function of the distribution of the system on its micro-states.

Fundamental postulate

The fundamental postulate in statistical mechanics (also known as the equal a priori probability postulate) is the following:

This postulate is a fundamental assumption in statistical mechanics - it states that a system does not have any preference for any of its available microstates. Given Ω microstates at a particular energy, the probability of finding the system in a particular microstate is p = 1/Ω.

This postulate is necessary because it allows one to conclude that for a system at equilibrium, the thermodynamic state (macrostate) which could result from the largest number of microstates is also the most probable macrostate of the system.

This allows for the definition of the information function (in the context of information theory):

I = \sum_i \rho_i \ln\rho_i = \langle \ln \rho \rangle

When all rhos are equal, I is minimal, which reflects the fact that we have minimal information about the system. When our information is maximal, i.e. one rho is equal to one and the rest to zero (we know what state the system is in), the function is maximal.

This "information function" is the same as the reduced entropic function in thermodynamics.

Microcanonical ensemble

Since the second law of thermodynamics applies to isolated systems, the first case investigated will correspond to this case. The Microcanonical ensemble describes an isolated system.

The entropy of such a system can only increase, so that the maximum of its entropy corresponds to an equilibrium state for the system.

Because an isolated system keeps a constant energy, the total energy of the system does not fluctuate. Thus, the system can access only those of its micro-states that correspond to a given value E of the energy. The internal energy of the system is then strictly equal to its energy.

Let us call Ω(E) the number of micro-states corresponding to this value of the system's energy. The macroscopic state of maximal entropy for the system is the one in which all micro-states are equally likely to occur during the system's fluctuations.

Canonical ensemble

Invoking the concept of the canonical ensemble, it is possible to derive the probability Pi that a macroscopic system in thermal equilibrium with its environment will be in a given microstate with energy Ei:

P_i = {\exp\left(-\beta E_i\right)\over{\sum_j^{j_{max}}\exp\left(-\beta E_j\right)}}
where \beta={1\over{kT}},

The temperature T arises from the fact that the system is in thermal equilibrium with its environment . The probabilities of the various microstates must add to one, and the normalization factor in the denominator is the canonical partition function:

Z = \sum_j^{j_{max}} \exp\left(-\beta E_j\right)

where Ei is the energy of the ith microstate of the system. The partition function is a measure of the number of states accessible to the system at a given temperature. See derivation of the partition function for a proof of Boltzmann's factor and the form of the partition function from first principles.

To sum up, the probability of finding a system at temperature T in a particular state with energy Ei is

P_i = \frac{\exp(-\beta E_i)}{Z}

Thermodynamic Connection

The partition function can be used to find the expected (average) value of any microscopic property of the system, which can then be related to macroscopic variables. For instance, the expected value of the microscopic energy E is interpreted as the microscopic definition of the thermodynamic variable internal energy (U)., and can be obtained by taking the derivative of the partition function with respect to the temperature. Indeed,

\langle E\rangle={\sum_i E_i e^{-\beta E_i}\over Z}=-{dZ\over d\beta}/Z

implies, together with the interpretation of <E> as U, the following microscopic definition of internal energy:

U\colon = -{d\ln Z\over d \beta}.

The entropy can be calculated by (see Shannon entropy)

{S\over k} = - \sum_i p_i \ln p_i = \sum_i {e^{-\beta E_i}\over Z}(\beta E_i+\ln Z) = \ln Z + \beta U

which implies that

-\frac{\ln(Z)}{\beta} = U - TS = F

is the Free energy of the system or in other words,

Z=e^{-\beta F}\,

Having microscopic expressions for the basic thermodynamic potentials U ( internal energy), S ( entropy) and F ( free energy) is sufficient to derive expressions for other thermodynamic quantities. The basic strategy is as follows. There may be an intensive or extensive quantity that enters explicitly in the expression for the microscopic energy Ei, for instance magnetic field (intensive) or volume (extensive). Then, the conjugate thermodynamic variables are derivatives of the internal energy. For instance, the macroscopic magnetization (extensive) is the derivative of U with respect to the (intensive) magnetic field, and the pressure (intensive) is the derivative of U with respect to volume (extensive).

The treatment in this section assumes no exchange of matter (i.e. fixed mass and fixed particle numbers). However, the volume of the system is variable which means the density is also variable.

This probability can be used to find the average value, which corresponds to the macroscopic value, of any property, J, that depends on the energetic state of the system by using the formula:

\langle J \rangle  = \sum_i p_i J_i = \sum_i J_i \frac{\exp(-\beta E_i)}{Z}

where < J > is the average value of property J. This equation can be applied to the internal energy, U:

U = \sum_i E_i \frac{\exp(\beta E_i)}{Z}

Subsequently, these equations can be combined with known thermodynamic relationships between U and V to arrive at an expression for pressure in terms of only temperature, volume and the partition function. Similar relationships in terms of the partition function can be derived for other thermodynamic properties as shown in the following table.

Helmholtz free energy: F = - {\ln Z\over \beta}
Internal energy: U = -\left( \frac{\partial\ln Z}{\partial\beta} \right)_{N,V}
Pressure: P = -\left({\partial F\over \partial V}\right)_{N,T}= {1\over \beta} \left( \frac{\partial \ln Z}{\partial V} \right)_{N,T}
Entropy: S = k (\ln Z + \beta U)\,
Gibbs free energy: G = F+PV=-{\ln Z\over \beta} + {V\over \beta} \left( \frac{\partial \ln Z}{\partial V}\right)_{N,T}
Enthalpy: H = U + PV\,
Constant Volume Heat capacity: C_V = \left( \frac{\partial U}{\partial T} \right)_{N,V}
Constant Pressure Heat capacity: C_P = \left( \frac{\partial H}{\partial T} \right)_{N,P}
Chemical potential: \mu_i = -{1\over \beta} \left( \frac{\partial \ln Z}{\partial N_i} \right)_{T,V,N}

The last entry needs clarification. We are NOT working with a grand canonical ensemble here.

It is often useful to consider the energy of a given molecule to be distributed among a number of modes. For example, translational energy refers to that portion of energy associated with the motion of the center of mass of the molecule. Configurational energy refers to that portion of energy associated with the various attractive and repulsive forces between molecules in a system. The other modes are all considered to be internal to each molecule. They include rotational, vibrational, electronic and nuclear modes. If we assume that each mode is independent (a questionable assumption) the total energy can be expressed as the sum of each of the components:

E = E_t + E_c + E_n + E_e + E_r + E_v\,

Where the subscripts t, c, n, e, r, and v correspond to translational, configurational, nuclear, electronic, rotational and vibrational modes, respectively. The relationship in this equation can be substituted into the very first equation to give:

Z = \sum_i \exp\left(-\beta(E_{ti} + E_{ci} + E_{ni} + E_{ei} + E_{ri} + E_{vi})\right)
= \sum_i \exp\left(-\beta E_{ti}\right) \exp\left(-\beta E_{ci}\right) \exp\left(-\beta E_{ni}\right) \exp\left(-\beta E_{ei}\right) \exp\left(-\beta E_{ri}\right) \exp\left(-\beta E_{vi}\right)

If we can assume all these modes are completely uncoupled and uncorrelated, so all these factors are in a probability sense completely independent, then

Z = Z_t Z_c Z_n Z_e Z_r Z_v\,

Thus a partition function can be defined for each mode. Simple expressions have been derived relating each of the various modes to various measurable molecular properties, such as the characteristic rotational or vibrational frequencies.

Expressions for the various molecular partition functions are shown in the following table.

Nuclear Z_n = 1 \qquad (T < 10^8 K)
Electronic Z_e = W_0 \exp(kT D_e + W_1 \exp(-\theta_{e1}/T) + \cdots)
vibrational Z_v = \prod_j \frac{\exp(-\theta_{vj} / 2T)}{1 - \exp(-\theta_{vj} / T)}
rotational (linear) Z_r = \frac{T}{\sigma} \theta_r
rotational (non-linear) Z_r = \frac{1}{\sigma}\sqrt{\frac{{\pi}T^3}{\theta_A \theta_B \theta_C}}
Translational Z_t = \frac{(2 \pi mkT)^{3/2}}{h^3}
Configurational (ideal gas) Z_c = V\,

These equations can be combined with those in the first table to determine the contribution of a particular energy mode to a thermodynamic property. For example the "rotational pressure" could be determined in this manner. The total pressure could be found by summing the pressure contributions from all of the individual modes, ie:

P = P_t + P_c + P_n + P_e + P_r + P_v\,

Grand canonical ensemble

If the system under study is an open system, (matter can be exchanged), and particle number is conserved, we would have to introduce chemical potentials, μj, j=1,...,n and replace the canonical partition function with the grand canonical partition function:

\Xi(V,T,\mu) = \sum_i \exp\left(\beta \left[\sum_{j=1}^n \mu_j N_{ij}-E_i\right ]\right)

where Nij is the number of jth species particles in the ith configuration. Sometimes, we also have other variables to add to the partition function, one corresponding to each conserved quantity. Most of them, however, can be safely interpreted as chemical potentials. In most condensed matter systems, things are nonrelativistic and mass is conserved. However, most condensed matter systems of interest also conserve particle number approximately (metastably) and the mass (nonrelativistically) is none other than the sum of the number of each type of particle times its mass. Mass is inversely related to density, which is the conjugate variable to pressure. For the rest of this article, we will ignore this complication and pretend chemical potentials don't matter. See grand canonical ensemble.

Let's rework everything using a grand canonical ensemble this time. The volume is left fixed and does not figure in at all in this treatment. As before, j is the index for those particles of species j and i is the index for microstate i:

U = \sum_i E_i \frac{\exp(-\beta (E_i-\sum_j \mu_j N_{ij}))}{\Xi}
N_j = \sum_i N_{ij} \frac{\exp(-\beta (E_i-\sum_j \mu_j N_{ij}))}{\Xi}
Gibbs free energy: G = - {\ln \Xi\over \beta}
Internal energy: U = -\left( \frac{\partial\ln \Xi}{\partial\beta} \right)_{\mu}+\sum_i{\mu_i\over\beta}\left({\partial \ln \Xi\over \partial \mu_i}\right )_{\beta}
Particle number: N_i={1\over\beta}\left({\partial \ln \Xi\over \partial \mu_i}\right)_\beta
Entropy: S = k (\ln \Xi + \beta U- \beta \sum_i \mu_i N_i)\,
Helmholtz free energy: F = G+\sum_i \mu_i N_i=-{\ln \Xi\over \beta} +\sum_i{\mu_i\over \beta} \left( \frac{\partial \ln \Xi}{\partial \mu_i}\right)_{\beta}

Equivalence between descriptions at the thermodynamic limit

All the above descriptions differ in the way they allow the given system to fluctuate between its configurations.

In the micro-canonical ensemble, the system exchanges no energy with the outside world, and is therefore not subject to energy fluctuations, while in the canonical ensemble, the system is free to exchange energy with the outside in the form of heat.

In the thermodynamic limit, which is the limit of large systems, fluctuations become negligible, so that all these descriptions converge to the same description. In other words, the macroscopic behaviour of a system does not depend on the particular ensemble used for its description.

Given these considerations, the best ensemble to choose for the calculation of the properties of a macroscopic system is that ensemble which allows the result be most easily derived.

Random Walkers

The study of long chain polymers has been a source of of problems within the realms of statistical mechanics since about the 1950's. One of the reasons however that physcists were interested in their study is that the equations governing the behaviour of a polymer chain were independant of the chain chemistry. What is more, the governing equation turns out to be a random (diffusive) walk in space. Indeed, Schrodinger's equation is itself a diffusion equation in imaginary time, t' = it.

Random Walks in Time

The first example of a random walk is one in space, whereby a particle undgoes a random motion due to external forces in it's surrounding medium. A typical example would be a pollen grain in a beaker of water. If one could somehow "dye" the path the pollen grain has taken, the path observed is defined as a random walk.

Consider a toy problem, of a train moving along a 1D track in the x-direction. Suppose that the train moves either a distance of + or - a fixed distance b, depending on whether a coin lands heads or tails when flipped. Lets start by considering the statistics of the steps the toy train takes:

\langle S_{i} \rangle = 0 ; due to a priori equally likely probabilities
\langle S_{i} S_{j} \rangle = b^2 \delta_{ij}

The second quantity is known as the correlation function. The delta is the kronecker delta which tells us that if the indices i and j are different, then the result is 0, but if i = j then the kronecker delta is 1, so the correlation function returns a value of b2. This makes sense, because if i = j then we are considering the same step. Rather trivially then it can be shown that the average displacement of the train from the x-axis is 0;

x = \sum_{i=1}^{N} S_{i}
\langle x \rangle = \langle \sum_{i=1}^{N} S_{i} \rangle
\langle x \rangle = \sum_{i=1}^{N} \langle S_{i} \rangle

As stated \langle S_{i} \rangle is 0, so the sum of 0 is still 0. It can also be shown, using the same method demonstrated above, to calculate the root mean square value of problem. The result of this calculation is given below

x_{rms} = \sqrt {\langle x^2 \rangle} = b \sqrt N

From the diffusion equation it can be shown that the distance a diffusing particle moves in a media is proportional to the root of the time the system has been diffusing for, where the proportionality constant is the root of the diffusion constant. The above relation, although cosmetically different reveals similar physics, where N is simply the number of steps moved (is loosely connected with time) and b is the characteristic step length. As a consequence we can consider diffusion as a random walk process.

Random Walks in Space

Random walks in space can be thought of as snapshots of the path taken by a random walker in time. One such example is the spatial configuration of long chain polymers.

There are two types of random walk in space: Self Avoiding walks, where the links of the polymer chain interact and do not overlap in space, and Pure Random walks, where the links of the polymer chain are non-interacting and links are free to lay on top of one another. The former type is most applicable to physical systems, but their solutions are harder to get at from first principles.

By considering a freely jointed, non-interacting polymer chain, the end-to-end vector is \mathbf{R} = \sum_{i=1}^{N} \mathbf r_i where \mathbf {r}_{i} is the vector position of the i-th link in the chain. As a result of the central limit theorem, if N >> 1 then the we expect a Gaussian distribution for the end-to-end vector. We can also make statements of the statistics of the links themselves;
\langle \mathbf{r}_{i} \rangle = 0 ; by the isotropy of space
\langle \mathbf{r}_{i} \cdot \mathbf{r}_{j} \rangle = b^2 \delta_{ij} ; all the links in the chain are uncorrelated with one another
Using the statistics of the indivudal links, it is easily shown that \langle \mathbf R \rangle = 0 and \langle \mathbf R \cdot \mathbf R \rangle = Nb^2. Notice this last result is the same as that found for random walks in time.

Assuming, as stated, that that distribution of end-to-end vectors for a very large number of identical polymer chains is gaussian, the probability distribution has the following form

P = \frac{1}{\left (\frac{2 \pi N b^2}{3} \right )^{3/2}} \exp \frac {-3 \mathbf R \cdot \mathbf R}{2NB^2}

What use is this to us? Recall that according to the principle of equally likely a priori probabilities, the number of microstates, Ω, at some physical value is directly proportional to the probability distribution at that physical value, viz;

\Omega \left ( \mathbf{R} \right ) = c P\left ( \mathbf{R} \right )

where c is an arbitrary proportionality constant. Given our distribution function, there is a maxima corresponding to \mathbf {R} = 0. Physically this amounts to there being more microstates which have an end-to-end vector of 0 than any other microstate. Now by considering

S \left ( \mathbf {R} \right ) = k_B \ln \Omega {\left ( \mathbf R \right) }
\Delta S \left( \mathbf {R} \right ) = S \left( \mathbf {R} \right ) - S \left (0 \right )
\Delta F = - T \Delta S \left ( \mathbf {R} \right )

where F is the Helmholtz free energy it is trivial to show that

\Delta F = k_B T \frac {3R^2}{2Nb^2} = \frac {1}{2} K R^2 \quad ; K = \frac {3 k_B T}{Nb^2}

A Hookian spring!
This result is known as the Entropic Spring Result and amounts to saying that upon stretching a polymer chain you are doing work on the system to drag it away from it's (prefered) equilibrium state. An example of this is a common elastic band, composed of long chain (rubber) polymers. By stretching the elastic band you are doing work on the system and the band behaves like a conventional spring. What is particularly astonishing about this result however, is that the work done in stretching the polymer chain can be related entirely to the change in entropy of the system as a result of the stretching.