BOLTZMAN STATISTICS
Atomic particles in thermal equilibrium with their surroundings are
in a
state of rapid, but random, motion.
In a gas, each particle moves along a straight
line until it collides with another particle or a
solid object such as the walls of the
containing vessel.
After a collision the particle rebounds in a random direction
with random velocity until the next collision, and so
on. The situation is much
the same in a liquid except that the particles are
closer together and the
collisions are much more frequent. In a solid the heavier particles are more or
less constrained to vibrate rapidly within the close
neighborhood of a specific
location. The
three dimensional components of the motion at any given instant
are random both in speed and direction.
I will now present some background
material intended to stimulate my reader's intuition
about such matters.
The atoms in a solid tend to arrange themselves in orderly arrays,
called
crystal lattices, at least on a local scale.
This can be seen by taking a
photograph of X-rays scattered from almost any solid.
If we know the wavelength
of the X-rays, we can calculate the lattice spacings
from the observed angles of
refraction. Lattice
spacings are usually on the order of several Angstroms, while
the atoms themselves are often a factor of 3 or 4
less in diameter. (1 Angstrom =
1E-10 meter). For
our purposes here, we may imagine that the lattice sites in a
solid are occupied by the heavy nuclei of the atoms
which make up the solid and
these are surrounded by a swarm of highly mobile
electrons closely confined, for
the most part, near the nuclei. The mass of a nuclear particle is roughly 1800
times the mass of an electron. In the case of electrical conductors, a fraction of
the electrons are more or less free to wander about
through the lattice, able to
transfer heat and electrical charge.
In the case of insulating solids at low
temperatures, almost all of the electrons are closely
confined to the nuclei.
These may aid in heat transfer by interactions with
neighboring electrons, but
they are not free to carry significant electric
charge. At elevated
temperatures,
however, most insulators become conductors to some
degree.
Although the atoms in a solid tend to be confined to specific
locations in
the lattice structure, even at room temperature they
may occasionally escape the
confines of their neighbors and wander to a similar
location elsewhere in the
lattice. At
temperatures approaching the melting point the individual atoms may
relocate themselves more or less rapidly while the
lattice structure remains
intact, but at the melting point even the structure
disintegrates.
In order to understand the process of diffusion in solids, it is
necessary to
understand the concepts of binding energy and random
thermal motion. We note
that the moon is bound to the earth by gravitational
attraction. This pair as well
as the other planets and their moons are also bound
to the sun. Even the comets
are bound to the sun, although they move at such high
speed that they may
spend years or centuries far from the sun or even the
outermost planets before
returning. When
Newton reasoned out the laws of motion in a gravitational field,
he realized that bodies of mass moving at or above a
certain critical velocity
could escape forever the gravitation field of the
earth or the sun. We learned
in
freshman physics that the critical escape velocity
for a mass in the earth's field of
gravity is roughly 7 miles/second. In an idealized two body model, a projectile
fired in an upward direction at a lower velocity
would eventually come to rest and
return to the earth, but if the velocity was greater
the projectile would continue
moving away indefinitely.
The earth's atmosphere, for example, is relatively free
of hydrogen while the moon is relatively free of all
gas particles. A significant
fraction of any hydrogen molecules, H2, in the
atmosphere acquire a velocity in
excess of that needed to escape the earth and the
expected lifetime of a free
hydrogen molecule in the atmosphere is short compared
to the age of the earth.
The escape velocity on the surface of the moon is
roughly 1.5 miles/second and
molecules lighter than Argon will have an expected
lifetime short compared to
the age of the moon.
When the kinetic theory of gases was being developed, the random
nature
of the motion of the individual particles was widely
appreciated before Ludwig
Boltzman came up with a satisfactory mathematical
expression to describe the
situation. Before
putting forth a non-rigorous argument intended to stimulate my
reader's intuition, allow me to put forth some other
basic ideas, also intended as
an aid to the intuition.
This is perhaps the time and place to meet my primary challenge
head on...
how to explain what is essentially a mathematical
argument to people, mostly
children, with little or no mathematical background.
There is no shortage of
wisdom to the effect that it cannot be done and
should not be tried, but I have
watched enough television to know what is expected of
me as a person of the
male persuasion, namely that I will blunder on ahead
heedless of the advice of
wiser heads. "Ah,
gentle dames, it gars me greet... etc.", in the immortal words of
Bobby Burns.
A great deal of modern mathematics is devoted to the concept of the
function. Roughly
speaking, a function is a recipe for finding one number, called
the dependent variable, given another number, called
the independent variable.
For example, my pocket calculator has some functions
like EXP, SIN, COS, TAN,
LOG, LN, ?, 1/x, etc.
I can enter the independent variable from the keyboard,
press the appropriate key, and the dependent variable
appears in the register . I
know a number of people who are quite adept at using
these and many similar
buttons and computer icons to solve their everyday
problems in business and
engineering without any significant grasp of the
underlying logic, although
someone somewhere must understand. I can hope that at least one or two of my
readers are sufficiently curious to take a few
college level math courses
whenever the opportunity may arise.
Others may do very well indeed by simply
forming a symbiotic relationship with a math nerd who
is apt to be impaired with
regard to the basic social skills. The mutual benefits could be substantial.
Variables may be thought of as symbols to which we can assign
numerical
values. To
illustrate the concept further, consider the simple function y = 2x2.
x
and y are variables which can take on numerical
values within some range, or
domain, which we are relatively free to specify.
If we restrict the domain of x to
be the set of real numbers between -2 and 2, for
example, we may write this as -2
< x < 2. In
this case the domain of y will be 0 <= y < 8. x is the independent
variable, which means that we must specify its value
first, and then, using the
functional relationship, find y, the dependent
variable. x is normally
thought of
as a continuous variable, which means that we may
choose the value as precisely
as we please. Furthermore,
we can say that between any two specific values of x,
however close to each other these values may be, we
can always find an
arbitrarily large number of values in between.
The real number set is infinitely
fine grained. Once
x is chosen, we may then square it, multiply the result by 2,
and find the value of y.
The domain of y is continuous as well.
For each value of
x, there is one, and only one, value for y.
In this case, y is a single valued
function of x.
For any mathematical operation, we routinely insist on being able
to invoke
the inverse operation.
My pocket calculator has an INV button which I can use
whenever I wish to find the inverse of any of the
functions listed above. With
regard to the function y = 2x2, we might prefer to
specify y first and find the
consequent value of x.
We may, following the laws of algebra , divide both sides
of this expression by 2 and then take the square root
of both sides to get a new
expression x =(+/-)SQRT(y/2) with y being the
independent variable and x being
the dependent variable.
Note that these two functions are not the same.
In
particular, x is not a single valued function of y.
For each y in the domain 0 <= y <
8 there are two values of x since, as we are told in
freshman algebra, (-x)2 = +x2.
I was also taught in high school that since -12 = +1 it followed
that SQRT(-1)
= +i or -i, where i is the unit imaginary, take it or
leave it. These concepts were
deeply troubling to me and I was unable to find
anyone to explain away the
difficulty I was having until my senior year in
college. I finally found the
satisfaction I was looking for in the concept of
closure where we require that all
operations, and their inverse, on numbers in a domain
yield other numbers in the
same domain. The
set of positive integers, for example, yields only positive
integers if all we are allowed to do is add and
multiply. If, however, we
wish to
divide positive integers by positive integers of all
sizes, we must invent rational
fractions if we insist on closure. If we wish to subtract as well as add, we must
invent negative numbers.
If we wish to take squares and square roots when the
independent domain includes fractions and negative
numbers, we must then
invent irrational
numbers and imaginary numbers. I
don't like the term,
imaginary, because i = +SQRT(-1) is not more
imaginary than -1 itself. I
prefer the
term, quadrature, since multiplication by i rotates a
complex number by +90
degrees counter-clockwise in the complex plane.
Finally, we must invent the
complex number set where each variable has both a
real and a quadrature
component, thus, x = a + ib is the complex form where
a and b are real numbers.
The beauty of the complex number set is that a wide
variety of common
functions, including all of the partial differential
equations of physics and their
solutions are to be found in the complex plane where
closure is always assured.
If, in the above example, we allow x to be a complex
variable, x = a + ib, then y =
2x2 = 2 * (a + ib)2 = 2 * (a2 + b2) + i4ab is a
complex variable as well.
This one-to-one relationship is a primary reason why mathematical
abstractions are so useful in describing concrete
physical relationships. We
can
often find one-to-one relationships between
measurable physical quantities and
go on to find one-to-one mathematical relationships
to correspond. We can then
work with the mathematics and reliably predict what
will happen in the physical
situation . For
example, people have observed that the distance travelled by a
vehicle moving at a certain speed is proportional to
the time travelled and they
have expressed
this physical fact by the mathematical model D = S * t where D is
the distance, miles, S is the speed, miles/hour, and
t is the time, hours. We can
reason out, to a moral certainty, the relationship
between time, speed, and
distance without actually having to carry out the
exercise. This kind of
reasoning
has been applied to almost every observable
relationship in nature. From
careful
measurements of the positions of the planets over
many years Johannes Kepler
was able to set forth several functional
relationships which serve as major
milestones in modern scientific thinking.
From these relationships and his own
intuition, Newton was able to work out his underlying
theory of gravity from
which he could derive Kepler's Laws.
Newton published his PRINCIPIA in 1687
and roughly 100 years later a new planet, Uranus, was
discovered by accident in
1781. The
only problem was that the orbit of Uranus didn't quite obey the laws of
motion as established.
Going on the hope that Newton's Laws were indeed valid
after all and the discrepancies in the motion of
Uranus were due to the presence
of another planet, as yet undiscovered, astronomers
made a number of tedious
calculations trying to discover where to look for the
new planet. Neptune was
finally discovered in 1846 and, since the discovery
of Neptune did not entirely
explain the orbit of Uranus, Pluto was found in 1930.
These discoveries illustrate
an essential characteristic of scientific theories
and mathematical models...
although based on well established observations there
are almost always a few
fine points which, upon closer examination, lead to
new and unexpected
discoveries.
Having introduced the concepts of dependent and independent
variables,
we now turn to the concept of random variables.
We recall that the dependent
variable is found after we are given the independent
variable and the functional
relationship between the two variables.
A random variable is a number whose
value is determined as the result of an experiment,
either a physical experiment
or an experiment in the mind. Consider, for example, a roll of a single die.
The
number of spots showing on the top of the cube is a
random variable whose
domain is the set of integers... 1, 2, 3, 4, 5, and
6.
Much of the theory of probability is concerned with finding the
distribution
of random variables.
This is a device for relating random variables to continuous
variables, or at least piecewise continuous
variables. We seek a
continuous, or
piecewise continuous, function which represents the
probability that the
numerical outcome of an experiment is less than some
continuous variable, or
lies within some range of a continuous variable.
The notation, P{Nrv < Xcv} = F(X)
is often used. This
says the probability that N, a random variable outcome of an
experiment, is less than a continuous variable X, is
a continuous, or piecewise
continuous, function of X.
F(X) is sometimes referred to as the Cumulative
Distribution Function, CDF, of X. When N is a real number and the domain of X is
the real number line, F(-?) = 0 while F(+?) = 1.
There is no probability that the
numerical outcome of any experiment will be allowed
to have a value less than -
?. The
probability that the numerical outcome of any experiment is less than +?
is certainty, or 1.
All CDF's have the value zero when X is arbitrarily large in the
negative direction.
All CDF's increase in value, and never decrease in value, as X
increases. All
CDF's -> 1 as X -> +?. At
the risk of belaboring the point, consider
the CDF when N is the number of spots showing at the
roll of a single die. Since
we can think of no rational argument to the contrary,
we will assume that all
outcomes are equally likely, thus P{N=1} = P{N=2},
etc., P{N=6} = 1/6. Thus
for -? < X < 1, F(X) = 0
for 1
<= X < 2, F(X) = 1/6
for 2 <= X < 3, F(X) = 2/6 = 1/3
for 3 <= X < 4, F(X) = 3/6 = 1/2
for 4 <= X < 5, F(X) = 4/6 = 2/3
for 5
<= X < 6, F(X) = 5/6
for 6 <= X < +?, F(X) = 1
The Cumulative Distribution Function of X, a
continuous real variable, is
piecewise continuous everywhere except at the six
discrete values of X where X =
1 or 2 or 3 or 4 or 5 or 6.
The CDF is thus a six step staircase from 0 to 1 from
which we can find the value of P{Nrv < Xcv} = F(X)
for any real value of X.
Another common probability function regarding random variables is
the
Probability Density Function, PDF, which specifies
the probability that the
numerical outcome of an experiment lies within a
specified range of values. We
are interested in P{X < Nrv < X + dX} = f(X) *
dX. In other words, the
probability
that the random variable, Nrv, falls between X and X
+ dX, where dX is an
arbitrarily small increment, is given by f(X) * dX.
In the case of the number of
spots showing at the roll of a single die, this
function is zero except when the
interval X + dX includes one of the integers 1, 2, 3,
4, 5, or 6. The value of f(X)
can
not be determined when x = 1, 2, 3, 4, 5, or 6, but
the product, f(X) * dX = 1/6 in
each case.
In the discussion of random variables so far, I have only
considered the
case of a particular discrete random variable.
Before the proliferation of digital
computers, most math and science students were
heavily schooled in analytic
functions which deal, for the most part, with
continuous complex numbers.
Discontinuous events, such as the instantaneous onset
of the flow of electric
current when a switch is closed, were a bit awkward
to describe even though a
number of completely satisfying devices had been
invented to deal with them.
Gases were known to consist of discrete particles,
roughly 3e19 of them per
cm^3 in the air we breathe, but slide rules were only
accurate to 2 or 3 decimal
places so for all practical purposes gases were a
continuous medium and could
be adequately described by analytic functions.
Modern computers are quite
capable of discrete function analysis to more
significant figures than are usually
needed, but to discard analytic functions and the
lore surrounding them would be
a great mistake... like forgetting how to make stone
tools when someone leaves
you a pile of iron strapping.
Purists today might recognize that the variables described by the
partial
differential equations of physics are really the
expected values of random
variables. The
expected value of a random variable is, roughly speaking, the
average value. In
the case of the number of spots showing after the roll of a
single die, the expected value is 3.5.
Some explanation seems in order since we
never expect to see 3.5 spots showing.
By way of sharpening your intuition,
consider the following carnival game:
a single die is rolled and the payoff is $1
for each spot showing.
What is a fair wager to play this game?
If the only
outcome which paid $1 was the showing of a single
spot, the player would win
only 1/6th of the time, on average, and the fair
price to play would be $1/6 = $.166.
If 2
spots paid $2, the fair price of that chance alone is $2/6 = $.333, and so
on.
The value of an expectation is the sum of the payoff
at a specific outcome times
the probability of that outcome taken over all
possibilities. The sum, 1/6 +
2/6 +
3/6 + 4/6 + 5/6 + 6/6 = 21/6 = 3.5.
The fair price to play the carnival game is $3.50...
a greater price favors the house, a lesser price
favors the player.
Before getting back to Boltzman and the energy distribution of gas
particles, let us play a few more mind games with the
dice. Consider the number
of spots showing at the roll of 2 dice, a red one and
a green one. There is only
one way to get snake eyes or box cars... both dice
must come up 1 for snake
eyes or 6 for box cars.
There are 2 ways to get 3 spots showing... Nred=2 and
Ngreen=1 or visa versa.
There are 3 ways to have 4 spots showing, and so on.
We can enumerate each case and find that there are 36
ways all told for the two
dice to turn up.
If we propose a carnival game in which the payoff is the number
of spots showing, in dollars, we can use the
principle stated above to find the fair
price to play the game is $7. The expected number of spots showing on the roll
of 2 dice is 7.
The probability of snake eyes and the probability of boxcars is
1/36
in each case. The
reader is encouraged to calculate these results for himself as
well as the expected number of spots showing.
In general, the
expected number of spots showing after the roll of M dice
is 3.5 * M. Consider
the roll of 10 dice. We
expect to see somewhere in the
neighborhood of 35 spots showing. Quite generally, if we can choose an
outcome in P ways and then independently choose a
second outcome in Q ways
and a third outcome in R ways, etc., we can make all
three choices in P * Q * R
ways, and so on.
Thus there are thus 610 ways all told that 10 dice could come
up... 60,466,176 ways, in fact. It is certainly possible that all of the dice could
show a 1 or a six or any other single number, but the
probabilities are extremely
small. The
expected number of spots is 35, but 33, 34, 36, & 37 are almost
equally likely.
An exact calculation of the number of ways 10 dice can show in
the vicinity of 35 spots is a tedious proposition,
but we can make a fairly good
estimate using the logic set forth below.
The expected value of a random variable is the sum, taken over all
possible
outcomes, of the probability of each outcome times
the numerical value of that
outcome. We
also need to know something about the spread of values... how
widely disbursed about the expected value can we
expect the outcomes to vary.
The variance of a random variable is the measure we
seek. The variance is
defined as the sum, taken over all possible outcomes,
of the probability of each
outcome times the square of the difference between
each outcome and the
expected value.
In the case of a single die, the calculation is carried out as
follows:
Outcome (EV - OC)2 * P{OC} = Value
1
(3.5 - 1)2 / 6 = 6.25/6
2
(3.5 - 2)2 / 6 = 2.25/6
3
(3.5 - 3)2 / 6 = 0.25/6
4
(3.5 - 4)2 / 6 = 0.25/6
5
(3.5 - 5)2 / 6 = 2.25/6
6
(3.5 - 6)2 / 6 = 6.25/6
SUM = VARIANCE =
17.5/6 = 2.91667
The Standard Deviation may be a more familiar term to
some. This is simply the
square root of the variance, in this case the
standard deviation, Sigma = å =
?2.91667 = 1.7078.
This result doesn't do much for our intuition in this case
except to indicate that the dispersion of the number
of spots showing as related
to the expected value is fairly broad for a single
die. As the number of dice is
increased, however, the dispersion narrows
considerably as we shall see.
Let us revisit the Central Limit Theorem, mentioned earlier. The number of
spots showing at the toss of a number of dice is the
sum of the spots showing on
each individual die.
The Central Limit Theorem tells us that the distribution of a
random variable which is made up as the sum of random
variables tends to a
centralized distribution having a specific form
without regard, in many cases, to
the distributions of the component random variables.
The centralized
distribution is the Gaussian, or Normal, distribution
which is the familiar bell
shaped curve we study in statistics courses.
The expected value of the SUM is
the sum of the expected values of its components,
while the variance of the SUM
is the sum of the variances of the components,
provided that the expected value
and the variance of each component is small compared
to the total. In other
words, the Central Limit Theorem breaks down if any
single component of the
SUM dominates the process.
In the case of a roll of 10 dice, the expected number of spots
showing is 10
* 3.5 = 35, while the variance is 10 * 17.5 = 175.
The standard deviation is ?175 =
13.229. We
will not, therefore, be surprised to see as many as 35 + 13 = 48 spots
or as few as 35 - 13 = 22 spots, but outside of this
range, the probabilities get
progressively smaller, and rapidly at that.
We do not expect to see all 10 dice
show the same number of spots in our lifetime,
although it is clearly a possibility.
We can,
however write a simple BASIC computer routine to roll the dice in
cyberspace and look for such an outcome, all the
while keeping track of the
number of rolls.
When we get around to looking at diffusion in rocks we will see
that some extremely rare events must happen in nature
often enough to result in
measurable concentration profile changes over
geological or archaeological
times.
I once shared an office with someone who spent a great deal of time
studying the history of science and I may have got
the impression from him that
some of the universally accepted theorems in
probability had never been properly
proved. Moreover,
certain major aspects of the Central Limit Theorem had only
been proved during the heroic efforts of scientists
working on war time projects
during WWII. The
name, Norbert Weiner, of CYBERNETICS fame rings a bell.
Boltzman probably worried about the lack of rigor
inherent is his treatment of the
statistics of gases, but that didn't stop him from
using his intuition and
speculating. Newton
had pretty well established that applying a force to a mass
for a period of time would result in a change of the
velocity of the mass. If we
could somehow introduce a gas particle at rest into
an aggregate of other gas
particles, it would experience a series of impacts at
random time intervals and in
random directions and thus come into thermal
equilibrium with its neighbors.
The final random velocity components in the x, y, and
z directions would result
from the of sum of the random impacts in those
directions. From the state of
the
Central Limit Theorem in his day, Boltzman reasoned
that the three components
of the velocity of a particle in thermal equilibrium
in a gas were, most likely,
distributed normally.
The expected value of each component of the
instantaneous velocity would be the sum of the
expected values of the individual
impulses while the variance of the instantaneous
velocity would be the sum of
the variances of the individual impulses.
The number of impulses is arbitrarily
large and we are in no position to estimate the
indicated sums. We can,
however,
take the bull by the horns and say that, since the
gas in a stationery closed
vessel is not going anywhere, the average, or
expected, value of the velocity
components is zero.
The variance presents a similar dilemma, but we can
suppose that the sum of the individual variances is
not zero nor infinite.
Boltzman suggested that the variance was simply kT,
where k, known today as
Boltzman's Constant, is to be determined
experimentally while T is the absolute
temperature in Degrees Kelvin.
Moving ahead using Boltzman's assumption, we can calculate a number
of
the properties of a gas such as its pressure,
viscosity, specific heat, and thermal
conductivity and compare the results with experiment.
The value of Boltzman's
Constant was thus found in a number of independent
ways, all giving about the same
result. Continued
experimental refinement over the years has led us to believe that we
now know Boltzman's Constant to within 32 parts per
million.
Although the x, y, and z velocity components are normally
distributed, the
random kinetic energy of a particle in any specific
direction is exponentially
distributed. The
probability that the kinetic energy exceeds some critical value, say
Qd electron volts, is just EXP(-Qd/kT) = e(-Qd/kT).
(I will describe the EXPonential
function in a bit more detail later when it comes
time to discuss Fick's Laws). At
room
temperature, roughly 300 DgKelvin (26.8 DgC), kT is
approximately .026 ev... the
energy acquired by an electron while falling through
a potential of 1 volt.
No course in bonehead chemistry would be complete without a
discussion of
surface tension and the heat of vaporization.
The emphasis is apt to be on the
evaporation of water... the transformation of water
molecules on the surface of the
liquid to water vapor in the space adjacent to the
surface. The molecules within
a
liquid, as in a gas, are in rapid thermal motion and
the random energy distribution
also follows Boltzman Law.
In a gas, the mean free path between collisions is
typically hundreds or thousands of molecular
diameters while in a liquid the
molecules are typically 1 to 3 molecular diameters
apart. In an ideal gas,
essentially
all of the energy is stored in the kinetics of
motion, while in a liquid a large fraction of
the total energy is stored in the attractive forces
between the molecules. Work
must
be done to separate the molecules from each other.
In the case of water, the escape
energy is roughly 0.456 ev.
When evaporation occurs, the escape energy comes from
the tail end of the random thermal energy
distribution. The threshold
energy for the
transition from liquid to vapor is called the heat of
vaporization. The threshold
energy
for the transition from solid to liquid is called the
heat of fusion. The threshold
energy
for the transition from one site of residence within
a solid to a similar site of residence
is called the heat of diffusion. There are other transitions from one state of matter to
another characterized by a specific threshold of
energy... adsorption, absorption, and
desorption, to name a few.
The probability that a particle in thermal equilibrium with
its surroundings has sufficient energy at any
specific time in excess of some
threshold energy, Q, is just EXP(-Q/kT), typically an
extraordinarily small number.
Transitions do occur, however, because the frequency
of attempts is very large, on
the order of 1e12 to 1e13 times per second.
The dwell time in such a situation is a
random variable whose expectation is the reciprocal
of the product of the probability
of escape at a trial and the frequency of trials... a
very small number times a very large
number. The
range of expected dwell times in our ordinary experience will range from
nanoseconds to the age of the universe, depending on
the temperature and the
binding energy.
Some old time math teachers have told me that these shortcut
devices have cost
us a generation of mathematicians or at least most of
those who survived New Math.
My earlier story about the barrel hoops comes to
mind. Algebra was invented in
antiquity by an Arab olive oil merchant, one Al
Jebra, by way of keeping his financial
affairs in order, or so I have been told.
I will discuss irrational numbers further in an appendix.
My mother-in-law could not bring herself to get on an airplane.
She said that
she couldn't see what held it up. It didn't help her when I explained that
mathematics held it up.
|