Patil S. Elements of Modern Physics 2021
Patil S. Elements of Modern Physics 2021
Patil
Elements of
Modern Physics
Elements of Modern Physics
S. H. Patil
123
S. H. Patil
Department of Physics
Indian Institute of Technology Bombay
Mumbai, India
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature
Switzerland AG 2021
This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether
the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of
illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and
transmission or information storage and retrieval, electronic adaptation, computer software, or by similar
or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this
publication does not imply, even in the absence of a specific statement, that such names are exempt from
the relevant protective laws and regulations and therefore free for general use.
The publishers, the authors, and the editors are safe to assume that the advice and information in this
book are believed to be true and accurate at the date of publication. Neither the publishers nor the
authors or the editors give a warranty, express or implied, with respect to the material contained herein or
for any errors or omissions that may have been made. The publishers remain neutral with regard to
jurisdictional claims in published maps and institutional affiliations.
This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Dedicated
to
My Parents
Preface
This book has been thoroughly revised and updated as per the requirement of
the students. The book provides a perspective of the important concepts and
applications in contemporary physics.
While modern physics developing so rapidly, there is a constant need to
revise and update the presentation. The present book tries to do this. Starting
with a discussion of special theory of relativity and quantum theory, it
describes their applications to atoms, molecules, solids and nuclei. There are
two special chapters on the modern description of elementary particles and
on general theory of relativity and cosmology. The emphasis is on a logical
development of ideas, and historical aspects are referred to mainly as an aid
to this. An effort has been made to maintain rigour analytical discussions and
precision in descriptions. It is hoped that the book will be useful to an advanced
undergraduate student, and as a review to a graduate student.
I am grateful to my colleagues, Dr. S.M. Bharati. Dr. S.M. Chitre,
Dr. P.P. Divakaran, Dr. Y.K. Gambhir, Dr. G.V. Dass, Dr. Dipan K. Ghosh, Dr.
K.S. Kulkarni, Dr. R.C. Mehrotra, Dr. C.H. Mehta, Dr. G. Mukhopadhyay, Dr.
R.S. Patil, Dr. G. Thyagarajan and Prof. Atul Mody, Dept. of Physics, VES,
College of Arts, Science and Commerce, Mumbai who ungrudgingly gave
me their valuable time in reading parts of the manuscript and made valuable
suggestions. I also thank Mr. Sunil Somalwar for going through a part of the
manuscript.
Mr. S.B Modak not only provided accurate typing but also executed
the entire organization of the book with the help of Mr. D.S. Nakhawa,
Mr. Kashipathy and Mr. C.A. Sarmalkar. I owe them gratitude. I acknowledge
financial support from the curriculum development programme of IIT, Bombay.
S H Patil
vii
Fundamental Constants
c
= 2.997925 × 108 m/s
h
= 6.6256 × 10–34 Js
me = 9.109 × 10–31 kg = 0.511 MeV/c2
e
= 1.60206 × 10–19 C
k
= 1.38044 × 10–23 J/K
mp = 938.211 MeV/c2
mn = 939.505 MeV/c2
ε0 = 8.85434 × 10–12 F m or C2/Nm2
µ0 = 4π × 10–7 H/m or N/A2
N = 6.022 × 1026/kmol. number of atoms in 12 kg of 12C
viii
Contents
ix
x Elements of Modern Physics
We begin our discussion of modern physics with the theory of relativity which
aims at relating the observations made by observers in relative motion with
respect to each other. Here only the restrictive case of the special theory of
relativity is analysed, in which the observers are moving with constant velocity
with respect to each other. This will help in choosing appropriate frames of
reference and in presenting the later topics in a unified manner. After a brief
consideration of the drawbacks of the classical theory, the main results of the
special theory of relativity are obtained, and applied to describe some specific
physical situations.
and the magnitude of c–v is (c2–v2)½, so that the time taken for light to travel
from A to C and back, is given by
2l2
t2 = (1.7)
( c 2 − v 2 )1 / 2
Thus the difference in the two times is
2l1c 2l
∆ = t1 − t2 = 2 2
− 2 22 1/ 2 (1.8)
c −v (c − v )
If the apparatus is turned through 90°, the roles of l1 and l2 are interchanged,
and the difference in the times becomes
2l1 2l c
∆′ = t1′ − t2′ = 2 2 1/ 2
− 2 2 2 (1.9)
(c − v ) c −v
The expected shift in the interference fringe at D, is
c ( ∆′ − ∆ )
δ=
λ
2(l1 + l2 ) 1 1
= − 2
λ 2 2 1/ 2
(1 − v / c )
2
1− v / c
(l1 + l2 ) v 2
≈− for v >> c (1.10)
λ c2
In the experiment of Michelson and Morley, l2 + l 2 was 22 m, and
λ = 5.9 × 10–7 m. The value of v is at least of the order v ≈ 30 km/s corresponding
to the velocity of the earth’s motion around the sun, even if the motion of the
solar system around the galactic centre is ignored. For these values
δ ≈ 0.37 (1.11)
Special Theory of Relativity 5
Fig. 1.2 In order that starlight passes along a telescope moving with velocity v,
the telescope should be tilted at an angle of α = v/c.
transformations (1.2) be discarded. He found that space and time are related in
an intimate manner and should be treated on an equal basis. Their relation has a
far-reaching influence on the laws of physics. We begin the discussion of
Einstein’s results with a formal statement of the postulates of the special theory
of relativity.
1. The laws of nature are of the same form in all inertial frames of reference.
2. The speed of light is the same in all inertial frames of reference, and is
independent of the motion of the source.
It is implicit in the first postulate that, since the coordinates of the different
inertial frames are related, the laws of nature written in the various inertial
frames can be deduced from one another. It also follows that the Galilean
transformations (1.2) relating the coordinates of the inertial frames, cannot be
right since they would imply that the speed of light is different in different inertial
frames, in contradiction to the second postulate. Hence, a more general relation
between the coordinates must be obtained, which incorporates the information
that the speed of light is the same in all inertial frames.
Taking into account the possibility that time may not be a universal variable,
t′ = γ (t–βx) (1.14)
is for the transformation of the time coordinate.
Let an electromagnetic signal be emitted at t = 0, from the origin of F, which
also coincides with the origin of F′ at that time. Since the speed of light is the
same in all inertial frames, the wavefront is described in the two frames by the
equations
x2 + y2 + z2 = c2t2 (1.15)
and x′2 + y′2 + z′2 = c2t′2 (1.16)
respectively. Substituting Eqs. (1.12) - (1.14) in Eq. (1.16)
(α2 –c2γ2β2) x2 + y2 + z2 = (c2γ2– v2α2) t2 + 2 (vα2 + c2β2) xt
(1.17)
This relation is consistent with Eq. (1.15) provided
α2 – c2γ2β2 = 1
v2 2
γ2 − α =1 (1.18)
c2
vα2 = c2βγ2 = 0
These equations are solved by first eliminating α2 by using the third
equation, and then eliminating γ. The final solutions are:
v2
β=
c2
1
α=γ=
1/ 2 (1.19)
v2
1 − 2
c
v
t′ + 2
x′
t= c
1/ 2
v2
1 − 2
c
From these equations, it is seen that frame F moves with velocity –v with
respect to F′ so that the relative velocities of the frames are reciprocal.
It should be noted that the deviations of the Lorentz transformations from
v x
the Galilean transformations are second order in or and hence the
c ct
experiments which can test Lorentz transformations must be accurate enough
to detect these second order terms. The Michelson-Morley experiment did have
such an accuracy and could prove the inadequacy of Galilean transformations.
Lorentz transformations, though they differ only slightly from Galilean
transformations in most physical situations, bring in a profoundly new concept in
the kinematics of the universe. They remove the universal character of time
and treat it on the same footing as space coordinates. They require that physical
space be treated as a 4-dimensional space of space and time coordinates. As
might be expected, this mixing of space and time coordinates leads to some
unfamiliar consequences. A few of them are discussed here.
Also, since the relative velocity of the frames is v, t1– t0 = l/v which leads to
1/ 2
l v2
t1′ − t0′ = 1 −
v c2
1
−δ
t1′ − t0′ = v
1/ 2
v2 (1.26)
1 − 2
c
lv
δ= (1.27)
c2
The time dilation of moving clocks can be made more physical by considering
a clock which consists of a beam of light bouncing back and forth between two
mirrors kept at a distance l apart along the y′-direction (Fig. 1.3). In frame F′,
each round trip takes a time
2l
∆t ′ = (1.28)
c
Viewed from frame F, the beam travels a longer distance along a line
v
making an angle θ with the y-axis, given by sin θ = . Therefore the
c
corresponding time observed for each trip is
2l
∆t = 1/ 2 (1.29)
v2
c 1 − 2
c
∆t ′
∆t = 1/ 2 (1.30)
v2
1 − 2
c
Special Theory of Relativity 11
which tells us that ∆t′ ≤ ∆t . Since ∆t′ is the time indicated by the clock at rest
in F′, this implies that the moving clocks run at a slower rate. This phenomenon
is observed for unstable particles which are found to live for a longer time when
they are moving (see Example 2, Sec. 1.13).
Frame F¢ Frame F
even explain the observation (1.32) from frame F as follows. The observer in
frame F′ will argue that the scale used by the observer in frame F is shorter by
1/ 2
v2
a factor of 1 − 2 , and therefore the length observed by F should be
c
v 2
l0 1 − 2 . He will also argue from Eq. (1.21), that t2–t1 = 0 corresponds to
c
v
t2′ = t1′ = − ( x2′ − x1′ )
c2
v (1.33)
= − 2 l0
c
vl0
Therefore, the leading end was measured at a time − earlier than the
c2
trailing end, giving rise to an additional contribution to the measurement of the
v 2l0
length of an amount . Including both of these corrections, the length of the
c2
rod should be
v2 l v2
l ′ = l0 1 − 2 + 0 2 = l0 (1.34)
c c
in agreement with the measurement from frame F′ !
dr dr′
Let u = be the velocity of a particle in frame F and u′ = be the
dt dt ′
corresponding velocity in frame F′ which moves with velocity v. Lorentz
transformations (1.20) imply that
dx − v dt
dx′ = 1/ 2,
dy′ = dy , dz′ = dz (1.35)
v2
1 − 2
c
Special Theory of Relativity 13
v
dt − dx
and dt ′ = c2 (1.36)
1/ 2,
v2
1 − 2
c
Dividing the position intervals by the time interval,
ux − v
u x′ =
vu
1 − 2x
c
1/ 2
v2
u y 1 − 2
c
u y′ = (1.37)
vu x
1− 2
c
1/ 2
v2
u z 1 − 2
c
u z′ =
vu x
1− 2
c
where the subscript describes the components of the velocities. These
expressions relate the velocities of a particle measured in different inertial frames.
(c 2 − v 2 )(u 2 − c 2 )
u ′2 − c 2 = 2 (1.38)
2 vux
c 1 − 2
c
For v2 < c2, which is required for the Lorentz transformations to be physically
meaningful, the following important results are obtained:
u′ < c if u < c (1.39)
u′ = c if u = c (1.40)
u′ > c if u > c (1.41)
The first result implies that the relativistic addition of velocities with the
speed of each being less than c will again give a velocity with speed less than c.
14 Elements of Modern Physics
The statement in Eq. (1.40) is the reappearance of the assumption that the
speed of light is the same in all inertial frames of reference. Eq. (1.41) has been
considered for particles with speeds greater than the speed of light, known as
tachyons. It is interesting to note that for tachyons, the equation uxv = c2, is
possible in which case u′ tends to infinity. Observation of tachyons would be of
great interest since it would imply that information can be transmitted at a speed
greater than the speed of light. However, so far, tachyons have not been observed
experimentally.
1 v
Ax′ = Ax − c A0 (1.42)
2 1/ 2
v
1 − 2
c
Ay′ = Ay
Az′ = Az
1 v
A0′ = 1/ 2 A0 − c Ax
v 2
1 − 2
c
The scalar product between two Lorentz vectors, A µ = (A, A 0) and
Bµ = (B, B0) can be defined as
A⋅B ≡ A0B0 – A⋅B (1.43)
which can easily be shown to be equal to A′·B′, and hence is called a Lorentz
scalar. In particular, the ‘length’ of a vector is defined (A·A)1/2 which has the
same value in all inertial frames:
(A⋅A)1/2 = (A02 – A⋅A)1/2 (1.44)
Special Theory of Relativity 15
Here, unlike the three-dimensional case, the length of the vector may be
real or imaginary depending on whether (A02 – A·A) ≥ 0 or (A02 – A·A) < 0,
respectively.
Two examples of four-vectors are given which are of particular
importance. Consider two events characterized by the vectors xµ = (r, ct),
xµ* = (r*, ct*). The separation between these two events,
xµ* – xµ = (r* – r, c (t* – t)) (1.45)
is a 4-vector, and the interval between the two events is defined to be ∆τ where
1
(∆τ) 2 = (t * −t )2 − (r * − r ) ⋅ (r * − r ) (1.46)
c2
This interval ∆τ is a scalar invariant and is called the proper time interval
between the two events. The proper time interval is said to be timelike if
(∆τ)2 > 0, spacelike if (∆τ) < 0 and lightlike if (∆τ)2 = 0. Since (r* – r, c (t*– t)
transforms as a 4-vector, it may be observed that for a timelike interval, there is
an inertial frame in which the events occur at the same place. This is the frame
which moves with velocity v = (r* – r)/(t* – t) with respect to the given frame
(|v|) < c since |r*–r| < c |t*–t|. On the other hand, for a spacelike interval, the
events occur at the same time in the frame which moves with velocity v = c2
|r*–r| |t*–t|/|r*–r|2 with respect to the given frame (|v| < c since |r*–r| > c
|t*–t|. Finally, for a lightlike interval, a light pulse starting at (r, ct) would just
reach (r*, ct*).
As a second example of a Lorentz four-vector, consider the set of operators
∂ ∂ ∂ 1 ∂
− , − , − , . The transformations for these operators can be
∂x ∂y ∂z c ∂t
obtained from Eq. (1.21) by using the chain rule and are
∂ 1 ∂ v 1 ∂
− ′ = − ∂x − c c ∂t
∂x v 2
1/ 2
1 − 2
c
∂ ∂
− ′ = −
∂y ∂y
∂ ∂
− ∂z′ = − ∂z (1.47)
16 Elements of Modern Physics
1 ∂ 1 1 ∂ v ∂
= 1/ 2 c ∂t − c − ∂x
c ∂t v 2
1 − 2
c
Comparing these with Eq. (1.42), it is seen that
∂ ∂ 1 1 ∂
− , − , − , (1.48)
∂x ∂y ∂z c ∂t
is a 4-vector operator, i.e. it is an operator which transforms like a four-vector.
The relations (1.47) further imply that the negatives of the scalar product of the
operator with itself,
∂ ∂2 ∂2 1 ∂2
≡ 2
+ 2 + 2 − 2 2 (1.49)
∂x ∂y ∂z c ∂t
is a scalar operator. This operator, which is invariant under Lorentz
transformations, is called the d′Alembertian operator.
∆r c∆t
(p, p0 ) = m0 lim , (1.50)
∆τ→0 ∆τ ∆τ
m0u
p= 1/ 2
u2
1 − 2
c
m0c
p0 =
u2
1/ 2 (1.52)
1 − 2
c
p y ′ = p y ; p z ′ = p z (b ) (1.53)
v
p0 − px
p0′ = c (c )
1/ 2
v2
1 − 2
c
The vector p is the relativistic generalization of the Newtonian momentum
vector m0u. For the interpretation of p0, it is noted that for u << c,
1 3 u4
cp0 = m0c 2 + m0u 2 + m0 2 + ... (1.54)
2 8 c
where the second term is the Newtonian kinetic energy. Therefore cp0 may be
defined as the energy of the particle, m0c2 being the rest energy and the remaining
terms being the relativistic generalization of the Newtonian kinetic energy. If T
denotes the kinetic energy, then
T = cp0 – m0c2 (1.55)
The rest energy does not play a significant role if there is no change of mass
in a process, but becomes important if there is a change of mass. Finally, it is
noted that the scalar product p.p is
p02 – p.p = m02 c2 (1.56)
an invariant scalar as expected.
18 Elements of Modern Physics
dp
=f (1.57)
dt
where f is the force on the particle. This equation is similar to Newton’s equation
of motion, with the important difference that the momentum is now given by the
dp0
relativistic expression (1.52). An equation for can be deduced using
dt
Eq. (1.56) to give
dp0 p
= .f
dt p0
u
= .f (1.58)
c
On multiplying by c, this equation just relates the rate of change of energy to
the rate of work done. It should be appreciated that since t is not a scalar, the
transformation properties of f are rather involved. It is however straight forward
dp
to obtain the transformation relations for f from those of , giving
dt
1 p0 v p
f x′ = fx − .f
2 1/ 2
v p0′ c p0
1 − 2
c
p0 p
f y′ = f y ; f z′ = 0 f z (1.59)
p0′ p0′
It also follows directly from Eqs. (1.57) and (1.58), that if f = 0, both energy
and momentum of the particle are constants of motion.
1 ∂2A 1 ∂φ
∇2 A −2 2
− ∇ ∇ ⋅ A + 2 = − µ0 J
c dt c ∂t
It may be observed that Eqs. (1.62) and (1.63) do not determined A and φ
uniquely. A transformation
A→A+∇Λ (1.64)
∂
φ→φ− Λ (1.65)
∂t
where Λ is a scalar function, does not alter B and E. Therefore some subsidiary
conditions can be imposed on A. This is done by requiring that
20 Elements of Modern Physics
1 ∂φ
∆⋅A+ =0 (1.66)
c 2 ∂t
a condition known as the Lorentz condition. Then Eqs. (1.64) simplify to:
2 1 ∂2 ρ
∇ − 2 2 φ = −
c ∂t ε 0
2 1 ∂2 (1.67)
∇ − 2 2 A = − µ0J
c ∂t
1
Aµ = A, φ (1.70)
c
also transforms as a Lorentz 4-vector. Hence, Maxwell’s equations are seen to
be consistent with the special theory of relativity.
To show the form invariance of Eq. (1.61) for the motion of a charged
particle, the expression for the derivative with respect to the proper time τ is
written using Eq. (1.51). Substituting expressions (1.62) and (1.63) for B and E
in the expression for the electromagnetic force, and simplifying, gives
dp 1 φ dA
= −q ∇ p0 − p ⋅ A + (1.71)
dτ m0 c dτ
The quantity inside the parentheses is a scalar product between the
4-vectors pµ and Aµ, while ∇ is the space part of the 4-vector operator (1.48).
Therefore, both the right hand side and the left hand side of Eq. (1.71)
transform as space components of 4-vectors. The corresponding equation for
p0, making use of Eqs. (1.51), (1.56) and (1.61), is
dp0 1 ∂ φ d φ
=−q p0 − p . A + (1.72)
dτ m0 c∂τ c d τ c
where the partial derivative applies only to φ and A; and not to p0 or p. It is now
clear that the left hand sides of Eqs. (1.71) and (1.72), as also the right hand
sides, transform as 4-vectors and hence the equation of motion of a charged
particle in the presence of electromagnetic fields, is form-invariant under Lorentz
transformations.
Since photons are supposed to have zero mass, one has from Eq. (1.56)
hv
p= (1.74)
c
Substituting these expressions in Eq. (1.53c) for the transformation of p0,
an expression for the frequency of radiation observed from a moving frame is
obtained as:
v
1 − c cos α
v′ = v , (1.75)
2 1/ 2
v
1 − 2
c
hv
where px = p cos α = cos α, α being the angle between p and the
c
x-axis. This formula is the exact expression for Doppler effect. Similarly, using
Eqs. (1.53a) and (1.53c), the ratio px′/p0′ is obtained as
v
cos α −
cos α′ = c (1.76)
v
1− cos α
c
This equation relates the directions of propagation in the two frames. In
particular, it gives us the relativistic aberration of starlight reaching us. “Unlike
the classical Doppler effect, it is observed that the relativistic Doppler shift for
radiation depends only on the relative velocity between the source and the
observer.” For observing the relativistic correction, one may consider
transverse Doppler shift for which cos α′ = 0, i.e. the observer is moving in a
direction orthogonal to the direction of propagation. In this case, the transverse
Doppler shift is (cos α = v/c)
1 v2
v′ ≈ v 1 − 2 for v << c (1.77)
2c
in contrast to the classical result of v′ = v. The small second order change in the
∆v
frequency ≈ 5.6 × 10−16 for v ≈ 10 m/s has been observed (1960) by using
v
Mössbauer effect.
In Mössbauer effect, there is recoil-free emission and absorption of photons
by atoms embedded in crystals low temperatures (low temperatures are
required so that energy is not carried away by the lattice vibrations). The
Special Theory of Relativity 23
frequencies of the radiation of fairly well-defined except for the uncertainty due
to the natural lifetime τ of the excited atom. The radiation has a frequency
distribution
ρ (v ) ~ 1
(1.78)
4 h 2 (ν − ν 0 ) 2
1+
I ′2
where ν0 is the central value, and Γ′ is the uncertainty in the energy related to
the lifetime of the excited state by
τΓ = (1.79)
h
= , h = 6.67 × 10−34 J s being the Planck’s constant. In an experiment
2π
performed by Hay et al. (1960), photons are emitted by excited 57Fe atoms
embedded in the crystal, which have energy centred around hv0 = 14.4 keV and
a linewidth Γ = 4.7 × 10–9 eV. The emitter is placed at the centre of a centrifuge.
The photons are observed by 57Fe atoms in the ground state, embedded in a
crystal and kept at the edege of the centrifuge. When the centrifuge is not
rotating, the photons from the emitter are absorbed by the absorber, since the
photons have just the right energy for exciting the 57Fe atoms. However, once
the centrifuge starts rotating, the absorber sees the photons with shifted frequency
given by Eq. (1.77) (the speed is ν = ω r, ω being the angular speed and r being
the radial distance of the absorber from the centre) and the rate of absorption
goes down. The experimental observations for the shifts agree with the shifts
given by Eq. (1.77) within experimental errors, thus confirming the predictions
of the special theory of relativity for the transverse Doppler shift.
1.13 EXAMPLES
A few examples are now discussed to illustrate some applications, and elaborate
the ideas that have been analysed.
Example 1
This example shows that Galilean transformations are not consistent with
Maxwell’s equations.
Consider an infinitely long, stationary line-charge and a positively charged
particle P with charge q, moving away from the line charge with velocity u. The
only force acting on P is the repulsive force due to the electric field, and it acts
in a direction prependicular to the line charge. Now, an observer in a frame
moving parallel to the line charge with velocity v sees both electric and magnetic
24 Elements of Modern Physics
fields. The electric field gives rise to a force, again prependicular to the line
charge. However, the magnetic field B which is prependicular to u and v gives
rise to a force q (u–v) × B which has a component parallel to the line charge.
This contradicts the result of Galilean transformations that force is invariant.
Example 2
The time-dilation of a moving clock has a dramatic manifestation in terms of the
increased lifetime of a moving particle.
In nature, unstable particles are found, whose decay is described in quantum
mechanics as a transition from the initial state to the final state. The rate of
decay is determined by the transition probability which is defined by λ,
|dN (t)| = – λ N (t) dt (1.80)
where N (t) is the number of particles at time t, and dN (t) is the number of
particles which decay in time dt. The number of particles remaining at time t is
obtained from Eq. (1.80) to be
N (t) = N (0) e-λt (1.81)
The mean lifetime of the particle is then given by
dN
∫
τ0 = t
N (0)
= 1/ λ (1.82)
Now, if the unstable particles are moving, their dilated lifetime is given
τ0
τ= 1/ 2 (1.83)
v2
1 − 2
c
where v is the speed of the particles, i.e., the particles live for a longer time.
The dilation of lifetime of moving particles has important implications in the
design of experiments. Consider for example, the production of K+-mesons by
fast-moving protons colliding with a target. For a beam of K+-mesons of
momentum 3 GeV/c corresponding to v ≈ 0.98645 c, the bubble chamber where
the K+-particles will interact with protons, is kept at a distance of 100 m. Since
τ0 for K+-mesons is 1.23 × 10–8 s, the value of τ is 7.5 × 10–8 s so that the
fraction of K+-mesons reaching the chamber at t = d/v, is
N (t )
= 1.1 × 10−2 (1.84)
N (0)
Without the time-dilation, the fraction would have been about 1.12 × 10-12,
so that with a typical pulse carrying about 103 K+ -mesons the experiment would
Special Theory of Relativity 25
not have been feasible. Therefore, it is time-dilation which makes the experiment
feasible.
Example 3
It should be emphasized that it is only the speed of light in vacuum that is invariant.
In particular, the speed of light in water is not invariant with respect to different
inertial observers, as was shown by Fizeau (1851).
Consider the passage of light through a tube of length l, containing water at
rest. The speed of light in water is c/n where n is the refractive index of water.
If the water now flows with velocity v parallel to the direction of propagation of
light, the observed speed of light can be obtained from Eq. (1.37):
c
+v
u= n (1.85)
v
1+
cn
c
which is different from . This causes a change in the time of passage,
n
v
ln 1 +
ln cn
∆t = − (1.86)
c vn
c 1 +
c
lv 2
≈
(n − 1)
c2
and the corresponding phase shift is
∆φ ≈ 2π lv v0 (n2–1)/c2 (1.87)
where v0 is the frequency of the beam of light.
In the experiment of Fizeau, two parts of a beam traverse the water tube in
opposite directions (the set-up is similar to the Michelson-Morley experiment),
and interface to produce a fringe shift
∆N ≈ 2 lv v0 (n2–1)/c2 (1.88)
This result agrees with experimental observations. It is worth noting that for
the classical case, the denominator in Eq. (1.85) is 1 and the corresponding
fringe shift is given by Eq. (1.88) but with n2 –1 replaced by n2.
Example 4
Whenever energy is extracted from a reaction, chemical or nuclear, it is at the
expense of the rest-mass energy. For chemical reactions, the change in the rest
26 Elements of Modern Physics
mass is no small that it is not possible to observe it directly. On the other hand,
for nuclear fusion or fission reactions, the change in the mass is quite substantial
and can be measured directly.
In the fusion of hydrogen nuclei into helium nuclei, which is the basic reaction
in stars, four protons combine either through the proton-proton cycle or the
carbon cycle (see Sec. 9.8) to yield a helium nucleus and two positrons.
Example 5
The transformation properties of the electric field due to a moving charge can in
1
principle be obtained from Eqs. (1.62) and (1.63) where A, φ transforms
c
as a four-vector. However, it is simpler to deduce them from the expression for
the Lorentz force in Eq. (1.61) and the transformation of force given in
Eq. (1.59).
Consider the field due to a charge q at the origin, moving with velocity u in
the x-direction. Then, the force on a unit charge at rest at P is
f=E (1.91)
However, in a frame F′ which moves with velocity u, the charge is at rest
and therefore, the force on the unit charge is
f′ = E′
Special Theory of Relativity 27
q r′
= ′3 (1.92)
4πε0 r
where r′ is the distance between P and the point charge. The two forces are
related by Eq. (1.59), so that
Ex′ = Ex
1/ 2
u2
E y′ = 1 − 2 Ey (1.93)
c
1/ 2
u2
E z′ = 1 − 2 Ez
c
q x′
Ex =
4πε0 r ′3
q y′
Ey =
4πε0 u2
1/ 2
r ′3 1 − 2
c
q z′
Ez = (1.94)
4πε0 1/ 2
3 u2
r ′ 1 − 2
c
u2
qr 1 − 2
E= c
3/ 2
3 u 2 sin 2 θ (1.95)
4πε0 r 1 −
c2
where θ is the angle which the line joining the point P and the charge makes
with the direction of the velocity of the charge. The field is seen to be weaker
for small angles and angles near π , and has the largest value for θ = π /2.
28 Elements of Modern Physics
PROBLEMS
1. Show the two successive parallel Lorentz transformations are equivalent
to a single Lorentz transformation.
2. A rod of length l0 in its rest frame, moves with velocity v parallel to itself.
Obtain the Lorentz contraction of the rod by calculating the time taken by
the rod to pass a point (use time dilation) and then multiplying this
time by v.
3. Obtain an expression for time dilation considering a clock with light bouncing
back and forth along the direction of relative velocity, and using the concept
of Lorentz contraction.
4. What is the visually observed rate of a clock which moves with velocity v
along the line of vision?
5. The incoming primary cosmic rays (mostly protons) create µ-mesons in
the upper atmosphere. The lifetime of µ-mesons at rest is 2.15 × 10-6 s. If
the mean speed of the meson is 0.998 c, what fraction of the µ-mesons
created at a height of 20 km reach the sea level? What is the mean distance
travelled by the mesons before they decay?
6. A rod AB parallel to the x-axis, moves along the y-axis with velocity u.
Show that in a frame F′ which moves with velocity v along the x-direction,
this rod is inclined to the x′-axis at an angle
uv
tan −1 1/ 2
.
v2
c2 1 − 2
c
7. A ρ-meson of mass 760 MeV/c2 decays at rest into two π-mesons of
mass 140 MeV/c2 each. What is the relative velocity of the π-mesons
with respect to each other?
8. Show that when force f is not parallel to velocity u, the acceleration is in
general not parallel to either the force or the velocity.
9. A particle of mass M decays at rest into a particle of mass m and a
photon. What is the energy of the photon emitted? Apply this to
(a) Σ+ (1189.4 MeV) → p (938.3 MeV) + γ, (b) H (2p) → H(1s) + γ, the
binding energy being 10.2 eV and 13.6 eV respectively.
10. A charged particle emits radiation when subjected to an external field.
This is known as bremsstrahlung. Show that energy-momentum
conservation does not allow a particle in isolation (no external forces) to
radiate. The argument is very simple in the centre of mass frame.
Special Theory of Relativity 29
11. A proton gains an energy of 1 electron volt (eV) or 1.6 × 10–19 J when it
traverses a potential difference of 1 V. If the proton has a mass of
1.67 × 10–27 k.g., what is the velocity of the proton which starts from rest
and traverses a potential difference of 109 V? What is the velocity of a
proton which comes out of the CERN super proton synchrotron with an
energy of 270 GeV (1 GeV is equal to 109 eV)?
12. A star is observed at the zenith taken to be the z-direction. If the star is
moving in the x-direction, and its radiation shows a redshift of 0.003 Å for
the Hα Balmer line (λ = 6563Å), what is its velocity with respect to us?
If the star were moving towards us with the same speed, what would be
the observed wavelength for the Hα line?
13. A charged particle can move in a material medium with a velocity greater
than the velocity of light in that medium. It polarizes the nearby atoms
which then emit radiation known as Cerenkov radiation. The envelope of
the spherical waves is a cone with the vertex at the charged particle and
whose surface makes an angle θ with the direction of motion of the particle.
Show that sin θ = c/nv, where v is the speed of the particle and n is the
refractive index of the medium (v ≥ c/n).
2
Introduction to Quantum Ideas
Fig. 2.1 Intensity of radiation from a black-body, I (λ, T), in watts per square
centimetre per micron.
with nx ny and nz being positive integers (negative integers give the same mode).
The wave number and therefore the frequency v = c | k | is obtained from these
relations as
c
c= (nx 2 + ny 2 + nz 2 )1/ 2 (2.7)
2l
Every possible set of positive integers (nx, ny, nz) gives a possible standing
wave, which may be depicted by a point in the 3-dimensional plot of (nx, ny, nz).
Since there is one such point per unit volume, the number of states is essentially
equal to the volume in this space (provided the volume is large). Therefore, the
number of stationary modes with frequency between 0 and v (which corresponds
to the volume in the first octant with n ≤ 2l v/c) is
3
1 4π 2lv
N (v ) = 2
8 3 c
8πl 3 v 3
= (2.8)
3c3
where a factor of 2 has been introduced to take into account the fact that for
each frequency v, there are two transverse modes of electromagnetic oscillations.
In the theory of statistical mechanics, the principle of equipartition states
1
that a mean energy of k T (k is the Boltzmann constant which has a value of
2
1.38 × 10–23 J K–1) is associated with each degree of freedom. For example, for
an ideal gas with molecules treated as geometric points, the mean energy of
3
each molecule is kT corresponding to its translational motion in three
2
independent directions. However, for a molecule in oscillatory motion.
corresponding to each mode of translational motion there is a potential energy
1
term which also contributes a mean energy of kT . Now, if a mean energy of
2
kT is assigned to each mode of electromagnetic oscillation, then according to
the principle of equipartition, the energy per unit volume, between frequencies
v and v + dv, is given by
dN (v)
u (v) dv = (kT ) (2.9)
l3
so that energy density per unit volume, per unit frequency is
8π v 2 kT
u (v ) = (2.10)
c3
Introduction to Quantum Ideas 35
∑ nh v exp (−nh v / kT )
n =0
ε= ∞
∑ exp (−nh v / kT )
n =0
hv
= (2.11)
exp (hv / kT ) − 1
Using this expression in Eq. (2.10) in place of kT, the following expression
is obtained
8πh v3 1
u (v ) = (2.12)
c exp (hv / kT ) − 1
3
2 π5 k 4
σ= (2.15)
15h3 c 2
whose numerical value is in good agreement with the experimental value of
σ = 5.67 × 10–8 J/s m2 K4 (2.16)
Planck’s law in Eq. (2.12) can also be used to deduce the properties of λm
at which the radiation density is maximum. Nothing that u (λ) = u (v) dv/dλ.
8π hc 1
u− (λ ) = 5
λ exp (hc / λ kT ) − 1
8πk 5T 5 x 5
= 4 4 x (2.17)
h c e −1
with x = hc/λkT. This function has a maximum at a wavelength given by the
condition d u (λ)/dλ = 0, and a numerical solution (see Example 2) gives
xm = hc/λmkT = 4.965 (2.18)
or λmT = 2.90 × 10-3 mK (2.19)
This relation, called Wien’s displacement law, is in very good agreement
with the experimental observations. It should be emphasized that Eq. (2.17)
implies
u− (λ )
= f (λ T ) (2.20)
T5
which had been deduced earlier by Wien (1893) from thermodynamic
considerations. It implies that a function of a single variable, namely of λT, gives
the complete description of u (λ) as a function of variables λ and T.
Planck’s hypothesis that the energy of the oscillators is quantized, is rather
on ad-hoc assumption, though it leads to Planck’s law of black-body radiation
which is an excellent agreement with the experimental observations. The law of
Introduction to Quantum Ideas 37
Fig. 2.2 (a) Schematic diagram of the equipment used for studying photoelectric
effect. (b) Typical photoelectric current against collector voltage.
38 Elements of Modern Physics
1
mvm 2 = eV0 (2.21)
2
It is independent of the intensity of incident radiation but is proportional to
v – v0, i.e.,
1
mvm 2 ∝ (v − v0 ) (2.22)
2
It is very difficult to reconcile the classical wave theory of light with these
observations. For example, with an incident radiation of intensity 10-10 J/m2 s, it
would require about 5 × 1011 s to absorb an energy of 3 eV by a cross-sectional
area of about 10-20 m2 presented by an atom. Actually, a more detailed analysis
shows that an atomic oscillator presents an effective area of about λ2 to light of
wavelength λ corresponding to its resonant frequency.
For radiation of λ = 10–7 m, this means an area of about 10–14 m2, which still
implies an accumulation time of about 5 × 105 s in contradiction with the
observation that there is no noticeable time delay in the emission of electrons,
Nor can the wave theory of radiation explain the existence of the sharp
threshold v0 for the emission of electrons, if the energy is absorbed continuously.
The fact that the maximum kinetic energy of the electrons emitted does not
depend on the intensity of incident radiation and that it is proportional to v – v0,
is equally puzzling.
A simple explanation of the various observations of the photoelectric effect,
was provided by Einstein (1905). Inspired by Planck’s work, Einstein proposed
that electromagnetic radiation itself is quantized into quanta of energy hv where h
is Planck’s constant. It is these quanta, called photons, that are absorbed as
single units by the electrons. If the energy hv of the photons is high enough, the
electrons are knocked out. From the law of conservation of energy, the maximum
kinetic energy of the electron is
1
mvm 2 = hv − eφ, for hv > eφ (2.23)
2
where eφ is the minimum energy with which the electron is bound within the
metal and is called the work function. This is the famous Einstein’s relation for
Introduction to Quantum Ideas 39
eφ
v0 = (2.24)
h
However, since the current increases gradually as V increases from –V0 ,
the effective binding of the electrons inside the metal varies as also the
velocity of the emitted electrons. Therefore, the critical frequency in
Eq. (2.24) refers to the emission of electrons with minimum binding energy.
3. The maximum kinetic energy of the emitted electrons (having minimum
binding energy) is given in terms of the critical frequency, by the relation
1
mvm 2 = h (v − v0 ) (2.25)
2
In terms of the stopping potential V0, one has
eV0 = h (v–v0) (2.26)
Thus, it not only explains all the experimental observations but also gives the
ratio of h/e from the slope of the linear plot of V0 against v. Using the known
value for the charge of the electron, an independent determination of Planck’s
constant, in good agreement with the value obtained from other considerations
such as the black-body radiation, can be obtained.
Some additional observations related to the photoelectric effect are:
1. Only a small fraction (about 5%) of the incident photons, succeeds in
ejecting photoelectrons while most of them are absorbed by the system as
a whole and generate thermal energy.
2. Photoelectric effect is also possible for isolated atoms in the form of a gas,
e.g. Na, K vapour, and the process is known as photoionization. It is
observed by passing a beam of ultraviolet radiation through a chamber
containing Na or K vapour, and collecting the electrons ejected by subjecting
them to an electric field. It is interesting to not that in photoionization, since
the atoms are isolated, there is no collective absorption of photons and
every photon absorbed succeeds in ejecting an electron. This can be verified
by comparing the number of photons absorbed as deduced from the
decrease in the intensity of the beam, with the number of electrons collected
by the electric field.
40 Elements of Modern Physics
3. The energy required for ejecting the electrons may also be provided by
heating the metal, which results in the thermionic emission of the
electrons. They allows us to calculate, from quantum statistical mechanics,
the work function eφ. The value obtained agrees with the one obtained
from the photoelectric effect.
4. So far, it has been assumed that an electron receives energy only from a
single photon, the process being called a single-photon process. The
development of lasers has provided light beams of very high intensity
which allow us to observe multi-photon processes, in particular the multi-
photon photoelectric effect. In this process, an electron ejected from a
metal receives energy from N photons. Its kinetic energy is given by
1
mvm 2 = Nhv − eφ (2.27)
2
and the critical frequency is eφ/Nh which is smaller than the corresponding
frequency for single-photon processes by a factor of 1/N.
In the analysis of the photoelectric effect, a photon was regarded as a
wave packet of energy, with no statements made for the quantization of
momentum. In fact since a significant amount of momentum was carried away
by the metal, conservation of momentum could not be usefully applied to the
photon-electron system. For the photon to acquire the bonafides of a particle, it
should have both a quantum of energy as well as a quantum of momentum. This
was demonstrated by the discovery of Compton effect (1922) in the scattering
of x-rays by electrons.
hg
hn0 q
f
I
0 2.4
(l – l0) Å ´ 102
(a) (b)
Fig. 2.3 (a) The intensity of the scattered x-rays as a function of (λ – λ0) Å × 102, for
scattering angle θ = 90°. (b) Compton scattering as particle-particle scattering.
hv0
If the initial and final photons have energies hv0 and hv, and momenta n
c 0
hv
and n, respectively, where n0 and n are unit vectors in the directions of
c
propagation and the momentum of scattered electron is p, then from momentum
and energy conservation [Fig. 2.3 (b)],
hν 0 hν
= cos θ + p cos φ (2.28)
c c
hν
0= sin θ − p sin φ (2.29)
c
hv0 + mc2 = hν + (p2c2 + m2c4)1/2 (2.30)
Eliminating φ from the first two equations,
h2
p2 = (ν 0 2 + ν 2 − 2ν ν 0 cos θ) (2.31)
c2
Also from Eq. (2.30)
h2
p2 = (ν 0 − ν + mc 2 / h) 2 − m 2c 2 (2.32)
c2
Equating the two expressions for p2 leads to
hν ν 0
ν0 − ν = (1 − cos θ) (2.33)
mc 2
42 Elements of Modern Physics
h
or λ − λ0 = (1 − cos θ) (2.34)
mc
This is Compton’s expression for the shift in the wavelength of the scattered
x-rays. It identifies the Compton wavelength as
h
λc = (2.35)
mc
which depends only on the mass of the scattering particle and has a value of
2.43 × 10–2 Å for the electron. It has an interesting interpretation, that a
photon with wavelength λc has an energy hν = mc2, i.e., the rest energy of
the particle.
In the discussion presented here, it is assumed that the target electron is
stationary and free. It is also valid if the electron is weakly bound to the
atom with the binding energy of a few eV which is quite small compared
with the energies of the x-ray photons, which are about 10 keV or greater.
However, it may so happen that the electron remains bound in the same
state to the atom, even after the collision with the photon (this is more likely
to happen if the electrons are strongly bound). In this case, the transfer of
energy and momentum is to the atom as a whole, so that the mass of the
atom must be used in place of the mass of the electron. Therefore, the
corresponding Compton wavelength is much smaller (at least by a factor of
1800) and the resulting Compton shift in the wavelength is negligible. This
explains the unshifted component in the spectrum of the scattered x-rays,
which is called the Thomson component.
Some interesting additional features of Compton effect are:
1. The fact that the shift in the wavelength of the radiation is indeed due to
the scattering of the radiation by the electron was confirmed by observing
the scattered electron (Bothe and Geiger, 1925).
2. The main reason for the spread in the wavelength of the Compton-shifted
x-rays is that the initial electron is in general not stationary but has a
momentum spread even inside the atom. The correction due to the binding
of the electron can be taken into account in terms of its momentum
distribution (see Example 4).
3. Though the sift λ – λ0 is independent of λ0, the intensity of scattering
depends on λ0. It actually increases as λ0 → 0 and hence the effect is
more easily observable for x-rays than for lower frequency radiation (indeed
this is the reason why the sky is blue).
4. The scattering angle φ is given by
Introduction to Quantum Ideas 43
sin θ
tan φ =
ν 0 / ν − cos θ
cot (θ / 2)
= (2.36)
1 + h ν 0 / mc 2
Since in cost cases hv0 is of the order of 10 keV, and therefore hv0 << mc2,
1
one has the simple relation φ ≈ (π − θ) or the direction in which the electron is
2
scattered bisects the angle which is supplementary to the angle made by the
final photon momentum with the initial photon momentum.
h
λ= (2.37)
p
called the de Broglie wavelength. This idea was an important step in the
development of wave mechanics.
The wave properties become easily noticeable only when the obstructing
bodies have dimensions comparable with the wavelength. For macroscopic bodies,
the de Broglie wavelength is negligibly small. For atomic systems, this wavelength
becomes more significant: for an electron with an energy of 100 eV the
de Broglie wavelength is about 1 Å, comparable with the wavelength of x-rays
as also with the size of an atom. Their wave properties may therefore be
observed in their scattering by crystals. This was confirmed experimentally by
Davisson and Germer (1927) who studied the scattering of electrons by a
monocrystal of nickel.
44 Elements of Modern Physics
The electron diffraction experiments establish the fact that the electrons
possess wave properties exactly as de Broglie had suggested. To clarify that
the wave property is not because of the simultaneous participation of a large
number of electrons, but is associated with each electron, experiments have
been done with very low intensity electron beams so that the electrons pass
through the instruments essentially one at a time. With sufficiently long exposure,
a diffraction pattern was obtained which differed in no way from the pattern
obtained with normal intensity beams, thus suggesting that the wave-like properties
are to be associated with individual electrons. The de Broglie wavelength has
also been verified for neutral molecules (Estermann and Stern, 1930) and for
neutrons; however, in order that they have the same de Broglie wavelength as
the x-ray wavelength, their energies (E = p2/2m) have to be of the order of
~ 0.02 eV, i.e., one uses thermal molecules and neutrons.
Electron and neutron diffraction, along with x-ray diffraction have become
an indispensable tool for the study of the structures of solids. Their diffraction
patterns though qualitatively similar, have important differences which need to
be mentioned.
Since the electrons are sensitive to electrostatic forces, electron beams
have little penetration. With the development of slow electron beams
(10 to 1000 eV) which undergo negligible penetration and for which diffraction
occurs essentially at the first atomic layer, electron diffraction has become an
important tool for the study of surfaces. Electron beams of fairly high energies,
i.e., about 50 keV, have quite small de Broglie wavelengths, and therefore are
used in electron microscopes for high resolution studies of small specimens.
On the other hand neutrons, like x-rays, are fairly insensitive to electrostatic
forces, and therefore penetrate easily. Neutron diffraction has several advantages:
1. Since neutron scattering is essentially by the nucleus and depends quite
distinctively on the structure of the nucleus, neutron diffraction can give
more information about crystals formed from different atoms which have
nearly equal atomic number and which cannot easily be distinguished by
x-rays.
2. The scattering of neutrons by light nuclei such as hydrogen is large, and
hence neutron diffraction is an important technique in the study of structures
of organic compounds. The scattering of x-rays by light atoms is weak
(the coherent cross-section is approximately proportional to Z2 where Z is
the number of electrons).
3. The magnetic ordering of atoms in a crystal, which have nonzero magnetic
moment, can be studied by neutron diffraction which therefore is a valuable
method of investigating magnetic materials. The main disadvantages of
using neutron diffraction are the low intensity of neutron beams available
46 Elements of Modern Physics
and the difficulty of detecting neutrons which are electrically neutral. They
are usually detected by the α particle emitted through their reaction with
boron nuclei.
Fig. 2.5 Hydrogen spectrum in the visible and near ultraviolet region.
where λ∞ ~ 3646 Å is the limiting value shown in Fig. 2.5. A simple but important
step in the analysis of atomic spectra was taken by Rydberg (1890) who pointed
out that the sequence in Eq. (2.39) could be represented in a more suggestive
form in terms of the reciprocal of the wavelength called the wave number,
related to the frequency:
1 1 1
= R 2 − 2 , n = 3, 4, ... (2.40)
λn 2 n
R = 1.0972 × 107 m-1
where R is called the Rydberg constant. The spectral lines represented by
Eq. (2.40) from what is known as the Balmer series. Further investigations
showed that the hydrogen spectrum has other series, in the ultraviolet and infrared
regions. They are represented by formulae similar to Eq. (2.40), e.g. Lyman
series (1906) in the ultraviolet region by
1 1 1
= R 2 − 2 , n = 2, 3, ... (2.41)
λn 1 n
Paschen series (1908) in the infrared region by
1 1 1
= R 2 − 2 , n = 4, 5, ... (2.42)
λn 3 n
Brackett series (1922) in the infrared region by
1 1 1
= R 2 − 2 , n = 5, 6, ... (2.43)
λn 4 n
The frequencies of the lines in the hydrogen spectrum can be obtained from
a single formula
1 1 1
= T1 2 − 2 , m < n (2.44)
λ m,n m n
m = 1 giving the Lyman series, m = 2 giving the Balmer series, etc.
The spectra of other atoms also show some order, and their frequencies
can be represented by
1
= T1 (m) − T2 (n), m < n (2.45)
λ m,n
However, the form of T(n) is generally more complicated than for the
hydrogen atom, one of the most useful being R(n–d)–2 where δ is a constant
known as the quantum defect, δ << n.
48 Elements of Modern Physics
hcR
En ( H ) = − (2.47)
n2
The numerical value of hcR is 13.6 eV, and is called the ionization potential
of the hydrogen atom.
One of the first attempts to have a model of an atom, which can explain the
discrete spectrum of the atom, was due to J.J. Thomson (1903). It had been
recognized that the negatively charged electron is one of the fundamental
constituents of the atom. Since the atom as a whole is neutral, it should also
contain a positively-charged part, called positive ion to balance the negative
charge of the electron. It was also known that a great majority of the mass is
associated with the positive ion. This led Thomson to propose a model of the
atom in the form of a sphere of uniform positive charge, in which the small,
negatively charged electrons are embedded. The electrons perform simple
harmonic motion about their positions of equilibrium, which results in the emission
of radiation of characteristic frequencies. Quantitatively, the potential energy of
the electron inside the atom is
Ze 2
V (r ) = (r 2 − 3R02 ) (2.48)
8πε0 R03
where R0 is the radius of the atom and Z is the atomic number. This potential will
cause the electron to oscillate with a frequency
1/ 2
1 Ze 2
ν= (2.49)
2π 4πε0 mR03
Introduction to Quantum Ideas 49
(impact parameter is the distance of the nucleus from the initial direction of
motion of the α particle) of b to b + db be scattered into angles between θ and
θ + dθ (Fig. 2.6). Then, the fraction of α particles scattered into angles between
θ and θ + dθ is
dN 2πbdb
=n
N A
db
= 2π ρtb dθ (2.51)
dθ
Ze 2 θ
b= 2
cot (2.52)
2πε 0 mv 2
where m is the mass of α particle and v is its speed. Using this relation in
Eq. (2.51), the fraction of particles scattered into solid angle dΩ = 2π sin θ dθ,
is
2
dN Ze 2 dΩ
= ρt 2
θ (2.53)
N 4πε0 mv sin 4
2
This is the Rutherford formula for the scattering of α particles by nuclei of
change Ze.
The important features to note in this formula are that the fraction is
(i) proportional to the thickness t of the metal foil, (ii) proportional to Z2,
Introduction to Quantum Ideas 51
1 2
(iii) inversely proportional to T2 where T = mv is the kinetic energy of the
2
θ
incoming α particles and (iv) inversely porportional to sin 4 where θ is the
2
angle of scattering. These properties were tested by Geiger and Marsden by
varying the thickness and the composition of the foil, the energy of the incident
α particles and the angle of scattering, and were found to be inxcellent
agreement with the experimental observations. For examples, they found in
an experiment with silver foil, that dN was proportional to 111, 680 and 8800
for θ = 150°, 75° and 37.5°, respectively, other variables remaining the same.
For these values, the product (dN) sin4 (θ/2) is proportional to 96.6, 93.4, 93.9,
respectively. The near-constancy of the product, though dN itself varies by a
large factor, indicates the essential correctness of the θ-dependence of
scattering rates.
It may be noted that the Rutherford formula is the same for attractive and
repulsive Coulomb potentials. It is not valid for values of the impact parameter
b larger than interatomic distances for which an α particle cannot be regarded
as being scattered by a single atom in the metal foil. It is also not valid if the
α particle approaches the nucleus to a distance (for a head-on collision
rmin = Ze2/πε0mv2) less than the size of the nucleus, i.e., about 10–14 m, at which
the nuclear forces become important.
While the Rutherford model of the atom provides a fairly comprehensive
description of the scattering of low-energy α particles by the atoms, there are
some implications of the simple model which are in conflict with the classical
interpretations of experimental observations. The stability of the atom demands
that the electron must revolve around the nucleus. But such an electron, since it
is accelerating, must radiate energy continuously according to the classical theory
of electromagnetism, and ultimately coalesce with the nucleus. Experimentally,
an atom is a highly stable object. Furthermore, it can absorb radiation only of
some well-defined frequencies and then emit radiation again of well-defined
frequencies. The Rutherford model of the atom is unable to explain these
experimental observations. It is with the intention of reconciling the Rutherford
model with the observed stability and spectrum of the atom, that Bohr began his
search for a model of the atom and came up with what is known as the Bohr
model of the atom.
−mn
rn = r (2.55)
mn + me
It is seen that in the centre of mass frame, the electron and the nucleus are
on opposite sides of the centre of mass.
The total energy is
1 Ze 2
E= (me re 2 + mn rn 2 ) ω2 −
2 4πε 0 r
1 Ze
= mr r 2ω2 − (2.56)
2 5πε0 r
where mr = memn/(me + mn) (2.57)
is the reduced mass (it is only slightly smaller than the electron mass), and the
angular momentum is
L = (mere2 +mnrn2)ω
= mrr2 ω (2.58)
Introduction to Quantum Ideas 53
Ze 2
me re ω2 = mr r ω2 = (2.59)
4πε 0 r 2
while the quantum condition for the angular momentum is
L = mrr2 ω = n (2.60)
Solving for r, Eqs. (2.59) and (2.60) give
4πε 0 2 n 2
r= (2.61)
mr Ze 2
This is the radius of the nth Bohr orbit. Furthermore, the equilibrium condition
in Eq. (2.59) allows us to write the total energy as
Ze 2
E=− (2.62)
8π ε 0 r
Using the value of r given in Eq. (2.61), the allowed energies of the atom
are
hcR
En = − , n = 1, 2, ... (2.63)
n2
2
where mr Ze2 (2.64)
R=
4πc3 4πε0
1 1
ν = cR 2 − 2 for absorption
n m
It is also implied that if the atom is in the ground state, i.e., the state with the
lowest energy, n = 1, it continues to remain in that state unless an external
54 Elements of Modern Physics
Fig. 2.7 Energy levels of the hydrogen atom, in the Bohr model.
The result in Eq. (2.63) can also be applied to other one-electron atoms
such as the deuterium, singly ionized helium atom He+ or doubly ionized lithium
atom Li++, etc. For the deuterium, only the reduced mass is slightly larger than
for the hydrogen atom and the energies (being negative) are slightly lower. The
existence of the corresponding lines can be used for the detection of the presence
of deuterium. For He+, Li++, etc. the energy levels differ by an additional factor
of Z2. However, since Z is an integer, some of their energy levels will be close to
those of the hydrogen atom, the small differences being due to the different
reduced masses. For example, the energy levels of He+ for even n, n = 2p, are
related to the hydrogen energy levels, by
Introduction to Quantum Ideas 55
mr (He + )
E2 p (He + ) = E p (H) (2.67)
mr (H)
The ratio of the reduced masses is approximately
mr (He + ) 1 1
≈ 1 + me − + (2.68)
mr (H) m(H) m(He )
i.e., about 1.000408. Therefore, the transitions between these He+ levels
correspond to frequencies which are slightly higher than those of the hydrogen
atom. Indeed, measurements of these small differences give a fairly accurate
determination of the ration of me /m(H).
Bohr’s ideas can be extended to noncircular orbits also. This leads to the
conclusion (see Example 7) that the angular momentum does not uniquely
determine the energy of the atom, and that there are several angular momentum
states which correspond to the same energy. This is an example of what is
known as the degeneracy of an energy level. However, inclusion of the relativistic
corrections shows that these different angular momentum states have slightly
different energies. This results in the multiplicity of the corresponding spectral
lines. Such a fine structure of the lines (the structure is narrower for larger n
values), is indeed observed experimentally, but the quantitative predictions of
the simple model are not in agreement with the experimental observations.
The Bohr theory of the atom is essentially a theory of single-electron atom.
It does not allow a simple generalization to many-electron atoms, not even to
helium, and being ad hoc, it is not logically consistent. But its picture of an atom
with quantized orbits for the electrons, has retained its utility till today, especially
for qualitative arguments.
The existence of discrete atomic energy levels can be observed
experimentally from an analysis of collisions between atoms and electrons with
known energy. In these collisions, since the mass of the atom is much larger
than that of an electron, very little energy is carried away as kinetic energy of
the atom. However, if the energy of the electron is sufficient to raise a bound
electron to a higher energy orbit, the electron may transfer most of its energy to
the atom. This phenomenon was demonstrated by Franck and Hertz (1914).
Electrons from a filament are gradually accelerated through a vapour in a tube
[Fig. 2.8 (a)], towards a grid G and are subjected to a small retarding potential
V0 between the grid and the plate P. When the accelerating potential is sufficiently
large to excite an atom, the electron may undergo a collision near G and transfer
most of its energy to the atom.
56 Elements of Modern Physics
It will then be unable to reach the plate P. Thus, as the accelerating potential
V is raised from zero, the current arriving at P will increase. When it is just
greater than the excitation potential for the atoms, there is a sharp decrease in
the current [Fig. 2.8 (b)]. It begins to increase again till the next energy level
can be excited, and falls again. That the atoms are indeed excited is confirmed
by the appearance of the corresponding spectral lines in the radiation emitted as
the electrons fall back to the lower energy levels.
2.8 EXAMPLES
A few examples that provided some details and extensions of the ideas discussed
are now given.
Example 1
For obtaining the expression for the energy radiated by a unit area per unit time,
given in Eq. (2.13), it is noted that the energy crossing a unit area, per unit time,
per unit frequency is
dU
dv ∫
= vz dE (2.69)
where the direction perpendicular to the area is taken as the z direction. Now
1
dE = u (v) d cos θ, where u(v) is the energy density, vz is c cos θ, and the
2
1
range of integration is from θ = 0 to π , so that
2
dU 1 1
dv 2 0
∫
= c u (v) cos θ d cos θ = c u (v)
4 (2.70)
Introduction to Quantum Ideas 57
Example 2
The position of the maximum in Eq. (2.17) is determined numerically by iteration.
Equating the derivative of u(λ) to zero, one gets the condition
x = 5 (1 – e–x) (2.71)
where x = hc/λkT. Inspection suggests that the solution to this is close to x ≈ 5.
First iteration gives
x ≈ 5 (1 – e–5)
= 4.9663 (2.72)
while the second iteration gives
x ≈ 5(1–e–4.9663)
= 4.96516 (2.73)
which agrees with Eq. (2.18).
Example 3
An experiment on the photoelectric effect of a metal gives stopping potentials
of 4.62 V for λ = 1850 Å, and 0.18 V for λ = 5460 Å. These results can be used
to calculate the Planck’s constant and the work function of the metal. From
Einstein’s relation in Eq. (2.23),
hc
= eφ + eV0 (2.74)
λ
which on using the experimental values, leads to two linear equations in h and φ.
Solving them, we get h = 6.64 × 10–34 J s and φ = 2.1eV.
Example 4
The wavelength of x-rays scattered by bound electrons (Sec. 2.3) has a spread,
mainly due to the fact that the bound electron has a momentum distribution. For
estimating the correction due to the non-zero initial momentum, it is noted that
the binding energy of the electrons is usually quite small, about 10 eV, compared
to the x-ray energies of about 10 keV.
The momentum and energy conservation relations give
(p f − p i ) 2 =
h2
c2
( 2
ν 0 + ν 2 − 2ν 0 ν cos θ ) (2.75)
h λ λ 1 2
λ − λ0 = (1 − cos θ) + 0 p f ⋅ pi − pi + ∆EM (2.77)
mc hmc 2
where ∆E has been neglected compared to mc2. For x-ray energies of a few
tens of keV and binding energies of the order of a few eV, the second term on
the right hand side is smaller than the first term. Because of the variation of pi,
this term gives rise to a spread in the frequency of the scattered beam
[see Fig. 2.3 (a)].
Examples 5
In the Davisson-Germer experiment, the x-ray beam was incident normally on
the surface AB (see Fig. 2.9). The condition for coherent, maximum reflection
is
d′ sin θ = mλ (2.78)
where m is a positive integer. It can be shown that the Bragg condition reduces
to this condition.
For Bragg reflection, the incident and reflected beams make equal angles
with the reflection plane AC, the angles being (π – θ)/2. The Bragg condition
for coherent reflection is
(p – q)
q
2
d¢
A B
d
q/2
D d¢ C
1
2d sin (π − θ) = nλ (2.79)
2
But d = d′ sin (θ/2), so that the Bragg condition becomes
2d′ cos (θ/2) sin (θ/2) = nλ (2.80)
which is the same as Eq. (2.78) if n = m.
Introduction to Quantum Ideas 59
Example 6
The orbit for the motion of a particle in a Coulomb potential can be derived by
considering the change in the momentum of the particle.
For a particle with impact parameter b (see Fig. 2.6) and scattered at an
angle θ, the change in the momentum is
1
∆p = 2mv cos (π − θ)
2
∞
= ∫ F dt
−∞
p
dφ
∫
= F cos φ
φ
(2.81)
where Fp is the component of the force parallel to ∆p and φ is the angle between
the position vector r and ∆p (see Fig. 2.6). But
Ze (2e)
F= (2.82)
4π ε 0 r 2
and mvb = mr2φ (2.83)
which follows from the conservation of angular momentum.
Therefore
Ze 2
2mv sin (θ/ 2) = cos (θ / 2) (2.84)
πε 0vb
which leads to
Ze 2
b= cot (θ / 2) (2.85)
2πε 0 mv 2
used in Eq. (2.52).
Example 7
The Bohr model was generalized by Sommerfeld (1916) to include non-circular
elliptic orbits. The generalized quantum conditions are
∫ dr (2m E − n
2 2
2 r / r 2 + mr Ze 2 /2πε0 r )1/ 2 = kh (2.90)
rmin
The integral can be evaluated using the theory of complex variables and
leads to
1/ 2
Ze 2 mr
− − nh = kh (2.91)
2ε0 2 E
so that
2
mr Ze 2 1
E=− , (k + n) ≥1 (2.92)
2 2 4 πε 0 ( k + n) 2
Thus, for a given value of the principal quantum number (k + n), states with
the same energy exist, for k = 0, ..., k + n – 1. Thus we have what is called as
a degeneracy of order k + n (see. Sec. 3.4).
Example 8
Suppose in addition to the Coulomb attraction, there is a potential energy terms
g/r2. Then the expression for the total energy E is
1 L2 g Ze 2
E= pr2 + + − (2.93)
2mr 2mr r 2 r 2 4πε 0 r
The energy levels are now different for a given k + n but different k or n
values. Thus, the degeneracy due to different angular momentum states is
removed by the addition of the potential g/r2. Indeed, the Coulomb degeneracy
is removed by any additional interaction.
Example 9
A quick estimation of the binding energies of the hydrogen atom is obtained by
the following simple argument.
A stationary state may be thought of as one for which an integral number of
de Broglie wavelengths can be fitted over the orbit, e.g. for circular orbits
2πr = n (h/p), n = 1, 2, ... (2.95)
Using this relation, the total energy is
1 Ze 2
E= p2 −
2mr 4πε0 r
n22 Ze 2
= − (2.96)
2mr r 2 4πε 0 r
If the state is stable, it corresponds to a minimum of this energy:
dE n22 Ze 2
=− 3
+
dr mr r 4πε 0 r 2
=0 (2.97)
so that
n22 4πε 0
rmin = 2 (2.98)
mr Ze
This leads to the energy levels
2
mr Ze2
En = − 2 2 (2.99)
2 n 4πε0
PROBLEMS
1. If the continuum spectrum of the sun approximates that of a black body,
peaking at λm ≈ 5000 Å, what is the surface temperature of the sun? What
can you infer about the temperature of the material surrounding the sun
from the observation of Balmer absorption lines in the spectrum?
62 Elements of Modern Physics
( E2 − E1 ) E2 − E1
ν= 1 +
h 2m1c 2
where m1 is the mass of the initial atom. The recoil correction given by the
second term is of the order of 10–8 , or smaller, for the hydrogen atom. In
Mössbauer effect, the atom is embedded in a crystal so that its effective
mass is that of the crystal. As a result, the second term is negligible and
one has recoil-less absorption of photons.
22. In a Franck-Hertz experiment, hydrogen atoms are bombarded with
electrons. What are the wavelengths of the emission lines observed when
the electrons are accelerated through a potential difference of 12.5V?
23. Assuming that the earth is a black body in equilibrium at a temperature of
300 K, estimate the temperature of the sun.
24. Suppose that man’s power production reaches 20% of the power received
from sunlight (this would happen in about 250 years if the present exponential
growth continues). What is the expected approximate increase in the surface
temperature of the earth?
3
Elements of Quantum Theory
coherent waves coming from S1 and S2, and the intensity at a point on the
screen is given by the modulus squared of the resultant amplitude:
I = |ψ (S1) + ψ (S2)|2 (3.1)
2 πi ( vt – l1 / λ )
= | Ae + Ae 2 πi ( vt − l2 / λ ) |2 (3.2)
where v is the frequency, λ is the wavelength, and l1 and l2 are the distances of
the point on the screen from the two slits. Therefore, the intensity of the
interference pattern at any point is given by
πax
I = 4 | A |2 cos 2 (3.3)
λd
where a, x and d are as shown in Fig. 3.1 and the path difference is approximately
(ax/d). If either of the slits is closed, the interference fringes disappear and a
uniform intensity distribution of I = |A|2 results.
S
S1 x
a
d
S2
2 1 ∂2
∇ − 2 2 ψ = 0 (3.5)
c ∂t
where ψ stands for one of the fields. Such an equation is also satisfied by the
electric and magnetic fields as can be seen by applying the curl operator to
equations (1.60c) and (1.60d) and using the other Maxwell equations in
Eq. (1.60). Furthermore, the plane-wave solutions to this equation are of the
form
ψ (r, t) = Ae–2πi (vt – k.r) (3.6)
v
where v is the frequency and k = n with n being a unit vector in the direction
c
of propagation. It is noted that hv is the energy of the photon and hk is its
momentum. Therefore, the wave function of the photon is of the form
1 E 2 p2
2 2 − 2 ψ = 0 (3.8)
c
which is obviously true since for a photon (mass = 0), E2 – p2c2 = 0 (using the
notation p.p = p2 = p2).
Suppose we knew first about the photon and wanted the equation which
would determine its wave function. Then one could start with the valid relation
1 E 2 p 2
2 − ψ = 0 (3.9)
c
Elements of Quantum Theory 69
∂ψ (r, t ) 2 2
i = − ∇ ψ (r, t ) + V (r ) ψ (r, t ) (3.23)
∂t 2m
Elements of Quantum Theory 71
2 2 ∂ψ (r, t )
− ∇ ψ (r, t ) + V (r ) ψ (r , t ) = i = Eψ (r, t) (3.24)
2m ∂t
Postulates 1 and 2 allow us to deduce average values of general dynamical
variables. Equation (3.24) on being multiplied by ψ*(r, t) and integrated over the
entire space and on rearrangement of terms leads to
∂
∫ψ* (r, t) V (r) ψ (r, t) d3r = ∫ψ* (r, t) i ψ (r, t) d3r
∂t
2 2
– ∫ ψ* (r, t) − ∇ ψ (r, t) d3r (3.25)
2m
The term on the left hand side, as seen from Eq. (3.21), is the average
potential energy. It is therefore reasonable to identify the terms on the right
hand side as the average total energy and the average kinetic energy:
∂
〈 E 〉 = ∫ ψ* (r, t) i ψ (r, t) d3r (3.26)
∂t
1 2 2 2
〈 p 〉 = ∫ψ* (r, t) − 2m ∇ ψ(r, t) d r
3
(3.27)
2m
E
Now, since p, transforms as a 4-vector, the relationships in Eqs. (3.10)
c
and (3.11) suggest that, in addition to Eq. (3.26)
〈 p 〉 = ∫ ψ* (r, t) (− i ∇ ) ψ (r, t) d3 r (3.28)
These results are generalized in the following postulate.
Postulate 3: The average values of E and the dynamical variable F (p, r) are
given by
∂
〈 E 〉 = ∫ ψ* (r, t) i ψ (r, t) d3r (3.29)
∂t
dynamical observables that their average values must be real. It is easy show
from Eq. (3.30) that both r and p have real average values and are acceptable
as dynamical observables. On the other hand, xpx is not an observable though
the angular momentum r × p can be shown to have a real average value and
hence is acceptable as a dynamical observable.
It is often convenient to work with the fourier transform of the wave function
rather than with the wave function itself. Writing ψ (r, t) as
1
ψ (r, t) = ∫ f (k , t ) exp (ik .r / ) d 3k (3.31)
h3/ 2
the inverse fourier transform is
1 3
f (k, t) = ∫ ψ (r , t ) exp (−ik .r / )d r
3/ 2 (3.32)
h
The fourier transform f (k, t) is called the wave function in the momentum
space. It is easy to show that
∫ |ψ (r, t)|2 d3r = ∫ |f (k, t)|2 d3k (3.33)
and that the average value of momentum given in Eq. (3.28) reduces to
〈 p 〉 = ∫ |f (k, t)|2 k d3k (3.34)
which justifies the definition of f (k, t) as the wave function in the momentum
space.
ψ= ∑an
n φn (3.40)
with time (except for the phase factor), this means that the eigenvalues of
B for these states do not change with time, and therefore B is conserved.
The solutions to the Schrödinger equation in some simple situations are now
discussed.
∂ψ 2 2
i = − ∇ψ (3.43)
∂t 2m
Separating the variables, the solution to Eq. (3.43) can be written in the
form
ψ (r, t) = f (t) φ(r) (3.44)
Substituting this in Eq. (3.43) and dividing the equation by ψ (r, t) gives
1 ∂f (t ) 2 1
i = − ∇ 2 φ( r ) (3.45)
f (t ) ∂t 2m φ(r )
which can be satisfied only if both the sides are constant, say E. Then
∂f (t )
i = Ef (t) (3.46)
∂t
2 2
− ∇ φ(r ) = Eφ (r) (3.47)
2m
Equations (3.46) and (3.47) are eigenvalue equations for the energy, E being
the energy eigenvalue and f (t), φ (r) being the corresponding eigenfunctions.
They describe a state with a well-defined energy E.
The solution to Eq. (3.46) is
f (t) = exp (– iEt/ ) (3.48)
except for an overall constant which will be included in φ(r). For solving
Eq. (3.47), once again a separable form is assumed for φ (r),
φ (r) = A (x) B (y) C(z) (3.49)
Substituting this in Eq. (3.47) and dividing by φ (r) gives
2 1 d A ( x) 1 d C ( z)
2 2 2
1 d B( y )
− + + = E (3.50)
2m A( x) dx 2 B( y ) dy 2 C ( z ) dz 2
Elements of Quantum Theory 75
For this relation to be valid, each of the three terms in Eq. (3.50) should be
a constant. Introducing constants kx, ky, kz one gets
d 2 A ( x) k x2
= − A ( x) (3.51)
dx 2 2
d 2 B( y ) k y2
= − B ( y) (3.52)
dy 2 2
d 2C ( z ) k2
2 = − z2 C ( z ) (3.53)
dz
1
with E= (k 2 + k y2 + k z2 ) (3.54)
2m x
Solutions to these equations finally lead to
i
ψ (r, t) = β exp − ( Et − k ⋅ r ) (3.55)
with the constants E, k satisfying the condition in Eq. (3.54). The following
points should be noted about this solution:
∂
1. Since the operators corresponding to energy and momentum, i and
∂t
– i ∇ , operating on this solution give the wave function back but multiplied
by constants E and k respectively, the solutions describe a particle with
energy E and momentum k.
2. The solution is not normalizable [see Eq. (3.17)] since |ψ| = |β| and
∫ |ψ|2 dV = ∞. Nevertheless, it can be used for describing relative
probabilities, the probability of finding the particle anywhere being the
same. The wave function can be interpreted as describing a beam of
noninteracting particles with momentum k, and with |β|2 number of particles
per unit volume.
3. Since Eq. (3.43) is linear, any superposition of solutions in Eq. (3.55) is
also a solution of Eq. (3.43), i.e. the general solution can be written as
1 i
∫ exp − ( Et − k ⋅ r) F (k ) d k
3
ψ (r, t) = 3/ 2 (3.56)
h
with E given by Eq. (3.54).
76 Elements of Modern Physics
1 i
ψ (r, t) = ∫ f (k − k 0) exp − ( k 2t / 2m − k ⋅ r ) d 3k (3.57)
h3/ 2
where f (k – k0) is significantly nonzero only in a small region about k ≈ k0.
There are some general properties of the wave packet which are demonstrated
here by taking the Gaussian form for f (k – k0). For
1 (k − k 0 ) 2
f (k – k0) = 3/ 4 3/ 2
exp − (3.58)
π (b) 2 2b 2
the wave packet ψ (r, t) is obtained from Eq. (3.57) by changing the variable of
integration to q = k – k0, and integrating
i k02
ψ (r, t) = exp − 2m t − k0 .r u (r , t ) (3.59)
b2 k
2
b3/ 2 exp − r − 0 t / (1 + itb 2 / m)
2 m
u (r, t) = 3/ 4 2 3/ 2 (3.60)
π (1 + itb / m)
Thus, the wave packet is a product of a plane wave with momentum k0,
k0
and an envelope which is peaked at r = t. The phase moves with velocity
m
k0.2m, which is called the phase velocity, and the envelope moves with velocity
k0/m, which is called the group velocity. Since the envelope determines the
location of the particle, it is the group velocity which corresponds to the classical
velocity of the particle.
The wave packet brings out an important principle regarding the determination
of the position and momentum of a particle. It can be seen from Eq. (3.60) that
1
the wave packet at t = 0 is significantly nonzero only for | x | < , so the spread
b
in the x-component of position of the particle is
1
∆x ≈ (3.61)
b
Elements of Quantum Theory 77
w
∆px ≈ p (3.67)
d
where w is the width of the central fringe. This uncertainty results from the fact
λd
that the particle may come to any point within this fringe. Since w = , and
a
p = h/λ by the de Broglie relation, we get
(∆x) (∆px) ≈ h (3.68)
which is the same as Eq. (3.63) in order of magnitude. This demonstration
brings out the fact that the Heisenberg uncertainty principle is essentially a
consequence of associating wave properties with the particles.
∂ψ ( x, t ) 2 ∂ 2
i = − + V ( x ) ψ ( x, t ) (3.70)
∂t 2m ∂x
2
As before, for states with energy E,
2
2 d φ ( x)
− = [E – V (x)] φ(x) (3.72)
2m dx 2
The solution for φ (x), for x < 0, is
φ (x) = a+eipx + a– e–ipx, x < 0 (3.73)
1
with p= (2mE )1/ 2 (3.74)
where the first term in the solution corresponds to a particle with momentum
p , and the second term to a particle with momentum – p . The solution for
x ≥ 0 is
φr (x) = b+ eiqx + b–e–iqx, x ≥ 0 (3.75)
1
with q= [2m ( E − V )]1/ 2 (3.76)
Now, since the potential is piece-wise continuous and finite, it follows from
the properties of the differential equation (3.72), that the wave function φ (x)
dφ
and its first derivative are continuous everywhere, in particular at x = 0.
dx
Therefore, one has
a+ + a– = b+ + b– (3.77)
q
a+ – a– = (b − b− ) (3.78)
p +
The solutions are discussed separately for the two qualitatively different
cases, (i) E ≥ V, and (ii) E < V.
Case (i) For E ≥ V, it is assumed that the particle approaches the barrier from
the left and is either transmitted or reflected at x = 0. Hence, for x > 0, there is
only a wave function describing a particle moving to the right which implies that
b– = 0 (3.79)
Therefore Eqs. (3.77) and (3.78) give
2p
b+ = a+
p + q
p − q
a– = a+ (3.80)
p + q
80 Elements of Modern Physics
2 2
a− p − q
R= a = (3.81)
+ + q
p
In terms of the refractive index n,
p
n= (3.82)
q
Eqs. (3.81) can be written as
4n
T=
( n + 1) 2
2
n − 1
R = (3.83)
+ 1
n
which are the same as the classical transmission and reflection coefficients for
electromagnetic waves.
Case (ii) For E < V, q is imaginary. Writing
q = iα (3.84)
1
where α= [2m (V – E)]1/2 (3.85)
the solution for x > 0, is
φr (x) = b+ e–αx + b – eαx (3.86)
In order to keep the probability finite as x → ∞,
b– = 0 (3.87)
The continuity equations (3.77) and (3.78) then imply that
2p
b+ = a+
p + iα
p − iα
a– = a+ (3.88)
p + iα
Elements of Quantum Theory 81
∫ |ψ|
2
1. Since dx = ∞, the wave function is not normalizable.
–∞
This wave function could have been obtained from Eqs. (3.73) and (3.86)
with b– = 0, by just requiring that the wave function vanishes at x = 0,
φ (0) = φr (0) = 0, but no further conditions on dφ/dx. Indeed this prescription
considerably simplifies that calculations whenever the potentials jump to
infinity.
5. The quantum mechanical phenomenon of a particle penetrating classically
forbidden barriers gives rise to an interesting observation of trapped
particles escaping through classically forbidden barriers. Consider a situation
[Fig. 3.2(b)] where the barrier exists only for 0 ≤ x ≤ d, i.e. V (x) = V for
0 ≤ x ≤ d and V (x) = 0 elsewhere. In this case, if a beam of particles, is
incident from the left with energy E < V, the wave function is given by
φ(x) = a+eipx + a–e–ipx for x < 0,
= b+e–αx + b–eαx for 0 ≤ x ≤ d, (3.94)
= c + e ipx for x > d
where p and α are given in Eqs. (3.74) and (3.85) respectively. Continuity
of the wave function and its derivative at x = 0 and x = d, allows us to
determine a–, b+, b– and c+ in terms of a+. Since the particles can penetrate
the forbidden barrier, in general b+, b– and c+ are nonzero. Thus, the particles
can cross a barrier even if classically, the energy is insufficient to pass
over the barrier, and the probability of transmission is given by the ratio
|c+|2/|a+|2. This effect is termed as tunnelling and provides a satisfactory
explanation for the decay of unstable particles (e.g. U235, etc.) as a
tunnelling of trapped particles through a potential barrier.
A scanning tunneling microscope (STM) is an instrument for imaging
surfaces at the atomic level. It is based on the concept of quantum
tunneling. When a conducting tip is brought very near to the surface to
be examined, a bias (voltage difference) applied between the two can
allow electrons to tunnel through the vacuum between them. The resulting
tunneling current is a function of tip position, applied voltage, and the
local density of states (LDOS) of the sample. Information is acquired by
monitoring the current as the tip’s position scans across the surface. For
an STM, good resolution is considered to be 0.1 nm lateral resolution and
0.01 nm depth resolution. With this resolution, individual atoms within
materials are routinely imaged and manipulated. (Source : Wikipedia)
Elements of Quantum Theory 83
Instrumentation
Control voltages for piezotube
Piezoelectric tube
with electrodes
Tunneling Distance control
current amplifier and scanning unit
Tip
Tunnelling
voltage
Data processing
and display
Fig. 3.3
1
p= (2mE)1/2 (3.97)
The wave function is zero for x < 0 or x > l since the potential is infinite in
this region. Since the potential jumps to infinity at x = 0 and x = l, the boundary
conditions as discussed in Sec. 3.7 are that the wave function should vanish at
x = 0 and x = l. This implies that
sin α = 0, (3.98)
pl = nπ, n = 1, 2, ... (3.99)
Therefore, the solutions are
1/ 2
2 nπ
φn(x) = sin x , n = 1, 2, ... (3.100)
l l
2 π2 2
En = n (3.101)
2ml 2
where a = (2/l)1/2 has been used as required by the normalization condition in
Eq. (3.19)
l
2 nπ
| a |2 ∫ sin
0
l x dx = 1
(3.102)
2. The discreteness of the energy levels is significant only for small m and l.
For example, if m ≈ 10–3 kg and l ≈ 0.1 m, the separation between the
energy levels is of the order of 10–62 J which is quite negligible. On the
other hand, for an electron in an atom, m ≈ 10–30 kg and l ≈ 10–10 m, so that
∆En ~ (60n) eV and the discreteness becomes important.
3. It is seen that the states with n ≥ 2 have nodes inside the box. Since
probability density is given by |φn(x)|2, this means that there are some
regions where the particle will not be found, which is totally incompatible
with the classical ideas of trajectories.
4. If the potential in the region x < 0 and x > l, is not infinite but finite, the
wave function can penetrate into this region. Therefore, it is not forced to
be zero at x = 0 or x = l. Hence, the wave function varies more gently
inside the region 0 ≤ x ≤ l, and the energies, which are related to the
second derivative of the wave function are lower than in the case where
the potential is infinite for x < 0 and x > l.
5. It is observed that
l
∫φ
0
n ( x) φn ' , ( x) dx = 0, for n ≠ n′ (3.103)
∫φ
0
n ( x) φn ( x) dx = 1 (3.104)
∫φ 0
n ( x) φn ' ( x) dx = δn, n′ (3.105)
where the Kronecker delta δn,n′ is 1 for n = n′ and zero otherwise. Thus,
these states are orthonormal. It can also be shown that any general state
of a particle in the box can be written as a linear combination of the
energy eigenstates, i.e.
∞
ψ (x) = ∑ a φ ( x)
n =1
n n
∑| a
n =1
n |2 = 1 (3.106)
which means that the eigenstates φn (x) are complete. The orthonormality
and completeness are important properties associated with the eigenstates
of any physical observable (see Sec. 3.4).
86 Elements of Modern Physics
A quantum well laser is a laser diode in which the active region of the
device is so narrow that quantum confinement occurs. The wavelength of the
light emitted by a quantum well laser is determined by the width of the active
region rather than just the bandgap of the material from which it is constructed.
This means that much shorter wavelengths can be obtained from quantum well
lasers than from conventional laser diodes using a particular semiconductor
material. The efficiency of a quantum well laser is also greater than a conventional
laser diode.
(Source : Wikipedia)
d 2η ( x) d η ( x) 2 2mE
− 2α 2 x = α − 2 η ( x) (3.109)
dx 2 dx
Substitution of a series solution for η(x) into Eq. (3.109) and equating the
coefficients of xk gives
∞
η(x) = ∑b
k =0
k xk
2 2 2mE
(k + 2) (k + 1) bk + 2 = 2α k + α − 2 bk (3.110)
For a general value of E, the infinite series for n (x) gives an asymptotically
increasing solution φ (x) ~ exp (α2x2/2) which is not normalizable. However, for
α 2 2
some special values of E = (n + 1/2) , n a positive integer, the series in
m
Eq. (3.110) terminates at k = n and we get normalizable wave functions in
88 Elements of Modern Physics
1/ 2
α
φn (x) = π−1/ 4 n exp (– α2x2/2) Hn (αx)
2 n!
1
En = n + ω , n = 0, 1, 2, ... (3.111)
2
where ω = (k/m)1/2. It is observed that again the ground state energy is not zero.
1
Its value of ω is called the zero-point energy.
2
The first three Hermite polynomials are
H0 (αx) = 1
H1 (αx) = 2αx (3.112)
H2 (αx) = 4α2x2 – 2
The harmonic oscillator problem can be solved more elegantly by using
operator algebra. Defining
1/ 2
a = mω x+ ∂
2 hω ∂x
1/ 2
mω x− ∂
a† = mω ∂x
(3.113)
2
it can easily be shown that
aa † − a † a = 1
aH – Ha = ωa (3.114)
a † H − Ha † = −ωa +
with the Hamiltonian H (i.e. energy) being
2 d 2 1 2
H= − + kx (3.115)
2m dx 2 2
Therefore if ψ0 (x) is the ground state with energy E0, then using Eq. (3.114)
1
En = n + ω (3.120)
2
The normalization is obtained by the repeated use of the first relation in
Eq. (3.114) and Eq. (3.118). It is clear from the definition of a † , tht ψn (x) are
alternatively even and odd functions of x.
∫ φ*n V ψ d τ
E = En + λ (3.124)
∫ φ*n ψ d τ
90 Elements of Modern Physics
ψ= ∑aφ
i =1
i
(i )
n (3.126)
where ∑| a |
i
i
2
= 1, Eq. (3.124) gives
∑ (λ ∫ φ
i
( j )*
n V φ(ni ) d τ) a j = (E – E ) ai
n
(3.127)
Thus, (E – En) and ai are obtained from this set of equations. For example,
if the degeneracy is of order two, i.e. i = 1, 2, the solutions are
E = En + λ (V11 + xV12) (3.128)
where x = a2/a1 is
= ∑ [r p
i, j
i j ri p j − rii j rj pi ]
∂
and with r.p = −ir Eq. (3.133) becomes
∂r
∂ ∂2
L2 = r2p2 + 2 2 r + 2r 2 2 (3.134)
∂r ∂r
Using this relation the kinetic energy T can be written as
1 2
T= p
2m
2 ∂ 2 ∂ L2
= − 2 ∂r
r + (3.135)
2mr ∂r 2mr 2
which shows that the angular momentum is an important term in the kinetic
energy.
The expressions for the angular momentum operators in terms of spherical
coordinates are obtained from Eq. (3.131) as
∂ ∂
Lx = i sin φ + cos φ cot θ
∂θ ∂φ
∂ ∂
L y = − i cos φ − sin φ cot θ (3.136)
∂θ ∂φ
Lz = − i ∂
∂φ
2
and L2 = − i 1 ∂ sin θ ∂ + 12 ∂ 2 (3.137)
sin θ ∂θ ∂θ sin θ ∂φ
The wave functions corresponding to well-defined values of L2, satisfy the
equation
1 ∂ ∂ 1 ∂2
− i sin θ + Y (θ, φ) = λY (θ, φ) (3.138)
sin θ ∂θ ∂θ sin 2 θ ∂φ2
92 Elements of Modern Physics
1 ∂2F sin θ ∂ ∂ λ
− sin θ P (θ) + 2 sin 2 θ
F (φ) ∂φ2 = P (θ) ∂θ ∂θ
(3.140)
Since the two sides depend on different variables, each must be a constant,
say m2, so that
d 2 F (φ)
2
+ m 2 F (φ) = 0 (3.141)
dφ
1 d dP (θ) m2 λ
sin θ − 2 P(θ) + 2 P(θ) = 0 (3.142)
sin θ d θ d θ sin θ
The solutions to the Eq. (3.141) are
F(φ) = e im φ (3.143)
However, if the condition that the wave function at every physical point
must be single-valued is imposed, then F(φ) = F(φ + 2π) which means that the
values of m are restricted to m = 0, ± 1, ±2 etc. It is easily seen that
∂
−i F (φ) = m F (φ), m = 0, ± 1, ± 2, ... (3.144)
∂φ
2 λ
(k + 1) (k + 2) bk + 2 = k + k − 2 bk
For an arbitrary value of λ, the series diverges at v = ± 1. However, for
λ = l (l + 1) 2 , l a positive integer, the series terminates at k = l and we get
well-behaved solutions:
λ = l (l + 1) 2 , l = 0, 1, 2, ... (3.146)
1 dl 2
Pl (v) = (v − 1)l , v = cos θ
2l l ! dvl
Elements of Quantum Theory 93
These are Legendre Polynomials of order l, the first few of them being
P0 (cos θ) = 1
P1 (cos θ) = cos θ (3.147)
1
P2 (cos θ) = (3 cos 2 θ − 1)
2
The solutions for m ≠ 0 are somewhat more complicated, and for l ≥ m ≥ 0,
are given by
dm
2 m/2
Plm (v) = (1 − v ) Pi (v) , v = cos θ (3.148)
dv m
called the associated Legendre functions. Combining these solutions with those
in Eq. (3.143) gives the solutions to Eq. (3.138) as
1/ 2
(2l + 1) (l − m)!
Y (θ, φ) =
m (–1) m eimφ Pl m (cos θ) (3.149)
l
4π (l + m)!
with λ = l (l + 1) 2 , l and m being integers, and l ≥ m. Ylm (θ, φ) are called
spherical harmonics, and are defined for negative integers m by the relation
Ylm = (–1)m (Yl–m)* (3.150)
Their normalization is chosen such that they are orthonormal,
∫ Yl m* (θ, φ) Yl m' ' (θ, φ) d cos θ d φ = δl ,l ' δ m ,m ' (3.151)
They are simultaneous eigenfunctions of Lz and L since 2
m
Lz Ylm (θ, φ) = m Yl (θ, φ), m ≤ l
Apart from playing an important role in the discussion of the kinetic energy
in spherical coordinates [Eq. (3.135)], the angular momentum operator plays a
significant role in determining the rotational energy levels of a rigid rotator. For
example, the rotational energy levels of a di-atomic molecule are given by the
Hamiltonian
1 2
H= L (3.155)
2I
where l is the moment of inertia, and the corresponding energy levels are given
by
1
E= l (l + 1) 2 , l = 0, 1, ... (3.156)
2I
3.12 EXAMPLES
Here some important properties of quantum mechanical system and their
applications are discussed.
Example 1
Since indeterminacy is not an essential part of classical mechanics, it is suggetive
that classical measurements may be related to the averages of quantum
mechanical measurements. This is illustrated by Ehrenfest’s theorem.
Consider the time-derivative of the average position given by
d d
〈 r〉 = ∫ ψ* rψ d 3 r
dt dt
*
∂ψ 3 ∂ψ
∫ ψ* r ∫
3
= d r+ rψd r (3.157)
∂t ∂t
Using the Schrödinger equation, and cancelling the potential energy terms
(potential is real),
d i * 2
〈r 〉 = ∫ ψ r∇ ψ d 3 r − ∫ (∇ 2 ψ* ) r ψ d 3 r (3.158)
dt 2m
Integrating by parts gives
d 1
dt
〈r 〉 =
m ∫
ψ * ( − i ∇ ) ψ d 3 r
〈p〉
= (3.159)
m
Elements of Quantum Theory 95
which is analogous to the classical result that the momentum is the product of
mass and velocity. Proceeding in a similar way, it can be shown that
d 〈 p〉 * ∂ψ 3 ∂ψ* 3
= −i ∫ ψ ∇ d r+∫ ∇ψd r (3.160)
dt ∂t ∂t
which on using the Schrödinger equation once again and integrating by parts,
leads to
d 〈 p〉 3
= − ∫ ψ *(∇V )ψ d r
dt
= 〈−∇V 〉 (3.161)
This relation is analogous to Newton’s second law in classical mechanics.
It is to be noted that if the uncertainties in the values of the various dynamical
quantities can be neglected, Eqs. (3.159) and (3.161) represent the classical
behaviour of particles in terms of approximate trajectories.
Example 2
The angular momentum operators provide an interesting illustration of the
properties of hermitian operators. It follows from Eq. 3.131) or from Eqs. (3.136)
and (3.137) that the angular momentum operators satisfy the commutation
properties
[Lx, L2] = [Ly, L2] = [Lz, L2] = 0 (3.162)
but
Example 3
Consider the tunnelling of particles across a barrier potential of height V and
with d. Imposing the conditions of continuity of the wave function in Eq. (3.94),
and its derivative at x = 0 and x = d,
a+ + a– = b+ + b–
iα
a+ – a_ = (b+ – b_)
p
ip
b+ e–αd – b_ eαd = − c + eipd (3.167)
α
Expressing c+ in terms of a+, we get
a+ 1 ipd iα ip αd iα ip −αd
e 1 + 1 − e + 1 − 1 + e
c+ = 4 p α p α
(3.168)
16α 2 p 2 −2αd
T≈ e (3.170)
( α2 + p2)
Elements of Quantum Theory 97
Example 4
Bohr’s correspondence principle states that a quantum system tends (in a
particular sense) to its classical analogue, for large quantum numbers. This
is demonstrated for a particle in a box.
For a particle in a one-dimensional box of length l, the probability of finding it in
the region B ≤ x ≤ B + b, (see Sec. 3.8), is
B+b
2 nπ
Pb = ∫ sin 2 x dx
l l
B
B+b
b 1 2nπx →
b
for n → ∞
= − sin (3.171)
l 2nπ l B l
Example 5
As an application of perturbation theory, consider a particle of charge q, in the
presence of a constant electric field E, inside a 3- dimensional box of dimensions
(lx) × (ly) × (lz).
In the absence of the electric field, the wave functions and the energies are
[see Eqs. (3.100) and (3.101)]
φn (x, y, z) = φnx (x) φny (y) φnz (z) (3.172)
2 π2 nx2 n 2y nz2
En = 2 + 2 + 2 , nx = 1, 2, etc. (3.173)
2m lx l y lz
1/ 2
2 n π
where φnx (x) = sin x x , 0 ≤ x ≤ lx (3.174)
lx lx
= 0 for x 〈 0 or x 〉 lx
and similar expressions for φny (y) and φnz (z). If a weak constant electric
field E is introduced, the additional potential energy is
V= –qE⋅r (3.175)
The change in the energy due to this term is given by perturbation theory
(see Sec. 3.10) as
E – En ≈ – q ∫ |φn (x, y, z)|2 E ⋅ r dτ (3.176)
1
=– q (Ex lx + Ey ly + Ez lz) (3.177)
2
98 Elements of Modern Physics
Thus, to the leading order, all the energy levels are shifted by the same
amount.
Example 6
For the 3-dimensional harmonic oscillator, the Hamiltonian
1 2 1 2
H= p + kr (3.178)
2m 2
is the sum of 1-dimensional Hamiltonians in the three directions. Hence the
wave function is a product of the three wave functions,
ψ (x, y, z) = ψnx (x) ψny (y) ψnz (z) (3.179)
where ψnx (x), etc. are given in Eq. (3.111), and the energy is
3
E = nx + n y + nz + ω (3.180)
2
These solutions can be written in terms of spherical coordinates so as to
exhibit the angular momentum content of the states. For example, the ground
state is
2α3/ 2 0
ψ (r, θ, φ) = 1/ 4
Y0 (θ, φ) exp (– α 2 r 2 / 2)
π
3
E= ω (3.181)
2
while the state with nx = 1, ny = nz = 0 can be written as
2α 5 / 2
ψ (r, θ, φ) = 1/ 4 1/ 2
[Y1−1 (θ, φ) − Y11 (θ, φ)] r exp (−α 2 r 2 / 2)
π 3
5
E= ω (3.182)
2
PROBLEMS
1. A one-dimensional wave packet has the form ψ(x) = 0 for |x| > a, and
ψ(x) = (2a)–1/2 for |x| ≤ a. What is the wave function in the momentum
space? Demonstrate the uncertainty principle for this wave packet.
2. For a particle coming from the left with energy E, and a potential changing
from V (x) = 0 for x < 0 to V (x) = – V0 for x ≥ 0, obtain the transmission
and reflection coefficients T and R respectively. Show that R + T = 1.
Elements of Quantum Theory 99
3. Show that the frequency of radiation emitted when the particle inside a
one-dimensional box undergoes a transition from (n + 1) state to n state,
tends to the classical frequency of motion inside the box, for n → ∞. This
is another illustration of Bohr’s correspondence principle.
4. Consider a wave function Ae–r/a for the ground state of the hydrogen
atom, r being the separation between the electron and the proton. Determine
A, a, and the ground state energy. What are the classical and quantum
mechanical probabilities of finding the electron at a separation greater
than 2a?
5. If a three-dimensional harmonic oscillator has a solution of the form AY10
(θ, φ) re–ar2, determine a, A, and the energy in terms of mass and force-
constant of the oscillator.
6. For a particle inside a one-dimensional square well defined by V (x) = 0
for |x| ≥ a and V(x) = –V0 for |x| < a, obtain a relationship between
the binding energy, a, and V0. Show that for V0 → 0, there is a bound state
with energy E → –2ma2V02/ 2 (this is a shallow bound state in the sense
that E/V0 → 0 as V0 → 0).
7. For a particle in a one-dimensional box, obtain the standard deviations
σ(x) and σ(p) for position and momentum respectively. Show that σ(x)
1/ 2
n 2 π2 1
σ(p) = − , and that it is greater than / 2 .
12 2
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 101
S. H. Patil, Elements of Modern Physics,
https://doi.org/10.1007/978-3-030-70143-7_4
102 Elements of Modern Physics
In this chapter, the one-electron atom is analysed within the framework of wave
mechanics. It is the ability of quantum mechanics to describe the detailed
properties of the one-electron atom which has, more than any thing else,
established the essential validity of quantum mechanical ideas, at least as a
calculational tool for describing small-distance phenomena.
The wave functions and the energy levels of the nonrelativistic one-electron
atom are first obtained. The corrections due to spin-orbit interaction and other
relativistic effects are then introduced perturbatively. Together, these results
provide a very satisfactory description of the one-electron energy levels including
the fine structure. Finally, the effect of the nuclear spin on the atomic energy
1
levels is discussed and a brief introduction to the formal description of spin-
2
particles is given.
1 1 Ze 2
E = mere2 + mnrn2 – (4.1)
2 2 4πε 0 | re − rn |
In the centre of mass frame defined by Eq. (2.54), Eq. (4.1) has the form
p3 Ze 2
E = 2mr − 4πε r (4.2)
0
where r = re – rn (4.3)
p = mr r (4.4)
me mn
mr = (4.5)
me + mn
The Schrödinger equation follows from Eq. (4.2). For states with well-
defined energy E, one can write the wave function in the from
ψ (r, t) = φ (r) exp (–iEt/ ) (4.6)
with φ (r) satisfying the time-independent Schrödinger equation
2 2 Ze 2
– 2m ∇ φ(r ) − 4πε r φ (r ) = Eφ (r) (4.7)
r 0
where the first term represents the kinetic energy. Because the potential is a
function only of r, it is preferable to write the Laplacian operator in terms of
spherical coordinates. It is particularly convenient to use the expression in
Eq. (3.135) in terms of which the Schrödinger equation becomes
The One-Electron Atom 103
2 ∂ 2 ∂ L2 2
− 2 ∂r
r φ(r ) + φ(r ) − Ze φ (r ) = Eφ (r)
2mr r ∂r 2mr r 2
4πε 0 r
(4.8)
with the angular nominatum term L2 given by Eq. (3.137).
The solutions to Eq. (4.8) in the factorizable form can be written as
φ (r) = (r) Y (θ, φ) (4.9)
Dividing Eq. (4.8) to φ (r) leads to
1 2 d 2 d Ze2
r − − E R(r )
R(r ) 2mr r dr
2 dr 4πε0 r
1
2
L2Y (θ, φ) (4.10)
2mr r Y (θ, φ)
which implies that once the left hand side is independent of θ and φ, Y (θ, φ)
must satisfy the eigenvalue equation
L2Y(θ, φ) = λY (θ, φ) (4.11)
The solutions to this equation were discussed in Sec. 3.11, and are the
spherical harmonics Ylm (θ, φ) given in Eq. (3.149), which satisfy the equations:
m
LzYlm (θ, φ) = m Yl (θ, φ) , m = 0, ± 1, ..., ± l
2 1 d 2 d l (l + 1) Ze2
− r
r dr dr + R ( r ) − R(r ) = ER(r) (4.13)
2mr 2
r
2 4πε0 r
There are two important classes of solutions to the radial equation. It is
found that solutions exist for all positive values of E. They exhibit oscillatory
behaviour for r → ∞ and are not normalizable. These solutions can be used to
describe a beam of particles scattered by the Coulomb potential, and lead to
Rutherford scattering. The solutions which are of greater interest are the ones
for negative E which correspond to bound state solutions. The steps followed in
obtaining the negative energy solutions are as follows:
1. Obtain the asymptotic behaviour of R(r), which is finite for r → ∞.
It is
2. Define
3. Impose the condition that the solution for u(r) does not alter the asymptotic
behaviour of R(r). This constraint on the asymptotic behaviour leads to
the result that the series in Eq. (4.16) must terminate for a finite value of
k, say k = s + p, p is an integer, i.e. bk = 0 for k > (s + p).
Substituting the above expressions in Eq. (4.13), and equating the coefficients
of the same powers of r, in particular of rk–2, gives
2 Ze 2 2 k 2mr E 1/ 2
bk 2m [l (l + 1) − k ( k + 1)] = bk −1 − − 2 (4.17)
4πε0 mr
r
Since bs–1 = 0, we have s = l or s = – l – 1. For l ≠ 0, the s = – l – 1 solutions
are not normalizable and hence are discarded. For l = 0, the first term in
Eq. (4.16) for the s = –l –1 solution, is b–1 r–1. However, Eq. (4.17) for k = 0
gives b–1 = 0 which is inconsistent. Hence, only the s = l solution need be
considered. The requirement that the series terminates, i.e.
bk = 0 for k = l + p + 1, p ≥ 0, then leads to
1/ 2
mr Ze 2 2mr E
2 = (l + p + 1)
− 2 (4.18)
4πε 0
n = l + p + 1, p = 0, 1, ...
with l + p being the highest power of r in the series solution for u(r). The wave
functions corresponding to the solutions are related to Laguerre polynomials,
and are given by
1/ 2
2 Z ( n − l − 1)!
3
2Zr 2 4πε0
ρ = , a1 =
na1 mr e 2
where a1 is the radius of the first Bohr orbit for Z = 1, and L2nl++11 are the
associated Laguerre polynomials given by
di
Lji (ρ) = L j (ρ) (4.21)
d ρi
Lj (ρ) being the Laguerre polynomials,
ρ dj
Lj (r) = e (ρ j e− p ) (4.22)
dρ j
Here Rn, l (r) are normalized to satisfy the condition
∫ [Rn,l (r)]2 r2dr = 1 (4.23)
Collecting the radial and angular parts, the solutions are
φn,l,m (ρ, θ, φ) = Rn,l (r) Ylm (θ, φ) (4.24)
with
2
mr Ze 2 1
En = – 4πε 2 (4.25)
2 2 0 n
3/ 2
1 Z Zr
φ2,1,0 (r) = a exp ( − Zr / 2a1 ) cos θ (4.26)
(32π)1/ 2 1 a1
3/ 2
1 Z Zr
φ2,1,1 (r) = a exp (− Zr / 2a1 ) sin θ exp (iφ)
(64π)12 1 a1
∑ (2l + 1)
l =0
= n2 (4.27)
a1n 2 1 l (l + 1)
〈r〉 =
Z 1 + 2 1 − n 2 (4.29)
1 Z
〈 〉 = (4.30)
r a1n 2
1 Z2
〈 〉 = (4.31)
r2 1
a12 n3 l +
2
1 Z3
〈 3 〉 = , for l > 0 ( 4.32)
r 1
a13 n3 l (l + ) (l + 1)
2
Only for 1/r is the average value the same as the corresponding value for
the Bohr orbits.
The energy levels of the one-electron atom, deduced from the Schrödinger
equation, are the same as those obtained from the Bohr model. However, it
The One-Electron Atom 107
should be appreciated that the results of the Bohr model follow from ad-hoc,
though interesting, assumptions, while those from the Schrödinger equation are
based on fundamental physical principles.
The solutions of the nonrelativistic Schrödinger equation for the one-electron
atoms provide a satisfactory basis for the understanding of the general features
of the energy spectra of these atoms. However, these spectra have a fine
structure, to understand which small relativistic corrections should be included
and additional properties for the electron proposed.
e
µz = ± (4.36)
2me
e / 2me is called the Bohr magneton for the electron, (e /2me ≈ 9.2731
× 10–24 m2 Cs–1).
It was observed in Eq. (3.152) that the possible eigenvalues of L2 are
l (l + 1) 2 and those of Lz are m where m = l , (l – 1) , ..., – l . If this
property is assumed to be valid for the spin angular momentum as well, it follows
that since Sz has only two eigenvalues [see Eqs. (4.33), (4.36)], Sz and S2 have
eigenvalues
1
S z = ms , ms = ± (4.37)
2
1
S 2 = s (s + 1) 2 , s = (4.38)
2
It is now easy to deduce from Eqs. (4.37), (4.36) and (4.33) that
e
µ = – S (4.39)
me
The One-Electron Atom 109
These relations introduce the new idea of intrinsic spin angular momentum
whose quantum numbers take half-integral values in contrast to the integral
values taken by the quantum numbers of angular momentum originating from
the spatial motion of particles. The relation in Eq. (4.39) also differs (by a factor
of 2) from the relation
e
µ = – L (4.40)
2me
expected for the magnetic moment of a negatively charged particle moving
around in a circle with angular momentum L. All in all, the spin of an electron,
with half-integral values for its quantum numbers, is a revolutionary idea with no
classical analogue. It has found strong support not only in the wealth of
experimental data it can explain, but also in the elegant formulation of the
linearized relativistic equation of Dirac (1928) describing a spin 1/2 particle.
J 2 = j (j – 1) 2 (4.43)
It follows from Eq. (4.41), that
mj = ml + ms (4.44)
so that mj has integral values if ms has integral values, and half-integral values if
ms has half-integral values (ml is used in place of m to have a more symmetric
notation). For deducing the possible values for j, it is assumed that l ≥ s. It is also
noted that the magnitude of J is not affected by the choice of the direction of L,
i.e. the choice of ml. Taking the largest possible value ml, = l, gives
mj = l + ms (4.45)
Thus the largest and the smallest values of mj are l + s and l – s. This result,
along with a similar analysis for l ≤ s, implies that the allowed values of j and mj,
are
|l–s| ≤ j≤l+s (4.46)
mj = j, j – 1, ..., – j (4.47)
110 Elements of Modern Physics
It therefore follows from Eqs. (4.47) and (4.44), that j takes on integral
values if s takes integral values, and half-integral values if s takes half-integral
values.
An electron n an atom is characterized by the quantum numbers n, l, s
and j. In spectroscopic notation, the state of such an electron is designated by
n2s + 1 Lj (4.48)
The superscript 2s + 1 gives the multiplicity of the state for l ≥ s, as can be
deduced from Eq. (4.46). The subscript in the notation describes the total angular
momentum. In place of L, a letter which conventionally denotes a particular
orbital angular momentum is used, e.g. s, p, d, f, g and h for l = 0, 1, 2, 3, 4, and
5, respectively. These small case latters are used to describe the states of
individual electrons. For the states of the atom, capital letters S, P, D, F, G, H are
used instead.
Table 4.1 Spectroscopic letters for different l values, small case letters for
electron states and capital letters for atomic states
Values of l → 0 1 2 3 4 5
Letter symbol → s, S p, P d, D f, F g, G h, H
Ze mcr × v
B = 2 3 (4.49)
4πε0 me c r
Therefore, the energy of the spin magnetic moment of the electron interacting
with this field is
V 1′ = – µ ⋅ B
Ze 2 S.L
= 2 2 3 (4.50)
4πε0 me c r
However, this is the energy seen in the frame of the electron which is being
accelerated. It was shown by Thomas that the corresponding energy in the rest
frame of the nucleus, is smaller by a factor of 1/2, so that the first correction to
the nonrelativistic energy is (for details see Ref. 1)
Ze 2 S.L
V1 = 2 2 3 (4.51)
8πε0 me c r
It is easy to see that this term is smaller than the angular momentum term in
Eq. (4.8) by an order of magnitude of (Ze2/4πε0r) (1/mcc2), i.e. ratio of binding
energy to rest energy. This illustrates that the spin-orbit interaction, given in
Eq. (4.51), is a relativistic effect and gives corrections which are smaller than
the nonrelativistic energies by an order of about 10–5. This correction is there
only for the states with l ≠ 0.
The second correction is obtained from using the relativistic expression for
the kinetic energy
T = (p2c2 + me2c4)1/2 – mec2
1 2 1
≈ p − p 4 + ... (4.52)
2me 8me3c 2
Thus, the leading correction gives rise to an extra term for the energy,
1
V2 = − p4 (4.53)
8me3c 2
which is smaller than the kinetic energy by a factor of about (p2/2me) (1/2mec2).
Hence, this term also gives rise to corrections which are smaller by a factor of
about 10–5 than the nonrelativistic energies. Being negative, it will lower the
energy of all the states.
Finally, there is an additional correction which follows from the relativistic
Dirac equation. It is called the Darwin term and has the form
112 Elements of Modern Physics
πZ e2 2
V3 = δ(r ) (4.54)
8πε0 me2 c 2
Because of the presence of the Dirac delta function δ (r), this term
contributes only to the l = 0 states (δ (r) = 0 for r ≠ 0 but ∫δ (r) dτ = 1), since the
wave functions of states with l ≠ 0 vanish at r = 0. This term is of the same
order as the spin-orbit interaction, and therefore contribute corrections of the
order of 10–5 compared to the leading terms.
Collecting all the corrections together, the additional energy is
V = V1 + V2 + V3
Ze 2 S.L 1 4 πZ e2 2
= 2 2 3
− p + δ(r ) (4.55)
8πε0 me c r 8me3c 2 8πε0 me2 c 2
where the contribution of the first term is only to the l ≠ 0 terms. Since these
terms are small, their contribution to the energy levels can be evaluated by using
the first order perturbation theory described in Sec. 3.10 as
∆E n ≈ ∫φ n* Vφ ndτ (4.56)
For the evaluation of the contribution of V1 to ∆En, we note that
1
S.L = [(L + S)2 – L2 – S2] (4.57)
2
This, together with the value of 〈1/r3 〉 given in Eq. (4.32), leads to
j ( j + 1) − l (l + 1) − 3/ 4
〈V1〉 = Z2α2 |En| ,l≠0 (4.58)
nl (2l + 1) (l + 1)
= 0, for l = 0
2
( )
where α is the fine structure constant e / 4πε 0 c and has an approximate
value of (1/137). The contribution of V2 is obtained from
2
1 Ze 2
〈 –p4/8me3c2 〉 = − 〈 H0 + 〉
2me c 2 4πε 0 r
1 2 Ze 2
2
Ze 2
= − E + 2 E 〈 〉 + 〈 4πε r 〉 (4.59)
2me c 2
n n
4πε 0 r 0
Using relations (4.30) and (4.31) gives
Z 2 α 2 | En | 4n
〈 V2 〉 = 2 3 − l + 1/ 2 (4.60)
4n
The One-Electron Atom 113
Finally, the contribution of the Darwin term depends on the wave function
3/ 2
1 Z
at the origin. Detailed analysis shows that ψ(0) = 1/ 2 δl 0 which then
π a1n
leads to
Z 2α 2 | En |
〈 V3 〉 = for l = 0
n
= 0 for l ≠ 0 (4.61)
These relations allow us to obtain ∆En,
Z 2 α 2 | En | 4n
∆E n = 3 − j + 1/ 2 (4.62)
4n 2
for the fine structure of the energy levels of the hydrogen atom.
The important properties of the fine structure given in Eq. (4.62) and
demonstrated in Fig. 4.2 are:
1. As expected, the fine structure corrections are smaller than En by a factor
of about α2/4 ~ 10–5. The hydrogen atom energies En themselves may be
written as En = –α2mc2/2n2, which means that they are smaller than the
rest energy by a factor of about α2. All the shifts in the energy levels due
to fine structure corrections are negative and the shift decreases as
j increases. Furthermore, the corrections decrease rapidly as n increases,
so that its effect is more easily noticeable for small-n states.
2. The fine structure corrections remove some of the degeneracy of the
energy levels En. The states with different j values now have different
energies. For a given n, the allowed j values range from 1/2 to (n –1/2) so
that each n level is now split into n levels.
3. Some degeneracy still survives. For a given n, the level with j = n – 1/2 is
nondegenerate but all the other levels have a degeneracy of order two
corresponding to l = j ± 1/2. For example, for n = 2, 2 2P3/2 is nondegenerate
but 22P1/2 and 22S1/2 are degenerate. Actually there is a small separation
between the 22P1/2 and 22S1/2 states also, known as the Lamb shift, which
can be satisfactorily explained in terms of quantum electrodynamics.
114 Elements of Modern Physics
4. Not all the transitions between the different levels are allowed. As will be
seen later, there are selection rules for the allowed transitions. For the
most prominent transitions, called the electric dipole transitions, the allowed
transitions satisfy the selection rules
∆l = ± 1
∆j = ± 1, 0, but not j = 0 → j = 0
∆mj = ± 1, 0 (4.63)
∆n = unrestricted
Thus, these transitions are allowed only between adjacent columns in
Fig. 4.2. For example, there are two lines (a doublet) corresponding to transitions
between 2P and 1S levels, two lines (a doublet) corresponding to transitions
between 3S and 2P levels and three lines (a triplet) corresponding to transitions
between 3D and 2P states.
The One-Electron Atom 115
I 2 = I ( I + 1) 2 (4.64)
Associated with I is a magnetic moment µN,
e
µN = g m I (4.65)
p
where mp is the mass of the proton. Because the structure of the nucleus is
more complicated than that of an electron, the value of g is generally different
from 1, and is 2.79 for the proton. The nuclear magnetic moment is seen to be
smaller than the electron magnetic moment by a factor of about me/mp ~ 1/1000.
The atomic states are now designated by the total angular momentum F,
F = J+I (4.66)
with eigen values
(
F z = m j + mi ) (4.67)
F 2 = F ( F + 1) 2 , | j − I | ≤ F ≤ j + I
This means that each level with a given j has a multiplicity of 2I + 1 if j > I
and a multiplicity of 2j + 1 if j ≤ I. The allowed electric dipole transitions are
found to satisfy the selection rules
∆l = ± 0
∆F = ± 1, 0 but not F = 0 → F = 0 (4.68)
∆mF = ± 1, 0
The nuclear magnetic moment interacts with the magnetic field created at
the nucleus by the electron. The magnetic field is due to (i) the orbital motion of
the electron around the nucleus, and (ii) the intrinsic magnetic moment of the
116 Elements of Modern Physics
electron. The magnetic field at the nucleus, due to the orbital motion of the
electron can be deduced from Eq. (4.49) as
e
B = − (L / r 3 ) (4.69)
4πε 0 me c 2
Therefore, the interaction energy due to this field is
ge2
〈 Vorb 〉 = 2
I.L 〈 1/ r 3 〉 for l ≠ 0 (4.70)
4πε0 me m p c
= 0 for l = 0
This is smaller than the fine structure terms by a factor of about me/mp
~ 1/1000.
For calculating the field due to the intrinsic magnetic moment of the electron,
we note that the vector potential due to a magnetic dipole moment µ is
1 r
A = µ × 3 (4.71)
4πε0 c 2 r
From this, the magnetic field comes out as
B = ∇×A
1 r r
=
4πε0 c 2 µ ∇. r 3 − (µ.∇) r 3 (4.72)
Therefore, the energy of the nuclear magnetic moment µN interacting with
this field is
1 r r
V spin = − 2
∇. (µ N .µ) 3 − µ N . 3 µ (4.73)
4πε0 c r r
With this, the perturbative expression for the interaction energy comes out
to be
1 µ .µ µ .r µ.r
〈 Vspin 〉 = 〈 N3 − 3 N 5 〉 for l ≠ 0 (4.74)
4πε 0 c 2 r r
For l = 0, the angular integration in Eq. (4.74) gives zero for r ≠ 0. For
obtaining the correct value of the contribution from r = 0, Gauss theorem is used
in the expectation value of the expression in Eq. (4.73), to get
1 8π
〈 Vspin 〉l = 0 = − (µ N .µ) | ψ (0) |2
4πε 0c 2 3
The One-Electron Atom 117
2 ge 2
= 2
(I.S) | ψ (0) |2 (4.75)
3ε0 me m p c
where ψ (0) is the wave function at the origin. In particular, using the wave
functions in Eq. (4.26), the hyperfine splitting between the F = 1 and F = 0 levels
of the ground state of the hydrogen atom is
16me
E1 (F = 1) – E1 (F = 0) = ( g α 2 ) | E1 | (4.76)
3m p
than that of the proton. Therefore, the hyperfine interaction is much larger for
the positronium. In particular, the separation between the F = 1 and F = 0 levels
corresponds to a frequency of v = 2.034 × 1011 s–1. The electron and the positron
in the positronium annihilate each other, emitting two photons in the F = 0 state
and three photons in the F = 1 state. The lifetimes of the positronium in these
two states are different: 1.25 × 10–10 s for the F = 0 state and 1.4 × 10–7 s for the
F = 1 state.
Muonium
Muonium is a bound state of an electron and a µ+ meson (µ meson or muon
and µ+ meson are similar to an electron and a positron respectively, except that
they are heavier, their mass being about 206.84 me). They are produced when a
beam of µ+ is stopped by a gas. Their energy levels are similar to those of the
hydrogen atom except for a small difference due to the difference in the reduced
mass. The hyperfine energy levels of the muonium are of importance since they
can be calculated precisely, and serve as a test of the theory.
Muonic Helium
Muonic helium is formed by replacing one of the electrons in a helium atom
by a muon. Since the Bohr radius of the muon is smaller by a factor of about 207
than that of the electron, the electron essentially sees a nucleus of charge 2|e|
with a muon moving around close to the nucleus. Therefore, the energy levels
are similar to those of the hydrogen atom. However, the hyperfine splitting is
due to the electron magnetic moment interacting with the muon magnetic moment.
Using Eq. (4.76) as a first approximation but taking g = 1 and replacing mN by
mµ, it is found that the hyperfine splitting for muonic helium corresponds to a
frequency of v = 4.515 × 10 9 s –1, close to the experimental value of
4.465 × 109 s–1.
Muonic atoms are very useful for probing the structure of nuclei since the
Bohr radius of the muon is quite small, and therefore the probability of finding
the muon inside the nucleus may be quite substantial.
Rydberg Atoms
When an electron in an atom is in a state with a sufficiently large principal
quantum number n, it is influenced mainly by the net positive charge of the ionic
core and not by its distribution. These excited states of atoms are similar to
those of a hydrogen atom. They are termed Rydberg states and the atoms are
called Rydberg atoms. It is the advent of tunable lasers (see Sec. 6.5) that has
helped to excite and investigate the Rydberg states. They are of interest for the
following reasons:
120 Elements of Modern Physics
1
The spin eigenstates of Sz with eigenvalues ± are designated by α and
2
β, so that
1 1
S zα = α , Szβ = – β (4.78)
2 2
The spin operator S has the properties,
3 2
S 2 = s (s + 1) 2 = (4.79)
4
1 2
S x 2 = Sy2 = Sz2 = (4.80)
4
Furthermore, the raising and lowering operators S± defined as
S ± = Sx ± iSy (4.81)
satisfy the properties [see Eq. (3.165)]
S+ β = b1α , S+ α = 0
i 2
and also S x S y + SySx = – ( S − S−2 ) = 0 (4.84)
2 +
Together with the commutation relation SxSy – SySx = iS z satisfied by all
angular momentum operators [see Eq. (3.163)], this implies
i
S x S y = –SySx = S (4.85)
2 z
By symmetry,
i
S y S z = –SzSy = S (4.86)
2 x
i
S z S x = –SxSz = S
2 y
For writing down the Schrödinger equation for a free, spin 1/2 particle of
mass m, it is proposed that the kinetic energy be written as
2
E = (p ⋅ S) (p ⋅ S) (4.87)
m 2
1
This expression is equivalent to p2 for the free particle, as can be
2m
shown by using Eqs. (4.80), (4.85) and (4.86). On using the operator expressions
for E and p, it leads to the Schrödinger equation for a free particle,
∂ψ 2
i = – (∆ ⋅ S) (∆ ⋅ S)ψ (4.88)
∂t m
The interaction with the electrostatic potential φ, can be introduced by the
prescription that
E → E – qφ (4.89)
1
where q is the charge of the particle. However, since both p, Etot and
c
zero of the energy. Equations (4.89) and (4.90) introduce what is known as the
minimal electromagnetic interaction. With these prescriptions, the Schrödinger
equation for a spin 1/2 particle in the presence of electromagnetic fields, comes
out as
∂ψ 2
i = [(−i∇ − qA) ⋅ S]2 ψ + qφψ (4.91)
∂t m 2
∂ψ 1 q
i
∂t
=
2m
( −i∇ − qA )2 ψ m S. (∇ × A + A × ∇)ψ + qφψ
(4.93)
Finally, noting that ∇ in ∇ × A operates on A as well as on ψ, and writing
V for qφ, gives
∂ψ 1 q
( −i ∇ − q A ) ψ − S ⋅ B ψ + V ψ
2
i = (4.94)
∂t 2m m
This expression for the energy contains a term which corresponds to the
interaction of a particle with magnetic moment
q
µ = S (4.95)
m
with the external magnetic field B. Thus, the particle which satisfies Eq. (4.91),
has an intrinsic spin S and an associated magnetic moment given by Eq. (4.95).
For the electron q = –|e|. These results are in conformity with the experimental
observations discussed in Sec. 4.2.
4c 2
E2 = (p ⋅ S) (p ⋅ S) + m 2 c 4 (4.96)
2
which is equivalent to E2 = p2c2 + m2c2. On taking the momentum term to the
left hand side and factorizing, it leads to the equation
The One-Electron Atom 123
i ∂ − 2ic∇ ⋅ S i ∂ + 2ic∇ ⋅ S ψ
∂t ∂t = m2c2ψ (4.97)
where ψ has the form given in Eq. (4.92). It can be linearized by defining
i ∂ − 2ic∇ ⋅ S ψ
∂t = mc2x (4.98)
substitution of which in Eq. (4.97) leads to
i ∂ − 2ic∇ ⋅ S x
∂t = mc2ψ (4.99)
Equations (4.98) and (4.99) together are equivalent to the Dirac equation
(1928) for a spin 1/2 particle.
The free particle solutions can be written by noting that
ψ = (b1α + b2β) exp [– i (Et – k ⋅ r)] ] (4.100)
satisfies Eq. (4.97) provided
E 2 = k2c2 + m2c 4 (4.101)
The corresponding x is obtained from Eq. (4.98), as
1 2c
x = E − k ⋅ S (b1α + b2β) exp [−i ( Et − k ⋅ r ) / ]
mc 2
(4.102)
The most striking property of these solutions is that negative energy solutions
with E = – (k2c2 + m2c4)1/2 are allowed in addition to the usual positive energy
solutions with E = (k2c2 + m2c4)1/2.
Further discussion of the Dirac equation is not within the scope of this book.
We will be content with making a few remarks.
1. The problem of one-electron atoms can be considered by the replacement
∂ ∂ Ze 2
i → i + (4.103)
∂t ∂t 4π ε 0 r
in Eqs. (4.98) and (4.99). The various fine structure terms can then be
deduced by carrying out suitable expansions.
2. The existence of negative energy states creates some complications. Since
no negative energy particles are observed in nature, how are possible
transitions to negative energy states explained ? Dirac overcame this
difficulty by postulating that vacuum consists of a sea of electrons which
fill all the negative energy levels. Hence, transitions to negative energy
124 Elements of Modern Physics
2
E = mc
E=O
E = – mc2
(a) (b)
Fig. 4.3 (a) Forbidden transition to a filled negative energy state, and
(b) allowed transition leading to annihilation of an electron and a hole with the
emission of two photons. Filled circles are electrons and the open
circle is a hole or a vacancy.
3. The Dirac hypothesis of the negative energy sea suggests that if enough
energy is provided, a negative energy electron may become a positive
energy electron. The energy of vacuum can be written as
Ev = ∑ (– E )
i
i (4.104)
where the summation is over all the negative energy states (ignoring the
complication of the infinite sum). Suppose two photons with energies hv1
and hv2, come together and give all their energy to an electron with energy
–El, which now has a positive energy En. Then, by energy conservation
hv1 + hv2 + ∑ (– E ) = E + ∑ (− E )
i n i = En + El + ∑ (− E )
i
i
i i≠l
(4.105)
The final state consists of an electron with energy En and another particle
with energy El corresponding to the hole in the negative energy sea. A
The One-Electron Atom 125
similar analysis for charge conservation shows that the final state consists
of a negatively charged electron (with energy En) and a positively charged
particle (corresponding to the hole in the negative-charge sea). The hole
therefore has properties exactly opposite to those of the vacant negative
energy electron state, i.e. it has positive energy and positive charge (also
opposite momentum and spin). This hole state is called the positron, which
is an example of what are known as antiparticles. The overall process is
equivalent to two photons annihilating each other to produce an electron
and a positron. The process is known as pair creation. Similarly, pair
annihilation occurs when a positive energy electron drops into a vacancy
in the negative energy sea, giving out radiation [Fig. 4.3 (b)]. Pair creation
and annihilation are important processes in particle physics, though the
associated particles may not always be two photons or electrons.
4. The presence of an external charge polarizes the sea of negative charges
thus reducing the effective charge of the external particle. This is called
vacuum polarization. As a result, an s-wave electron in the hydrogen
atom, which is ‘nearer’ to the proton than a p-wave electron, sees a greater
charge for the proton. Hence, the vacuum polarization lowers the s-wave
levels compared to the p-wave levels. This contributes to the removal of
j-degeneracy, in particular, the degeneracy between 2p1/2 and 2s1/2 states.
However, there are additional contributions to the separation of these energy
levels called the Lamb shift, for other effects such as self interaction, field
fluctuations, etc. which can be treated within the framework of quantum
electrodynamics. The predictions of the theory for the Lamb shift are in
excellent agreement with the experimental observations.
4.9 EXAMPLES
A few examples to illustrate the properties of the one-electron atoms are discussed
here.
Example 1
Though the solutions to the radial equation, Eq. (4.13), are in general complicated,
the solutions for l = n – 1, are fairly simple.
Consider a solution of the form
2 2 1/ 2 1 n(n − 1) l (l + 1) Ze2
2mr 2n (−2mr E / ) r − r 2 + r 2 − 4πε0 r = 0
(4.107)
This relation can be satisfied if
l = n – 1,
2
m Ze 2 1
E = − r2 4πε 2 (4.108)
2 0 n
Example 2
The scaling properties of Eq. (4.13) provide a useful insight into the solutions.
Consider a transformation
r → λr (4.110)
which takes Eq. (4.13) to the form
2
2 1 d 2 d l (l + 1) (λZ )e
2mr − r 2 dr r dr + r 2 R(λr ) − 4πε0 r R(λr )
1
R(r, 1) = R(r / Z , Z )
Z 3/ 2
1
E (Z = 1) = E (Z ) (4.112)
Z2
where Z is shown as an additional variable. The factor of Z–3/2 in the first relation
is due to normalization. Thus, we can obtain the solutions for Eq. (4.13) in terms
of solutions for Z = 1. It also follows that
1 n
〈 rn 〉 z = 〈r 〉 z = 1 (4.113)
Zn
The One-Electron Atom 127
Example 3
A beam of sodium atoms with velocity 103 m/s, moves along the magnetic poles
over a distance of 0.15 m, in the x-direction. We determine the separation between
the two components of the beam, at a distance of 0.6 m from the magnet, given
that the magnetic induction between the poles varies in the z-direction as
B = (1 – 100z) W/m2 (4.114)
The force on the atoms is
e
Fz = ± (100)
2me
= ± 9.28 × 10–22 N (4.115)
If t1 is the time taken by the atoms to traverse the poles and t2 is the time
taken to move from the magnets to the plane of observation, the separation
between the two components of the beam is
1 | F | 2 | F |
∆z = 2 z t1 + z t1t2 (4.116)
2 mn mn
Since t1 ≈ 1.5 × 10–4 s and t2 ≈ 6.0 × 10–4 s, one gets
∆z = 4.9 × 10–3 m (4.117)
Example 4
Some general properties of fine structure lines are now enumerated.
1. Transitions from the level with principal quantum number n > 1, to the
ground state give rise to doublets corresponding to
np 1/2 → 1s1/2, np3/2 → 1s1/2 (4.118)
2. Transitions from the level with principal quantum number n > 2 to the level
with n = 2, gives rise to seven lines corresponding to
np 1/2, 3/2 → 2s1/2
ns 1/2 → 2p1/2, 3/2 (4.119)
nd 3/2 → 2p1/2, 3/2
nd 5/2 → 2p3/2
3. If n0 ≥ 2, where n0 is the principal quantum number of the final state,
every increase in n0 increases the number of fine structure line by 6.
These additional lines correspond to
128 Elements of Modern Physics
Z 2 α 2 | En | Z 2 α 2 | En ' |
v = v0 + (3 − 2 n ) − (3 − 4n ') (4.124)
4n 2 h 4n '2 h
where v0 is the frequency in the absence of fine structure correction given
by
En − En '
v0 = (4.125)
h
Example 5
The spin 1/2 space contains only two linearly independent states α and β as
1
defined in Eq. (4.78). It is convenient to regard these states with Sz = ± as
2
two-component column vectors
1
α = (4.126)
0
0
β = 1
The One-Electron Atom 129
and describe the spin operators by 2 × 2 matrices. These matrices must satisfy
the commutation relations in Eq. (3.163), and the properties stated in Eqs. (4.78),
(4.80), (4.85) and (4.86). One set of such matrices is given by
1 0 1
Sx =
2 1 0
1 0 −i
Sy =
2 i 0
1 1 0
Sz = (4.127)
2 0 −1
The operation of the spin operator is then given by the operation of these
matrices on the column vectors given in Eq. (4.126). It is easy to see that these
matrices give the results in Eq. (4.82) with the specific choice of
b1 = b2 = 1.
PROBLEMS
1. Calculate the expectation value of 〈 – Ze2/4πε0r 〉 for the one-electron
atoms in the ground state and hence deduce the expectation value
〈 p2/2mr 〉 of the kinetic energy.
2. For the ground state of the hydrogen atom, the wave function is of the
form ψ = b exp (–r/a), where b is a constant and a is the Bohr radius.
Determine the probability of finding the electron at a separation greater
than 2a. What is the corresponding classical probability?
3. Consider a wave function of the form R(r) = (1 + br) e–gr for a one-
electron atom. What is the possible eigenvalue of this state?
4. What is the value of r at which 4πr2 |φ (r)|2 has a maximum for the
ground state? What is the probability density as a function of r, at this
value?
5. The effect of the finite size of a nucleus may be taken into account by
modifying the potential for r < rn, such that the potential V(r) = –Ze2/
4πε 0rn for r < r n, r n being the radius of the nucleus. Treating the
modification perturbatively, show that the correction to the ground state
4rn2 Z 2
energy is approximately 2 | E1 |. What is the order of magnitude
3a1
of this correction?
130 Elements of Modern Physics
1
Hence show that 〈 rn〉 = 〈 r n 〉 mr = 1 .
mr
mrn
2c
ψ – χ= p . S (ψ + χ)
( E + mc 2 )
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 131
S. H. Patil, Elements of Modern Physics,
https://doi.org/10.1007/978-3-030-70143-7_5
132 Elements of Modern Physics
Pij2 = 1 (5.4)
which implies that the possible eigenvalues of Pij are + 1 or –1. It is experimentally
observed that the physical states indeed are (not just ‘can be’) eigenstates of Pij
and the eigenvalues are characteristic of the nature of the particles. This result
is stated in terms of the following rules:
Atoms and Molecules 133
2 2
− 2m ∇i + V (i ) ψ ai (i ) = Eai ψ ai (i ) (5.9)
134 Elements of Modern Physics
1
ψ + (1, 2, ..., N) =
( N !)1/ 2
∑ψ
perm
a1 (1) ψ a2 (2) ...ψ aN ( N ) (5.11)
ψ a1 (1) ψ a1 (2)...ψ a1 ( N )
1 ψ (1)ψ (2)...ψ ( N )
ψ– (1, 2, ..., N) = det. a2 a2 a2
(5.12)
( N !)1/ 2 ... ...
ψ aN (1) ψ aN (2)...ψ aN ( N )
where the summation is over all the permutations of the particle indices (the
normalization is different if some of the ai are the same). For the simple case of
two identical particles, the symmetric and antisymmetric states are
1
ψ± (1, 2) = [ψ a1 (1) ψ a2 (2) ± ψ a1 (2) ψ a2 (1)] (5.13)
21/ 2
In these equations, i.e. Eqs. (5.11) to (5.13), the solutions ψ+ with the plus
sign are applicable to bosons while the solutions ψ– with the minus sign apply to
fermions. These solutions are of great importance as solutions for N identical
particles, and serve as approximate starting solutions even for systems whose
Hamiltonian cannot be written in the separable form in Eq. (5.7).
An extremely important point to note in Eqs. (5.12) and (5.13), is that the
fermion wave functions vanish if any two of the ai are equal. This means that
no two identical, noninteracting fermions can be in states described by the same
set of quantum numbers. This rule was first stated for electrons in an atom by
Pauli (1925), no two electrons in an atom can have the same set of quantum
numbers n, l, ml and ms, and is known as Pauli’s exclusion principle. It is central
to the understanding of the structure of atoms. We now begin the analysis of the
structure and the energy levels of an atom, subject to the constraints of Pauli’s
exclusion principle.
Atoms and Molecules 135
Ze 2
where H0 = ∑ 21m p
i
2
i −
4π ε 0ei
+ V (ri )
(5.15)
1 e2
H1 = ∑
2 i ≠ j 4πε0 rij
− ∑V ( r )
i
i (5.16)
1
H2 =
2 ∑ 2m1c r dV
i
2 2 dr
i
l .s
i
i i (5.17)
( Z − 1)e 2
V(r) = (1 − e − r / b ) (5.18)
4πε 0 r
which has the correct behaviour for r → 0 and r → ∞. Since b is expected to be
large compared to the radius of the inner orbits, the potential can be expanded in
powers of r to obtain an approximate expression for V(ri) as
( Z − 1)e 2 1 − 1 r + ...
V(ri) ≈ b (5.19)
4πε 0 2b 2
i
The second term H1 represents the deviations of the actual repulsive potential
from the average potential V(ri). The term H2 describes the spin-orbit interaction
of the electrons. It is of the same form as Eq. (4.51) except that Ze2/4πε0 r2 has
been replaced by the more general expression dV/dr. Of these three terms, the
general structure of the atoms is determined mainly by H0. The terms H1 and
H2, however, play an important role in the determination of the energy levels of
the atom, in particular the fine structure of the levels.
For obtaining the structure of the atoms, one starts with only the kinetic
energy of the electrons and the electrostatic interaction of the electrons with the
nucleus, i.e. the first two terms in H0 [Eq. (5.15)]. The energy levels of the
electrons are then those of a one-electron atom, i.e.
136 Elements of Modern Physics
2
m Ze 2 1
E n(0) = − (5.20)
2 4πε 0 n 2
and the total energy is the sum of the energies of the N electrons,
N
E (0) = ∑E
i =1
(0)
n (i ) (5.21)
However, the states that can be occupied by the electrons are constrained
by Pauli’s exclusion principle. The ground-state energy is therefore obtained by
placing successive electrons in the lowest-energy, unocoupied states. It may be
noted (see Eq. (4.27)) that for each value of the principal quantum number n,
there are 2n2 states (including the factor of 2 due to the states) with the same
energy. Thus, the first two electrons are to be placed in the n = 1 states, the next
8 electrons in the n = 2 states, the next 18 electrons in the n = 3 state, etc.
Electrons with the same value of n form what are known as shells which are
designated by the letters K for n = 1, L for n = 2, M for n = 3, etc.
It may be recollected (Sec. 4.1) that the degeneracy of the different l states
(with l ≤ n – 1) for a given value of the principal quantum number n, is a special
property of the 1/r potential. The average potential V(ri), arising from the
interaction with the other electrons will remove this degeneracy and states with
different l value but the same n value, will have different energies, Since V (ri)
is positive and becomes more important as ri increases, it may be expected that
the states with larger l values will be raised more than those with smaller l
values. Explicit perturbative calculations can be made for the first two terms of
the potential V(ri) given in Eq. (5.19). From Eq. (3.125),
( Z − 1)e2 1 a1 2
En, l ≈ En(0) +
4πε0 b − 4b 2 Z (3n − l (l + 1)) (5.22)
where Eq. (4.29) has been used for 〈 r 〉, a1 being the radius of the first Bohr
orbit with Z = 1. It is seen here that the screening effects due to other electrons
remove the l-degeneracy, the energies now increasing as l increases. This implies
that each shell is made up of subshells that have the same n value but different
l values, the subshells with larger l values having higher energy. Indeed, it so
happens that the energy of a subshell with sufficiently large l may be higher
than that of another with larger n but a lower l. The relative positions of the
various energy levels which follow from detailed calculations, and also from
experimental observations, are shown in Fig. (5.1) and form the basis of the
shell structure of the atoms.
Atoms and Molecules 137
Examples
In its ground state H (hydrogen) has an electron with n = 1, l = 0, ml = 0 and
ms = 1/2 or –1/2. This configuration is designated by 1s. The two electrons of
138 Elements of Modern Physics
The shells are filled in the order shown, the exceptions being shown in bold type. For
the heaviest elements (Z = 89 to 102), the electrons are in both 5f and 6d subshells.
Consider an atom with Z electrons, i of while are in the last subshell. Then
the potential seen by an electron in the last shell will be that due to the nucleus
(with charge Z| e |) screened by Z-i electrons, i.e. essentially an attractive
potential due to charge i | e |. Therefore, the ionization potential (or the binding
energy of an electron in the last subshell) may be expected to increase as
i increases. This trend is generally observed, with the ionization potential being
a minimum for atoms with only one electron in the last shell, e.g. Li, Na, K, Ga,
Rb, and a maximum for atoms with the last shell being complete, e.g. He, Ne,
Ar, Kr, and Zn (less prominent). Of course, these arguments are very qualitative.
140 Elements of Modern Physics
IA IIA IIIA IVA VA VIA VIIA VIIIA IB IIB IIIB IVB VB VIB VIIB VIIIB
1 2
1s H He
Atoms and Molecules
3 4 5 6 7 8 9 10
2s Li Be 2p B C N O F Ne
11 12 13 14 15 16 17 18
3s Na Mg 3p Al Si P S CI Ar
19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
4s K Ca 3d Sc Ti V Cr Mn Fe Co Ni Cu Zn 4p Ga Ge As Se Br Kr
37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54
5s Rb Sr 4d Y Zr Nb Mo Tc Ru Rh Pd Ag Cd 5p In Sn Sb Te I Xe
55 56 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86
6s Cs Ba 5d Lu Hf Ta W Re Os Ir Pt Au Hg 6p Tl Pb Bi Po At Rn
87 88 89 90 91 92 93 94
7s Fr Ra 6d Ac Th Pa U Np Pu
1 2 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6
1 2 3 4 5 6 7 8 9 10 11 12 13 14
57 58 59 60 61 62 63 64 65 66 67 68 69 70
4f La Ce Pr Nd Pm Sm Eu Gd Tb Dy Ho Er Tm Yb
141
142 Elements of Modern Physics
Since they all have a similar outer shell, their chemical properties also are
similar. The three series of elements are 21 ≤ Z ≤ 30 for n = 3, 39 ≤ Z ≤ 48 for
n = 4, and 71 ≤ Z ≤ 80 for n = 5. The partially filled inner 3d shell allows some
of these elements to have large magnetic moments. In particular, Fe, Co and Ni
are found to be ferromagnetic materials (see Sec. 8.6). (v) The rare-earths are
a group of 14 elements corresponding to a progressive filling of the 4f subshell,
though the 6s subshell is already complete. These elements have 57 ≤ Z ≤ 70.
In summary, the electronic structure of atoms provides an insight into their
properties.
given shell then reduces to the problem of determining the different possible
total angular momentum states for a given shel configuration. In this context, it
is noted that the total angular momentum of an assembly of electrons forming a
complete shell is zero. This follows from the observation that the z components
of the total orbital angular momentum and the total spin momentum for the
assembly, are both zero, i.e.
l
∑m
ml = – l
l =0
1/ 2
∑
ms = –1/ 2
ms = 0 (5.23)
and therefore
∑ m = ∑ (m
shell
j
shell
l + ms ) (5.24)
=0
Since the z-axis can be taken along any arbitrary direction, this implies that
the total angular momentum is zero. Therefore, the total angular momentum of
an atom is due to contributions from only the unfilled shells.
A detailed perturbative calculation of the contribution of H1 and H2 gives
the important result that the atoms fall into two main categories:
1. For most atoms the residual mutual interaction between the electrons, i.e.,
H1, is more important than the spin-orbit interaction represented by H2.
This situation is treated as LS coupling of Russel-Saunders coupling.
2. For some atoms, mainly heavy atoms with large unclear charges, the spin
orbit interaction, i.e. H2, is more important than H1. This is treated as j-j
coupling.
Russel-Saunders or LS Coupling
In this scheme, the spin-orbit interaction is first neglected. Since H1, being
independent of S, commutes with S and therefore also L, the energy eigenstates
can be designated by L and S quantum numbers. Having determined the different
states and their qualitative ordering, the effect of spin-orbit interaction can then
be introduced as a small perturbation to deduce the final multiplicity of the energy
levels.
For determining the ordering of the energy eigenstates, it is noted that for
the state with the largest total spin, the spins are essentially parallel to each
other and therefore the spatial wave function will be antisymmetric under the
144 Elements of Modern Physics
1
∆EJ = C [ J ( J + 1) − L ( L + 1) − S ( S + 1)] (5.26)
2
where the subscripts of C have dropped. It is found that the constant C is
positive for multiplets formed from a subshell that is half or less than half-filled,
and negative for multiplets formed from a subshell which is more than
half-filled. Therefore, for a subshell which is half-filled or less, the energy within
the multiplet increases as J increases, and for a subshell which is more than
half-filled, the energy within the multiplet decreases as J increases. This is
known as the multiplet rule. In partiular, this means that the ground state of an
atom has the smallest J value subject to Hund’s rule if the subshell is half-filled
or less, and the largest J value if the subshell is more than half-filled. The
separation between the levels with values J + 1 and J (but the same L and S) is
obtained as
EJ +1
– EJ = C (J + 1) (5.27)
This is known as the Lande interval rule which states that the spacing
between consecutive levels of a fine-structure multiplet is proportional to
the larger of the two J values of the levels. These ideas are made explicit by
the following two examples.
Consider an atom with two valence electrons, one in the ns state and the
second in the n′ l state where l ≠ 0. This state has a total degeneracy of
4(2l + 1) corresponding to two states for the s electron and 2(2l + 1) states for
the l electron. The total wave function is a product of the spatial part
un,n′,L (r1, r2) = Rn (y1) Y00 (θ1, φ1) Rn′ (r2) Ylm (θ2, φ2) (5.28)
and the spin part
Atoms and Molecules 145
It is clear that the total orbital angular momentum of the state u corresponds
to L = l. The state can be symmetrized or antisymmetrized with respect to the
two electrons, so that there are two states with L = l given by
1
u±n,n′, l = un , n ', l (r1 , r2 ) ± un , n ', l (r2 , r1 ) (5.30)
21/ 2
For the spin part, since both the electrons have s = 1/2, the allowed values
of S are S = 1, 0. It can be shown that the three symmetric states
m m's m m's
v 1 + = v s (1) v (2) + v s (2) V (1) (5.31)
The second example considered is an atom with one valence electron in the
np state, and the second in the n′l state, the total degeneracy being 12(2l + l).
Since l = 0 case was considered in the first example, it is assumed here that
l ≠ 0, and also that l ≠ 1. Then the spatial wave functions are still given by
Eq. (5.30) except that the allowed values of the angular momentum quantum
numbers now are l + 1, l, l – 1, i.e.
1
u±n′, n, L = un ,n ',L (r1 , r2 ) ± un ,n ', L (r2 , r1 ) , L = l + 1, l , l − 1
21/ 2
(5.35)
The total wave functions are given by Eqs. (5.33) and (5.34) except that
±
u , l are replaced by u±n, n′, L, with L = l + 1, l, l – 1. The spin-orbit interaction
n,n’n′
removes the J degeneracy, and the final energy levels are shown in Fig. 5.4.
As before, they are described by the notation (2S+1)LJ. If l = 1 but n ≠ n′, there
is only one level corresponding to L = l – 1, for each S, with J = 0 for S = 0, and
J = 1 for S = 1. The other levels, i.e. L = l, l + 1 are singlets or triplets, as
shown in Fig. (5.4). Finally, the case of l = 1 and n = n′ requires a special
treatment. In this case, the allowed values of L are L = 2, 0 for un+, n, L and
L = 1 for u–n, n, L. Thus, the S = 0 state has L = 2, 0 states associated with it,
while the S = 1 state has L = 1 associated with it. However, the S = 1 state
splits into J = 2, 1, 0 states because of the spin-orbit interaction, for which the
energy increases with J.
Atoms and Molecules 147
electron state with a given set of ji contains states with different J values, which
are degenerate. This degeneracy is removed by the residual electrostatic
interaction between the electrons. The final levels are characterized by the
quantum numbers ji and the total J. A schematic illustration of the levels given in
Fig. 5.5 for the fine-structure splitting of the levels with one electron in the np
state and another in the n′l state (l ≥ 2). These levels are generally represented
by np n′l (j1, j2)J. For example, the ground state is npn′l(1/2, l – 1/2)l–1. It is to
be noted that if l = 0, the only allowed value of j2 is is 1/2, so that J = 1, 0 for
j1 = 1/2, j2 = 1/2 and J = 2, 1 for j1 = 3/2, j2 = 1/2. For l = 1 but n′ ≠ n, one has
148 Elements of Modern Physics
only J = 2, 1 for j1 = 3/2, j2 = 1/2. Finally, for l = 1 and n′ = n, the electrons are
in the same subshell. There is only one set of antisymmetric states corresponding
to j1 = 1/2, j2 = 3/2 and j1 = 3/2, j2 = 1/2. Furthermore, the Pauli exclusion
principle restricts the allowed states to (3/2, 3/2)2,0, {(3/2, 1/2), (1/2, 3/2)}2,1
and (1/2, 1/2)0 for (j1, j2)J, where the last state is the ground state. It should be
observed that the number of final levels and the allowed J values are the same
in both the LS coupling scheme and the j-j coupling scheme [compare Figs.
(5.4) and (5.5)].
2. The allowed changes in the quantum numbers of the whole state are
∆S = 0, ∆L = 0, ± 1
∆J = 0, ± 1, but not J = 0 → J = 0 LS coupling (5.38)
∆mJ = 0, ± 1
and
Fig. 5.6 Some energy levels and allowed transitions for the mercury atom
(the fine structure due to the spin-orbit interaction removing
the J degeneracy is not shown).
1 1
v= R 2
− , n > n' (5.40)
(n ' − δ ') ( n − δ) 2
dI
dn
0 n nmax = eV/h
Fig. 5.7 The spectral distribution of intensity per unit frequency showing
two characteristic lines superposed over a continuous spectrum.
152 Elements of Modern Physics
When the high-velocity electrons reach the anode, they are subjected to
large accelerations in the vectorial sense by the strong electrostatic interaction
with the nuclei of the anode, which causes them to emit electromagnetic radiation.
This radiation due to deceleration, called bremsstrahlung, forms the continuous
radiation of the x-rays from an x-rays tube. As might be expressed from this
mechanism, the maximum energy of the radiation an electron with energy e
times V (V is the voltage difference in the tube) can emit is eV so that the
highest frequency in the continuous spectrum is
e
vmax = V (5.40a)
h
This relation povides a means of obtaining an accurate measurement of the
ratio e/h. As V increases, the intensity increases at all frequencies, and vmax
increases in proportion to V. It is also found that as the nuclear charge Z increases
(V being the same), vmax remains unaltered but the intensity increases (since the
decelerating forces increase with Z).
In contrast to the continuous spectrum, the line spectrum is independent of
the accelerating voltage V, but depends only on the material of which the anode
is made. When the fast-moving electrons strike the atoms of the anode, they
will occasionally knock out an electron in one of the inner shells, creating a
vacancy there. Subsequently, an electron from an outer shell will undergo a
transition to the vacant level by emitting a photon of energy equal to the difference
in the energies of the two levels. This gives rise to the observed characteristic
line spectrum whose frequencies depend only on the energy levels of the atoms
in the anode.
Emission Spectrum
The line spectrum of x-rays is due to transitions between states in the inner
shells of heavy metals (Z > 30), for which the nuclear interaction is dominant.
Therefore, for these states, it is reasonable to use an approximate Hamiltonian
1 2 Ze2
Hi = pi − + H′ (5.41)
2m 4 πε0 ri
( Z − 1)e2 Z e− r / b
En,l = En(0) + 2 −
4πε0 a1n r
Z 2 α 2 | En (0) | 4n
+ 3 − (5.43)
4n 2
j + 1/ 2
where the last term includes the relativistic corrections discussed in sec. 4.4 and
e− r / b Z (n + l ) !(2b0 ) 2l + 2
= 2l + 4 (1 + 2b0/n)–2n
r a
1n (2l + 1) ! N !
F (– N, – N, 2l + 2, 4b02/n2) (5.44)
α 2 x [α (α + 1)]2 x 2
where F(α, α, β, x) ≡ 1 + + +... (5.45)
β 1! β(β + 1) 2!
with b 0 = bZ/a 1 (a 1 is the radius of the first Bohr orbit with Z = 1),
and N = n – l – 1. This expression is quite simple for n = 1 and 2:
2
e− r / b Z 2b0
= for n = 1
r a1 2b0 + 1
2 2
Z b0 (2 + b0 )
= for n = 2, l = 0 (5.46)
4a1 (1 + b0 ) 4
4
Z b0
= for n = 2, l = 1
4a1 1 + b0
A more detailed analysis (based on what is called as the Fermi-Thomas
model) indicates that the screening parameter b has the form
b = c a1 Z–1/3 (5.47)
and the experimental first give the result b0 ≈ 0.80 Z2/3.
In x-ray spectroscopy, the shells are designated by the capital letters K, L,
M, etc. corresponding to the principal quantum number n = 1, 2, 3, etc.
respectively, and the subshells by the sub indices I, II, III, etc. in the order of
increasing energy. For example, in the K shell (n = 1) there is only one level,
whereas in the L shell (n = 2) there are three subshells, LI(n = 2, l = 0), LII(n =
2, l = 1, j = 1/2) and LIII(n = 2, l = 1, j = 3/2). The energy level of the K and L
shells of some elements, obtained from the expression in Eq. (5.43) are given in
Table (5.3). The agreement between the predicted values and the experimental
values is generally good (except in the case ELII − ELI of heavy elements),
especially considering the fact that the predictions use only first order perturbative
calculations. It is important to observe that the spin-relativity separation
(e.g. ELIII − ELII )
154 Elements of Modern Physics
Table 5.3 The energy levels (in keV) of K and L shells for some atoms,
obtained from Eq. (5.43), along with the experimental values in brackets
Kβ for the transition from the M shell to the K shell, etc. The ultiplets are given
an additional number index, e.g. Kα1 for LIII → K, Kα2 for LII → K, etc. The first
few allowed transitions are shown in Fig. 5.8.
MV 3d5/2
MIV 3d3/2
MIII 3p3/2
MII 3f1/2
MI 3s1/2
a2
b3 b1 a1
h b l=1+½
E b4 1–½
LIII 2p3/2
LII 2p1/2
LI 2s1/2
L Series l=0
n=2
a1
b1 l = 0, 1
a2 b2
K
K Series
70 Ka1
60 Ka2
50
1/2
(hn/E0)
40
30
20
10
0 10 20 30 40 50 60 70 80
Z
Fig. 5.9 Moseley diagram for the plot of (hv/E0)1/2 against Z for the Ka lines.
( Z − σ f )2 ( Z − σ )2
hv ≈ |E0| 2
− 2
l
...(5.51)
nf ni
where E0 is the energy of the ground state of the hydrogen atom. This expression
may be written in the approximate form in conformity with the experimentally
1 1
hv ≈ |E0| 2 − 2 (Z – σ)2 (5.52)
nf ni
observed relation in Eq. (5.50). It is found that σn ≈ 1 for the ground state n = 1
and σn ≈ 7.5 for n = 2. It may also be noted that the separation between Kσ1 and
Kσ2 lines, being related to the spin-relativity separation, increases rapidly as Z
increases.
X-ray Absorption Spectrum
X-rays can pass through matter. The intensity is reduced in the process, the
reduction depending upon the nature of the material (which forms the basis of
many practical applications), and on the frequency. High frequency x-rays are
generally absorbed less than low frequency x-rays.
The amount of absorption of x-rays by a given material is studied in terms
of the mass absorption coefficient which is defined by the relation
dI = – µ ρ I dx (5.53)
Atoms and Molecules 157
LI
LII
LIII
K
(a) m(n)
nL nK Frequency
L-series K-series
(b) x-ray intensity
Frequency
the Auger electron is not knocked out by the photo-electric absorption of a photon
emitted by the electron which undergoes a transition to the K shell, but emerges
directly as a part of the process of readjustment of the atom. For example, the
vacancy in the K shell may be filled by an electron in the LI shell and the electron
in the LII shell may be knocked out, with the result that there will be two vacancies
in the L shell. Thus, the de-excitation of the atom may be accompanied either by
the emission of a photon (characteristic radiation) or an electron (Auger electron).
The two processes together essentially account for the number of vacancies in
the K shell. Finally, it is noted that the basic process in the Auger effect is also
known as auto-ionization or internal conversion (in nuclear transitions).
atom which has five valence electrons in the 3p shell, can attract another electron
(because of its incomplete shell) and bind it with a binding energy of 3.80 eV.
However, if an electron is transferred from a K atom to the Cl atom, resulting in
K+ and Cl– ions, there will be an additional electrostatic attraction between the
ions. Including the van der Waals repulsion (the – 1/r6 attraction may be neglected
as compared to the electrostatic attraction), the energy of the system is
14.4 b
E = – 3.80 – + n (5.59)
r r
where E is expressed in eV and r is in Å (the small kinetic energy of the atoms
has not been included). If this energy is less than –4.34 eV (the binding energy
of the electron in the K atom), then it is favourable for the electron from the
K atom to be transferred to the Cl atom, with the resulting ions held together by
the electrostatic attraction between them. This gives rise to ionic bondng. The
details are shown in Fig. 5.11, the system together having a minimum energy of
–8.76 eV at a separation 2.79 Å. It is observed that the dissociation energy, i.e.
the energy required to separate the KCl molecule into K and Cl atoms is (8.76
–4.34) eV, i.e. 4.42 eV.
Covalent Bonds
In some cases, the valence electrons of the atoms have no particular preference
for either of the two atoms, and are shared by both the atoms. This is especially
true in the case of identical atoms forming molecules, e.g.
r0 r(Å)
0
2 4 6 8
–2
–3.8 eV
E (in eV)
–4
–4.34 eV
–6
–8
–8.76 eV at r0 = 2.79 Å
H2, O2, N2, etc. The bonds resulting from the sharing of the valence electrons
are known as covalent or homopolar bonds.
Consider a particle moving in the presence of two similar, one-dimensional,
attractive potential V1 and V2 which are centred at positions x1 and x2. If the
positions x1 and x2 are separated by a large distance d, the ground state energy
E0 will be essentially degenerate, the degenerate eigenstates being ψ1 and ψ2
which are eigenstates with only V1 or V2 being present, respectively. As the
separation distance d decreases, one may consider as possible eigenstates,
1
ψ± = (ψ1 ± ψ2) (5.60)
2
where the overlap integral is ignored in the normalization. If it is also assumed
that ψ1 is small at x2 and ψ2 is small at x1, the expectation value of the energy is
ro
E (in eV)
1 2
r (in Å)
–2
–4
– 4.75 eV at ro = 0.74 Å
Fig. 5.12 The energy of the H2 molecule for the bonding and anti-bonding states.
1 d 2V
V(r) = V0 + (r – r0)2 + ... (5.63)
2 dr 2 r = r0
where the constant term only defines the zero of the energy of the system.
Neglecting the higher order terms in the expansion, the Hamiltonian for vibrational
and rotational motion is
1 2 1 2 1
Hvr ≈ pr + J + k (r – r0)2 (5.64)
2M 2I 2
where the first term is the kinetic energy of the vibrational motion (M is the
reduced mass) and the second term is that of the rotational motion J being the
rotational angular momentum (I is the moment of inertia about the centre of
mass), and k is d2V/dr2. The energy eigenvalues of this Hamiltonian are easily
obtained from Eqs. (3.120) and (3.156), leading to the total energy E,
1/ 2
1 k 2
E = Ee + n +
+ J(J + 1),
2 M 2I
n = 0, 1, 2,..., J = 0, 1, 2,... (5.65)
When the molecule undergoes a transition, there is a change in the energy
of the state. In an emission process, the frequency of the photon is given by
1/ 2
k 2 2
hv + Ee –Ee′ + (n – n′) + J ( J + 1) − J′ (J′ + 1) (5.66)
M 2I 2I
It is found, both from theory and experiments, that the separation between
electronic energy levels is of the order of 5 eV while that between vibrational
energy levels is about 1 eV and that between rotational levels is about 10–5–10–3
eV. Therefore, for weak excitations, only changes in rotational states are observed
whereas changes in vibrational and electronic states require stronger excitations
to be observed. Here, we will confine out discussion to changes in the rotational
and vibrational states (symmetric molecules such as H2 requires a special
treatment).
Selection Rules
The electric dipole transitions (which are the most prominent transitions) for
vibrational and rotational states, satisfy the selection rules
∆n = 0, ± 1
∆J = ± 1 (5.67)
For transitions with ∆n = 0, the emission frequency is given by
2 2
hv = J ( J + 1) − (J – 1) J
2I 2I
164 Elements of Modern Physics
2
= J, J 1,2,... for ∆n = 0 (5.68)
I
This gives a band of spectral lines (Fig. 5.13), known as the rotational band,
with equally spaced frequencies
h
v= 2 J (5.69)
4π I
The spacing is of the order of 1012 s–1, which falls in the very far infrared
region. The spacing allows us to evaluate I and hence the equilibrium distance
[r0 = (I/M)1/2]. For HCl, the spacing is ∆v ≈ 6.2 × 1011 s–1 which gives the
values of I ≈ 2.7 × 10–47 kg. m2 and therefore r0 ≈ 1.29 Å.
For transitions with ∆n = 1,
h2
hv = (k/M)1/2 ± J, J = 1, 2,... for ∆n = 1 (5.70)
I
where the plus sign is for J′ = J – 1 and the minus sign is for J′ = J + 1. This
again gives us a band of spectral lines which have the same spacing as the lines
1
in the rotational band, but with the centre at v0 = (k/M)1/2 (which has a
2π
value of above 8.67 × 1013 s–1 for HCl) and with the central frequency missing.
This is known as the vibrational-rotational band (Fig. 5.13).
Symmetric Molecules
Symmetrical molecules do not have an electric dipole moment and the associated
dipole transitions, and hence do not exhibit the pure rotational or vibrational-
rotational bands just described. The changes in their states are due to higher-
order effects, so that the radiation emitted is much weaker. These higher-order
transitions obey the selection rules:
∆J = 0, ± 1, ± 2 (5.71)
Since the nuclei of a symmetrical molecule are identical, the total nuclear
wave function must satisfy the requirements of exchange symmetry, i.e. the
total wave function must be symmetric for an integral nuclear spin I, and
antisymmetric for an half-integral nuclear spin I, under the interchange of the
nuclei. The exchange symmetry of the spatial part of the wave function is
determined by the rotational states (i.e. the Yl m (θ, φ) functions) which for the
exchange of the nuclei (i.e. θ → π – θ, and φ → π + φ) are even for even l and
odd for odd l. Here l plays the role of J. Of the spin states of nuclei with spin I,
there are (I + 1) (2I + 1) states which are even and I (2I + 1) states which are
odd, under the exchange of spin. States with even spin state are called ortho-
modifications while those with odd spin state are called para-modifications.
Atoms and Molecules 165
For nuclei with an integral I, the ortho states are associated with even l values
and the para states with odd l values, while for nuclei with an half-integral I, the
ortho states are associated with odd/values and the para states with even
l values. Since nuclear spin has only a weak interaction, it does not change in a
normal transition.
J=3
n=1 2
1
0
J=3
n=0 2
1
0
2n1
Line
Spectrum
n1 3n1 n0
n1 = n1J n = n0 ± n1J
(a) (b)
5.8 EXAMPLES
Here, a few examples will be discussed to illustrate and extend some of the
ideas introduced in this chapter.
Example 1
Hund’s rule can be used to deduce the ground state of the elements. In particular,
consider the period from Na to Ar.
Sodium has one 3s electron in the valence shell, and hence its ground state
is 2S1/2. Magnesium has (3s)2 in the valence shell. Since this corresponds to a
closed subshell, its ground state is 1S0. For Al, there is one electron in the
3p subshell so that its ground state is 2p1/2 (the smallest J value allowed is 1/2).
For Si, the valence shell has (3p)2 so that the ground state has S = 1. The largest
allowed orbital angular momentum has L = 1 (since the space part is
antisymmetric, the largest ML corresponds to the two electrons having ml = 1
and ml = 0, so that the largest value of L is 1). Thus, the ground state is 3P0
(smallest value of J is zero). For P, the valence shell is (3p)3 so that the ground
state has S = 3/2. The only allowed value of L is L = 0 (the antisymmetric spatial
wave function corresponds to electrons having ml = 1, ml = 0, and ml = – 1).
Therefore, the ground state is 4S3/2.
For sulphur the shell is more than half-filled. Since a closed shell has J = L
= S = 0, it is easier to consider the unfilled shell as hole states (two holes for
sulphur). As in the case of holes in the Dirac sea, these holes may be regarded
as having positive charge. The spin of the two-hole state for sulphur is S = 1, the
orbital angular momentum is L = 1 (as for the two electron state), and J = 2 (the
holes have positive charge so that the constant C is Eq. (5.26) is negative and the
ground state has the largest allowed J value). Therefore, the ground state is denoted
by 3P2. for Cl, there is only one hole which gives for its ground state, 2P3/2 (largest
J value). Finally or Ar, the subshell is closed, giving its ground state as 1S0.
As an example of two unfilled subshells, consider molybdenum whose unfilled
shells are (4d)2 (5s). The largest spin has S = 3, and the only allowed value of L
Atoms and Molecules 167
is 0 (ml = 2, 1, 0, –1, –2 for the five d-shell electrons) so that the ground state
is 7S3.
Example 2
In Sec. 5.4, it was shown that the number of J states for a two-electron system
is the same in LS and in j-j schemes, if one of the electrons has l = 0 or 1. This
result is now extended to l > 1.
Let 1 < l2 < l1 for electrons 1 and 2. The allowed values of L are L = l1 +
l2,..., l1 – l2, while the allowed values of S are S = 0, 1. The corresponding J
values in the LS coupling scheme are:
S = 0 : J = l1 + l2, ..., l1 – l2
S = 1 : J = l1 + l2 + 1, ..., l1 – l2 + 1 (5.73)
J = l1 + l2, ..., l1 – l2
J = l1 + l2 – 1, ..., l1 – l2 – 1
In the j-j coupling scheme, the allowed values of j1 and j2 are l1 ± 1/2 and
l2 ± 1/2 respectively. Therefore, the allowed J values are:
j1 = l1 + 1/2, j2 = l2 + 1/2 : J = l1 + l2 + 1, ..., l1 – l2
j1 = l1 + 1/2, j2 = l2 – 1/2 : J = l1 + l2, ..., l1 – l2 + 1
j1 = l1 + 1/2, j2 = l2 + 1/2 : J = l1 + l2, ...,l1 – l2 – 1
j1 = l1 + 1/2, j2 = l2 – 1/2 : J = l1 + l2 + 1, ..., l1 – l2
(5.74)
It is observed that each J is repeated the same number of times in Eq. (5.73),
i.e. the LS coupling scheme, and in Eq. (5.74) i.e. the j-j coupling scheme. For
two inequivalent electrons but with l1 = l2 > 1, the allowed J values are those
given in Eqs. (5.73) and (5.74) except that the state with J = l1 – l2 – 1 is not
allowed.
For equivalent electrons, l1 = l2 > 1, the allowed J values in the LS coupling
scheme are:
S = 0 : J = l1+ l2, l1 + l2 – 2, ..., 0
S = 1 : J = l1 + l2, l1 + l2 – 2, ...,2 (5.75)
J = l1 + l2 – 1, l1 + l2 – 3, ..., 1
J = l1 + l2 – 2, l1 + l2 – 4, ..., 0
In the j-j coupling scheme, the allowed j1 and j2 values are j1 ± 1/2 and
j2 ± 1/2 respectively. Therefore, the J values are
168 Elements of Modern Physics
j1 = l1 + 1/ 2, j2 = l2 − 1/ 2
: J = l1 + l2 , l1 + l2 − 1, ...,1
j1 = l1 − 1/ 2, j2 = l2 + 1/ 2
Example 3
The spin-orbit interaction splits the levels of the LS couplng scheme into multiplets.
The multiplet structure of the first few observed lines in mercury is as follows:
The triplet levels are split into (6s) (np) 3P2,1,0, (6s) (nd) 3D3,3,2,1, etc.
wereas (6s) (ns) 3S1 has only one level. The allowed transitions are:
(6s) (6p) 1P1 → (6s) (6s) 1S0, λ = 1849.6 Å
(6s) (6p) 3P1 → (6s) (6s) 1S0, λ = 2536.5 Å
(6s) (7s) 1S0 → (6s) (6p) 1P1, λ = 10,139.7 Å (5.77)
(6s) (7s) S1 → (6s) (6p) P0, λ = 4046.6 Å
3 3
Example 4
The spin-orbit interaction breaks the degeneracy of a given LS level into levels
with different J values. It may be observed that the average of the L.S interaction,
summed over all the states of a given LS level, is zero, i.e.
Σ Σ L⋅S = 0 (5.78)
MS ML
(this follows from the fact that with a given orientation of S, for every term with
a given L, there is another term with –L). Now, the summation over the states
can equally well be carried over MJ and J, which implies that
Σ Σ L.S = 0 (5.79)
J MJ
For a given J, the expectation value is the same for all MJ values, so that this
relation is equivalent to
Atoms and Molecules 169
Σ (2 J + 1) L ⋅ S = 0 (5.80)
J
ELS = Σ (2 J + 1) EJ / Σ (2 J + 1) (5.82)
J J
Example 5
The calculation of the energy levels of a many-electron atom is in general quite
difficult. For the helium atom, a perturbative estimation of the ground state
energy can be made.
The Hamiltonian for the helium atom is
1 Z ′e 2 1 1 e2 1
H= ( p12 + p22 ) − + +
2me 4πε0 r
1 r2 4 πε 0 | r −
1 r2 |
(5.83)
Taking the unperturbed Hamiltonian as
1 Z ' e2 1 1
H0 = ( p12 + p22 ) − + (5.84)
2me 4π∈0 r1 r2
with Z′ representing the screened charge of the nucleus, and the perturbation as
e2 1 1 e2 1
lV = − ( Z − Z ′) + + ...(5.85)
5πε0 r1 r2 4πε0 | r1 - r2 |
A good perturbative estimaton of the energy can be obtained of the
perturbation λV is small. It is plausible to ‘optimize’ the smallness of the
perturbation by requiring that the expectation value of λV is zero,
λV = 0 (5.86)
which will determine Z'.
The unperturbed ground-state wave function is
Z '3 Z'
ψ0 (r1, r2) = 3
exp − (r1 + r2 ) (5.87)
π a1 a1
and the ground state energy is
170 Elements of Modern Physics
e 2 Z '2
E0 = – (5.88)
4πε 0 a1
The calculation of the expectation value of λV is straight forward, and gives
( Z − Z ′) Z ′ e 2 5e 2 Z ′
λV = – + (5.89)
2πε 0 a1 32πε 0 a1
From the condition in Eq. (5.86,)
5
Z' = Z – (5.90)
16
which gives an estimation of the screening of the nuclear charge. With this
value for Z', the ground state energy is
2
e2 5
E=– Z − (5.91)
4πε 0 a1 16
On substracting this expression from the energy of the ionized state, the
ionization energy is
e2 5
2
1 2
E=– Z − − Z (5.92)
4πε0 a1 16 2
For the helium atom Z = 2, so that
I ≈ 23.1 eV (5.93)
which is in very good agreement with the experimental value of 24.6 eV. For the
singly ionized lithium, Li+, the ionization energy from Eq. (5.92) with Z = 3,
comes out to be 74.1 eV which may be compared with the experimental value
of 75.6 eV.
Example 6
The highest-energy characteristic x-ray lines are obtained from U92. The energies
of K-lines are estimated from
E ≈ 13.6 Z2 eV (5.94)
≈ 115 keV
with a corresponding wavelength of about 0.11 Å.
Example 7
A knowledge of the molecular dissociation energy enables the estimation of its
repulsive energy.
For a KCl molecule, the dissociation energy is 4.42 eV (E = – 8.76 eV), so
that from Eq. (5.59),
Atoms and Molecules 171
14.4 b
– 8.76 = – 3.80 – + n (5.95)
r r
The equilibrium condition implies that at the equilibrium separation r0
14.4 bn
2
+ n +1 (5.96)
r0 r0
Substituting this in Eq. (5.95)
14.4 1
– 4.96 = – 1 − (5.97)
r0 n
From the information that r0 ≈ 2.79 Å, n ≈ 25. This is an overestimation and
suggests that the terms that have been neglected (such as the van der Waals
attraction) are important in the determination of n. The equilibrium value of r0
gives the result that the net repulsion is about 0.2 eV at r = 2.79 Å.
PROBLEMS
2
1. Show that the expectation value (r1 - r2 ) is greater for ψ – than for ψ+
1
we here ψ± = [ψi (r1) ψj (r2) ± ψi (r2) ψj (r1)], ψi and ψj being
1/ 2
2
orthogonal to each other. This would suggest that the two particles are
closer together in the symmetric states.
2l + 1
2. From the relation Σ | Ylm (θ, φ) |2 = , show that the charge density
m 4π
of a closed shell is isotropic.
3. Show that the sum of the degeneracies for the (ns) (n' l) system is 4 (2l + 1)
and for the (np) (n' l) sytstem it is 12 (2l + 1) (assume that the electrons
are inequivalent). What are the sums of the degeneracies for equivalent
electrons?
4. Show the energy levels of C in a diagram similar to Fig. 5.6, and indicate
the first few allowed transitions.
5. Discuss the energy levels in the j-j coupling scheme for two valence
electrons (nd) (n' d). What happens if n = n′?
6. What are the ground-state terms for elements from K to Zn?
7. The wavelengths corresponding to transitions (6s) (6d) 3D2 → (6s) (6p)
(3P1, 3P2) in mercury are 3125.66 Å and 3654.83 Å respectively. What is
the value of CLS in Eq. (5.25) for the L = 1 and S = 1 state?
172 Elements of Modern Physics
8. Discuss the energy level diagram of the valence electron in the sodium
atom. If the 2P1/2 to 2S1/2 transition corresponds to a wavelength of 5895.923
Å, what is the minimum energy of the bombarding electrons required to
excite this Na line? (Assume that the Na atoms are in the ground state.)
9. Using experimental information (Table 5.3) about the energy levels of Ag,
determine the minimum potential required across the x-ray tube, to excite
the K lines and the L lines. What are the wavelengths of the Kα lines?
What are the frequencies of K and L absorption edges?
10. If the Kα1 radiation from silver is incident on a material, what is the largest
Z value of the material for which the K electrons can be ejected (use
Moseley’s law)? What is the kinetic energy of the ejected electron for
Cu?
11. Given that the K-absorption edges for lead is 0.140 Å, and the minimum
voltage required for producing K lines in lead is 88.6 keV, determine the
ratio of h/e.
12. For Cu, determine the kinetic energy of the Auger electron for the transition
in which two vacancies are created in the LI shell in filling up a K-shell
vacancy (some simplifying assumptions may be required.)
13. Assuming that Na+ and Cl– behave like hard balls or radii 1.0 Å and 1.8 Å
respectively (as far as repulsive forces are concerned), estimate the
dissociation energy for a NaCl molecule. The ionization potential of Na is
5.1 eV and the electron affinity for Cl is 3.8 eV.
14. For the HCl molecule, lines are found as v/c equal to 2944, 2926, 2908,
2866, 2844, 2821 cm–1. Determine the force constant for the vibrational
motion, and the distance of separation for the ions.
15. The rule of equal spacing is not strictly valid for the vibrational-rotational
band. Calculate the change in the energy if the moment of inertia in the
two vibrational states is different, say I0 and I1. Estimate (I1 – I0)/I0 for
HCl for the states corresponding to the spectrum observed in Problem 14.
6
Interaction with External Fields
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 173
S. H. Patil, Elements of Modern Physics,
https://doi.org/10.1007/978-3-030-70143-7_6
174 Elements of Modern Physics
In Chapter 5, the structure and the energy levels of atoms and molecules were
discussed. Here, their interaction with external electromagnetic fields will be
considered. While the time independent fields allow the investigation and
modification of the energy levels to suit our convarience, it is the time dependent
fields which lead to transitions between the states. These effects are of great
importance not only in deducing atomic and molecular properties, but also in
devising useful practical applications such as the lasers and masers.
1 q
H(i) = [− i ∇i − qA(ri , t )]2 − si . B + V (ri , t ) (6.1)
2m m
where V/q and A are the electrostatic and electromagnetic potentials. For the
special case of the external magnetic and electric fields being constant,
1
A (ri, t) = – ri × B
2
V (ri, t) = – q ri . E + Vint (6.2)
where B and E are the constant magnetic and electric fields, respectively, and
Vint is the potential due to the nucleus and the other electrons. Substitution of
these expressions in Eq. (6.1), after some simplification, leads to
2 2 q q2
H(i) = − ∇ i− (l i + 2si ) . B + (r × B ) 2
2m 2m 8m
– q ri . E + Vint (6.3)
where, except for the extremely strong magnetic fields (such as B ~ 109 G), the
quadratic term in B can be neglected. Therefore, the Hamiltonian for the atom
is given by
H = H0 + H1 + H2 + H3 + H′ (6.4)
where H0, H1 and H3, defined in Eq. (5.14), describe the atom in the absence of
the external fields, and with q = – e, e > 0,
e
H′ = ∑ (li + 2si ) . B + e Σi ri . E
2m i
(6.5)
We first consider the case of E = 0, which is known as the Zeeman effect for
weak B field and as the Paschen-Back effect for strong B field. The interaction
with a constant external electric field leads to what is known as the Stark effect
and is discussed as an example in Sec. 6.8. The interaction with radiation is
important in transitions and is discussed in Sec. 6.3.
Interaction with External Fields 175
∫ ψ *J M J
Sψ J , M ′J d τ = a ∫ ψ *J M ′J Jψ J , M ′J d τ (6.8)
where a is a constant, independent of MJ and MJ′.
This theorem (see Example 1 in Sec. 6.8 for the proof) essentially implies
that S is proportional to J within the sub space of states with a given value of J.
A similar result is also valid for L.
Zeeman Effect
For the weak magnetic field, H′ is regarded as a small perturbation. In the LS
coupling scheme, the states are characterized by the quantum numbers L, S, J
and MJ, so that the perturbation in energy is obtained by using Eq. (6.8) as
e
∆E = g B . 〈 L, S , J , M J | J | L , S , J , M J 〉 (6.9)
2m
where the notation indicates taking an average with respect to the states with
given L, S, J, and MJ ; and g is defined by the relation
L + 2S = g J (6.10)
in the subspace of states with a given J. The constant of proportionality g, is
determined by taking the scalar product of Eq. (6.10) with J, and calculating
the expectation value between the states with given L, S, J and MJ:
L, S , J , M J | (L + 2S) . J | L, S , J , M J
2
= g L, S , J , M J | J | L, S , J , M J (6.11)
where gives (using 2L . J = L + J – S and 2S . J = S + J – L )
2 2 2 2 2 2
176 Elements of Modern Physics
J ( J + 1) + L( L + 1) − S ( S + 1)
g=
2 J ( J + 1)
J ( J + 1) + S ( S + 1) − L( L + 1) (6.12)
+
J ( J + 1)
This on simplification leads to
3 S ( S + 1) − L( L + 1)
g= + (6.13)
2 2 J ( J + 1)
The quantity g is called the Lande g-factor. The shift in the energy, due to
the magnetic field taken along the z-direction, is given by
eh
∆E = gB M J (6.14)
2m
which implies that the degenerate states with a given J split into 2J + 1 equidistant
levels. This is known as Zeeman effect, and is illustrated in Figs. (6.1) and
(6.2).
The Lande g-factor is often derived from what is called the vector model.
In this model, the L + 2S vector is supposed to precess rapidly around J so that
for the purpose of taking averages, only the component of L + 2S along J is
considered,
(L + 2S) . J
〈L + 2S〉 = J (6.15)
J2
which again leads to the expression in Eq. (6.11) with g given by Eq. (6.13).
The weak-field approximation is valid if the energy shift in Eq. (6.14) is
small compared to the fine-structure splitting. The energy shift for ordinary
fields, is rather small, e.g. for B ≈ 104 G (i.e. 1Wb/m2), the energy splitting is of
the order of 0.58 × 10–4 eV for g = 1.
The selection rules for transitions between the MJ multiplets of two levels,
are:
∆MJ = 0, ± 1 (6.16)
and the shift in the frequency of the radiation emitted is given by
eB
∆ω = ( gM J − g ′M ′ J ) (6.17)
2m
MJ′ – MJ = 0, ± 1
It is observed that the frequency shifts of the spectral lines have a simple
relation if the levels do not have fine structure, i.e. S = 0 (singlet states). In this
case, since ∆S = 0 for electric dipole transitions, one has J = L for which
g = g′ = 1, (S = S′ = 0) (6.18)
The shifts of the spectral lines in this case are
eB
∆ω = 0, ± (6.19)
2m
Interaction with External Fields 177
This is known as normal Zeeman effect, and results in each line splitting
into three lines symmetrically placed about the unshifted line, one of which is
the unshifted line (see Fig. 6.1). It may be noted that the shifts for ordinary
magnetic fields are quite small, ∆ω ~ 8 × 1010 rad/s for B ~ 104 G, compared to
ω ~ 3 × 1015 rad/s for visible light.
MJ MJ
2 2
1 1
1D2 0 0
–1 –1
–2 –2
1 1
1P1 0 0
–1 –1
w0 – Dw w0 + Dw
w0
(a) (b)
Fig. 6.1 (a) The energy levels of 1P1 and 1D2 states as a function of the
magnetic field, and (b) the splitting of energy levels into three components
illustrating normal Zeeman effect.
If the states have fine structure arising from spin-orbit interaction, the spectral
lines break into more than three components, and the frequency shifts are given
by rational fractions of the normal Zeeman shift,
p
∆ω = ∆ ω0
q
eB
∆ω0 = (6.20)
2m
where p and q are integers. This case is known as anomalous Zeeman effect. As
a specific example, consider the splitting of the alkali doublet lines which
2
correspond to the transitions P1/2 → 2 S1/2 and 2
P3/2 → 2 S1/2 . The Landé
g-factors for these states are obtained from Eq. (6.13) as
178 Elements of Modern Physics
2
S1/2 : g = 2
2
P1/2 : g = 2/3 (6.21)
2
P3/2 : g = 4/3
Substituting the values in Eq. (6.17) gives
∆ω = (± 2/3, ± 4/3) ∆ω0 for 2 P1/2 → 2 S1/2 (6.22)
2
and ∆ω = (± 1/3, ± 1, ± 5/3) ∆ω0 for P3/2 → 2 S1/2 (6.23)
The energy levels as a function of the field B, the allowed transitions, and
the splitting of the spectral lines, are shown in Fig. 6.2. It is worth noticing that
some of the fine structure lines cross each other for (e B/m) ~ ∆E. At these
values of B, there is a mixing of states which, under some circumstances, causes
a sharp change in the intensity of radiation emitted. This phenomenon is known
as Hanle effect (particularly, the case of crossover at B = 0), and has been used
to determine the constants involved in the fine structure multiples. Of course,
for e B/m ~ ∆E, the perturbative analysis is not strictly valid (H′ ~ H2), and a
more complicated, nonperturbative analysis has to be carried out.
ML = 1, MS = 1/2
E – E0
DE
ML = 0, MS = 1/2
1
2
P3/2 1.0 2.0 (eB/mDE)
0 ML = 1, MS = –1/2
2
P1/2 ML = –1, MS = 1/2
–1
ML = 0, MS = –1/2
ML = –1, MS = –1/2
MS = 1/2
2
S1/2
w0 w1
MS = –1/2
(a) (b)
Fig. 6.2 (a) The energy levels of 2S1/2, 2P1/2 and 2P3/2 states in the presence of a
magnetic field, in units of the fine structure splitting ∆E. (b) Splitting of
the spectral lines for transitions from 2P1/2 and 2P3/2 states to 2S1/2
states at e B/m = 0.5 (∆E).
Interaction with External Fields 179
Paschen-Back Effect
If the magnetic field is so strong that the splitting of the energy levels due to the
magnetic field is larger than the fine structure separation, is known as Paschen-
Back effect. In this case, H′ in Eq. (6.6) is treated as the main term and H2
representing the spin-orbit interaction as a small perturbation, which makes the
calculations relatively simple.
In the absence of spin-orbit interaction, the unperturbed states may be
specified by the quantum numbers L, S, ML and MS. Taking the magnetic field
along the z-direction, the energy shift due to the magnetic field is given by
Eq. (6.6) as:
e
∆E′ = ( M L + 2M S ) B (6.24)
2m
and the line-splitting is an integral multiple of ∆ω0. The selection rules for
electric dipole transitions in this case are:
∆ML = 0, ± 1, ∆MS = 0 (6.25)
so that we essentially get back the normal Zeeman shifts
eB
∆ω = 0, ± (6.26)
2m
This is expected since the only role played by spin here is to change the
energy levels by an amount eMbB/m which, in view of the selection rules in
Eq. (6.25), does not affect the frequency associated with the transitions.
The effect of spin-orbit interaction can be included perturbatively by using
the theorem stated earlier but applied to li whose sum if L, and also to si shows
sum is S. The expectation value of the spin orbit interaction can then by written
as
〈L, S, ML, MS | H2 | L, S, ML, MS〉
= a 〈L, S, ML, MS, | L.S | L, S, ML, MS〉 (6.27)
where a is independent of ML and MS. Since LxSx + LySy [which can be written as
1
(L+S– + L–S+)] changes the values of ML and MS, the only term which
2
contributes in Eq. (6.27) is the LzSz term. Therefore, the total energy shift is
given by
e
∆E = ( M L + 2M S ) B + a 2 M L M S (6.28)
2m
Consider the effect of this term on the 2P levels shown in Fig. 6.2(a). The
allowed values for ML are 1, 0, – 1 and for 2MS, 1, – 1, so that ML + 2MS can take
on the values 2, 1, 0, – 1, – 2. Thus, the 2P levels are split into five equidistant
levels by the first term in Eq. (6.28) [the lines in Fig. 6.2(a) for B → ∞]. This
180 Elements of Modern Physics
equidistance is removed by the second term which, for these levels, has a value
of a2, 0, – a2/2, 0, – a2. The shifts in the transition frequencies are now given
by
∆ω = (∆ω0 + aMS) ∆ML (6.29)
For the P → S transitions, the frequency shifts for MS = 1/2 are slightly
2 2
j-j Coupling
In the analysis so far, it has been assumed that LS coupling is valid. The theory
can easily be modified to apply to heavy atoms where j-j coupling is dominant.
The effect of the magnetic field on two electrons with j-j coupling is briefly
discussed here.
In j-j coupling, the states are characterized by j1, j2, J and MJ. The energy
shift due to the interaction of the two electrons with a weak external magnetic
field is given by
e
∆E = 〈 j1 , j2 , J , M J | l1 + 2s1 + l 2 + 2s 2 | j1 , j2 , J , M J 〉 ⋅ B
2m
(6.30)
Using the results of the theorem in Eq. (6.8), we can write
〈j1 | l1 | j1| j1〉 = a1 〈j1 | j1 | j1〉 (6.31)
〈j1 |s1 | j1〉 = b1 〈j1 | j1 | j1〉
and similar relations for l2 and s2. The constants ai and bi are determined by the
following steps similar to those leading to Eq. (6.13) giving
1 li (li + 1) − si ( si + 1)
ai = + (6.32)
2 2 ji ( ji + 1)
1 si ( si + 1) − li (li + 1)
bi = =
2 2 ji ( ji + 1)
where i = 1, 2. Then one gets
e
∆E = 〈 j1 , j2 , J , M J | (ai + 2b1 ) ji
2m
+ ( a2 + 2b2 ) j2 | j1 , j2 , J , M J 〉 ⋅ B (6.33)
Again, applying the theorem in Eq. (6.8) to ji whose sum is J, gives
〈J | j1, 2 | J〉 = A1, 2 〈J | J | J〉 (6.34)
Interaction with External Fields 181
with
1 j1 ( j1 + 1) − j2 ( j2 + 1)
A1 = +
2 2 J ( J + 1)
1 j2 ( j2 + 1) − j1 ( j1 + 1)
A2 = + (6.35)
2 2 J ( J + 1)
Finally, one gets
eB
∆E = M J [ A1 (a1 + 2b1 ) + A2 (a2 + 2b2 )] (6.36)
2m
which again results in (2J + 1) equidistant energy levels.
In the strong field case, the unperturbed states are characterized by the
quantum numbers j1, m j1 and j2, m j2 so that the energy shifts can be obtained
from the relations in Eq. (6.31) as
eB
∆E = [(a1 + 2b1 ) m j1 + (a2 + 2b2 )m j2 ] (6.37)
2m
To this, the contribution of the spin-orbit interaction can be added, which
also can be estimated by using arguments similar to those used in the discussion
for the LS coupling. It gives a contribution proportional to m j1 m j2 .
In all discussions so far, only the effect of the linear term in B in Eq. (6.3)
has been considered. The quadratic term becomes important for atoms for which
the magnetic dipole moment is zero, e.g., He, Ne, etc. which have L = S = 0.
The magnetic properties of these materials such as magnetic susceptibility, are
determined by the quadratic term. The quadratic term is also important in
astrophysics where enormously large magnetic fields are encountered (in pulsars
and neutron stars) and is some solid state problems. The energy shift due to the
quadratic term is known as the quadratic Zeeman effect.
which is quate large compared with atomic sizes (~ 1 Å). Therefore, the electric
field can be regarded as being constant over atomic distances (this is called the
electric dipole approximation). Such an electric field is provided by the scalar
potential – Ez cos ωt, where it is assumed that the electric field is in the
z-direction and has an amplitude E. The interaction of the charged particle with
this scalar potential leads to the potential energy term
V = – q Ez cos ωt (6.38)
The effect of this interaction on a bound state can be treated perturbatively.
Let the particle be in a bound eigenstate φ0 of the Hamiltonian H0, before
the radiation is incident on it. After the radiation is introduced at t = 0, the
particle can undergo a transition to any of the other eigenstates φn where
H0 φn = Enφn (6.39)
which satisfy the orthonormality conditions
∫φ m ∗ φn d τ = 0 for m ≠ n
= 1 for m = n (6.40)
The state of the particle can then be represented by
φ= ∑ n
an (t ) exp (− iEn t / )φn (6.41)
Multiplying both sides by φm* and integrating, and using the orthonormality
conditions gives
∂a (t )
i m
∂t
= – qE exp (iωm t) (cos ωt) φm∗ zφd τ ∫ (6.45)
If the incident radiation is not very strong, we can approximate φ in Eq.
(6.45) by its unperturbed expression, i.e., φ ≈ exp (– i ω0t) φ0, and get as a first
order approximation
Interaction with External Fields 183
iqE t
am(t) =
zm 0
0 ∫
exp [ωm − ω0 )t ]
cos (ωt) dt, m ≠ 0, (6.46)
where we have used the boundary conditions in Eq. (6.42), and
where dI/dω is the flux density per unit frequency. This gives
For large t, the integrand being sharply peaked at ω = ωm0, dI/dω can be
evaluated at ω = ωm0, and the remaining integration can be carried out to give
q 2 πt dI
| am(t) | =
2 | zm 0 |2 ,
ε 0 c 2
d ω ω = ωm 0
sin 2 ax
x ∫
2
dx = πa
(6.56)
1
| (n . r)nm |2 → | (r )nm |2 (6.68)
3
Equating Wm → n (with the above replacement) and Pnm then yields
q2π
Bnm = 2
| (r ) nm |2 (6.69)
3ε0
so that
2q 2 ω3
Anm = 3
| (r ) nm |2 (6.70)
3ε 0 hc
This is the expression for the probability of spontaneous electric dipole
transitions.
Selection Rules
It is observed that the induced and the spontaneous transition probabilities
[Eqs. (6.57) and (6.70)] depend on the same matrix element, (r)nm. Therefore
these electric diole transitions are allowed only if this matrix element is nonzero.
This imposes certain conditions on the allowed transitions. In particular, it is to
be noted that since r is odd under parity transformation (i.e., r → – r) the
product of ψ*n and ψm also should be odd. If these are single-particle, angular
momentum states [see Eq. (3.153)] with orbital angular momentum quantum
ln + lm
numbers ln and lm, the parity of the product of these states is (−1) so that
(lm + ln) and therefore (ln – lm) are odd. In addition, the angular dependence of r
is of the form Ylm (θ, φ) from which it can be shown that | ln – lm | = 1 and
| jm – jn | = 1, 0, jm and jn being the total angular momentum quantum numbers.
Thus the allowed electric dipole transitions satisfy selection rules:
∆l = ± 1, ∆j = ± 1, 0, ∆s = 0 (6.71)
where the ∆s = 0 result follows from the fact that spin is unchanged in the
transitions. More detailed arguments also shown that
∆mj = ± 1, 0, j = 0 →
/ j=0 (6.72)
∆ω = 1/τ (6.74)
The average lifetime τ and therefore the linewidth are related to the transition
probability.
If A is the transition probability, the number of particles (– dN) which undergo
transition in time dt is [see Eq. (1.80)]
dN = – AN (t) dt (6.75)
which on integration gives
N(t) = N(0) e–At (6.76)
Now, the average lifetime of the particles is
N (ti ) − N (ti + ∆t )
τ= ∑t i
i
N (0)
∞
= A ∫
0
te − At dt (6.77)
1
=
A
so that the linewidth in Eq. (6.74) is equal to the transition probability A. For
most atomic systems which admit to electric dipole transitions, τ ~ 10–8 s. This
gives rise to a spread of ∆ω ~ 108 s–1. For λ ~ 5000 Å, the corresponding spread
in the wavelength is
∆λ ~ 10–4 Å (6.78)
For some excited states which are stable against electroid dipole transitions,
e.g., the 2 2S1/2 state in the hydrogen atom, known as metastable states, the
lifetime is usually about 105 times larger, i.e., τ ~ 10–3 s. Metastable states play
a very important role in lasers and masers.
There are other effects which also contribute to the observed linewidth.
One of them is due to Doppler effect. Since the atoms are moving around (thermal
motion), the observed radiation is Doppler shifted from the frequency ω0
expected for atoms at rest. If the velocity of the particle makes an angle of
α with the line of observation, the observed frequency is
v
ω ~ ω0 1 − cos α (6.79)
c
For v ~ 6000 m/s (corresponding to atomic hydrogen at about 1400 K), and
λ ~ 5000 Å, the Doppler shift is
∆ω ~ 7.5 × 1010 rad/s.
∆λ ~ 0.1 Å (6.80)
Another phenomenon that contributes to the linewidth is atomic collisions
which effectively change the lifetime of the excited states. The observed
linewidth ∆ω is the sum of the linewidths arising from the different effects.
188 Elements of Modern Physics
E3 2s 5s S*n
S*o
3p
E2
3s T*
2
E1 (1s) 2p Sn
So
He Ne
(a) (b) (c)
stimulate the emission of similar other photons and the chain reaction quickly
develops a beam of photons all moving parallel to the rod, which is
monochromatic (well-defined frequency) and is coherent (well-defined phase
and polarization). When the beam develops sufficient intensity, it emerges
through the partially silvered end. The ruby laser is a solid-state laser and operates
in pulses (several pulses per minute). The larger amount of heat released in the
crystal is cooled by liquid air.
Helium-neon laser: An example of a continuously operating laser is the
helium-neon laser (Fie. 6.4). In this laser (Javan, 1960), an electric discharge is
created by a dc current in a tube containing a mixture of helium and neon in the
ratio of 5 : 1. The discharge raises some of the helium atoms into the 2s level
[see Fig. 6.3 (b)] which is a metastable state (i.e., it has a long lifetime). The
energy of this level (20.61 eV) is almost the same as the energy of the 5s level
(20.66 eV) in neon. Hence, the energy of the helium atoms is easily transferred
to the neon atoms when they collide. This preferential transfer of the neon atoms
to the 5s state results in a population inversion between the 5s and the 3p states.
The spontaneous transitions from the 5s state to the 3p state, produce photons
of wavelength 6328 Å, which then trigger stimulated transitions. Photons
travelling parallel to the tube are reflected back and forth between the mirrors
placed at the ends, the rapidly build up into an intense beam which escapes
through the end with the lower reflectivity. The energy taken out by the laser
beam is continuously replaced by the dc supply, so that it is a continuously
operating laser. The usual efficiency of conversion of energy into the laser beam
energy is quite small, about 10–3%.
The hydrogen maser: Since the nucleus of the hydrogen atom has I = 1/2,
the ground state of the atoms splits into two levels with total angular momentum
quantum numbers F = 0 and F = 1, which have a small energy difference. Of
the two levels, the one with F = 0 has a slightly lower energy. The hydrogen
maser is based on stimulated transition from the F = 1 state to the F = 0 state.
Window at Brewster
angle
Emerging beam
that the beam is essentially described by a single plane wave with a width equal
to the cross-section of the beam. This gives rise to a directionality constrained
only by the width. If a beam with a cross-sectional area (∆x)2 is travelling in the
z-direction, uncertainty relation implies
∆px ≈ /∆x (6.88)
so that the angular spread is
∆p x λ
α= ≈ (6.89)
pz ∆x h
≈ 10 rad for λ ≈ 5000 Å, ∆x ≈ 10–3 m
–4
Interaction with External Fields 193
Nonlinear Optics
For ordinary light sources, the electric field is so small that the induced
polarization P is approximately proportional to the electric field E, and the
various properties of the medium such as polarizability a, refractive index, etc.
are independent of the field intensity (here, for simplicity the vector nature of
P and E is neglected). Thus, we have what is known as linear optics for which
the superposition principle holds i.e., P1 = α E1, P2 = α E2 implies P1 + P2 = α
(E1 + E2). However, with laser fields of high intensity, there is no longer a linear
relation between P and E, and the description is in terms of nonlinear optics.
194 Elements of Modern Physics
with a frequency of 2ω. This is called frequency doubling. For example, when a
dielectric medium is irradiated with a powerful ruby laser beam with λ = 6943
Å, an ultraviolet component with λ ≈ 3472 Å is observed to emerge from the
medium. In general, higher harmonics with frequencies 3ω, 4ω, etc. also may
be present.
If two beams with different frequencies, ω1 and ω2, at least one of them
being a laser beam, are incident on the medium, the non-linear term will have
terms with frequencies 2ω1, 2ω2, ω1 + ω2 and ω1 – ω2. The emerging beam
therefore will contain components with these frequencies. Thus, the effect of a
low frequency beam (e.g., ω2 in the infra-red region) may be observed in the
optical region by choosing ω1 in the optical range.
In many substances, the refractive index of the substance increases as the
intensity increases. Thus, the effective refractive index of the material is larger
near the centre of the propagating laser beam so that the rays bend towards the
beam axis. This is known as self-focussing and is again a consequence of non-
linear optics. This property is utilized in fibre-optics communication.
Holography
An extremely interesting application of lasers is to holography, i.e., the
production of the whole or complete, 3-dimensional picture of an object. It is
based on the reconstruction of the electromagnetic fields reflected by the object.
Preparation of the photographic plate: Consider the electromagnetic field
of a laser beam reflected by an object. For simplicity, the reflected beam is
assumed to be a plane wave. This wave with amplitude R is allowed to interfere
with a reference laser beam of amplitude A, at an angle θ, on a photographic
plate [Fig. 6.5 (a)]. The exposed plate registers the interference fringes and is
processed to give a hologram. The intensity registered is [Fig. 6.5 (a)]
I = | A exp (i2π(x – ct)/λ) + R exp (i2π (n . r – ct)/λ)|2x = 0
= A2 + R2 + 2AR cos [2πy (sin θ)/λ] (6.91)
where it is assumed that A and R are real. The separation between the interference
fringes is
λ
d= (6.92)
sin θ
Reconstruction of the wavefront: A similar reference beam is allowed to
fall on the hologram which acts as a diffraction grating (grating separation
d = λ/sin θ) and produces diffraction images. The angular separation between
the central maximum and the first maximum on the two sides, is [see Fig. 6.5(b)]
Interaction with External Fields 195
Reflected beam y
q
3
1
Reference beam d = l/sin q
2
l
(a)
q¢ = q
3
2
q¢ = q
(b)
Fig. 6.5 (a) Interference of the plane wave object beam and the reference
beam producing a hologram. (b) The hologram producing three
components one of which is the same as the object beam.
λ
sin θ′ =
d
= sin θ (6.93)
or θ′ = θ. Therefore, of the two maxima, one of the them has the same
directionality as the beam reflected from the object had. The wavefront
corresponding to this maximum is the same as that of the reflected beam and
hence produces a three dimensional image. Analytically, the modulated
amplitude is
B = IA exp (i2π – ct)/λ) |x = 0
= A(A2 + R2) exp (– iωt) + A2R exp [– i(ωt + 2πy (sin θ)/λ)]
+ A2R exp [– i(ωt – 2πy(sin θ)/λ)] (6.94)
where I given in Eq. (6.91) has been used. The first component corresponds to
the central maximum, the second component corresponds to the first maximum
along the direction of the original beam (it is to be noted that this wave moves
downward as t increases), and the third component represents the other first
maximum (the wave moves upward as t increases).
196 Elements of Modern Physics
If the object beam is a diverging beam (as in normally the case), it is easily
seen that the maximum intensity points 2 and 3 both move up with respect to 1
(see Fig. 6.6). As a consequence, the lower beam will also be diverging and will
appear to start from the object, whereas the upper beam will converge to a real
image of the object. This is seen by constructing wavefronts with circles of
radius b with centre at point 1, radius b ∓ λ with centre of point 2, radius b ± λ
with centre at point 3. The upper signs correspond to the lower beam and the
lower signs correspond to the upper beam. It may also be noted that in viewing
a hologram, the reference beam may have a frequency different from that of the
beam used for recording the image. If the wavelength of the second reference
beam is longer, the image observed will be magnified.
Holograms are very useful in studying the conditions at different levels by
focussing the microscope at different planes of the reconstructed image, e.g., in
the investigation of sizes and distribution of particles, mechanical strains, etc.
Laser Cooling
In 1985, the group of, S. Chu and co-workers (among them Ashkin and J.E.
Bjorkholm) at Bell Laboratories, Holmdel, NJ, reported success in cooling a
dilute vapour of about 105 neutral sodium atoms in a volume of 0.2 cm3 to a
temperature of about 0.2 mK.
Mirror
AR coated UHV window
Light baffles
Mirror
Sodium pellet
30 cm
Sample
manipulator
Fig. 6.6 Schematic drawing of the vacuum chamber, intersecting laser beams and
atomic beam used for the Doppler cooling experiment. The laser beams enter the
UHV windows vertically and horizontally.
Interaction with External Fields 197
1
2
P
(a)
S S¢
2
q¢ = q
P
Observer
(b)
Fig. 6.7 (a) Interference of the object beam and the reference laser beam,
on the photographic plate, producing a hologram, (b) the hologram
producing three components, one of which has the same
wavefront as the object beam.
198 Elements of Modern Physics
at the wall, while the second component with MI = 1/2 moves along the trajectory
shown. The second field B2 is homogeneous and introduces an energy difference
e
∆E = g N B2 (6.101)
mp
between the energy levels. In this region, there is also a radiation field of radio
frequency ω (ω ~ 108 rad/s). If ω satisfies the resonance condition
e
ω = g N B2 (6.102)
mp
some of particles will undergo resonant transition to the MI = – 1/2 state. The
third field B3 also is inhomogeneous but has a gradient opposite to that of B1,
which will remove the particles with MI = – 1/2, at the wall, while those with
MI = 1/2 pass along the trajectory shown and register in the detector.
¶B1
¶z
B1 w
Beam
Detector
B2
B3
¶B3
¶z
Fig. 6.8 Schematic diagram of the atomic/molecular beam resonance experiment.
The dashed lines indicate the components removed.
In the actual experiment, the frequency ω is held fixed and the field B2 is
varied. When the resonance condition in Eq. (6.102) is satisfied, some of the
particles undergo transition to the MI = – 1/2 state and are removed at the wall,
which reduces the recorded beam intensity. The value of the field B2 at which
the minimum beam intensity is recorded can be used to calculate the value of gN
and hence the magnetic moment of the particles. For example, the reduction in
intensity is observed for 31P, at B = 104 G and ω = 1.08 × 108 rad/s which gives
a value of gN = 1.13.
The application of nuclear magnetic resonance best known to the general
public is magnetic resonance imaging (MRI) for medical diagnosis and magnetic
resonance microscopy in research settings, however, it is also widely used in
chemical studies, notably in NMR spectroscopy such as proton NMR,
carbon-13 NMR, deuterium NMR and phosphorus-31 NMR. Biochemical
information can also be obtained from living tissue (e.g. human brain tumors)
Interaction with External Fields 201
(see Fig. 6.9 ) with the technique known as in vivo magnetic resonance
spectroscopy or chemical shift NMR Microscopy.
Raman Effect
It was observed in 1928, by Raman and Krishnan, and simultaneously by
Landsberg and Mandelshtam, that the spectrum of light scattered by gases, liquids
and crystals, contains apart from the unshifted original frequency ω, new lines
whose frequencies are given by
ω′ = ω ± ω1 (6.103)
This is known as Raman effect, or more descriptively, as combination
scattering of light.
The process of scattering of radiation may be regarded as being made up of
absorption of the incoming photon and emission of the outgoing photon. If, as
a result, the final state of the atom or molecule is the same as the initial state, the
frequency of the photon is unchanged, giving the unshifted line. This process is
known as Rayleigh scattering. On the other hand, if the final state of the atom
or molecule is different, the process is an inelastic scattering of the photon and
the frequency ω′ of the final photon is given by the energy conservation relation
ω + E′ = E + ω (6.104)
E − E′
or ω′ = ω + (6.105)
The shifted frequency is less than the original frequency if E < E′ and the
corresponding lines are called the Stokes lines. It is more than the original
frequency if E > E′ and the associated lines are called the anti-Stokes lines. At
ordinary temperatures, there are more particles in the lower energy states, so
that there are more transitions with E1 → E2 than those with E2 → E1, E2 > E1.
Therefore, anti-Stokes lines are generally fainter (in some cases not even
observable) than the Stokes lines. The anti-Stokes lines increase in intensity as
the temperature is raised since this will increase the relative population of the
higher energy states.
202 Elements of Modern Physics
Since Raman effect is a two-step process, the selection rules can be deduced
from those for the two separate steps. In particular, the selection rules for the
transition between the rotational states of molecules, are ∆J = ± 1 for emission
or absorption of photons, and hence Raman effect is observed for transitions
with
∆J = ± 2, 0 (6.106)
For purely rotational transitions, only the ∆J = ± 2 transitions need be
considered (∆J = 0 does not involve changes in energy). The change in the
energy in the case of diatomic molecules, is given by
2
∆E = ± [ J ( J + 1) − ( J − 2) ( J − 1)]
2I
2
= ± (2J − 1) J ≥ 2 (6.107)
2I
where J refers to the higher state. For J = 2, ∆ω = ± 32/I, for J = 3, ∆ω = ± 52/I, for
J = 4, ∆ω = ± 72/I, etc. These lines are illustrated in Fig. 6.8. It is instructive to
compare them with the equi-spaced rotational levels in absorption spectra [see
Fig. 5.13(a)].
For transitions which involve changes in the vibrational states, ∆J can be
0 or 2. These involve larger changes in energy and hence anti-Stokes lines are
generally very faint. The frequency shift for a change in the vibrational state
but with ∆J = 0, corresponds to the missing central line in the vibrational-
rotational spectrum. The spacing of the ∆J = 2 lines about the ∆J = 0 is given by
Eq. (6.107). Raman spectra, involving changes in the vibrational states, provide
useful information about the structure of the molecules.
Raman spectra are characteristic of the molecules (and atoms) and are
extremely useful in the analysis of the complicated mixtures of molecules,
especially of organic molecules. They are also important in the determination
of the rotational and vibrational levels, and in the analysis of the structures of
the molecules.
J=4
1
0
n – 5n1 n + 5n1
n – 7 n1 n – 3 n1 n n + 3n1 n + 7n1
Fig. 6.9 The Raman spectrum for transitions within the rotational levels,
v1 is the spacing of the rotational levels (see Fig. 5.13).
Interaction with External Fields 203
Measurement of Lifetimes
The measurement of lifetimes of excited atoms and molecules, being of the
order of 10–8 s, is difficult. A few techniques of measuring short lifetimes are
discussed here.
The lifetimes can be obtained by measuring the intensity of radiation from
a collection of excited atoms, as a function of time. A voltage pulse is used to
excite the atoms by electron bombardment. The pulse starts the multi-channel
analyser in which channel n is active during the time nδ to (n + 1)δ, δ being a
time interval short compared with the lifetime τ of the atoms. The channel
records the pulses produced by the photoelectrons generated by the radiation
emitted. The pulse intensity is proportional to the number of excited atoms, and
hence its time dependence allows us to calculate the lifetime [from Eq. (6.76)].
One of the difficulties is that the population in the decaying state may be
continuously replenished by the particles in a higher excited state decaying to
the lower excited state under consideration.
In another method for measuring lifetimes of excited ions, called the beam-
foil technique, fast moving ions (accelerated by a potential difference) are excited
by passing them through a thin foil. The intensity of radiation emitted as a
function of the distance these excited ions travel, gives us information about
the number of excited states as a function of time, and hence allows us to calculate
the lifetime τ [from Eq. (6.76)].
An indirect method of calculating the natural lifetime is to measure the
linewidth of the level, and use the relation τ = 1/∆ω (essentially the uncertainty
relation) to deduce the lifetime of the state. In this method, the Doppler linewidth
and the collision linewidth (collisions affect the lifetime of a state), must be
taken into account in isolating the natural linewidth from the total observed
linewidth (∆ω used in the uncertainty relation is the natural line width).
6.8 EXAMPLES
The discussion in this chapter is now supplemented with some technical details
and examples.
Example 1
Here, the proof of the important theorem stated in Sec. 6.2 is outlined. To prove
the equality in Eq. (6.8), the z-axis is taken along the direction under
consideration. Then it has to be proved that
∫ψ* J , MJ Sz ψ J , MJ ′ ∫
dτ = a ψ *
J , MJ
I z ψ J , M J ′ dτ (6.108)
for [L, S] = 0, J = L + S.
204 Elements of Modern Physics
∫
( M J ′ − M J ) ψ *J , M J Sz ψ J , M J ′ dτ = 0 (6.110)
*
or ∫ ψ j, M J
Szψ
J , M ′J
d τ = 0, for MJ ≠ M′J (6.111)
aM J ′ ∫ ψ *J , M j ′ ψ J , M J ′ d τ = 0, M J ≠ M J ′ (6.112)
the equation is satisfied for MJ ≠ MJ′.
To prove Eq. (6.108) for MJ = MJ′, it is observed that
∫ ψ *J , M J +1 Sz ψ J , M J +1 d τ − ∫ ψ *J , M J S z ψ J , M J d τ = ah (6.113)
where a is a constant independent of MJ. While this result is plausible in the
sense that every increment of MJ (or Jz) may be expected to cause an increase in
the average value of Sz, which depends only on the increment of MJ and not on
MJ itself, it is quite difficult to prove it (see Ref. 22, p. 236). This result then
leads to
∫ ψ *J , M J S z ψ J , M J d τ = aM J + b (6.114)
It is then noted that if all the angular momenta. J, L and S take opposite
values, the average value of Sz also should change its sign:
∫ ψ *J , − M J S z ψ J , − M J d τ = − ∫ ψ *J , M J S z ψ J , M J d τ (6.115)
Substituting Eq. (6.114) in this relation give b = 0. Since MJ is the
eigenvalue of Jz, the required relation is obtained as
∫ ψ *J , M J S z ψ J , M J ′ d τ = a ∫ ψ * J , M J J z ψ J , M J ′ d τ (6.116)
in which both the sides are zero for MJ ≠ MJ′. This proves the equality in
Eq. (6.8).
Example 2
The interaction of an atom with an external constant electric field E in the
z-direction, is obtained from Eq. (6.5):
H′ = e ∑ zi | E | (6.117)
i
Interaction with External Fields 205
This interaction has the interesting new feature that it becomes indefinitely
large and negative as zi → – ∞, so that the electrons in an atom can tunnel
through the potential barrier and ultimately escape to infinity (z → – ∞). Thus,
there are no longer any true bound stages, and each level (including the ground
state) acquires a linewidth due to the fact that it has a finite lifetime (∆ω ~ 1/τ).
Also, the first-order energy shift given by Eq. (3.125) is zero for nondegenerates
states,
∫ ψ *n zi ψ n d τ = 0 (6.118)
Since zi is odd and | ψn | is even (nondegradable states are even or odd
2
under parity). For degenerate states, which one encounters in the hydrogen
atom, the problem is more complicated.
Consider the 2p and 2s states of the hydrogen atom. From Eq. (3.127), it is
easy to show that the energies of the mi = ± 1 states are unperturbed. For the
ml = 0 states, the energy shifts are given by Eq. (3.128), with 1 standing for the
l = 1, ml = 0 state and 2 standing for the l = 0, mi = 0 state. For this case, V11 = V22
= 0, so that x = ± 1 and the energy shifts are
∆E = ± e | E | ∫ ψ (1)*
2 z ψ (2)
2 dτ (6.119)
Thus, for the hydrogen atom, in addition to acquiring linewidths, the spectral
lines split into several components (in this discussion spin-obrit interaction has
been neglected). e.g. the n = 2 → n = 1 line splits into three components. This is
known as Stark effect. As the strength of the electric field becomes large
(>
~ 10 V/m), higher order corrections have to be included, and one has what is
7
known as the quadratic Stark effect to distinguish it from the first order effect
which is called the linear Stark effect.
Example 3
The Zeeman splitting for the hydrogen 2p → 1s transitions is given by
Eqs. (6.22) and (6.23).
For B = 104 G (1 Wb/m2), ∆ω0 = 8.78 × 1010 rad/s
so that
∆λ ≈ (± 2/3, ± 4/3) (0.0069) Å for 2 2P1/2 → 1 2S1/2 (6.120)
≈ (± 1/3, ± 1, ± 5/3) (0.0069) Å for 2 P3/1 → 1 S1/2
2 2
Example 4
Here, the lifetime of the 2p state of the hydrogen state is calculated using
Eq. (6.70).
206 Elements of Modern Physics
Without any loss of generality, we assume that the atom is originally in the
l = 1, ml = 0 state. Using the wave functions given in Sec. 4.1,
exp (− r/a1 ) r cos θ
| (r)1s, 2p | = | ∫ ( πa13 )1/2
z
(32 π a15 )1/2
exp (– r/2a1) r2dr 2πd cos θ |
215/2
= a1 ≈ 0.745 a1 (6.121)
35
Substituting this in Eq. (6.70) and using the relation in Eq. (6.77) the lifetime
τ of the 2p state (∆E = ω = 12.0 eV) is
τ ≈ 1.6 × 10–9 s (6.122)
which is in agreement with the experimental observation.
Example 5
The ratio of spontaneous transitions to stimulated transitions for particles in
thermal equilibrium can be obtained from Eq. (6.63). Denoting the probability
for spontaneous transitions by P1 and that for stimulated transition by P2,
Eq. (6.63) reads
P2 + P1 = exp ( ωnm/kT) P2 (6.123)
or P1/P2 = exp ( ωnm/kT) – 1 (6.124)
For very low temperatures, the transitions are predominantly spontaneous
but become predominantly stimulated for high temperatures. This is to be
expected since the radiation density increases with temperature. For example,
at room temperatures, the transitions between 2p and 2s states of the hydrogen
atom ( ωnm ~ 10 eV, kT ~ 0.026 eV) are predominantly spontaneous. However,
in some of the hot stars, the surface temperatures are as high as 30000 K, so that
there stimulated transitions also are important.
Example 6
Lasers provide an intense, collimated beam. To estimate the power, consider
the original ruby laser which had a diameter of 1 cm and a length of 5 cm. If the
ruby has about 1019 Cr atoms/cc, and all of them are excited, the total energy
available is
π
E = 1019 5 × (hv)
4
= 11.25 J (6.125)
If the pulse lasts for about 10–7 s, the power during this period is about
108 W.
Interaction with External Fields 207
Example 7
If an excited state is replenished by decays from a higher excited state, the
decay rate is not given by the simple exponential function in Eq. (6.76).
Consider three states with energies E0 < E1 < E2 and decay probabilities A10,
A20 and A21. Then the changes in N1 (t) in time dt are
dN1 (t) = – A10N1 (t) dt + A21N2 (t) dt (6.128)
dN2 (t) = – A2 N2 (t) dt (6.129)
where A2 = A20 + A21. The solutions to these equations are
N2 (t) = exp (– A2 t) N2 (0)
A21
N1 (t) = exp (– A10 t) N1 (0) +
A10 − A2
(exp [– A2 t] – exp [– A10 t]) N2 (0) (6.130)
It is observed that if A21 N2 (0) > A10 N1 (0), N1 (t) will increase for small
t but will eventually start decreasing.
PROBLEMS
1. Obtain the energy levels of the (1s)2 1S0, (1s) (2p) 1P1, and (1s) (3d) 1D2
states of the helium atom in the presence of a magnetic field of strength
1 Wb/m2. What are the shifts in Å for the allowed transitions, if lP → S =
584.4 Å, λD → P = 6678 Å for the unperturbed states under consideration?
2. Describe the Zeeman patterns of the 2D3/2 and 2P3/2 states. Calculate the
frequency shifts for the transitions 2D3/2 → 2P3/2 with ∆MJ = 0, ∆MJ = 1,
and ∆MJ = – 1. If the sodium line with λ = 8195 Å corresponds to a
2
D3/2 → 2P3/2 transition, what is the maximum shift in its wavelength when
a magnetic field of 1 Wb/m2 is introduced?
3. Obtain the shifts in the frequency for 2D3/2 → 2P1/2 transitions in the presence
of a weak magnetic field.
4. What is the magnetic moment of sulphur in the ground state 3P2? In the
presence of a magnetic field of 2 Wb/m2, what is the resonance frequency?
5. Indicate the energy levels of 2P and 2D states and the allowed transitions
between them in the presence of a strong magnetic field.
208 Elements of Modern Physics
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 209
S. H. Patil, Elements of Modern Physics,
https://doi.org/10.1007/978-3-030-70143-7_7
210 Elements of Modern Physics
Identical bosons: Since the particles are indistinguishable, there is only one
way of grouping R particles into distinguishable sets of r1, r2,..., r1, ... particles.
Therefore, the total number of distinguishable arrangements is given by just the
product of the number of ways in which ri particles are distributed among gi
number of states.
For determining the number of ways in which ri particles are distributed
among gi number of states, the states are regarded as being separated by portions.
Since no partition is needed at the ends, gi – 1 number of partitions is needed.
Then the particles and the partitions are arranged in a row, e.g.
××||×|×|××× (7.5)
where each × represents a particle, the vertical line represents a partition, and
the arrangement shown represents 2, 0, 1, 1, 3 particles in five states (four
partitions). The number of such distinguishable arrangements in the i-th cell is
given by the number of different ways of arranging (ri + gi – 1) objects of which
ri particles and gi – 1 partitions belong to two groups of indistinguishable objects
and is
( ri + gi − 1) !
Pi(ri) = (7.6)
ri !( gi − 1) !
Therefore, the total number of distinguishable arrangements for the
distribution of r1, r2, ..., ri, ... sets of bosons in g1, g2, ... gi,... states is
∞ (ri + g i − 1) !
P(ri) = Π (7.7)
i = 1 r !( g − 1) !
i i
Identical fermions: Here again, there is only one way of grouping S particles
into distinguishable sets of s1, s2, ...si, ... particles. For obtaining the number of
ways of distributing si particles in hi states, it is noted that each state can be
occupied by at most one particle so that the states may be arranged in a row of
hi objects, indicating the occupation of each state, e.g.
00××00 (7.8)
where 0 indicates that the level is unoccupied, × indicates that the level is
occupied by one particle, and the particular arrangement represents 0, 0, 1, 1, 0,
0 particles in the six energy levels. Therefore, the number of such distinguishable
arrangements in the i-th cell is given by the number of different ways of arranging
hi objects of which si and hi – si belong to two groups of indistinguishable objects:
hi !
P(si) = (7.9)
si !(hi − si ) !
Hence the total number of distinguishable arrangements for the distribution
of s1, s2, ..., si, ... sets of fermions in h1, h2, ..., hi …states is
∞ hi !
P(si) = iΠ (7.10)
=1 si !(hi − si ) !
Quantum Statistics 213
∞ f qi ( r + g – 1)! hi !
P(qi, ri, si) = (Qi) Π i i i
(7.11)
qi ! ri ! ( g i – 1) ! si !( hi − si ) !
i
∑
i
qi = Q, ∑ ri = R, ∑ si = S (7.12)
i i
The most probable distribution corresponds to the maximum of P (qi, ri, si),
subject to the conditions (7.12) and (7.13).
In practice, it is more convenient to maximize ln P (qi, ri, si). The calculations
are greatly simplified by using the following approximation (Stirling’s formula):
ln n ! = ln 2 + ln 3 + ...+ ln n
n + 1/2
= ∫
1
ln x dx + 0(1)
≈ n ln n – n (7.14)
where for large n only the first two leading terms have been retained. Then
keeping only the leading terms gives
ln P = Q ln Q − Q + ∑ i
(qi ln fi – qi ln qi + qi)
–ri ln ri + ri – gi ln gi + gi]
+ ∑ [ hi ln hi – hi – si ln si + si
i
∞ ∞
∞
+ ∑ δ si [ ln (hi − si ) − ln si ] = 0 (71.6)
i =1
∑
i =1
δqi = 0, ∑ δri = 0, ∑ δsi = 0,
i =1 i =1
(7.17)
∑
i =1
εi (δ qi + δri + δ si ) = 0 (7.18)
Using the relations in Eq. (7.17) to eliminate δq1, δr1 and δs1 in Eqs. (7.16)
and (17.18) gives
∞
f q ∞
(r + gi ) r1
∑ δqi ln i 1 + ∑ δri ln i
i= 2 f1 qi i= 2 ( r1 + g1 ) ri
∞
(h − si ) s1
+ ∑ δsi ln i =0 (7.19)
i= 2 ( h1 − s1 ) si
∞
+ ∑ (εi − ε1 ) (δqi + δri + δsi ) = 0 (7.20)
i=2
fi q1 εi − ε1 fq
ln − ln 2 1 = 0, i = 3, 4, ...,
f1qi ε 2 − ε1 f1q2
(ri + gi ) r1 εi − ε1 fq
ln − ln 2 1 = 0, i = 2, 3, (7.22)
(r1 + g1 ) ri ε 2 − ε1 f1q2
(hi − si ) s1 εi − ε1 fq
ln − ln 2 1 = 0, i = 2.3,
(h1 − s1 ) si ε 2 − ε1 f1q2
These relations allow us to solve for the equilibrium distributions
Quantum Statistics 215
q1
qi = f i exp [– β (εi − ε1 )] (7.23)
f1
gi
ri = (7.24)
r1 + g1
exp [β (εi − ε1 )] − 1
r1
hi
si = (7.25)
h1 − s1
exp [β (εi − ε1 ) + 1
s1
1 fq
where β= ln 2 1 (7.26)
ε 2 − ε1 f1q2
These equations are identified for q1 q2, r1 and s1, so that they are valid for
all i. It is often the case that the number of bosons is not restricted. In this case,
δr1 also is an independent variable. Following the same steps as before gives
instead of Eq. (7.24),
gi
ri = , ∑ ri = unrestricted (7.27)
exp (β εi ) − 1 i
In order to determine fi, gi and hi, for the translational levels, it is assumed
that the system is in a cubic box of length I, (the results are valid for other
shapes as well e.g., rectangular shape) for which the energy levels are [see Eq.
(3.173)]
2 π2 2
E= (nx + n y2 + bz2 ), nx = 1, 2 , etc. (7.28)
2ml 2
Since every set of positive, nonzero integers (nx, ny, nz) is associated with a
state, the number of states in the absence of internal degrees of freedom is
approximately equal to the volume in the first octant of the n-space. Therefore,
the expression for fi is
V 1/2
fi = (2m)3/2 ∈i ∆εi (7.29)
4 2 3
and similar expressions for gi and hi. From this, the total number Q of the
distinguishable particles and their total energy can be obtained as
Q = Σqi
3/2
m q1
=V exp (βε1 ) (7.30)
2π β
2
f1
216 Elements of Modern Physics
E = Σqiεi
3/2
m q1 3
= V exp (βε1 ) (7.31)
2π β f1 2β
2
observed that the bosons have a tendency to bunch together at low energies
[see Fig. 7.1 (a)]. Also, the number of bosons increases as T increases, at
all energies.
4. For the Fermi-Dirac distribution, si/hi is the probability for a state to be
occupied and is seen to be less than one for all εi as is required by the
Pauli exclusion principle. In general, ε3 is negative and it is convenient
to write
hi
si = (7.38)
exp [εi − ε f )/kT ] + 1
For T → 0, si/hi = 1 for εi < εf and si/hi = 0 for εi > εf. This means that
fermions occupy the lowest energy states available, subject to the
exclusion principle. For finite but small T, si/hi ≈ 1 for (εi – εf)/kT << – 1,
and si/hi ≈ 0 for (εi – εf)/kT >> 1. The quantity εf is called the Fermi energy
(which depends on T), and it plays an important role in the behaviour of
fermions. The distribution is illustrated in Fig. 7.1 (b).
5. In principle, every system of particles which interact weakly with each
other, is described by either the Bose-Einstein or the Fermi-Dirac
distribution. However, if the particles are localized (at the lattice points
for example) and their wave functions do not overlap, they can be taken
as being distinguishable (distinguished by the region of localization). In
such cases, Maxwell-Boltzmann distribution can be applied to describe
the system.
In what follows, some important physical properties of different systems
are deduced using the statistical distributions given in Sec. 7.2
1
ri/gi
1 2 1 – T = 2000 K
2 – T = 1000 K
0 1 2 3 4 5
e(eV)
(a)
218 Elements of Modern Physics
1
2 1 1–T=0K
si/hi
2 – T = 2000 K
0
0 1 2 3 4 5
e(eV)
(b)
Fig. 7.1 (a) Particle index for bosons with α2= 0, (b) particle
index for fermions with εf = 3.7 eV.
∑ exp (− E /kT ) f E
i
i i i
E= (7.40)
∑ exp (− E /kT ) f
i
i i
3
It was noted in Eq. (7.32) that Etr is kT. The average vibrational energy
2
is
∞
(7.42)
This expression can be evaluated by using Eq. (2.11) and gives
ω
Evib = 1 ω + (7.43)
2 exp (ω/kT ) − 1
where the first term is called the zero-point energy. The average rotational energy
is
∞
∑ exp (– aJ ( J + 1)/kT ) aJ ( J + 1) (2 J + 1)
j=0
Erot = ∞ (7.44)
∑
j=0
exp (– aJ ( J + 1)/kT ) (2 J + 1)
F ≈ 1 + 5 exp (– 6a/kT) + ∫6
exp (– 2ax/kT) dx,
(7.46)
where F is the denominator in Eq. (7.44). The lower limit corresponds to l = 3/2.
Carrying out the integration, we obtain for para-hydrogen,
220 Elements of Modern Physics
3 1 ω 1 ∂F
E = kT + ω + −
2 2 exp (ω/kT ) – 1 F ∂ (1/kT )
(7.47)
kT
F ≈ 1 + 5 exp (– 6a/kT) +
exp (– 12 a/kT)
2a
The values of ω and a are obtained from the spectrum of the hydrogen
molecule, and have the values
ω ≈ 0.5454 eV (7.48)
a ≈ 0.007 55 eV
The specific heat of para-hydrogen obtained from Eq. (7.47)
∂E
Cv = NAvo (7.49)
∂T
is plotted in Fig. (7.2) and is in very good agreement with the experimental
observations. It may be observed that the contribution to Cv from the rotational
energy becomes appreciable at T > ~ 75 K which corresponds to kT >
~ a, while the
contribution from the vibrational to kT >~ ω . Ordinary hydrogen is a mixture
of ortho- and para-hydrogen, there being about 25% para-hydrogen at room
1
temperature, I = for the hydrogen atom). The specific heat of the mixture is
2
a statistical average of the specific heats of the components. Its behaviour is
similar to that given in Fig. (7.2) except that the hump around
T ≈ 150 K is now absent.
4
Para-hydrogen
3
Cv/R 2
Ordinary hydrogen
1
Fig. 7.2 The specific heat of para-hydrogen (solid line) and ordinary
hydrogen (dashed line) at constant volume, as a function of absolute temperature.
Quantum Statistics 221
∑ nhv e
n=0
– nhv / kT
ε = ∞
∑e
n=0
− nhv / kT
222 Elements of Modern Physics
hv
= hv / kT (7.55)
e –1
which is the same expression encountered in Eq. (2.11) for Planck’s oscillator.
The specific heat of the system including oscillations in all the three direction is
∂
Cv = 3 N ε
δT
2 hv / λT
hv e
= 3R (7.56)
kT (e
hv / kT
− 1) 2
For large T, this expression reduced to the classical expression of 3R but at
low temperatures it decreases rapidly and goes to zero ~ T–2 exp (– hv/kT) for
T → 0. Overall the expression describes the qualitative behaviour of specific
heat quite well. However, experiments show that Cv goes to zero more gently,
as T3 near 0 K, and not as an exponential function. Still, the result clearly indicates
that quantum oscillations govern the low temperature behaviour of the specific
heat of solids.
An improved description of the specific heat of solids was given by Debye
(1912) who observed that the motion of neighbouring atoms is correlated, and
that the allowed frequencies of oscillation correspond to those of allowed
standing elastic waves in the medium. The number of the allowed modes for
the standing waves was calculated in Sec. 2.1 [see Eq. (2.8)], and is given by
8 π Vv 2 dv
dNt(v) = (7.57)
vt3
for the transverse modes (which correspond to the oscillations of atoms
perpendicular to the direction of propagation of the waves—there are two
independent directions of transverse oscillations), where vt is the velocity of
propagation for the transverse modes, and by
4 π Vv 2 dv
dNi(v) = (7.58)
vt3
for the longitudinal mode (which corresponds to the oscillation of atoms parallel
to the direction of propagation of the waves), where vl is the velocity of
propagation for the longitudinal modes, V being the volume. However, since
the medium of propagation consists of discrete atoms, Debye assumed that the
total number of frequency modes is equal to the total number of degrees of
freedom, i.e. 3N0, N0 being the number of atoms. This imposes an upper limit vm
on the allowed frequencies,
v
2 1 m
3N0 = 4π V 3 + 3 − ∫ v 2 dv
vt vl 0
Quantum Statistics 223
4πV 2 1 3
= + vm (7.59)
3 vt3 vl3
Since each mode is associated with an average energy given by Eq. (7.55),
the total thermal energy is
vm
2 1 hv
E = 4πV 3 + 3
vt vl
∫ 0 e hv / kT
−1
v 2 dv (7.60)
3 4 θ
to get E= π N 0 kT (T/θ)3 , >> 1 (7.65)
5 T
12 4 θ
and Cv = π R (T/θ)3 , << 1 (7.66)
5 T
The model predicts that the specific heat at low temperatures is proportional
to T3 , in agreement with the experimental observation.
The behaviour of Cv at other temperatures has to be evaluated numerically
from the expression
224 Elements of Modern Physics
3 θ/T
T x 4 e x dx
Cv = 9 R
θ
∫ (e
0
x
− 1)2
(7.67)
obtained from Eq. (7.61), and gives a universal curve as a function of θ/T
(Fig. 7.3). The general agreement between theory band experiments is quite
good, θ being about 100 K for lead, 160 K for sodium, 220 K for silver, 340 K
for copper, 400 K for aluminium, 640 K for silicon, and about 1860 K for
carbon (diamond). Some of the observed differences at intermediate temperatures
can be explained by taking a more realistic spectrum for the allowed frequencies.
Photon Gas
In Sec. 2.1, Planck’s theory of blackbody radiation in terms of the allowed
standing waves and the associated harmonic oscillators was discussed. A more
modern and satisfactory description is in terms of the energy distribution of the
photons regarded as massless bosons.
Since the number of photons is unrestricted, their distribution is given by
Eq. (7.27),
gi
ri = (7.68)
exp (βεi ) − 1
where εi = hv. The number of energy levels is the same as the number of allowed
standing waves given by Eq. (2.8), except that the standing waves in Eq. (2.5)
are to be interpreted as the energy eigenstates of photons with energy eigenvalue
hv. Therefore, gi is
8πV 2
gi = v dv (7.69)
c3
V being the volume. The energy density per unit volume is
8πh v dv
3
U(v) dv = 3 hv / kT (7.70)
c e −1
which agrees with Planck’s expression in Eq. (2.12)
Quantum Statistics 225
3R
Cv 2R
0
0 0.5 1.0 1.5 2.0
T/q
Fig. 7.3 The Debye specific heat as a function of T/ θ, θ being the Debye temperature.
Photon Gas
As in the case of electromagnetic waves and the photons, the elastic waves in a
solid have a quantum manifestation. The energy of these waves is in the form
quanta called phonons each of which carries a quantum of energy hv where v is
one of the allowed frequencies. These phonons are bosons, they interact with
the atoms, they are absorbed and emitted, and their total energy of the thermal
energy of the solid.
The number of phonons is unrestricted, so that their frequency distribution
is given by Eq. (7.27),
g
ri = βhv i (7.71)
e −1
Phonons are transverse or longitudinal and the number of energy levels is
given by Eqs. (7.57) and (7.58), with the upper limit vm for the frequency given
by Eq. (7.59). Therefore, the total energy of the phonon gas is
v
2 1 m hv
E = 4πV 3 + 3 ∫ hv/kT v 2 dv (7.72)
v
t vl 0 e − 1
which is the same as the relation in Eq. (7.60).
Bose-Einstein Condensation
A gas with a given number of bosons whose mass is nonzero, shows remarkable
quantum mechanical properties at low temperatures. In particular, it undergoes
a phase transition, known as Bose-Einstein condensation which is of interest
for two reasons. Firstly, it is an example which allows an exact mathematical
treatment. Secondly, the observed changes in the properties of 4He at T = 2.17 K
can be explained in terms of Bose-Einstein condensation.
226 Elements of Modern Physics
∞
1 2πV (2m)3/2 ε1/2 d ε
N= + ∫δ exp [(ε + α 2 ) / kT ] – 1
exp (α 2 / kT ) − 1 h3
(7.79)
where δ is a small positive quantity.
For T > Tc the-first term is small, e.g. at high temperatures one has Maxwell-
Boltzmann distribution for which [using Eq. (7.76) without the unit term in the
denominator]
3/2
2πmkT V
exp (α2/kT) = 2 >> 1
h N
For T < Tc, α2 is small but nonzero, and the nonsingular integral in Eq. (7.79)
can be evaluated at α2 = 0. Using the variable x = ε/kT, Eqs. (7.79) and (7.78)
given
N = N0 + N (T/Tc)3/2, T ≤ Tc (7.80)
where N0 is the number of particles in the ground state. The fraction of particles
in the ground state is
N0
= 1– (T/Tc)3/2, T < Tc (7.81)
N
and is shown in Fig. 7.4 (a). For T < Tc, a significant fraction of particles is in
the ground state, and this occupation of the zero energy and zero momentum
ground state is called Bose-Einstein condensation. The temperature Tc below
which the condensation takes place is called he condensation temperature.
The particles in the ground state have zero energy and momentum, and
hence do not contribute to the viscosity of the fluid. (Viscosity arises from the
interaction between particles—viscous flow is accompanied by the excitation
of vortices whose quantum is called a roton. The roton has a finite energy and
hence cannot easily be excited at low temperatures.) These particles, being in
the ground state, do not contribute to the total energy which therefore is obtained
from the second term in Eq. (8.79) with α2 = 0, as
∞
2πV (2m)3/2 ε3/ 2 d ε
E=
h3 ∫
δ
eε/kT − 1
(7.82)
T 5/2
E = 0.77 Nk 3/2
T < Tc (7.83)
Tc
228 Elements of Modern Physics
1.0
N0/N
0.5
0
0 0.5 1.0 1.5 2.0
T/TC
(a)
3.0
2.0
CV/R
1.0
Fig. 7.4 (a) Fraction of particles in the ground state, (b) specific heat
as a function of temperature. The solid line is for Bose-Einstein
condensation and the dashed line is the experimental
curve with Tc = 2.17 K.
at 2.17 K. Above 2.17 K, it behaves like a normal liquid and is known as helium I.
Below this temperature, it acquires some unusual properties, e.g. it flows through
capillaries without any apparent viscosity. This form is known as helium II and
many of its properties can be described by regarding it as a mixture of two
fluids, one a normal fluid and the other a superfluid which has no viscosity.
This mixture is similar to a Bose-Einstein gas with some condensation, the
superfluid corresponding to the particles in the ground state. This would explain
the zero viscosity. The identification of the two phenomena is further
strengthened by the observation that the specific heat of 4He also shows a singular
behaviour at 2.17 K. The observed specific heat has the shape of λ [see Fig. 7.4
(b)] and hence the transition is called a λ-transition while the transition
temperature is called the λ-point. It should be noted however that careful
experiments indicate that the specific heat has a logarithmic infinity at the
λ-point Tλ. This however may be due to the fact that the particles considered in
Bose-Einstein condensation were noninteracting which is certainly no the case
for the atoms of liquid helium. Finally, using V = 27.6 cm3/mole for liquid
helium in Eq. (7.78), one obtains Tc = 3.13 K compared with Tλ = 2.17 K. These
observations strongly suggest that the λ-transition is a form of Bose-Einstein
condensation.
Liquid 3He: Helium has an isotope 3He which is a fermion (it has 2 protons,
1 neutron and 2 electrons) and which liquifies at 3.2 K. It is found that 3He,
though a fermion, undergoes a transition to the superfluid state at 2.6 × 10–3 K.
This arises from the fact that two 3He atoms interact with each other and produce
a weakly-bound system at low temperatures. This bound system is a boson
which can undergo a transition to the superfluid state.
Hydrogen: The atoms of about half of the elements are bosons, i.e. they
obey Bose statistics. Even then, Bose-Einstein condensation is not a common
phenomenon. The reason for this is that the condensation takes its simplest
form only for an ideal gas in which the atoms do not interact with each other. In
real atoms, the electromagnetic interaction tends to bind them and most
substances go into the solid state long before the critical temperature for Bose-
Einstein condensation is reached. Therefore, condensation is expected in only
those systems where the interaction between the atoms is weak compared to the
zero-point energy of the atoms, e.g. in helium. An interesting possibility that is
being currently considered is the Bose-Einstein condensation of atomic hydrogen
[see Silvera and Walraven, Sc. Am. 246, 1, 56 (1982)]. It is true that under
ordinary conditions, the interaction between the hydrogen atoms is quite strong
and binds them into molecules in which the spins of the two electrons are
antiparallel. However, if the atoms with parallel electron spins are isolated, for
example, by using strong inhomogeneous magnetic fields. Pauli’s exclusion
230 Elements of Modern Physics
Bose-Einstein Condensation
In the gas phase, the Bose-Einstein condensate (BEC) remained an unverified
theoretical prediction for many years. In 1995 the research groups of Eric Cornell
and Carl Weiman of JILA, at the University of Colorado at Boulder, produced
the first such condensate experimentally.
Condensation happens when several gas molecules come together and form
a liquid. It all happens because of loss of energy. Gases are really excited atoms.
When they lose energy, they slow down and begin to collect. They can collect
into one drop. Water condenses on the lid of a pot when water is boiled. It cools
on the metal and becomes a liquid again. One would then have a condensate.
If a sufficiently dense gas of cold atoms can be produced without
condensation into liquid state, the matter wavelengths of the particles will be of
the same order of magnitude as the distance between them. It is at that point
that the different waves of matter can ‘sense’ one another and co-ordinate their
Quantum Statistics 231
0 Absorption max.
Fig. 7.7 Repeated release from the trap of parts of a Bose-Einstein condensate of
sodium atoms. Pulses of coherent matter fall in the gravitational field—the
phenomenon can be seen as an atom laser effect. The real size of the
picture is 2.5 mm × 5 mm.
(Source: http://www.nobelprize.org/nobel_prizes/physics/laureates/2001/
public.html)
kT
(a)
dN/de
(b)
0
0 e ef
Fig. 7.8 Density of levels as function of ε, (a) for T = 0, (b) kT = 0.1 εf(0).
∞
4πV (2m)3/2 ε1/2 d ε
N=
h3 ∫ exp [(ε − ε
0 f )/kT ] + 1
(7.88)
At T = 0,
εf
4πV (2m)3/2
∫ε
1/2
N= dε (7.89)
3h3 0
kT
2
π2
εf(T) ≈ ε f (0) 1 − (7.91)
12
ε f (0)
234 Elements of Modern Physics
For metals, N/V ≈ 5 × 1022 cm–3 for which Eq. (7.89) implies εf = 4.5 eV.
The actual value of εf (0) for some of the metals is 4.7 eV for Li, 2.1 eV for K,
7.0 eV for Cu, and 5.5 eV for Au. This means that the approximation in
Eq. (7.91) is adequate for most purposes (kT≈ 0.026 eV at T = 300 K). For
kT<<εf most of the electrons are in the lowest energy states allowed by Pauli’s
exclusion principle, and the electron gas is said to be degenerate (completely
degenerate at T = 0). It is interesting to note that because of the exclusion
principle, the average energy of the electron gas is quite substantial even at T = 0:
εf
∫ εε
1/2
dε
0
ε (0) = εf
∫ε
1/2
de
0
3
εf
= (7.92)
5
which is of the order of a few eV (compare with kT ≈ 0.0226 eV at room
temperature).
Specific heat of meals: An interesting property of the specific heat of metals
is that it is described quite well by the Debye theory. Since the Debye theory
includes only the phonon contributions, i.e. lattice vibrations, this implies that
the contribution from the free electrons to the specific heat of metals is small.
This is explained by the fact that, unlike the phonons, the free electrons satisfy
Fermi-Dirac statistics. When the temperature T is increased, only a few electrons
in the range |ε – εf | ≈ kT are excited to the higher energy states (see Fig. 7.8). It
is only these electrons that contribute to the specific heat, as a result of which
the contribution of the electron gas to the specific heat is quite small. Roughly
speaking, it is seen from Fig. (7.8) that the number of electrons which are excited
dN
is kT and their energy increases by an amount of about 2kT. Therefore,
dε ε =εf
kT
≈ 3R
εf
Here, we have used Eq. (7.87) for dN/dε and Eq. (7.89). A more detailed
calculation gives
π2 kT
Cve1 = R (7.95)
2 ε f
Since kT/εf is quite small at ordinary temperatures, the electronic specific
heat also is small and the total specific heat is described quite well by the Debye
theory. It should, however, be noted that the Debye specific heat at low
temperatures is proportional to R(T/θ)3 [see Eq. (7.66)] so that at sufficiently
low temperatures the electronic specific heat becomes dominant. At low
temperatures, the total specific heat is given by
12 4 3 π2 kT
Cv = π R (T / θ) + R (7.96)
5 2 ε f
and the observed nonzero limit of Cv/T as T → 0, for metals such as copper,
indicates the presence of the linear electronic contribution. Experimentally, in
the case of copper, Cv/T for T → 0 is about 0.7 × 10–3 J/mol/K2 whereas the value
predicted for copper (εf ≈ 7 eV), by Eq. (7.96), is about 0.54 × 10–3 J/mol/K2 . The
difference is a measure of the deviation of the model from the real situation.
Electrical and thermal conductivities: Some general characteristics of the
electrical and thermal conductivities of metals can be discussed in terms of the
free-electron theory of metals. This discussion will be based on the assumptions
that (i) the conducting electrons move with the velocity vf = (2εf /m)1/2 , which is
reasonable since most of the conducting electrons will be in states close to the
Fermi level, (ii) the electrons have a mean free path of λ and that they carry
information over a distance of λ (λ ≈ 500 Å) .
In the presence of an external electric field E, the electrons acquire an average
drift velocity v which is equal to half of the average acceleration εE/m multiplied
by the interval λ/v f between two collisions. Therefore, the current is
1
en (eλE/mvf) where n is the electron density. This satisfies Ohm’s law since
2
vf, being large, is essentially independent of E. The electrical conductivity is
then
σ = e2nλ/2mvf (7.97)
For calculating thermal conductivity, it is noted that since the electrons
carry information over a distance of λ, the energy carried across an area by the
236 Elements of Modern Physics
∂ε 1
electrons is ε ± (1/ 2)λ in opposite directions. Therefore, if n electrons
∂x 3
are assumed to have a velocity perpendicular to the area, the net energy
transferred across a unit area, per unit time, is
dQ 1 ∂ε
= − nv f λ (7.98)
dt 6 ∂x
(where the negative sign indicates that the energy is transferred in a direction
opposite to the gradient). From this relation, the thermal conductivity is obtained
by writing ∂ε/∂x as (∂ε/∂T) (∂T/∂x) which leads to the coefficient of thermal
conductivity K,
1 ∂ε
K= nv f λ (7.99)
6 ∂T
Since ∂ε/2T is the specific heat per electron, using Eq. (7.95) gives
π2 nk 2 λT
K= (7.100)
6m v f
It follows from Eqs. (7.97) and (7.100) that
K
=L
σT
2
π2 k
= (7.101)
3 e
which is the same for all metals. This relation is known as Wiedemann-Franz
law. The constant L, know as Lorenz number, has a value of 2.45 × 10–8 JΩ/s K,
while the experimental values of K/σT for some of the metals at 0°C are 2.31 ×
10–8 for Ag, 2.47 × 10–8 for Pb and 2.19 × 10–8 for Na.
While it is obvious that the free electrons are responsible for transporting
charge, it is suggested by the validity of the Wiedemann-Franz law that the free
electrons play a dominant role in the transfer of energy as well, in preference to
the phonons. It is also noted that the thermal conductivity of metals is in general
greater than that of insulators, sometimes by as much as two orders of magnitude.
It is therefore reasonable to say that most of thermal conductivity in meals is
due to the free electron gas.
Thermionic emission: When a metal is heated, electrons are emitted from
the surface. Thermionic emisson can be studied by subjecting the electrons to a
small potential difference and analysing the thermionic emission current as a
function of temperature.
The electrons in a metal may be regarded as particles in a potential well
with barrier at the boundary. The barrier arises from the fact that when an electron
tries to escape from the surface, its image in the surface, being of opposite
Quantum Statistics 237
charge, pulls it back. In order to escape, the electrons must then have a minimum
energy φ above the Fermi level. This energy φ is known as the work function,
and usually has a value of the order of a few eV, e.g. 2.3 eV for Na, about 4.5 eV
for Cu, etc.
An electron that is emitted must satisfy the condition (the metal surface is
taken to be perpendicular to the z-direction)
pz2
≥ εf + φ (7.102)
2m
Since the current at a point is v ρ, ρ being the charge density, the amount of
charge emitted by a unit area, per unit time is
∫
j = e vz dN (7.103)
Here dN is the number of electrons per unit volume, with momentum
between p and p + dp. It is equal to si /V, where si is given in Eq. (7.85) and
hl in Eq. (7.86). Using the relation p2 = 2mε, 2π (2m)3/2 ε1/2 dε is replaced in hi by
4π p2 dp or dpx dpy dpz. It is then integrated over only positive pz to give
∞ ∞ ∞
8e p 1
j= ∫
h3 0
dpx ∫ dp y ∫ dp z z
m exp [(ε − ε f )/kT ] + 1
0 [2 m ( ε1 + φ )1/2
(7.104)
Since φ is generally of the order of a few eV, the 1 in the denominator can
be ignored to get
∞` ∞
∞`
8e – py 2 / 2 mkT
j= 3
h ∫ dp e x
– px 2 / 2 mkT
∫ dp y e ∫
0
0 [2 m ( ε f + φ )1/2
p p2
dp z z exp − z − ε f kT
m 2m
4π
= 3
me k 2T 2 e – φ / kT (7.105)
h
This is known as the Richardson-Dushman equation and is generally written
as
j = AT2 exp ( – φ/kT) (7.106)
where A has the value 1.2 × 10 A/m /K . It is in good agreement with the
6 2 2
experiments provided (i) the constant A is modified to take into account the
possibility that the electron may be reflected when it comes across a change in
the potential near the surface, (ii) φ varies with temperature, with the crystal
direction and with surface impurities. The experimental values of A are usually
238 Elements of Modern Physics
though not always, smaller than the one predicted by Eq. (7.105), e.g. 0.4 × 106
for Cr, 0.30 × 106 for Ni, etc. in MKS units
7.6 SUPERCONDUCTIVITY
Superconductivity is an interesting phenomenon in which electrons, which are
fermions, behave like bosons. The reason for this is that under some special
conditions, pairs of electrons form weakly-bound states which exhibit properties
of Bose systems.
When the temperature of some metals, semiconductors and alloys is lowered
to a few degrees kelvin, the electrical resistance of the material suddenly drops
to zero [see Fig. 7.9 (a)]. The substance is then said to have become a
superconductor and the temperature Tc at which the transition takes place is
known as the critical or transition temperature, e.g. Tc = 0.015 K for tungsten,
3.72 K for tin, 9.3 K for niobium, and the highest known value 23.2 K for the
Nb3 Ge alloy. The transition to a superconducting state is quite sharp for a pure
and physically perfect specimen. In some cases it has been observed to occur
within a temperature range of 10–5 K. However, for impure or physically imperfect
specimens, the transition may be over a range as large as 0.1 K or more.
Superconductivity has not been observed in all substances. In particular, it
has not been detected in alkali metals, ferromagnetic substances, and relatively
good conductors of electricity such as Ag, Cu, Au. Matthias has pointed out
that superconductivity occurs only in substances which have an average of two
to eight valence electrons per atom. Also, a small atomic volume is favourable
for superconductivity. It is worth noting that an alloy may be a superconductor
even if it is composed of two metals which themselves are not superconductors,
e.g. Bi-Pd.
Some of the important properties of superconductors are the following:
1. The current in the superconductors persists for a very long time. This is
demonstrated by placing a loop of the superconductor in a magnetic
field, lowering its temperature below Tc and then removing the field.
The current which is set up is found to persist over a period longer than
two years without any attenuation.
2. The magnetic field does not penetrate into the body of the superconductor
(permeability µ = 0). This property, known as the Meissner effect, is the
fundamental characterization of superconductivity. However, when the
magnetic field B is greater than a critical value Bc (T) [see Fig. 7.9 (b)],
the superconductor becomes a normal conductor [Bc (T) is zero at T = Tc
and has the largest value at T = 0].
Quantum Statistics 239
Resistivity (ohm – m × 10 )
10
2
TC
0
0 2 4 6 8
T(K)
(a)
.09
.06
BC(tesla)
Normal
.03
Super-
conducting
0
0 2 4 6 8
T(K)
(b)
.06
C(Joule/mole/K)
.04
tin
uc
nd
rco
pe
.02
al
Su
rm
No
0
0 2 4 6 8
T(K)
(c)
Magnetic Properties
The magnetic properties of superconductors are quite complicated. In the class
of superconductors known as type I superconductors (which includes most of
the elemental superconductors), the magnetic field is excluded from the body
of the superconductors for B < Bc (T) showing perfect Meissner effect. However,
the Meissner effect disappears for B > Bc. (T).
For type II superconductors, an example of which is lead-indium alloy,
perfect Meissner effect occurs for B < B1 (T), but only a partial exclusion of the
field for B1 (T) < B < B2 (T), and a complete penetration of the field for B > B2
(T). The reason for this behaviour is that for B (T) between B1 (T) and B2 (T), the
material is in a mixed state. A close investigation of the specimen shows the
presence of small circular regions in the normal state, called vortices or fluxoids.
They are surrounded by large regions which are in the superconducting state. It
is the presence of both the states which gives rise to partial penetration of the
field. Materials with high critical temperatures tend to fall in the class of the
type II superconductors.
Since usually B2 (T) >> Bc (T), carefully-prepared type II superconductors
are used for the manufacture of high-field magnets which require almost no
power input and little cooling. Technology based on superconductors would
receive a major boost if superconductivity could be produced at higher
temperatures, say at liquid nitrogen temperature (Tb = 77.4 K). This possibility
has been considered recently.
There are two additional properties which are of interest, quantization of
flux enclosed by a superconductor and Josephson junctions, which are discussed
briefly.
Quantization of Flux
Consider a superconducting loop in which a current is circulating. The current
generates a magnetic field whose flux across the area enclosed by the loop is
quantized.
242 Elements of Modern Physics
1 2
2m (– i∇ − qA) + V ψ = Eψ (7.110)
with q being the charge of the particles, and is given by
q r
ψ(r) = exp i
ro ∫
A. d l φ(r )
(7.111)
where the integral involved is a line integral and φ(r) satisfies Eq. (7.110) in the
absence of the field, i.e. for A = 0. Now, the wave function must have the same
phase even after going around the entire loop, i.e.
q
z
A . d l = 2nπ, n = 0, ± 1, ± 2,... (7.112)
where the integration is along the entire loop. Using Stokes theorem
q
z A.d l =
q
∫
∇ × A. ds
φ= ∫ B.d S
h
= n, n = 0, ± 1, ± 2,... (7.114)
q
Thus the flux takes only quantum values of integral multiples of h/q. The
value of h/e = 4 × 10–15 Wb is quite small but is macroscopically detectable. The
quantized flux was observed experimentally by Deaver and Fairbank and
independently by Doll and Näbauer (1961). The observed flux was found to be
integral multiples of h/q with q = –2e. This is an additional confirmation of the
BCS theory according to which it is the Cooper pairs, with charge –2e each,
that are the carriers of current in superconductors.
Josephson Junctions
The discovery of Josephson junctions has made the direct macroscopic
measurement of the ratio /e possible.
Quantum Statistics 243
y1 y2
a x=0 a
Fig. 7.10 The Josephson junction and the associated wave functions.
244 Elements of Modern Physics
∂b1 (t )
i = v (cos ωt) b1 (t) (7.121)
∂t
∂b (t )
i 2 =0 (7.122)
∂t
These equations are fairly easy to solve. However, it is more instructive to
solve them perturbatively. Assuming that v is small, we replace the b1 (t) on the
right hand side by b1 (0) and integrate the two sides to get
i
b1 (t) ≈ b1 (0) – v (sin ωt ) b1 (0) (7.123)
ω
b2 (t) = b2 (0) (7.124)
Now the quantum mechanical generalization of current is
q
j = Re ψ*p ψ
m
i q
= – Re ψ*∇ ψ (7.125)
m
Substituting Eq. (7.119) for ψ with b1 (t) and b2 (t) given by Eqs. (7.123)
and (7.124), the current across the junction is
qVt v qVt
j ≈ j0 sin + δ0 − (sin ωt ) cos + δ0
ω
(7.126)
where j0 and δ0 are constants. It is interesting to note that the ac current persists
even in the absence of external radiation. On taking the time average, the
contribution of the second term is nonzero if
| qV |
=ω (7.127)
If higher order perturbations are included, there are nonzero contributions
to the current, for higher harmonics as well,
| qV |
= nω, n = 1, 2,... (7.128)
in conformity with the result in Eq. (7.116) for q = – 2e. For n = 1, v = 4.836 ×
1011 Vs–1 where V is in millivolts. Since V is usually of the order or several
millivolts, the Josephson frequency is in the microwave range.
One of the most important applications of the Josephson effect is the
determination of the fundamental constant e/ which occurs in Eq. (7.128) with
Quantum Statistics 245
Copper Ribbons
Ba
Copper Planes
Copper Planes
Ba
Copper
Oxygen
7.7 EXAMPLES
In this section, some examples which illustrate and extend the main ideas
quantum statistics are discussed.
Example 1
Consider the statistical distributions of two identical particles among three sets
of states g1= 1, g2 = 2, g3 = 1 with energies 0, ε, 2ε, respectively (both the g2
states have energy ε). The populations of these sets are (n1, n2, n3). The most
probable distribution with total energy 2ε has to be found.
Distinguishable particles: The allowed population distributions are:
1. (1, 0, 1) has two distinguishable arrangements (A, 0, B), (B, 0, A)
2. (0, 2, 0) with four possible distinguishable arrangements (AB, 0), (0, A, B),
(A, B) and (B, A) in the g2 set, where A and B represent the two
distinguishable particles.
Thus, the second distribution is twice as probable as the first distribution.
Bosons: For bosons, the (1, 0, 1) distribution has only one distinguishable
arrangement while (0, 2, 0) has three distinguishable arrangements (AA, 0),
(A, A) (0, AA) in the g2 set. Therefore, the (0, 2, 0) is three times as probable as
the (1, 0, 1) distribution.
Fermions: For fermions, of the two distribution (1, 0, 1) and (0, 2, 0), each
has only one possible distinguishable arrangement (it should be recalled that
Quantum Statistics 247
the Pauli principle forbids more then one particle in each state). Thus, the two
distributions are equally probable.
The number of distributions can be verified by Eqs. (7.4), (7.7) and (7.10).
Example 2
The extension of the classical distributions for the case of bound and ionized
atoms in equilibrium is of special interest in astrophysics and plasma physics.
The equilibrium distribution is obtained by using arguments similar to those
used in Sec. 6.4 for obtaining the Einstein coefficients A and B.
The transitions in this case are
M + + e
M0 (7.129)
where M0 is the neutral atom and M is its ion. In contrast to the discussion in
+
Sec. 6.4, here the final states form a continuum. Let N0 be the number of M0
atoms and N+ be the number of M+ ions. The number of ionization transitions to
a set of states fi, is obtained from Eq. (6.60) as
Nmn = Bmn u(ω) N0 fi
(2m)3/2 V 1/2
fi = ε dε (7.130)
4π2 3
where fi is given in Eq. (7.29) and ω = ε + EI, EI being the ionization energy
and ε is the energy of the electron. The number of reverse reactions, i.e.
recombinations, is given by the first equation in Eq. (6.60) except that now the
expression is also proportional to the number dNe of electrons in fi states
Nmn = ( Bnm u (ω) + Anm ) N+ dNe| (7.131)
Equating Nmn and Nnm, gives for u (ω)
Anm
u (ω) = (7.132)
N0
Bmn f i − Bnm
N + dN e
In analogy with Eq. (6.66) Bnm is taken to be equal to Bnm (this can be justified
by more rigorous arguments). Comparing u (ω) with the expression in Eq. (66.5)
gives
N +dN e
= exp [– (ε + EI)/kT] fi (7.133)
N0
Substituting for fi and integrating over dNe and ε, finally gives
3/2
n + ne 2πmkT
n0
= 2 exp (– EI / kT ) (7.134)
h
where n0 = N0/V, etc. This equation is known as the Saha equation (1920).
248 Elements of Modern Physics
In deriving this relation, the degeneracy of states has been ignored. The
degeneracy can be incorporated by multiplying the right-hand side by g + ge/g0
where g is the degeneracy of the appropriate state, in particular, ge = 2
corresponding to the two spin states of the electron. As an illustration it is noted
that if M0 is the hydrogen atom in the ground state and M+ is the proton, then
g+ = 2 and g0 = 4 so that the degeneracy factor is 1. The Saha equations is very
useful in plasma physics and also in astrophysics.
Example 3
As an applications of Maxwell-Boltzmann statistics, consider the ratio of para-
hydrogen to ortho-hydrogen ordinary hydrogen at the room temperature. Since
ortho-hydrogen has I = 1,
H ( para)
∑ (2 J + 1) exp [− aJ ( J + 1)]
J = 0,2 ...
= (7.135)
H (ortho) 3 ∑ (2 J + 1) exp [− aJ ( J + 1)]
j = 1, 3
where a = 0.00755/kT, kT being in eV. In evaluating the sum, the first term is
separated out and the remaining sum is converted into an integral by replacing
J by 2l and taking x = l(2l +1). Therefore,
∞
1 + ∫ e –2 ax dx
H ( para)
≈ 1
∞
H (ortho)
3 [3e –2 ax
+ ∫ e −2 ax dx]
3
2a + e −2 a
= (7.136)
3(6ae −2 a + e −6 a )
At T = 27°C, a = 2.93 and the ratio comes out to be 0.33 which is in very
good agreement with experimental observations.
Example 4
Copper has an atomic weight of 63.5, a density of 8.9 g/cc, and vt = 2.32 × 103
m/s and vl = 4.76 × 103 m/s. Its Debye temperature is
θ = hvm/k
1/3
9N0 2 1
vm = 3 + 3 (7.137)
4πV vt vl
Quantum Statistics 249
T
Cv ≈ 0 .16 R, ≈ 0.088 (7.138a)
θ
Example 5
For estimating the transition temperature Tc for 4He, if V = 27.6 cm3/mole
N
= 2.18 × 1028 m–3 (7.139)
V
On substituting this in Eq. (7.78)
Tc = 3.13 K (7.140)
It may also be noted that α2 is a small quantity for T < Tc. Using N0 given in
Eq. (7.81) gives
1
≈ N [1 – (T/Tc)3/2], T< Tc (7.141)
exp (α 2 /kT ) − 1
which, on the expanding the exponential functions gives
kT
α2 ≈ , T < Tc (7.142)
N [1 – (T / Tc )3/2 ]
Thus α2 is very small for T < Tc except when T is close to Tc.
Example 6
The electronic properties of Cu may be deduced by assuming that each atom
contributes one free electron. The atomic weight of Cu is 63.54 and its density
is 8.96 g/cc so that
N
≈ 87.44 × 1028 m–3g
V
From Eq. (7.90), the Fermi energy at 0 K is
εf (0) ≈ 7.0 eV (7.143)
The change in the Fermi energy T increases [see Eq. (7.91)] from 0 K to
300 K is very small, about – 7.8 10–5 eV and hence εf (T) can for most purposes
be taken to be a constant.
250 Elements of Modern Physics
Example 7
A very interesting application of the Fermi-Dirac distribution is to white dwarfs
and neutron stars, regarded as a degenerate gas of electrons and neutrons
respectively. A very sketchy and approximate discussion of the main ideas is
given here.
When a star contracts, a part of its gravitational energy escapes as radiation
but the remainder is retained as kinetic energy. At equilibrium, there is the
approximate relation
GM 2
≈ N εf (7.146)
R
where G is the gravitational constant, M is the mass of the star, R is its radius,
N is the number of particles and εf is the Fermi kinetic energy of the particles
which is of the same order of magnitude as the average kinetic energy. For the
highly degenerate fermions, relativistic kinematics should be used and
εf = (pf 2c2 + m2 c4)1/2 – mc2 (7.147)
where m is the mass of the degenerate particles. For obtaining pf, Eq. (7.89) is
written in the form
8πV
N= pf 3 (7.148)
3h3
It is interesting to note that while the validity of Eq. (7.89) is limited to
nonrelativistic situations, Eq. (7.148) is valid even for large velocities substituting
these relations in Eq. (7.146), gives
Quantum Statistics 251
1/2
GM 2 3 Nh3 2/3
= N c 2
+ m 2 4
c − Nmc 2 (7.149)
R 8πV
3/2
2 1 he
or M< (7.152)
2π mN2 2G
< 5 Msun
These calculations are only order of magnitude calculations. More refined
calculations provide a somewhat lower upper bound, < 3Msun.
This example demonstrates the importance of quantum distributions even
on an astronomical scale.
Example 8
As noted before, when the temperature of a material is lowered, the specific
heat shows a sudden increase when it becomes a superconductor at T = Tc. This
is because Cooper pairs are formed below T = Tc and some energy goes into
breaking them. However, the specific heat of the superconducting material goes
to zero faster than T as T → 0, unlike the linear T behaviour expected for a free-
electron gas, the reason being that the specific heat of the Cooper pairs goes to
zero faster than T as T → 0.
PROBLEMS
1. Three identical particles with total energy 6ε are distributed among four
energy levels with energies ε, 2ε, 3ε and 4ε of which the second level
has a degeneracy of 3. What are the possible distributions if the particles
are (i) distinguishable, (ii) bosons and (iii) fermions? Which is the most
probable distribution in each case?
252 Elements of Modern Physics
T 3 1 T −1
E = 3RTD − + + …
TD 8 20 TD
1 TD
2
Cv = 3R 1 − + ...
20 T
The approximation is quite good even for T ~ TD. Estimate the specific
heat of copper at T = 300 K (TD for copper is 343 K).
6. Obtain an expression for the energy of a 2-dimensional lattice. What is
the specific heat of this lattice for T → ∞ and for T → 0? The situation
is applicable for layer structures such as graphite whose specific heat at
low temperatures is proportional to T2.
7. For Cu, the lattice specific heat a low temperature has the behaviour for
Cv ~ 4.6 × 10–5 T3 J/mol K. Estimate the Debye temperature for Cu.
8. Given that the Debye temperature for diamond is 1860 K, what is the
specific heat of diamond at room temperature?
9. What is the number of excited phonons and the average energy per phonon
2
at a given temperature? Show that the average goes to hv for T → ∞ and is
3 max
proportional to T for T → 0.
10. Show that the average kinetic energy of the electrons emitted in thermionic
emission is 2kT, and that the average value of the square of velocity
perpendicular to the surface is 2kT/m.
11. Calculate the Fermi energy at 0 K, of silver (density 10.5 g/cc, atomic
weight ≈ 107.87) and sodium (density 0.97 g/cc, atomic weight ≈ 22.99),
Quantum Statistics 253
assuming one free electron per atom. What is the ratio of εf (T)/εf (0) for
these elements at T = 300 K?
12. Show that the electronic specific heat of Cu at room temperature is very
small compared to its lattice specific heat.
13. If electrons are treated as distinguishable particles, at what temperature
would they have an average energy of 5.5 eV (i.e. the Fermi energy of
silver)?
14. Given that the Fermi energy of Cu is 7.0 eV at room temperature, what is
the number of electrons per unit volume with energy greater than
8.0 eV?
15. Given that the electrical conductivity of aluminium is 3.55 × 107 Ω–1 m–1
estimate its thermal conductivity at room temperature.
16. What is the minimum frequency of radiation which can break apart
Cooper pairs in niobium?
17. A microwave radiation of frequency 1010 Hz is incident on a Josephson
junction. What is the minimum voltage across the junction for which a
jump in the current is observed?
8
Solid State Physics
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 255
S. H. Patil, Elements of Modern Physics,
https://doi.org/10.1007/978-3-030-70143-7_8
256 Elements of Modern Physics
Ionic Bonds
As in the case of molecules, these bonds are formed when it is energetically
favourable for a valence electron to be transferred from one atom to another
resulting in a net electrostatic attraction between the ions. However, in the solid
there are many more ions and the electrostatic interaction of an ion with all the
other ions should be taken into account. This is done by writing the electrostatic
energy per ion pair as
αe 2
Vel = − (8.1)
4πε 0 R
Solid State Physics 257
where R is the distance between the nearest neighbours and α is called the
Madelung constant. For example, in the case of NaCl, KCl, etc. (but not CsCl),
the crystal structure is what is called face centred cubic structure (Fig. 8.1) with
an ion pair associated with each lattice point. The electrostatic energy per ion
pair is obtained by writing it as a sum of terms representing the interaction of a
given ion with the nearest neighbours, the next nearest neighbours, etc.
Fig. 8.1 The KCl crystal structure (fcc) with K+ denoted by circles and Cl– by crosses.
e2
Vel = − (6 − 12/21/ 2 + 8/31/ 2 − 3 + ...)
4πε 0 R
e2
≈ − (1.75) (8.2)
4πε 0 R
where R is the distance between the nearest neighbours and the Madelung
constant turns out to be α ≈ 1.75.
To get an idea about the details of the ionic bonds, consider solid KCl. As
in the case of the KCl molecule, here also it takes an energy of 0.54 eV to
transfer one electron from the K atom to the Cl atom (see Sec. 5.6). Representing
the van der Waals repulsion, which becomes important when the electronic
wave functions overlap, by b/Rn, the energy per ion pair is given by
14.4 α b
Epair = 0.54 − + n (8.3)
R R
where α ≈ 1.75, R is in Å and the energy is in eV (the energy in Eq. (8.3) is with
respect to the energy of the isolated neutral atoms, taken to be zero). The energy
per pair, Epair, is known as cohesive energy per ion pair. The equilibrium condition
for the ions is
dE
=0 (8.4)
dR
258 Elements of Modern Physics
which implies
14.4 α 1
Epair = 0.54 − 1− (8.5)
R0 n
R0 being the equilibrium separation. From a knowledge of R0 ≈ 3.14 Å and
Epair ≈ – 6.67 eV, one gets n ≈ 10 which is in approximate agreement with what
is expected from a detailed theoretical analysis.
The ionic bind is quite strong, the binding per pair of atoms being about
5 eV. This leads to rather high melting temperatures for ionic crystals, e.g. 801°C
for NaCl.
Covalent Bonds
Covalent bonds discussed earlier in the context of molecules (see Sec. 5.6), are
important in the formation of solids also. These bonds arise when two atoms
find it energetically favourable to share their electrons (this is particularly true
for identical atoms such as two Cl atoms). In such cases the shared electrons are
found preferentially between the atoms, with opposite spins as required by Pauli’s
principle, providing each atom with a complete shell. These bonds are especially
important in group IV elements with half-filled shells, e.g. C, Si, Ge, etc. which
can accommodate eight electrons in their outermost shells. These atoms can
form four covalent bonds which are directional. Every atom may be considered
to be at the centre of a tetrahedron, sharing an electron with each of the four
nearest neighbours which are at the four corners of the tetrahedron. Since a
covalent bond involves the sharing of two electrons, one from each atom, this
allows the atoms to have closed shells with eight electrons.
Covalent bonds are usually quite strong (binding energy is a few eV per
bond) and directional. As a consequence, crystals with covalent bonds are hard
but brittle, and have a high melting point. This is especially so for diamond,
which has a cohesive energy of about 7 eV per atom. It is the hardest material
known, and has a melting point of more than 3550°C.
If the bonds are between different types of atoms, the electrons may spend
more time with one of the atoms, so that the bonds are partially ionic and partially
covalent. An important example of this is ZnS (zinc blende) which also has
tetrahedral structure but with different atoms at the centre and the corners, e.g.
Zn at the centre and S at the corners. In this case, the bonds are partially ionic
and partially covalent.
Metallic Bonds
As was pointed out in the discussion of the free-electron theory of metals, the
valence electrons in metallic atoms, being loosely bound, escape from the atom.
These essentially free electrons provide a medium of negative charge which
Solid State Physics 259
helps to bind the positive ions. That this leads to a lower energy state can be
seen from the following arguments.
An electron in an isolated atom is confined to a small volume around the
nucleus. This confinement gives rise to an uncertainty in the momentum,
∆p ~ /r, where r is the radius of the atom. Consequently, the electron has a
fairly substantial amount of kinetic energy, of the order of several eV. However,
in the crystalline, metallic state, the electrons are essentially free to be anywhere
in the entire crystal. As a result, there is a considerable reduction in their kinetic
energy. This is the source of metallic bonding.
The bond between two metallic atoms is somewhat weaker than ionic or
covalent bonds. This leads to relatively low melting points, for example 63°C
for K. However, the cohesive energy of metals is fairly large since each valence
electron interacts with several ions. The metallic bonds are not directional which
allows the planes of atoms to slide over each other quite easily. Hence, metals
are found to be ductile and malleable rather than brittle. The existence of
essentially free electrons gives rise to high electrical and thermal conductivity
for metals.
Hydrogen Bonds
The hydrogen atom has only one electron and would normally form a covalent
bond with only one other atom. However, if the other atom is strongly
electronegative, the electron may be transferred to the electronegative atom.
The remaining proton being small in size, about 10–15 m compared to the usual
atomic size of 10–10 m, can bind only two neighbouring negative ions (the two
spheres would be glued together by the proton in-between). This gives rise to
what is known as the hydrogen bond which connects only two atoms.
The hydrogen bond is important in the formation of ice and in the
polymerization of hydrogen fluoride.
260 Elements of Modern Physics
It may be noted that in a real crystal, the actual binding is due to a mixture
of different types of bonds though one of them may be predominant.
D C E
A B
that the atoms are hard spheres and that the nearest neighbours touch each other,
we can deduce the packing fraction for a given crystal pattern.
Simple cube: The simple cubic structure is perhaps the simplest possible
form, tough polonium is the only element which has this structure. The reason
for this is that the cubic structure is rather an open form with a considerable
amount of empty space.
Tetragonal P Tetragonal I
31/ 2 π
f= ≈ 68% (8.7)
8
It should be noted that some binary compounds such as CsCl, have a body
centred cubic structure, with one type of atoms (e.g., Cl) at the centres, and the
other type of atoms (e.g.,) at the corners. However, since these atoms are different
the lattice is a simple cubic lattice with a pair of Cs and Cl atoms associated
with each corner.
Close-packed structures: The most efficient packing of atoms in a plane is
the one shown in Fig. 8.4(a), with each atom touching six atoms in the plane.
There are two ways in which the successive layers can be placed over one
another. The second layer can have atoms in positions marked as B (or
equivalently C). The third layer may have atoms in positions A leading to the
hexagonal close-packed (hcp) structure [Fig. 8.4(b)]. The positions of the atoms
in the successive layers of this structure may be indicated by ABABAB... .
Examples of elements which have this structure are Cd, Mg, Ti, Zn, etc. On the
other hand, the third layer may have atoms in positions C leading to the face-
centred cubic (fcc) structure. The positions of the atoms in the successive layers
may be indicated by ABCABC... . The close packed layers of the face-centred
cubic structure, are in the body diagonal planes [Fig. 8.4(c)]. Some examples of
elements that have the face-centred cubic structure, are Cu, Ag, Au, Al, Pd, and
Pt. An interesting example is that of NaCl, KCl, etc. (Fig. 8.1) which consist of
two interpenetrating fcc sublattices, one of them made up of Na+ or K+ ions and
the other of Cl– ions, as shown in Fig. 8.1.
Both the close-packed structures, the hexagonal close-packed structure and
the face-centred cubic structure have a coordination number of 12 and have the
highest packing fraction. The packing fraction for the fcc structure is obtained
16π 3
by noting that four atoms belong to each cell and occupy a volume of R .
3
The length of a side of the cube is 23/2 R so that the packing fraction f is
Solid State Physics 263
π
f= ≈ 74% (8.8)
3(21/ 2 )
For the hcp structure, six atoms belong to each cell and occupy a volume of
8πR3. The area of the base of the hexagon is 6 (31/2) R2 while the height of the
hexagon is 4(2/3)1/2 R, from which the packing fraction is again found to be
π
f= ≈ 74% (8.9)
3(21/ 2 )
A A A A
B B B
C C
A A A
B B
C C C
A A A A
(a) (b)
(c)
concept is to regard each atom of one cube (e.g., a corner of the second cube) as
being at the centre of a tetrahedron formed by the four nearest neighbours which
belong to the other cube [see Fig. 8.5(b)]. This also brings out the fact that each
atom in this structure usually forms four covalent bonds with its four nearest
neighbours. Some binary compounds such as ZnS also crystallize in the diamond
structure, with Zn atoms forming one fcc and S forming the other fcc. In these
cases, the bonds are partly ionic and partly covalent. Other compounds with
this structure are SiC, CdS, InSb, GeP, etc.
(a) (b)
Fig. 8.5 Diamond structure, (a) circles represent an fcc lattice and
crosses represent the fcc lattice obtained by translating the
first lattice along the body diagonal by 1/4 the diagonal,
(b) a typical tetrahedron in the diamond structure.
The coordination number for the diamond structure is 4. The packing fraction
may be obtained by noting that eight atoms belong to the cell (4 are inside,
1 1
× 6 in the faces and × 8 at the corners) and that the diagonal is equal to
2 8
8R. This leads to a packing fraction f of
32π 3
f= R /(8R /31/ 2 )3
3
31/ 2 π
= ≈ 0.34 (8.10)
16
which means that the packing in diamond structure is rather loose.
where u′, v′ and w′ are integers. For example, the point E in the two dimensional
lattice in Fig. 8.2 is represented by 2a + b. The direction of the vector is then
represented by a set of numbers [u, v, w] which is obtained by multiplying u′, v′
and w′ by the lowest common denominator. If some of the components are
negative, this is indicated by a bar over the number, e.g., if u is negative, the
direction is given by [| u |, v, w] .
A crystal plane is characterized by the intercepts u′, a, v′ b and w′ c along
the three axes. The reciprocals 1/u′, 1/v′ and 1/w′ are reduced to the simplest
integers h, k, l by multiplying the reciprocals by their lowest common
denominator. The plane is then denoted by the set of numbers (h, k, l), called
the Miller indices. If some of the intercepts are along the negative axes, this is
indicated by a bar over the corresponding index, e.g., if u′ is negative, the Miller
indices are (| h |, k , l ) . It is clear from this that parallel planes (with the
corresponding intercepts having the same sign) are represented by the same
Miller indices. When an intercept is at infinity, the corresponding Miller index
is zero. For example, (1, 0, 0) represents a plane parallel to the yz plane, with a
positive intercept along the x-axis, while a plane with intercepts – 2a, 3b and
parallel to the z-axis, is denoted by the Miller indices ( 3 , 2, 0).
Diffraction by a Lattice
Crystal structures are determined experimentally by analysing the diffraction
of x-rays by a crystal. Since x-rays have a wavelength of about 1 Å, the atoms in
a crystal serve as a grating and produce diffraction maxima. The measurement
of the positions and intensities of these maxima gives information about the
crystal structure. The conditions for the maxima were obtained in terms of the
Bragg condition in Eq. (2.38) by considering the superposition of scattering
from different planes. This is an over-simplified picture. Here a more rigorous
and complete derivation of the maxima is presented by looking at the
superposition of scattered waves from different atoms.
Consider a radiation described by B exp [i (k0.r – ωt)], | k0 | = 2π/λ, incident
on a lattice with atoms at points
rn = ua + vb + wc, u = 0, ..., n1; v = 0, ..., n2;
w = 0, ..., n3 (8.12)
The scattered wave will have the same wavelength as the incident beam,
but will propagate in some other direction. Since the scattered beam has the
same phase as the incident beam at the point of scattering, it is given by
A= ∑ B′ exp [ik .r ] exp [ik.(r − r )],
n
0 n n
2π
| k | = | k0 | =
λ
266 Elements of Modern Physics
= B′ exp [ik . r] ∑
u , v, w
exp [iq . (ua + vb + wc)] ,
q = k0 – k (8.13)
The summation leads to
E 3p
3s
0 5 10
Interatomic separation in Å
(a)
30
20
E in eV
10
Energy gap
Fig. 8.6 (a) Energy bands in the tight binding approximation, (b) Brillouin
zones and energy bands in the nearly free electron approximation.
a = 2Å, V1 = 1 eV.
be 2N levels for an s state and 6N levels for a p state (two spin states for each
electron). Since N is very large, one gets bands of closely-spaced energy levels,
Solid State Physics 269
called energy bands which in general are separated by some energy gaps. The
spread for the inner-lying levels will be small since they are not greatly influenced
by the presence of other atoms, and it will be larger for the outer levels [sec Fig.
8.6 (a)]. Since the spread is determined by the structure of each atom and the
interatomic distance, the density of levels in a band will increase with the number
of atoms (keeping the interatomic distance constant). It may happen that some
outer-lying bands overlap and this will have a profound influence on the
properties of the crystal.
The tight-binding approximation is reliable mainly for narrow bands of
low-lying energy levels, for which the effect of the interatomic interaction is
small.
2 ∂2
− ψ( x) + V1 cos (2π x/a) ψ ( x) = Eψ ( x) (8.22)
2m ∂x 2
The general form of the solutions of such an equation is given by the Bloch
theorem which states that ψ(x) can be written in the form
ψ(x) = exp (ikx)uk(x) (8.23)
where uk(x) is periodic with period a. If the number of lattice points is N, a
periodic boundary condition is imposed on ψ that ψk(Na) = ψk (0). This is
equivalent to closing a linear chain and implies that the allowed values of k are
2πn
k=, n = 0, ± 1, ... (8.24)
Na
These allowed values of k are conveniently divided as follows, into what
are known as the Brillouin zones:
2πn
k= , n = 0, ± 1, ± 2, ... N /2,1st Brillouin zone ,
Na
270 Elements of Modern Physics
eikx eik ′x
ψ(x) = A + B (8.26)
L1/2 L1/2
where for the sake of definiteness we take k′ ≤ k. We substitute this expression
in Eq. (8.22), multiply the equation by e– ikx or e– ik′x and integrate to obtain
2k 2 1
A + V1 B = EA
2m 2 2π (8.27)
2 2
k − k′ =
k′ 1 a
B + V1 A = EB
2m 2
In these equations, the interaction connects only the states which satisfy
the condition
2π
k – k′ = .
a
This is related to the fact that for scattering by a one-dimensional lattice, a
Bragg maximum is obtained for precisely the same condition [see Eq. (8.15)].
The interaction which gives rise to the scatting, is also responsible for mixing
the two plane-wave states in Eq. (8.26). Solving the two homogeneous equations
gives
1/2
2 2 4 ( k 2 − k ′2 ) 2 1 2
E= ( k + k ′2 ) ± + V1 (8.28)
4m 16 m 2 4
2 2 (k ′2 − k 2 ) 2 (k 2 − k ′2 ) 2 1 2
1/2
B
= ± + V1
A V1 4m 16 m 2 4
2π
where k′ = k – . Clearly, the changes introduced, by the perturbation V1 are
a
significant mainly for | k | ≈ | k′ | and hence for k ≈ π/a. For k < π/a, the negative
sign corresponds to | A | ≥ | B |, which is therefore associated with the 0 ≤ k ≤ π/
a branch of the solutions, while for k > π/a, the positive sign corresponds to
Solid State Physics 271
| A | ≈ | B |, which is associated with the π/a < k ≤ 2π/a branch of the solutions.
The most important feature of these branches is that there is an energy gap of V1
at k = π/a between the two Brillouin zones described by these branches, and
there are 2N allowed states (including the negative k states) in each zone. This
is qualitatively similar to the energy bands. It also follows from Eq. (8.28) that
since k – k′ = 2π/a, ∂E/∂k = 0 at k = π/a. The energy bands are illustrated in
Fig. 8.6(b). The gap at k = 2π/a comes from the V2 term etc.
In the case of the three dimensional problem the condition in Eq. (8.27) is
modified to read k – k′ = q where q is a reciprocal lattice vector defined in
Eq. (8.16). The corresponding Brillouin zones are obtained form the requirement
that | k | = | k′ | at the boundaries. This leads to the condition 2k . q – q2 = 0
which defines the boundaries as some plane perpendicular to the reciprocal
lattice vectors q. The perturbations of energies at the edges of the Brillouin
zones lead to distorted equal-energy surfaces in the k-space. In particular, the
electrons in a crystal occupy the lowest energy levels at T = 0 K, subject to the
Pauli principle. The surface of the region of all occupied states in the k-space is
called the Fermi surface. Since the energy distortions are prominent mainly
near the surfaces of the Brillouin zones, the nearness of the Fermi surface to the
surfaces of the Brillouin zones, and its shape are of importance for the
understanding of the properties of electrons in crystals in general and metals in
particular.
Another property of the energy bands worth noting is the density of states.
The free particle energy density is proportional to E1/2 [see Eq. (7.29)]. However,
the periodic potential distorts the energy levels in such a way that there are no
energy levels in the energy gaps and the density of states goes to zero at the
bottom and the top of the energy band.
For obtaining information about the energy bands and the density of states,
x-rays are used to knock out electrons in the energy bands. An analysis of the
x-rays emitted when electrons for higher energy bands undergo transitions to
these vacant levels, provides information about the energy bands and the density
of states.
Effective Mass
A useful idea in the band theory of solids is that of the effective mass of an
electron in a solid. One is led to this idea in an effort to simulate an electron in
a periodic potential by a free electron but with an effective mass.
The energy of a free electron is given by
2 k 2
E= (8.29)
2m
272 Elements of Modern Physics
2
m= (8.30)
(∂ 2 E / ∂k 2 )
This definition may be extended to apply to a particle in a periodic potential
so that the effective mass of an electron in a one-dimensional crystal is
2
m* = (8.31)
(∂ 2 E / ∂k 2 )
Here, however, m* is a function of k. As can be seen from Fig. 8.6(b), m* is
positive near the bottom of each zone, negative near the top of each zone and is
infinite at the point of inflection. The large effective mass can be interpreted as
being due to the strong binding force between the electron and the lattice for
some k values, which makes it difficult to move the electron. The negative
mass may be interpreted in terms of the Bragg reflection when k is close to π/a,
2π/a, etc. on account of which a force in one direction, because of reflection,
leads to a gain of momentum in the opposite direction.
A detailed analysis shows that when an external electric field E is applied,
the acceleration of the electron is given by
1 ∂2 E
a = − e | E | 2 2
(8.32)
∂k
which again simulates a free particle motion with the effective mass m* given
in Eq. (8.31). For a three dimensional crystal, the anisotropy is taken into account
in the relation
∑ m *ij a j = – e Ei (8.33)
j
2
m*ij =
(∂ 2 E / ∂ki ∂k j )
The concept of effective mass provides a satisfactory description of the
charge carriers in crystals. In normal circumstances, the conduction of current
is by the electrons, particularly in the case of crystals in which an energy band
is only partially filled (e.g., alkali metals for which the band is only half-filled).
On the other hand, consider a band which is nearly full except for a few vacancies
near the top of the band. This situation of a full band with the vacancies in the
negative charge, negative mass states may be regarded as corresponding to the
presence of positive charge, positive mass particles. These hole states with
positive charge, also act as charge carriers. In elements like Be, Zn, Cd, etc. It is
Solid State Physics 273
the hole states which are the dominant charge carriers and hence they have
positive Hall coefficients (see Example 3 in Sec. 8.8).
2s 2s
1s 1s 1s
Metal Insulator Semiconductor
(a) (b) (c)
Fig. 8.7 Schematic illustration of the energy bands for (a) metals,
(b) insulators with an energy gap, and (c) semiconductors with a small gap.
The third case [Fig. 8.7(c)] is qualitatively similar to that of insulators except
that the energy gap between the conduction band and the valence band is much
smaller, 1.1 eV for is and 0.7 eV for Ge. At 0 K, all the electrons are in the
valence band and the conduction band is empty, and the solid behaves like an
insulator. However, at room temperatures, an appreciable number of electrons
are excited to the conduction band (kT ≈ 0.026 eV compared to the energy gap
which is about 1 eV). These electrons can carry charge. Simultaneously, the
electrons in the valence band can undergo transitions to the vacant states left
behind by the transitions to the conduction band. Effectively, the holes (or the
vacancies) serve as carriers of positive charge. The conductivity of these solids
lies between those of metals and insulators, and they are known as
semiconductors.
An important characteristic which distinguishes metals from semiconductors
is the temperature dependence of their conductivities. As the temperature is
raised, more and more phonons are excited, which can scatter electrons and
hence reduce their mobility. Therefore the conductivity of metals generally
decreases as temperature increases. However, in the case of semiconductors,
the decrease in the mobility is more than compensated by the increase in the
number of carriers, electrons as well as holes. As a result, the conductivity of
semiconductors increases (at moderate temperatures) as temperature increases.
8.4 SEMICONDUCTORS
As mentioned before, semiconductors are crystals whose valence band is
completely filled but which have a small energy gap (∆E ~ 1 eV) between the
Solid State Physics 275
conduction band and the valence band. Their conductivity is in-between that of
metals (~ 108 Ω–1 m–1) and that of insulators (~ 10–11 Ω–1 m–1), and increases
with temperature. Because of the narrowness of the energy gap and the proximity
of the energy levels of the impurity to the valence and conduction bands,
semiconductors have rather striking electronic properties which make them very
useful in the development of sophisticated electronic equipment. Here the
positions and populations of the semiconductor energy levels which determine
their electronic properties are discussed.
It is useful to classify semiconductors into two categories. The class of
semiconductors which are pure, such as silicon, germanium (which are group
IV elements), GaAs, PbS, etc., are known as intrinsic semiconductors. In the
second class of semiconductors known as extrinsic (or impurity) semiconductors,
the properties of the semiconductors are modified by the introduction of carefully
controlled amounts of impurities.
To see how the impurities affect the properties of semiconductors, consider
the specific examples of silicon and germanium. These are group IV elements
which have diamond structure in which each atom has a covalent bond with
each of the four nearest neighbours at the corners of a tetrahedron. If a small
amount of a group V element, such as phosphorus, arsenic or antimony, is
introduced during the formation of the crystal, the group V atom will take the
place of one of the group IV atoms and form four covalent bonds with the
nearest neighbours. However, since it has five electrons in the valence shell,
the fifth electron is only weakly bound to the atom. They i.e., the ‘fifth’ electrons
occupy localized energy levels which are just below the conduction band [see
Fig. 8.8(a)]. The electrons in these levels are easily excited to the states in the
conduction band and serve as current carriers. Since group V atoms donate
electrons for conduction they are known as donors, and the new energy levels
just below the conduction band as donor levels. The charge carriers in this case
being negatively charged, the corresponding extrinsic semiconductors are known
as n-type semiconductors. Alternatively, if a small amount of a group III element
such as boron, aluminium or indium is introduced, the group III element will
form only three covalent bonds with the nearest neighbours. Thus, there is a
vacancy or a hole associated with each of these atoms. Since an electron in
these states would be fairly tightly bound, the vacant states provide localized
energy levels which lie just above the valence band [see Fig. 8.8(b)]. The
neighbouring electrons can easily be transferred to these levels, as a result of
which holes are created in the valence band. Since the states near the top of the
band have negative mass (see Sec. 8.3), these holes behave as positive mass,
positive charge carriers. The group III impurity atoms are known as acceptors
and the new energy levels just above the valence band as acceptor levels. The
charge carriers in this case being positively charged, the corresponding extrinsic
semiconductors are known as p-type semiconductors.
276 Elements of Modern Physics
Extra a hole
electron
P B
Acceptor level
Valence band Valence band
(a) (b)
Fig. 8.8 Extrinsic semiconductors (a) n-type with P as the donor atom,
(b) p-type with B as the acceptor atom.
The electronic properties of the semiconductors are influenced by the
positions of the Fermi energy and the concentrations of the charge carriers.
Assuming that (εc – εf) >> kT, the unit term in the denominator can be
neglected and the integral evaluated [substitute (ε – εc) = x2]. This then gives
2(2πme* kT )3 / 2
nc = exp [(ε f − εc ) / kT ] (8.36)
h3
Solid State Physics 277
For obtaining the number of holes in the valence band it is noted that the
probability that a state is not occupied by an electron, is
1
Ph = 1 −
exp [(ε − ε f ) / kT ] + 1
1
= (8.37)
exp [(ε f − ε) / kT ] + 1
In analogy with Eq. (8.34), the number of hole states in the valence band
per unit volume, is taken to be
4π(2mh*)3 / 2
dNv = 3
(εv − ε)1/ 2 d ε (8.38)
h
where mh* is the effective mass of the holes in the valence band and εv is the
highest energy in the valence band. The number of holes in the valence band,
per unit volume, is
εv
4π(2mh*)3 / 2 (ε v − ε)1/ 2 d ε
nh =
h3
∫ exp [(ε f − ε) / kT ] + 1
(8.39)
−∞
Assuming that (εf – εv) >> kT, the unit term in the denominator can be
neglected, and this leads to
2(2πmh* kT )3 / 2
nh = exp [(εv − ε f ) / kT ] (8.40)
h3
It is interesting to note that
ε − εv 1 3
ln σ = − c + ln T + c (8.49)
2k T 2
where c is a constant. Here, it has been assumed that the mobilities are
independent of T. Actually, they do vary as a function of temperature. However,
the main variation is due to the 1/T term and a plot of ln σ as a function of 1/T
gives an approximate straight line. The slope of the straight line gives an
estimation of the energy gap (εc – εv) of the semiconductor.
It may be noted me* ≈ 0.25 m, mh* ≈ 0.3 m for Si and me* ≈ mh* ≈ 0.1 m for
Ge, m being the electron mass. These values imply that at a temperature of
300 K, the carrier concentrations are about 2.3 × 1015 m–3 for Si and about 1018
m–3 for Ge. The intrinsic conductivity at this temperature has the values of about
10–4 (Ω m)–1 for Si and about 0.1 (Ω m)–1 for Ge.
1
nd = 1 − Nd (8.50)
exp [(ε d − ε f )/kT ] + 1
From the condition that the number of electrons in the conduction band is
equal to the total number of vacancies in the donor levels and the valence band,
one gets
c0(me*T)3/2 exp [(εf – εc)/kT] = c0(mh*T)3/2 exp [(εv – εf)/kT)
Nd
+ (8.51)
exp [(ε f − ε d ) / kT ] + 1
where c0 = 2(2πk)3/2/h3. Now εc – εd ≈ 0.01 eV for Ge and about 0.045 eV for Si,
and for the cases of practical interest Nd is of the order of 1022 m–3. So, at ordinary
temperatures, most of the electrons in the conduction band are from the donor
levels. For T → 0, the unit term in the denominator can be neglected giving
c0 (me*T)3/2 exp [(εf – εc)/kT] ≈ Nd exp [(εd – εf)/kT] (8.52)
which leads to
1 1 Nd
εf =(ε d + ε v ) + kT ln 3/ 2
(8.53)
2 2 c0 (me* T )
where c0 = 2(2πk)3/2/h3. At T = 0, the Fermi level lies halfway between εc and εd.
At room temperature, εf is below εd for the cases of interest and most of the
donor atoms are ionized. In this region the number of vacancies, i.e., rhs of
Eq. (8.51) can be taken to be Nd to get
Nd
εf = εc + kT ln 3/ 2
, (ε d − ε f ) kT (8.54)
c0 (me* T )
For example, in the case of Si doped with a donor impurity to the extent of
1022 m–3, the Fermi energy at 300 K is εf ≈ (εc – 0.15) eV. At higher temperatures,
1
a detailed analysis of Eq. (8.51) shows that εf tends to the value (ε + ε ), i.e.,
2 c v
the value for the intrinsic semiconductor. The conductivity for n-type of
semiconductors is mainly due to the electrons in the condition band (at not very
high temperatures) and is given by
σ ≈ eneµe (8.55)
which leads to
280 Elements of Modern Physics
1 1 Na
εf = (ε a + ε v ) − kT ln 3/ 2
for T → 0 (8.58)
2 2 c0 (mh* T )
so that at T = 0, the Fermi level is half way between εa and εv. At room
temperature, essentially all the acceptor levels are occupied and so
Na
εf = εv – kT ln 3/ 2
, (ε f − ε a ) >> kT (8.59)
c0 (mh* T )
For Si doped with an acceptor impurity to an extent of 1022 m–3, the Fermi
energy at 300 K is given by εf = (εv + 0.15) eV. At higher temperatures εf tends
1
to the value of (ε + ε ). The conductivity of p-type semiconductors is primarily
2 c v
due to the holes in the valence band, and therefore one has as in Eq. (8.56),
ln σ = – ln {exp [(εa – εf)/kT] + 1} + c2 (8.60)
where c2 is a constant. Plotted as a function of 1/T, the behaviour of ln σ is
similar to that for n-type semiconductors.
εf for pn Junctions
Junctions between p-type and n-type semiconductors play an important role in
the development of semiconductor devices. A pn junction is a junction at the
microscopic level between a p-type and an n-type semiconductor. Such junctions
are developed by the diffusion of impurity atoms.
Solid State Physics 281
–
p-type n-type
ec ec V
ef –––––
+ ++ + +
ef
ev ev +
(b)
ne e
0
– xp xn
– np e
(c)
Fig. 8.9 The pn junction, (a) before equilibrium, (b) after equilibrium, and
(c) charge density across the boundary.
The Fermi level of an n-type semiconductor is close to εc while that of a
p-type semiconductor is close to εv (Fig. 8.9). Therefore, there are many more
electrons in the conduction band of the n-type semiconductor and many more
holes in the valence band of the p-type semiconductor. As a result, when a pn
junction is formed, electrons diffuse from the n-type to the p-type semiconductor
and occupy the vacant states there. Similarly, the holes diffuse from the p-type
to the n-type semiconductor and allow the electrons to occupy their vacant
states. As a result, there is a narrow depletion region at the boundary where
there are no charge carriers. Instead, there is a thin layer of positive charge on
the n-side (due to positive ions left behind) and a thin layer of negative charge
on the p-side (due to extra electrons occupying the acceptor levels). This double
layer of charges creates a potential difference across the junction which opposes
the flow of electrons from the n-type to p-type and of holes from the p-type to
n-type semiconductor. The flow of electrons and holes stops when the Fermi
energy on the two sides has the same value (see Fig. 8.9). It must be appreciated
that the shifting of the Fermi energy levels is due to the electric potential across
the junction and that the relative positions of the various energy levels on the
two sides, remain unchanged. The potential difference, across the boundary is
equal to the difference in the Fermi levels of the separate n-type and p-type
semiconductors, and is given by
1 1 Nd Na
V0 = (εc − ε v + ε d − ε a ) + kT ln 2 2 3/ 2
2 2 c0 (me* mh* T )
for T → 0 (8.61)
282 Elements of Modern Physics
Nd Na
and V0 = εc – εv + kT ln 2 2 3/ 2
(8.62)
c0 (me* mh* T )
at room temperature.
The width of the depletion region can be estimated by the following model
calculation. It is assumed that there is a width of xp in the p-type and xn in the
n-type of semiconductor. Using Maxwell’s equation ∇ ⋅ (κε0 E) = ρ, we get
κε0E = ρx + c (8.63)
x 0
e n
( x + x p ) dx
κε 0 ∫0 ∫
V0 = – ne ( x − xn ) dx − nh
− xp
e
= [ne xn2 + nh x 2p ] (8.64)
2κε0
The condition of overall neutrality gives
e ne nh 2
V0 = ( xn + x p ) (8.66)
2 κε n
0 e + nh
Q 2 ne + nh
V = (8.67)
2 κε 0 e ne nh
From this, the variable capacitance per unit area, is
dQ
C=
dV
Solid State Physics 283
1 2 κ ε0 e ne nh
= (8.68)
2V 1/ 2 ne + nh
The variation of C as V–1/2 is the basis of the variable capacitance diodes
(varactors) which are used in frequency locking and frequency modulation
circuits.
Semiconductor Diodes
The pn junction can be used as a rectifier, a voltage stabilizer and in high
frequency circuits.
Consider again the pn junction discussed in Sec. 8.4. Though the net current
across the junction is zero, the electrons and the holes diffuse across the boundary
but the flow in each direction is the same. The densities of electrons and holes
in the corresponding states (similarly located with respect to εc or εy) on the two
sides are related by Boltzmann statistics,
ne ( p)
= exp (– eV0/kT) (8.69)
ne (n)
nh (n)
= exp (– eV0/kT) (8.70)
nh ( p)
where V0 is the potential difference across the junction. It follows from these
relations that the product nenh, of the total number of electrons and holes, has
the same value on the two sides.
To be specific, consider the flow of electrons across the boundary. Since
the motion of electrons from p to n is ‘downhill’, the rate of electron flow from
p to n is
Ie (p → n) = c1ne (p) (8.71)
where ne (p) is the total number of electrons in the conduction band on the
p-side. On the other hand the electrons flowing from n to p, face an ‘up-hill’
potential of V0 and only those which have an energy of eV0 will be able to cross
the boundary. The number of such electrons is proportional to ne (n) exp (– eV0 kT),
so that
Ie (n → p) = c2 ne (n) exp (– eV0/kT) (8.72)
284 Elements of Modern Physics
va – v
Forward bias
Without bias
Reverse bias
(a)
I 5
Io
4
1
– eVB/kT
eV/kT
(b)
Fig. 8.10. Rectification by a diode, (a) potential difference across the boundary,
(b) current as a function of eV/kT, V being the applied potential.
Solid State Physics 285
If there is reverse bias |V | >~ |VB| (see Fig. 8.10), Eq. (8.74) for the current is
no longer valid. For |V| > |VB |, a very rapid increase in the current is observed
(Fig. 8.10). There are two reasons for this increase: (i) the large field at the
junction speeds up the few electrons in the p-region near the junction to such
high velocities that they knock out some of the valence electrons into the
conduction band. This process continues repeatedly and a large current is quickly
built-up. (ii) If the potential difference across the boundary is sufficiently large,
the conduction band on the n-side will overlap the valence band on the p-side
(see Fig. 8.9). In this case, it was suggested by Zender that the electrons in the
valence band on the p-side will tunnel across the boundary into the conduction
band on the n-side. The diodes based on these two effects are known as avalanche
diodes or Zener diodes. They are very useful in voltage stabilization circuits.
If the impurity concentration is very high (of the order of one part in a
thousand), the Fermi energy may move into the valence band on the p-side and
into the conduction band on the n-side [Fig. 8.1(a)]. In this case, there will be
vacant levels above εf in the valence band on the p-side and electrons below εf
in the conduction band on n-side. When the reverse bias is applied, many
electrons will move from the valence band on the p-side into the conduction
band on the n-side, giving rise to a large current [Fig. 8.11(b)]. For a small
forward bias, electrons from the conduction band on the n-side can move not
p-type n-type
ef
(a) (b)
I C
(c) (d)
Fig. 8.11 The characteristics of the tunnel or Esaki diode; (a) energy bands without
bias, (b) energy bands with reverse bias, (c) energy bands with forward bias,
(d) current as a function of V showing negative resistance between C and D.
286 Elements of Modern Physics
only into the conduction band on the p-side, but also tunnel into some of the
vacant levels on the p-side. This again gives rise to a large current. On the other
hand, if the forward bias is quite large, there will no longer be any overlap of
the valence band on the p-side and the conduction band on the n-side. The
resulting current which is due to electrons moving across the potential barrier
from the conduction band on the n-side to the conduction band on the p-side,
actually shows a decrease [Fig. 8.11(c)]. For still higher potentials, the current
will begin to increase again, as in the case of the ordinary diode. The main
characteristic of these tunnel or Esaki diodes is the negative-resistance section,
which is used in high-frequency oscillator circuits in the microwave region.
Transistor
An important application of semiconductor junctions is the transistor. It consists
of two semiconductor junctions close together, which serve as an amplifier of
current or voltage.
To be specific, consider a pnp transistor (similarly one can have an npn
transistor) which consists of three regions (Fig. 8.12). The first region is the
emitter, the small, narrow, middle region is the base, and the third region is the
collector. The equilibrium potential consists of a potential barrier in the n-region.
If now the base is connected to a small negative potential Vb (the common
emitter may be taken to be at Ve = 0), and the collector to a fairly large negative
potential Vc, the potential across the two junctions is modified [Fig. 8.12(b)].
There is a forward bias across the emitter-base junction for the holes, and the
current across the junction is (Eq. (8.74)]
I = I0 [exp (eVb/kT) – 1] (8.75)
Since the base is very narrow (less than 10 cm in width), most of the holes
–3
that enter the base from the emitter roll down into the collectotor, though a few
of them will be annihilated by the electrons in the base. Thus, a major part of
the emitter current flows through the collector while only a small fraction of it
flows out from the base. In a general way, the collector current is controlled by
the changes in the barrier introduced by Vb, and the changes in the base current
Ib are amplified into the changes in the collector current Ic. The amplification in
the current is estimated by
β = Ic/Ib
hole lifetime
= (8.76)
base transit time
(holes annihilated in the base contribute to Ib) which in practice has a value of
about 100.
Solid State Physics 287
Vc Vb
E B C Ic
p n p
Vc
Ib
Vb
(a) (b)
E B C
Ie Ic
p n p
Vi Ri Ro Vo
(c)
Fig. 8.12. Transistors, (a) common emitter circuit, (b) the potential
distribution, (c) common base circuit.
In the analysis so far the motion of only holes has been considered. Regarding
the electrons, there is hardly any electron flow from the collector to the base.
The flow of electrons from the base to the emitter is minimized by having
relatively small amount of doping of donors in the base. Thus the currents in
the pnp transistor, are mainly due to the motion of the holes. For an npn transistor,
the analysis is similar except that the signs of all the potentials are opposite and
the current in this case is due to the flow of the electrons.
The pnp transistor can also be used as a voltage amplifier by having a
common base [see Fig. 8.12(c)]. In this arrangement, the input potential across
Rin is amplified into the output potential across Rout. Arguments similar to those
given above imply that the emitter current Ic is approximately equal to the
collector current. Therefore, the voltages across Rin and Rout are given by
Vin = Ic Rin
Vout = Ic Rout (8.77)
It therefore follows that
Vout R
≈ out (8.78)
Vin Rin
Since Rout/Rin is usually quite large, voltage gains of the order of 500 are
quite usual. This arrangement amplifies both voltage and power.
Photodiodes
A photodiode is a pn junction used to convert radiation energy into an electric
current. Consider a photon of frequency v
288 Elements of Modern Physics
about 10–8 s) and efficiently, the junction is subjected to a reverse bias so that
the carriers at the junction are subjected to a large potential difference. The pn
junction is also used as solid-state detector for measuring the energy of other
particles such as protons and electrons, the only difference in this case being
that these particles are not absorbed and come to rest after creating several
electron-hole pairs.
In a silicon solar cell [Fig. 8.13(a)] there is a very thin, large surface of
n-type silicon, forming a junction with a large volume of p-type silicon. The
surface is made large so as to collect a substantial amount of radiation, and thin,
about 10–6 m, so that the majority of the carriers generated by the photons can
diffuse to the junction before recombining. The surface is coated with an
anti-reflection coating to increase the efficiency which can reach a value of
about 16% in silicon solar cells (efficiency is the ratio of the electrical energy
output to the radiation energy input). When the surface is exposed, the holes
move across the junction to the p-side and the photodiode becomes an energy
cell with the n-side being at a negative potential and the p-side being at a positive
potential.
–
–
n –
– – –
p
hn
+ + + +
R +
+
(a) (b)
Laser beam p T
Junction plane
n
Polished surface
saw. The surface of the slice is polished mechanically and chemically, to give
what is referred to as the substrate.
In the next step, a mixture of silicon tetrachloride, hydrogen and phosphine
(PH3) is passed over the substrate at about 1200°C. The silicon released by the
reduction of silicon tetrachloride, along with a suitable amount of phosphorus
(from the PH3), crystallizes on the substrate surface forming what is known as
an n-type epitaxial layer (for producing p-type epitaxial layer PH3 is replaced
by diborane, B2H6).
The surface of the epitaxial layer is oxidized by heating the substrate to
about 1100°C in steam or oxygen so as to produce a thin layer of SiO2 [Fig.
8.15(a)]. The oxide surface is coated with a photosensitive material called the
photoresist. An area of the surface is covered with a photographic mask and the
rest of the surface is exposed to ultraviolet radiation. The photoresist is then
developed and the unexposed area is washed off. The oxide in this area is
removed by immersing in hydrofluoric acid and then the exposed photoresist is
removed. This procedure is known as window opening and it effectively removes
SiO2 from specified areas.
Boron
SiO2 SiO2
p
n-type epitaxial layer n
n-type substrate n
(a) (b)
Phosphorus
n Emitter
Base
n n-collector
(c) (d)
convert part of the p-region back to n-type to form the emitter [Fig. 8.15(c).
The surface is again oxidized and two new windows are opened to expose the
emitter and base regions and a metal (usually aluminium) is evaporated into
those windows forming electrical contact with these regions. The contact with
the original epi-layer which forms the collector, can be made through the
substrate [Fig. 8.15(d)].
Other devices can be produced by the variations of the essential steps
involved in the development of the npn transistor described.
Amorphous Semiconductors
There are many amorphous substances which have significant electrical
conductivity. They are known as amorphous semiconductors. In these materials,
the conduction is by electrons. They differ from the crystalline semiconductors
in that while they have short-range order, long-range order is absent in them.
This can be illustrated by amorphous Ge. In this case, though each atom is
surrounded by four nearest neighbours, the location of the second-nearest
neighbour is not unique. In the amorphous semiconductors, the different possible
locations of farther-away neighbours are almost randomly filled leading to
disorder at long range. The effect of this long-range disorder is not significant
for energy levels deep inside an energy band but is important for those near the
edges, e.g., those near the top of the valence band and bottom of the conduction
band. It leads to narrowing of the energy gap as compared with the gap in
crystalline semiconductors. Some amorphous semiconductors are Ge, Si, Se,
As2Se3, etc.
Amorphous semiconductors are used in switching and memory components,
They are also used in xerographic processes. Here, typically a thin film of
amorphous selenium is deposited on a metallic substrate, usually Al. It is charged
electrically by means of a discharge. When a pattern of light to be copied falls
on this, the lighted areas become photoconductive and discharge their charge
whereas the dark areas retain their charge. A finely-powdered pigment is sprayed
on the surface. It is retained by the charged areas and then transferred to a sheet
of paper.
Diamagnetism
When an atom is subjected to a magnetic field, the changing magnetic flux
induces currents (via the electron orbits) which, as per Lenz’s law, oppose the
change in flux. The currents persist, and have a magnetic moment which is
opposite in sign to the magnetic field intensity. The associated magnetic
susceptibility is negative and the property is known as diamagnetism.
Diamagnetism is present in all substances but is usually obscured by the larger
effects due to permanent magnetic dipole moments of the atoms.
Essentially, diamagnetism is the consequence of the term in which is
quadratic in B,
e2
Hd = ∑ (ri × B)2
8m i
e2
=
8m
∑ (ri ⊥ )2 B 2 (8.87)
i
where the summation is over all the electrons. In perturbation theory, the energy
due to this term is the average value
e2 2
E=
8m
B ∑ (ri ⊥ ) 2 (8.88)
i
The magnetic moment in this case is defined by
∂E
m0 = −
∂B
e2
= − B ∑ (ri ⊥ ) 2 (8.89)
4m i
The diamagnetic susceptibility, therefore, is
294 Elements of Modern Physics
e 2µ 0 N
χ= −
4m
∑ (ri ⊥ ) 2 (8.90)
i
where N is the number of atoms per unit volume. For the evaluation of this
quantity, an approximate value for 〈(ri ⊥ ) 2 〉 is normally used. For a typical value
of r2 ~ 10–20 m2, the molar susceptibility is
χm ~ 5 × 10–8/kg. mol, in MKS units, (8.91)
(multiply by 10 /4π to get the value in Gaussian units of per g mole) which
3
Free-Electron Paramagnetism
Free-electron paramagnetism in metals arises from the intrinsic magnetic
moment associated with the spin of the electron. In the absence of any magnetic
field, there is no preferred orientation of these magnetic moments. However, in
the presence of a magnetic field, the energies of the electron are perturbed by an
additional interaction
e
H = − − s ⋅ B (8.92)
m
and the resulting energy eigenvalues are
e
ε′ = ε ± B (8.93)
2m
ε being the unperturbed energy. The net magnetic moment is obtained by using
the Fermi-Dirac distribution:
2π V (2m)3/2 ) ( − e / 2m)
M=
h3
∫
1 + exp ε + e B − ε kT
f
2m
− e / 2m
+ ε1/ 2 d ε (8.94)
e
1 + exp ε − B − ε f kT
2m
For T → 0, this expression reduces to
Solid State Physics 295
2πV (2m)3 / 2 e ε f + e B/ 2 m 1/ 2
M= ∫ ε dε
h3 2m ε f − eB/ 2 m
2
NB
3 e
≈
(8.95)
2 2m ε f (0)
where Eq. (7.89) has been used. The susceptibility therefore is positive and
given by
2
3 e µ0
χ≈ N (8.96)
2 2m ε f (0)
This is generally quite small and for sodium [εf(0) ~ 3.1 eV] the susceptibility
per unit mass 8.3 × 10–9 kg–1 in MKS units or 6.6 × 10–7 g–1 in Gaussian units.
On including the corrections due to exchange correlation and effective mass,
the value is 8.8 × 10–7 g–1 in Gaussian units, which should be compared with the
experimental spin susceptibility of 9.8 × 10 –7 g –1. For obtaining bulk
susceptibility, the diamagnetic susceptibility due to the free electrons and the
ions should also be included.
At finite temperature, there is a slight dependence of χ on T which, for all
practical purposes, may be neglected.
Paramagnetism
Atoms, ions and compounds with unpaired electrons (this is the case if the
number of electrons is odd and also for some systems with even number of
electrons), have a nonzero magnetic moment. In the presence of a magnetic
field, they align with the magnetic field and produce a net, macroscopic magnetic
moment, giving rise to paramagnetism. Since the atoms (most of the subsequent
discussion applies to ions and molecules as well) are localized, Boltzmann
distribution can be used for the electron states. This gives rise to a temperature-
dependent susceptibility.
As was discussed in Sec. 6.2 the energy due to the interaction of an atom
with a magnetic field is
e
∆E = g M J B, M J = − J , − j + 1, , J (8.97)
2m
where g is the landé g-factor [Eq. 6.13)] and MJ is the z-component of the total
angular momentum. Using Boltzmann distribution for the populations, the
magnetic moment per unit volume is
J
∑ M J exp (− aM J / kT )
e MJ = − J
M= −N g J
(8.98)
2m
∑ exp (− aM J / kT )
MJ = − J
296 Elements of Modern Physics
e
where a = gB and N is the number of particles per unit volume. The
2m
summations can be carried out to yield
eg d
M = N ln f ( x)
2m dx x = a / kT
1 1
exp J + x − exp − J + x
2 2
f(x) = x/2 −x / 2
(8.99)
e −e
e
For B kT , the expression leads to a susceptibility
2m
χ = M/H
2
e J ( J + 1)
= N µ0 (8.100)
2m 3kT
It is observed that the susceptibility is inversely proportional to temperature.
This is stated in the form
χ = C/T (8.101)
known as Curie’s law, where C is called the Curie constant.
At T ≈ 300 K, molar susceptibility χm is of the order of 5 × 10–7/kg mol in
MKS units (4 × 10–5/gm mol in Gaussian units), which is rather small, but it
becomes much larger at low temperatures. The pedictions of Eq. (8.100) with
the J values given by Hund’s rule (ground state has the largest S allowed by
Pauli principle, the maximum L consistent with this S, and J = L + S when the
shell is more than half full and J = |L – S| otherwise, are generally in good
agreement with the experimental observations for many paramagnetic crystals,
e.g., rare earth ions, where in some cases the effect of the nearby states has to be
included.
The predictions of Eq. (8.100) are not in good agreement with experimental
observations for the ions of the iron group. The reason for this is that the partially
filled 3d shell for these ions is the outermost shell and is exposed to the strong
field due to the neighbouring ions in the crystal. This field, called the crystal
field, breaks the rotational symmetry, and the total angular momentum is no
longer a ‘good’ quantum number. Furthermore, the average value of Lz may
reduce to zero. This effect is known as the quenching of the orbital angular
momentum and implies that Eq. (8.97) should be replaced by
Solid State Physics 297
e
∆E = M S B ( g = 2 for spin) (8.102)
2m
This leads to
2
e S ( S + 1) eB
χ = N µ0 , kT (8.103)
2m 3kT m
for the ions of iron group. As an example, in the case of χ for Mn3+ (5D0),
Eq. (8.100) predicts that χ = 0, whereas the prediction of Eq. (8.103) with S = 2
is in very good agreement with experimental observations.
It may be noted that adiabatic demagnetization of a paramagnet system can
be used for attaining low temperatures, T < 1 K. This is done as follows. A
magnetic field is applied to a paramagnetic substance in good thermal contact
with the surroundings at T1. The field aligns the magnetic moments along the
direction of the field. This increase in order is equivalent to a decrease in the
entropy and hence heat flows out of the system. If now the substance is insulated,
and the field removed adiabatically, the spins gradually get out of the alignment
by absorbing energy from the lattie vibration which leads to a lowering of the
temperature of the paramagnetic substance. Temperatures of the order of 10–3 K
have been reached by this method.
Ferromagnetism
Ferromagnetism is the phenomenon in which some materials like iron, cobalt,
nickel, and some of their alloys behave like ordinary paramagnets at high
temperatures but which below a critical temperature known as the Curie
temperature Tc, acquire a nonzero magnetic moment even in the absence of an
applied magnetic field. This is due to the interaction between the magnetic
ions, which is strong enough to align their magnetic moments against the disorder
introduced by thermal effects.
The interaction that aligns the magnetic moments is quantum mechanical
in origin and is due to the exchange properties of the electron wave functions.
When the wave functions of two atoms overlap, the electrons being
indistinguishable, belong to both the atoms. In such cases, the symmetry or the
antisymmetry of the wave functions will strongly influence the energy of the
system (as in the case of covalent bonding, see Chapter 5). In particular, it is the
exchange symmetry between the spins and the extent of the overlap of the wave
functions that determines the nature and the strength of the exchange interaction.
It is reasonable to represent the energy from the exchange interaction by
298 Elements of Modern Physics
E = − ∑ J ij Si ⋅ S j , i ≠ j (8.104)
i, j
where Si is the spin of the i-th atom, and Jij are symmetric constants. If the
magnetic moment is assumed to be due to spin alone, Mi = b Si, as is the case
for the iron group, the interaction energy of the i-th atom can be written as
Ei = – Mi ⋅ Bint (8.105)
where Bint is given by
1
Bint =
b2
∑ J ij M j , i≠ j
j
=λM (8.106)
i.e., the effective internal field is proportional to an average magnetic moment M.
It is this field, known as the Weiss field, which is responsible for the alignment
of the spins. In the case of ferromagnetic substances, Jij are quite large and λ is
positive, which gives rise to ferromagnetism.
Consider the behaviour of ferromagnets above the curie temperature Tc.
Writing
B = B0 + λM (8.107)
where B0 is the applied field, one gets from Eq. (8.99),
2
eg J ( J + 1) e
M = N ( B0 + λM ), B kT (8.108)
2m 3kT 2m
This leads to
CB0 /µ0
M= (8.109)
T − Tc
C
c= (8.110)
T − Tc
2
eg J ( J + 1)
with Tc = N λ
2m 3k
µ 0Tc
= (8.111)
λ
The expression in Eq. (8.110) is known as the Curie-Weiss low and Tc is
known as the Curie temperature. The behaviour of χ given in Eq. (8.110) is
valid for T > Tc. At T = Tc, χ becomes infinite. Since M is finite, this implies that
M is nonzero even when B0 = 0, i.e., spontaneous magnetization exists. The
Curie temperature is about 1043 K for Fe, 1400 K for Co, and 631 K for Ni.
Solid State Physics 299
eg
a= ( B0 + λM ) (8.112)
2m
The solutions are obtained by plotting M in Eq. (8.99) as a function of x,
and also
2mkT
M= x, (8.113)
eg λ
obtained from Eq. (8.112) with a = xkT, B0 = 0, and looking for the intersection
of the two curves [see Fig. 8.16(a), it can be shown that the intersection at the
origin gives an unstable solution]. At T = Tc the curve given by Eq. (8.113) is
tangential to the curve given by Eq. (8.99), at the origin, and there is no
spontaneous magnetization for T > Tc. When T < Tc, there are two equal and
opposite solutions for each T, for example corresponding to points A and A′ in
Fig. 8.16(a). One set of these spontaneous magnetizations is plotted as a function
of T (T < Tc) in Fig. 8.16(b).
For B0 ≠ 0, the magnetization is obtained from the intersection of the curve
given by Eq. (8.99) and
2mkT 1
M= x − B0 (8.114)
eg λ λ
obtained from Eq. (8.112) with a = xkT. There are two solutions for
M corresponding to intersections at D and D′ in Fig. 8.16(a), for each B0 (T <
Tc). These solutions trace the boundary of the hysteresis curve [Fig. 8.16(c)]. It
can be shown that the third solution corresponding to intersection F in Fig.
8.16(a) is unstable. This solution is the extension of the unstable solution at the
origin for B0 = 0.
At T = 0, all the spins and the magnetic moments of the atoms are aligned
corresponding to the ground state of the system. The direction of the alignment
is introduced by the arbitrarily assumed direction of the internal field Bint, and
obviously the ground state is infinitely degenerate.
For T > 0 K, some of the spins go out of alignment. As in the case of lattice
vibrations, the disturbances are correlated, and misalignments travel as waves
known as spin waves. In analogy with photons and phonons, the excitations of
spin waves are quantized into quanta known as magnons. Magnons, which obey
Bose-Einstein statistics, play a significant role in determining the behaviour of
M(T ) at low temperatures, and contribute to the specific heat and thermal
conductivity of ferromagnets.
300 Elements of Modern Physics
M
2m kT 1
M = eh gl x – l Bo
A
D Nehg f¢(x)
.M = 2m f(x)
x
F
A¢
D¢
(a)
M
D
A
0
M(O)
M(T)
H = Bo/mo
0
0 1
T/Tc D¢
A¢
(b) (c)
and finally all the domains will have magnetization along a preferred or an
‘easy’ direction (determined by the crystal structure) nearly parallel to the
direction of the external field. A further increase in the field brings the alignment
closer to the direction of the external field. The progress of magnetization as
the external field increases is described by the broken line in Fig. 8.16(c). If the
external field is now removed, some of the magnetization is retained [point A in
Fig. 8.16(c)]. The variation of magnetization with the external magnetic field
now produces the well-known hysteresis curve, similar to that in Fig. 8.16(c)
except that usually B is plotted against H. The magnetization can be destroyed
either by heating or by mechanical shocks. The reality of the domains, which
are so successful in describing ferromagnets, can be demonstrated by scattering
finely divided iron on the surface of a ferromagnet, which collects along the
domain boundaries where the field is the strongest.
CA
MA = ( B0 − λ A M B ), (8.116)
T
CB
MB = ( B0 − λ B M A )
T
2
e g A, B J A, B ( J A, B + 1)
where CA,B = N A, B (8.117)
2m 3k
If the atoms A and B are magnetically equivalent, then
χ = (MA + MB)/H
302 Elements of Modern Physics
2µ0C A
= , T > Tc (8.118)
T + Tc
with Tc = λACA, λA = λB, CA = CB, where Tc is the Néel temperature TN. At T = Tc,
an additional solution to Eq. (8.116) exists, MA = – MB even for B0 = 0. Thus, the
sets of atoms A and B are spontaneously magnetized for T < Tc though the net
magnetization is zero. Such materials are known as antiferromagnets. In
antiferromagnets, the crystal field aligns the magnetic moments along a preferred
direction below Tc. It can be shown that a field applied perpendicular to this
direction is associated with a susceptibility χ⊥ which is essentially independent
of T(T < Tc) whereas a field applied parallel to the preferred direction, is associated
with a susceptibility χ11 which is equal to χ⊥ at T = Tc but decreases to zero as
T → 0 K.
If the atoms A and B are magnetically inequivalent, Eq. (8.116) can be
solved for MA and MB and they lead to a susceptibility
µ0 [T (C A + CB ) − C ACB (λ A + λ B )]
χ= (8.119)
(T − Tc ) (T + Tc )
where Tc = (CACBλAλB)1/2. Thus, χ tends to infinity as T → Tc, which implies that
there is a net spontaneous magnetization for T < Tc (MA and MB are opposite in
sign but MA + MB ≠ 0). Such meterials are called ferrimagnets, Fe3O4 being a
well known example. An important class of ferrimagnets is the ferrites which
have the formula M2+ Fe23 +O42– where M is a member of the first transition
group. They are of great technical importance since they may have large
magnetization at room temperature, and high resistivity. They are therefore more
suitable than ferromagnets for use at high frequencies when eddy current losses
are a serious problem. They are also used for memory storage in computers.
P = ε0 χe E (8.120)
where ε0 is the permittivity of the vacuum and χe is the electric susceptibility. It
is convenient to define an atomic polarizability α by
P = N α Eloc (8.121)
where N is the number of atoms per unit volume and Eloc is the effective field at
the atom, not including the field due to the atom itself. A displacement vector
D can also be defined as
D ≡ ε ε0 E
≡ ε0 E + P (8.122)
where ε is the dielectric constant. It can be shown that for a dielectric material
with an isotropic or cubic (including simple cubic, bcc, fcc) structure, filling a
parallel plate capacitor,
1
Eloc = E + P (8.123)
3ε0
This allows us to eliminate Eloc in Eq. (8.121). Solving for P from
Eqs. (8.121) and (8.122), and equating the two expressions gives
Nα ε −1
= (8.124)
3ε0 ε+2
This is the Clausius-Mossotti formula, relating the atomic polarizability α
to the macroscopic dielectric constant ε.
For a nonmagnetic material, ε = n2, n being the refractive index, so that
n2 − 1 1
2
n +2
=
3ε0
∑ N i αi (8.125)
i
where a summation over i has been introduced to include the possibility the
several mechanisms contribute to the polarizability. For the polarizability of an
atom, contributions are electronic, ionic and orientational. It is possible to
experimentally separate the different contributions by observing the polarizability
as a function of frequency and by noting that the different contributions are
generally significant in different ranges. This is illustrated in Fig. 8.17(a). The
rapid changes in the polarizability are also accompanied by large absorption of
radiation [Fig. 8.17(b)].
Electronic Polarizability
The electronic contribution to polarizability arises from the displacement of
electrons in an atom, relative to the nucleus.
304 Elements of Modern Physics
aorientational
a
aionic
aelectronic
(a)
Absorption
n
(b)
e2 1 1
p≈ E cos ωt ∑ | z j 0 |2 + (8.128)
ω j 0 + ω ω j 0 − ω
j
where ωj0 = (Ej – E0)/. From this, the polarizability is
e2 1 1
α= ∑ | z j 0 |2 ω
+ (8.129)
j j0 + ω ω j0 − ω
It is observed that α takes sudden jumps whenever ω = (Ej – E0)/. Also at
ω = (Ej – E0)/, there are transitions to the state j, resulting in an absorption of
Solid State Physics 305
radiation. Actually the singularity at ω = ωj0 is displaced by the fact that the
state j is an unstable state which essentially requires a replacement of
(ωj0 – ω)–1 by the real part of (ωj0 – ω – i/2 τj)–1, i.e.,
1 ω j0 − ω
→ (8.130)
ω j0 + ω (ω j 0 − ω) 2 + (2 τ j ) −2
where τj is the lifetime of state j [see Eq. (6.76)]. With this modification,
Eq. (8.129) provides a qualitative explanation of the polarizability illustrated in
Fig. 8.17(a). It may also be noted that when ωj0 ≈ ω, there is a significant
probability for transition to state j, as seen from the expression in Eq. (6.55) for
aj, which leads to absorption of radiation [Fig. 8.17(b)]. While Eq. (8.129) is
valid for a general charged system, it is practically useful mainly for electronic
polarizability for which zj0 can be calculated with some reliability. A rough
order of magnitude estimation for this electronic polarizability gives
Ionic Polarizability
The ionic polarizability is due to the displacement of ions with respect to each
other. If it is assumed that the forces near equilibrium are simple harmonic, the
displacement in the presence of an electric field is given by
k ∆ x ≈ eE (8.132)
where k is the force constant. This leads to a polarizability
α ≈ e2/k (8.133)
Since k ≈ 20 N/m, αionic ≈ 10–39 F.m2
The ionic contribution is important at low frequencies (ωj0 in Eq. (8.129) is
small). This explains the fact that NaCl has ε ≈ 5.6 at low frequencies whereas
at optical frequencies ε ≈ 2.25. The difference may be ascribed to the ionic
contribution to polarizability (see Fig. 8.17).
Orientational Polarizability
Molecules with permanent electric dipole moment align themselves in the
presence of an external electric field giving rise to an orientational polarizability.
The energy of a dipole p in an electron field E is
V = – p E cos θ (8.134)
where θ is the angle between the dipole and the field. Therefore, the average
dipole moment (using Boltzmann distribution) is
306 Elements of Modern Physics
pE cos θ
∫ cos θ exp kT
d cos θ
p = p (8.135)
pE cos θ
∫ exp kT
d cos θ
d ea − e − a
= p ln
da a a = pE
kT
For pE << kT, p ≈ p2E/3kT which leads to
α = p2/3kT (8.136)
which at room temperatures is of the order of 10–39 F.m2, comparable to the
electronic polarizability. It is distinguished by its temperature dependence, and
suggests a relation
αtot = α0 + p2/3kT (8.137)
This expression is quite useful in determining dipole moments of dipolar
substances, e.g., HCl (p = 1.1 debyes, 1 debye = 10–39 C.m), by looking at the
temperature dependence of αtot. It is tempting to substitute Eq. (8.136) in
Eq. (8.124) to obtain
3 Tc Np 2
ε = 1+ , Tc = (8.138)
T − Tc 9ε o k
which would imply that spontaneous polarization (E = 0) sets in at T = Tc. It has
however been shown by Onsager that Eqs. (8.123) and (8.124) are not valid for
permanent electric dipoles. The theory of Onsager for permanent dipoles does
not imply the existence of a critical temperature for such dipoles.
Ferroelectric Crystals
The phenomenon of spontaneous polarization (E = 0), known as ferroelectricity
is observed in (i) Rochelle salt and some of the associated salts, (ii) some crystals
with hydrogen bonds, in which the motion of protons gives rise to ferroelectric
behaviour (e.g., KH2PO4, RbH2PO4, etc.) and (iii) ionic crystals with perovskite
(CaTiO3) and ilmenite (FeTiO3) structures. The perovskite structure illustrated
by BaTiO3 is the simplest structure which exhibits ferroelectricity—it has a
cubic structure with Ba at the corners, oxygen at the face centres and Ti at the
body centre. Ferroelectricity in barium titanate (BaTiO3) is briefly described
here.
Barium titanate becomes ferroelectric at 380 K and exhibits hysteresis curves
for T < Tc, in the plot of D against E. The ferroelectricity in BaTiO3 is due to
induced electronic and ionic dipole moments. From Eq. (8.124)
2
1+
3ε0
∑ Ni αi
i
ε= (8.139)
1
1−
3ε0
∑ N i αi
i
Solid State Physics 307
1
Now, the contribution of electronic polarizabilities to
3ε0
∑ N i αi is about
i
0.61. Assuming that the ionic contribution is about 0.39 (estimations show that
this is not unreasonable), the dielectric constant tends to infinity. Expanding
the denominator as a function of temperature gives
3/ β 1 ∂
ε=
T − Tc
,β = −
3ε 0
∑ Ni αi (8.140)
∂T i T = Tc
Estimates of β agree well with experimental observations, i.e., 3/β ~ 105 K.
It may be observed that the (incorrect) expression in Eq. (8.138) for dipolar
atoms would have given a value of 3Tc ~ 1140 K for the residue, considerably
smaller than the observed residue, which again rules out the explanation in
terms of dipolar atoms.
It may also be noted that (i) the description of the hysteresis, etc. in terms
of domains is valid for ferroelectricity, and (ii) antiferroelectricity is observed
e.g., in WO3, PbZrO3, etc.
Piezo-electricity
Some crystals when deformed by an external stress develop a net dipole moment
which produces surface polarization charges. This is known as piezo-electricity.
Piezo-electric materials exhibit the converse effect as well, i.e., they are distorted
when placed in an electric field. The strain produced however is very small. For
example in quartz which is the most common piezo-electric substance, an electric
field of 104 V/m produces a strain of only 1 part in 108. Of course, this also
means that even a small strain can produce enormous electric fields.
When a crystal is subjected to a strain, there is a displacement of the ions in
the crystal. If the charge distribution in the crystal does not have inversion
symmetry about a centre, a net polarization of charges may develop giving rise
to piezo-electricity. For example, an equilateral triangle with + 3 charge at the
centre and – 1 charge each at the vertices will have zero dipole moment. Under
strain, the bond lengths may remain the same but make unequal angles with
each other giving rise to nonzero dipole moment.
A part from quartz, other examples of piezo-electric materials are Rochelle
salt, barium titanate (BaTiO3), etc. In fact, all ferroelectric materials are piezo-
electric though the converse is not true. Piezo-electric materials are used to
convert electrical energy into mechanical energy and conversely, i.e., as
transducers. In particular, they are used in devices such as gramophone pickups,
microphones, strain gauges, etc. while the converse effect is used in ultrasonic
generators.
308 Elements of Modern Physics
8.8 EXAMPLES
Here, some examples that illustrate and extend the ideas about the solid state
are considered.
Example 1
The bulk modulus of a crystalline solid can be estimated from Eq. (8.5).
A pressure P produces a decrease in length, ∆l,
P = C l0 ∆l (8.141)
where C is a constant, so that the work done is
∆
W = 3CV0 ∫ x dx
0
3
= C V0 (∆l )2 (8.142)
2
where V0 is the volume. This causes a change in energy given by
∆E = N[E(R0 – ∆R) – E(R0)]
N ∂2 E
≈ 2 2
(∆R) 2 (8.143)
∂R R = R0
N being the number of ion pairs, and where the fact that E is a minimum
at R0 has been used. Equating W and ∆E gives
2
N R0 ∂ 2 E
C= (8.144)
3V0 l0 ∂R 2 R = R0
P 1N ∂2 E
K= − = R02 (8.145)
∆V/V0 9 V0 ∂R 2 R = R0
where N/V0 is the number of ion pairs/unit volume, and E is given in Eq. (8.3)
with n ≈ 10. For NaCl, the above expression gives an estimate of about
3.5 × 1010 J/m3 (experimentally it is about 3 × 1010 J/m3).
Example 2
It can be shown that when a, b and c are mutually orthogonal, the distance
between the planes with Miller indices (h, k, l) is given by
Solid State Physics 309
−1/ 2
h2 k 2 l 2
d = 2 + 2 + 2 (8.146)
a b c
Let one of the planes have intercepts n1a, n2b, n3c along the three axes
(where n1, n2, n3 are integers). A translation by an integral multiple of a, or b, or
c, along the first, or the second, or the third axis, respectively, gives an equivalent
plane. It is found that the number of equivalent planes between the origin and
this plane is equal to the l.c.m. of n1, n2, and n3, say N. On the other hand, the
Miller indices are
N N N
h=,k = ,l = (8.147)
n1 n2 n3
If D is the perpendicular distance of the plane with intercepts n1a, n2b, n3c
from the origin, then
D = n1a cos α = n2b cos β = n3c cos γ (8.148)
where α, β and γ are the angles made by the perpendicular line with the three
axes. Since
cos2 α + cos2 β + cos2 γ = 1, (8.149)
− 1/ 2
1 1 1
D= 2 2 + 2 2 + 2 2 (8.150)
n a n2 b n3 c
1
Using Eq. (8.147), the separation between two adjacent planes comes out
to be
− 1/ 2
h2 k 2 l 2
d = D/N = 2 + 2 + 2 (8.151)
a b c
Example 3
Hall effect provides a convenient method of determining the nature of charge
carriers and their nonability.
Consider a current flowing in the x-direction, through a thin sheet in the xy
plane. If a magnetic field Bz is applied to the current, the charge carriers are
deflected by the v × B force and build an electric field Ey in the y direction. In
the equilibrium condition
Ey + (v × B)y = 0 (8.152)
or
Ey = vx Bz (8.153)
Now the current in the x-direction is nqvx, n being the carrier density and q
their charge, so that the Hall coefficient is
310 Elements of Modern Physics
RH ≡ Ey/JxBz
= 1/nq (8.154)
Thus, a measurement of Ey for a given Jx and Bz allows us to determine the
concentration of the carriers as well as the sign of their charge. From
Eq. (8.153) vx and hence the mobility of the carriers can also be determined as:
µ = vx/Ex (8.155)
Example 4
A silicon crystal contains an arsenic concentration of 1.2 × 1022/m3 and a boron
concentration of 6 × 1021/m3. What is the density of majority and minority carries
at room temperature?
The electrons in the conduction band and in the acceptor levels, are from
the donor levels and the valence band:
nc + na = hd + hv (8.156)
This leads to
Na
c0(me*T)3/2 exp [(εf – εc)/kT] +
exp [(ε a − ε f ) / kT ] + 1
Nd
= c0 (mh*T)3/2 exp [(εv – εf)/kT] + (8.157)
exp [(ε f − ε d ) / kT ] + 1
where c0 = 2(2πk)3/2/h3, Nd = 1.2 × 1022 m–3 and Na = 6 × 1021 m–3. Assuming the
εf – εa >> kT and εd – εf >> kT, and neglecting the first term on the rhs, gives
Nd − Na
εf ≈ εc + kT ln 3/ 2
(8.158)
c0 (me* T )
Therefore nc ≈ Nd – Na
1
hv ≈ c02 (me* mh* T2)3/2 exp [– (εc – εv/kT] (8.159)
Nd − Na
where at room temperature, me* ≈ 0.25 m, mh* ∗ 0.3 m, and for the given
concentrations,
εf – εc ≈ – 0.24 eV (8.160)
This result shows that the neglect of the first term on the rhs of Eq. (8.157)
is justified.
Example 5
The effective masses of the carriers are determined from cyclotron resonance
experiments. A resonance is observed in a Si crystal at 3 × 1010 Hz and a field of
0.4 T. What is the value of m*?
Solid State Physics 311
Example 6
A germanium pn junction has 5 × 1022 phosphorus atoms/m3 in the n-side and
3 × 1022 gallium atoms/m3 in the p-side. What is the potential difference across
the junction at room temperature? If the current for a large reverse bias is
5 × 10–8 A, what is the current for a forward bias of 0.4 V?
Assuming complete ionization of donor atoms and occupation of acceptor
levels, one has the relations
c0 (me* T)3/2 exp [(εf – εc)/kT] = Nd (8.163)
c0 (mh* T) exp [(εv – εf*)/kT] = Na
3/2
(8.164)
with Nd = 5 × 10 m , Na = 3 × 10 m , and me* ≈ mh* ≈ 0.1m. This gives
22 –3 22 –3
εc – εf ≈ 0.072 eV and εf′ – εv ≈ 0.085 eV. Hence the potential difference across
the junction is
V0 ≈ 0.72 – 0.072 – 0.085
= 0.56 V (8.165)
Since a large reverse bias gives a current of 5 × 10 A, the forward bias
–8
current is
I ≈ 5 × 10–8 (exp [e∆V/kT] – 1) (8.166)
which for ∆V = 0.4 V gives I ≈ 0.24 A.
Example 7
The diamagnetic susceptibility of helium can be estimated from the approximate
helium wave function (see Example 5 of Sec. 5.8)
1
ψ=
π a′3 exp [− (r1 + r2 ) / a′]
(8.167)
a′ = 4πε02/me2Z′, Z′ = 27/16. The diamagnetic susceptibility is obtained
from Eq. (8.89) to be
e 2µ 0 N 2
χ=– a′ (8.168)
m
312 Elements of Modern Physics
which comes out to be 2.1 × 10–8/kg mole (1.67 × 10–6/g mol in Gaussian units,
compared to the experimental value of 1.9 × 10–6/g mol).
Example 8
A ferromagnetic material with J = 3/2 and g = 2 has a transition temperature
Tc = 120 K. Calculate the internal field near 0 K. What is the ratio of magnetization
at 300 K for B = 5 × 10–3 T compared to that at 0 K?
From the expression for Tc in Eq. (8.111), and
eg
Bint = λ N J at 0 K,
2m
6mkTc
one has Bint = (8.169)
eg ( J + 1)
which is about 108 T. The ratio of magnetization at 300 K to that at 0 K, is
B0 ( J + 1) eg
R=
3k (T − Tc ) 2m
= 3.1 × 10–5 (8.170)
which illustrates the fact that paramagnetic effects are, in general, much smaller
than ferromagnetic effects.
Example 9
When a photon is incident on a material with an energy gap ∆E, an electron in
the valence band may absorb this radiation and go to the conduction band if
hv > ∆E. The kinetic energy of the electron is given by
1 2
mv = h v – ∆E – (εv – ε) (8.171)
2
where ε is the initial energy of the valence electron, the maximum energy being
observed for ε = εv. Thus, a rapid increase is observed in the absorptivity of
radiation as v increases through the value of v = ∆E/h. This property is used to
determine the energy gap (the experiments are usually done at low temperatures
to reduce thermal effects). Since ∆E ~ 1 eV for semiconductors, they are
essentially transparent to infrared radiation but absorb most of the radiation in
the optical region.
The excited electrons are de-excited either immediately, in general emitting
radiation (fluorescence) of a different frequency than that of the original photon,
or wander around in the crystal until they are trapped at the luminescent centers
Solid State Physics 313
PROBLEMS
1. It requires an energy of 5.14 eV to remove the valence electron of Na and
an energy of 3.80 eV is released when an electron is added to Cl. Assuming
a value of n = 10 in Eq. (8.3), and an interatomic spacing of 2.82 Å, obtain
the cohesive energy/ion pair, and the repulsive energy.
2. Show that the Madelung constant for an infinite array of alternating positive
and negative charges in 1-dimension is α1 = 2 ln 2. Show that the
expressions in 2- and 3-dimensions are:
∞
(− 1) n + m
α2 = 2α1 – 4 ∑ 2 2 1/ 2
n, m = 1 (n + m )
∞
(− 1)l + m + n
α3 = 3α2 – 3α1 – 8 ∑ 2 2 2 1/ 2 .
l , m , n = 1 (l + m + n )
3. If the repulsive force is of the form Ce–r/a, determine C and a for NaCl
if the cohesive energy/ion pair is 6.61 eV, and the interatomic separation
is 2.82 Å.
4. It is observed that x-rays of wavelength 1.2 Å produce a first order
maximum at a Bragg angle of 12.3° when reflected by the (1, 0, 0) planes
of NaCl (which has fcc structure as in Fig. 8.1). If the density of NaCl is
2.165 g/cm3 and its molecular weight is 58.454, obtain the value of
Avogadro’s number.
5. For a cubic crystal of unit length 10–10 m, at what angles will the first
order maxima be observed for (1,1, 1), (1, 1, 0) and (1, 0, 0) planes? The
incident x-ray has a wavelength of 1 Å. Will the second order maxima be
observed? Will be (2, 1, 0) planes produce maxima in this case?
6. Hard spheres of radius R are arranged in contact in simple cubic, bcc and
fcc structures. Find the radius of the largest sphere that can fit into the
largest interstices of these structures.
7. Iron undergoes a phase transition from bcc (at lower temperature) to fcc
at 1180 K. If there is no change in the density show that the ratio of the
nearest neighbour separation increases by a factor of about 1.029.
314 Elements of Modern Physics
8. If an element contains both electron and hole carriers, show that the Hall
coefficient is given by
nhµ h2 − neµe2
RH =
e(nhµ h + neµe )2
A material has 1021 electrons/m3 and 5 × 1020 holes/m3. If µe = 0.05 and
µh = 0.07 in MKS units, evaluate the conductivity and the Hall coefficient
of the material.
9. Hall coefficient of A1 is – 0.3 × 10–10 MKS units. How many conduction
electrons does each atom contribute?
10. The conductivity of germanium is 0.7Ω–1m–1 at 0° C and 2Ω–1m–1 at 20°C.
What is the energy gap for germanium?
11. A measurement of 0.1% change in resistivity is possible in a silicon crystal.
What is the sensitivity at room temperature of such a crystal used as a
thermistor?
12. What is the conductivity at room temperature of (i) pure silicon (ii) silicon
containing 10–5% of phosphorus? Mobility of electrons is 0.14m2/Vs, that
of holes is 0.05 m2/Vs and the number of charge carriers in pure silicon is
about 2 × 1016 m–3, at room temperature.
13. A current of 10–4 A flows when a forward bias of 0.2 V is applied at room
temperature. Obtain the currents if (i) forward bias of 0.4 V (ii) reverse
bias of 1 V, are applied.
14. Show that the width of the depletion region at a pn junction, when a forward
bias of V is applied, is given by
1/ 2
2κε0 (ne + nh ) (V0 − V )
x=
ene nh
where V0 is the equilibrium potential difference across the junction. What
is the width for a junction with ne = 1022 m–3, nh = 1022 m–3, κ = 14, V0 = 0.7 V,
if a reverse bias of 2 V is applied?
15. Show that the product of electron and hole carriers is
nenh = c02(mh*me*T2)3/2 exp [(εv – εc)/kT],
independent of the concentration of the impurity. Thus, if an n-type of
impurity is introduced, the number of holes decreases.
16. A silicon semiconductor is doped with 9 × 1021 donors/m3 and 4 × 1021
acceptors/m3. If the electron mobility is 0.15 m2V–1s–1, estimate the
resistivity at room temperature.
Solid State Physics 315
1
1/ 2
1
εf ≈ εd + kT ln c1 + −
4 2
N d h3
where c1 = exp [(εc – εd)/kT]. This expression gives the
2(2πkme* T )3 / 2
correct expression at T → 0 as well as at room temperature.
18. An electron does not tunnel across the depletion region if the width of the
layer is more than 10–8 m. What is the minimum doping needed for a
silicon tunnel-diode to operate? Take ne ≈ nh for the two impurities, κ ≈ 12
for Si.
19. A silicon pn junction has 10 23 gallium atoms/m 3 and 10 22 arsenic
atoms/m3. What is the approximate potential difference across the junction?
20. The main contribution to parmagnetism of copper sulphate comes from
the copper ions which have spin 1/2. Show that its magnetization is given
by
e eB
M = N tanh .
2m 2mkT
21. For a substance containing paramagnetic ions with S = 1/2 and orbital
angular momentum quenched, derive an expression for the energy and
specific heat of the substance. Discuss its high- and low-temperature limits.
22. A magnetic field is applied to a salt containing Cu2+ ions. Given that
Cu2+ has nine 3d electrons, determine the field at which 90% of the ions
are in the ground state at 1 K.
23. Show that the magnetization in a ferromagnet tends to a value (use
Eq. (8.99) with a = egλ M/2m)
Neg J 1 eg 2 Ng λ
M→ 1 − exp −
2m J 2m kT
for T → 0 K. However, a more sophisticated calculation in terms of spin
waves and magnons shows that the second term vanishes at T3/2 rather
than an exponential.
9
The Nucleus
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 317
S. H. Patil, Elements of Modern Physics,
https://doi.org/10.1007/978-3-030-70143-7_9
318 Elements of Modern Physics
Nuclear Constituents
The nucleus is made up of protons and neutrons. The proton is the nucleus of
the simplest atom, the hydrogen atom. It has a rest energy of 938.256 MeV or a
mass of 1.0072766 mu (1 atomic mass unit, mu, is equal to 1/12 of the 12C mass
and corresponds to 931.478 MeV), a positive charge of e and spin /2. The
neutron has a rest energy of 939.550 MeV or a mass of 1.0086654 mu, zero net
charge and spin /2. Since protons and neutrons have half-integral spin, they
are fermions and satisfy Fermi-Dirac statistics. The near-equality of the neutron
and proton masses is an important property and it has a bearing on nuclear
interactions.
Both the proton and the neutron have magnetic moments given by
e
µp = g p s (9.1)
mp
e
µn = − g n s
mp
The Nucleus 319
Binding Energies
Since nuclear forces are strong, nuclear binding energies are a significant fraction
of nuclear masses. Thus, nuclear binding energies can be obtained from the
masses.
The binding energy Eb of a nucleus of mass mA, containing Z protons and
(A – Z) neutrons, is
Eb = c2 [Zmp + (A – Z)mn – mA] (9.3)
This is the minimum energy required to separate the nucleus into its
constituent nucleons. A nucleus with Z protons and A nucleons is said to have a
A
charge or atomic number Z and mass number A, and is designated by Z X
where X stands for the chemical symbol of the nucleus (e.g. one has 11 H, 12 H, 13 H
nuclei with 0, 1, 2 neutrons respectively). Nuclei with the same number of
protons but different number of neutrons are known as isotopes e.g. 11 H, 12 H etc.
or 16 17 ,
8 O, 8 O
etc. Nuclei with the same number of neutrons but different number
of protons, are called isotones, e.g. 42 He, 53 Li , and nuclei with the same A but
different Z are called isobars, e.g. 52 He, 53 Li . One also has some nuclei which
are excited states of a stable nucleus, but which have a very long lifetime (say
τ > 0.1 s). These are called isomers.
A very useful concept in nuclear physics is the binding energy per nucleon,
Eb/A. It starts from a very low value of Eb/A ≈ 1.1 MeV for the deuteron
(Eb ≈ 2.225 MeV), rapidly increases to 7.1 MeV for the α-particle, i.e. 42 He,
reaches a peak value of about 8.7 MeV for A ≈ 56. For nuclei with larger A, it
320 Elements of Modern Physics
decreases slowly reaching a value of about 7.5 MeV for the heaviest natural
element, uranium. The general dependence of the binding energy per nucleon
on the mass number is shown in Fig. 9.1, for the stable nuclei. Two important
results follow from the general behaviour of Eb/A: Energy can be released (i) in
the fission of a heavy nucleus into lighter nuclei, and (ii) in the fusion of lighter
nuclei into a heavier nucleus. For example, a nucleus with A = 220 (Eb/A ≈ 7.5
MeV), breaking into two nuclei with A = 110 each (Eb/A ≈ 8.5 MeV) will liberate
an energy of about 220 × (8.5 – 7.5) = 220 MeV. Similarly two 12 H
nuclei (E b/A ≈ 1.1 MeV) can combine into a 42 H nucleus (Eb/A ≈ 7.1 MeV) to
liberate an energy of about 4 × (7.1 – 1.1) = 24 MeV. These energies are very
large compared to the few electron volts released in chemical reactions which
are governed by electromagnetic forces.
10
Ni Mo
144Nd
0 208Pb
8
He
Eb
A
4
2 2
H
0
0 40 80 120 160 200 240
A
Fig. 9.1 The general behaviour of the binding energy per nucleon as a function of A.
Though the fission and fusion processes leading to nuclei with A ≈ 55 are
feasible, it is observed that most of the nuclei are stable. The reason for this is
that before a heavy nucleus breaks up, the components must go through an
intermediate state with higher energy than the ground state (this can be induced
by providing extra energy available in the capture of a neutron). Similarly, lighter
nuclei encounter a higher energy intermediate state with large Coulomb
repulsion, before they can combine (the fusion can take place at high temperature,
e.g. in stars).
The Nucleus 321
Unstable Nuclei
If a lower energy state is available to a nucleus, it will, in general, be unstable
and decay with the emission of a photon, or an α-particle, or some leptons (i.e.
e, v etc. see Sec. 10.1), provided the basic conservation laws, such as conservation
of energy, momentum, charge, etc. allow the decay. The neutron β-decay in
Eq. (9.2) is one such example. These decays are characterized by the lifetime τ
of the nucleus (the lifetime was discussed in Sec. 6.4), such that the number of
undecayed nuclei N(t) is given by
N(t) = N(0) exp (– t/τ) (9.4)
where N(0) is the number of nuclei at time t = 0. The lifetimes of the nuclei vary
from the unmeasurably small values of τ < 10–6 s (they may be indirectly
estimated), to the very large values of τ ~ 10100 years.
The important classes of nuclear decays are the following:
1. α-decay: It may be described by the process
A
ZX → ZA −− 42 Y + 24 He (9.5)
an example of which is
238
92 U → 234 4
90 Th + 2 He (9.6)
2. γ-decay: In a γ-decay, an excited nucleus undergoes transition to a lower
energy state by the emission of a photon. This may be represented by
X* → X + γ (9.7)
where X* is the excited state. The photon energies in nuclear transitions
are of the order of an MeV compared with the few eV in atomic transitions,
and the corresponding lifetimes are of the order of 10–14 s (compared to
t ~ 10–15/108 = 10–23 s required for a relativistic particle to traverse a
nucleus).
3. β-decay: These processes involve electrons and neutrinos, and are
exemplified by
A
ZX → Z +A1Y + e + v (9.8)
A
ZX → Z −A1Y + e + v (9.9)
A
ZX + e → Z −A1Y + v (9.10)
where e and e are the electron and the positron, and v and v are the
neutrino and the antineutrino. In the electron-capture process [Eq. (9.10)],
the absorbed electron is usually from the atomic shells.
The activity of the unstable nuclei, known as radioactivity, is measured in
terms of the curie which corresponds to 3.7 × 1010 disintegrations/s.
322 Elements of Modern Physics
Nuclear Radius
The wave-function description of a particle does not provide an unambiguous
description of the size of a particle. However, since nuclear forces are large
only within a distance of a few fermis (1 fermi = 10–15 m), it is useful to consider
the size of the nucleus. The nuclear radius may be estimated from the scattering
of neutrons and electrons by the nucleus, or by analysing the effect of the finite
size of the nucleus on nuclear and atomic binding energies.
Fast neutrons of about 100 MeV energy, whose wavelength is small
compared to the size of the nucleus, are scattered by nuclear targets. The fraction
of neutrons scattered at various angles can be used to deduce the nuclear size.
For example, in the scattering of a high energy particle by a hard sphere, V = ∞
for r < R, V = 0 for r > R, all the incident particles within a cross-sectional area
of 2πR2 are scattered. The factor of 2 is due to the diffraction of the waves at the
edges. The results of these experiments indicate that the radius of a nucleus is
given by
R ≈ r0 A1/3 (9.11)
where A is the mass number and r0 ≈ 1.3 – 1.4 fm. The scattering can be done
with proton beams as well. In this case, however, the effects due to Coulomb
interaction have to be separated out. The observations are in agreement with
the result in Eq. (9.11) with r0 ≈ 1.3 – 1.4 fm.
The scattering of fast electrons of energy as high as 104 MeV, with a
wavelength of about 0.1 fm, has the advantage that it can directly measure the
charge density inside a nucleus. The results of the experiment are in agreement
with Eq. (9.11) but with a somewhat smaller value of r0 ≈ 1.2 fm. The slight
difference in the value of r0 may be ascribed to the fact that the electron scattering
measures the charge density whereas the neutron and proton scattering
experiments measure the region of large nuclear potential, which may be
expected to be somewhat larger than the size of the nucleus.
The finite size of the nucleus modifies the atomic potential (– Z/r) at short
distances. This gives rise to a small separation between the spectral lines of
atoms with the same Z value but different A values—this is known as isotope
shift. The shifts can be used to deduce the nuclear ratius. The isotope shift is
much larger in muonic atoms (which have a muon in place of an electron) since
the radii of the muonic orbits are smaller than the electronic orbits by a factor of
about 200 (mµ ≈ 200 me). However, the accuracy of measurements is muonic
atoms is lower since the muons have a short lifetime, about 2 × 100–6 s. Finally,
the measurement of differences in the binding energies of mirror nuclei can
give an estimation of the nuclear radius. The mirror nuclei are nuclei which are
identical except that one proton is replaced by a neutron. They may be
2Z + 1 2Z + 1
characterized by Z + 1 X, ZY . The difference between their binding energies
The Nucleus 323
may be ascribed to the two different charges. A model calculation with assumed
charge distribution then provides an estimation for the nuclear radius. All these
approaches are in essential agreement with Eq. (9.11) with r0 ≈ 1.2 fm.
An important consequence of Eq. (9.11) is that the volume per nucleon is
the same for all nuclei:
4π
V1 = (r0 A1/3 )3 / A
3
4π 3
= r0 (9.12)
3
Thus, the nuclear density is the same for all nuclei. The result is in agreement
with what might be expected from the strong, short-range forces in nuclei.
Furthermore, it implies that the nuclear forces are independent of the charge of
the nucleons. This is known as charge independence on nuclear forces.
1 2
∆E = A [ F ( F + 1) − I ( I + 1) − J ( J + 1)] (9.16)
2
which leads to a separation of
EF –EF – 1 = A 2 (I + J), A 2 (I + J – 1), ..., A 2 (| I – J | + 1)
(9.17)
between the successive states in Eq. (9.15). The analysis of the spectral lines
corresponding to these levels gives the value of I (and also of J).
The spin of the nuclei can also be determined from the spectra of
homonuclear molecules. It was observed in Chapter 5 that the transitions in
324 Elements of Modern Physics
1
Q=
e ∫ (3z 2 − r 2 ) ρ (r ) dV (9.20)
where ρ(r) is the charge density distribution in the nucleus. For a spherically
symmetric ρ(r), Q is zero whereas for an ellipsoidal (ellipsoid is obtained by
rotating an ellipse about one of its axes) distribution,
2q 2 2
Q= (a − b ) (9.21)
5e
The Nucleus 325
where a is the semi-axis along the axis of rotation and q is the total charge
(Q > 0 implies and elongated or prolate nucleus and Q < 0 implies a flattened
or oblate nucleus).
The nuclear quadrupole moments are determined from their effect on the
hyperfine structure of the atomic spectra. The observed values of Q range from
Q = – 10–28 m2 for 123Sb to 8 × 10–28 m2 for 176Lu, while deuteron has a value of
Q = 2.74 × 10–31 m2.
Yukawa Forces
One of the important modern ideas of forces is that forces between particles
arise from the exchange of particles. The form of the resulting potential can be
deduced from the following arguments.
326 Elements of Modern Physics
2 1 ∂2
∇ − 2 2 φ = 0 (9.22)
c ∂t
in free space. This equation may be regarded as arising from the relation
1
2 2
( E 2 − c 2 p 2 ) = 0 for the photon with zero mass, by the quantum
c
mechanical replacements in Eqs. (3.10) and (3.11). For a particle with nonzero
mass m, one may instead start with the relation
1
( E 2 − c 2p 2 − m 2 c 4 ) = 0 (9.23)
2c 2
Implementing the quantum mechanical replacements in Eqs. (3.10) and
(3.11) and including a point source, the equation for the potential comes out to
be
2 1 ∂ 2 m 2c 2
∇ − 2 2 − 2 φm = g δ(r) (9.24)
c ∂t
where g is a constant. The static solution to this equation is found to be
g
φm = − exp [− r (mc/h)] (9.25)
4πr
which is the well known Yukawa potential. This potential has an approximate
range of r0 given by
r0 = /mc (9.26)
i.e. essentially the Compton wavelenght of the quantum of the field exchanged.
The potential decreases very rapidly for r >> r0. Yukawa argued that the nuclear
forces, which have a range of r0 ≈ 10–15 m, arise from the exchange of a particle
of mass m ≈ h/r0 c, which for r0 ≈ 10–15 m, comes out to be (expressed as rest
energy)
m ≈ 200 MeV (9.27)
The π-meson with a mass of about 140 MeV, would be a good candidate for
the quantum whose exchange gives rise to nuclear forces. Of course, there are
additional contribution to the interaction from the exchange of other, heavier
particles but with correspondingly shorted ranges [see Eq. (9.26)].
The short-range nature of the nuclear force is due to the rapidly-decreasing
exponential function. In contrast, the electromagnetic forces arising from the
exchange of zero-mass photons, have a long range.
The Nucleus 327
Nucleon-Nucleon Interaction
The forces between two nucleons from the basis of nuclear interactions. An
important part of these forces is from the exchange of π-mesons.
A nucleon is continuously emitting and absorbing virtual π-mesons, and is
effectively surrounded by a cloud formed by them. These processes are regarded
as virtual since total energy and momentum conservation would forbid them,
but the uncertainty principle allows them to take place over short distances and
times. The mesons emitted by the nucleon may be absorbed by another nucleon
(Fig. 9.2). This gives rise to the forces between the nucleons.
The exchange of a charged meson [Fig. 9.2(b)] gives rise to charge-exchange
forces. They are observed, for example, when a beam of neutrons passes through
hydrogen. The exchange of charged mesons converts some of the neutrons into
protons which are then observed in the beam with almost the same energy and
momentum as the initial neutrons. Similar to the exchange of charges are
processes that lead to an exchange of spin, and exchange of charge and spin.
There are additional forces due to relativistic corrections, many-body forces, etc.
Clearly, the nuclear forces, even in a two-nucleon system, are very complicated.
p n p p n n
0 0 0
p p p
p n p p n n
(a)
n p n p
+ –
p p
p n p n
(b)
The second relation, Eq. (9.28), is valid only for the states which are allowed
by Fermi-Dirac statistics for the two protons, i.e. even l for S = 0 and odd l for
S = 1. The electromagnetic interaction will violate charge independence and
introduce small corrections to the above relations.
The model of a nucleon surrounded by a cloud of virtual π-mesons, provides
a qualitative explanation for the magnetic moments of the proton and the neutron.
In this picture, a neutron spends part of the time in the virtual (p + π–) state. In
this virtual state, the orbital motion of π– gives rise to a substantial negative,
magnetic moment. Similarly a proton spends part of the time in the virtual
(n + π+) state. In the (n + π+) state, the orbital motion of π+ gives rise to a large
positive, magnetic moment. These descriptions are in qualitative agreement
with the observed gyromagnetic ratios for the neutron and the proton [Eq. (9.1)].
h2
E≈ +V (9.30)
m p r02
where V is the average potential energy. Taking r0 ≈ 1 fm and E ≈ – 2 MeV
(binding energy of the deuteron)
V ≈ – 40 MeV (9.31)
This may be compared with an electrostatic energy of about 1 MeV when
two protons are separated by a distance of about 1 fm. Nuclear interaction is
thus seen to be such stronger than electromagnetic interaction, which explains
the term strong interaction used to describe it.
Shell Model
In the shell model, nucleons are assumed to move independently of each other
in an average, centrally-symmetric potential. They occupy discrete energy levels
in this potential, taking the Pauli exclusion principle into account. The grouping
together of some of the energy levels gives rise to a shell structure of the
nucleus. The independent motion or equivalently the long, mean free path is
partially justified by the Pauli principle which forbids transitions to states that
are already occupied.
Experimentally, it is found that nuclei with the number of neutrons or protons
equal to,
Z or (A – Z) = 2, 8, 20, 28, 50, 82, 126 (9.32)
are especially stable. These numbers are known as magic numbers. The stability
is particularly pronounced for nuclei with both the number of protons and
4 16 40 48 208
neutrons equal to magic numbers, e.g. 2 He, 8O, 20Ca, 20Ca, 82 Pb . The stability
of 42 He leads to its being the only composite nucleus emitted in radioactive
decays. It is also found that the 3rd, 9th, 21st, 29th, 51st, 83rd and 127th neutron
or proton is loosely bound (in fact 52 He is unstable).
For describing the shell structure of nuclei, two of the potentials used are
the spherical-well potential and the simple harmonic oscillator potential. Since
the oscillator levels have already been deduced (Sec. 3.12, example 6), the
details for this case are given. The wave functions of the 3-dimensional oscillator
are products of the 1-dimensional wave functions and the energies are the sums
of the corresponding, 1-dimensional, equispaced energy levels:
E = ω (n + 3/2), n = nx + ny + nz (9.33)
Now, corresponding to each n level, there are several degenerate states, e.g.
for n = 1, there are three states with nx or ny or nz equal to 1. These states
correspond to states with different angular momenta (n = 0, l = 0), (n = 1, l = 1),
(n = 2, l = 2, 0) etc. with the number of corresponding states (taking spin states
into account), being 2, 6, 12 etc. However, the actual potential cannot be
simulated by the oscillator potential at large distances. Since it goes to zero at
large distances, the larger angular momentum states are lowered with respect to
the smaller angular momentum states. The ordering of the energy levels with
the degeneracy removed is shown in Fig. (9.3). It should be noted that while
these levels explain the magic numbers 2, 8, 20 corresponding to closed shells,
they cannot explain the other magic numbers.
330 Elements of Modern Physics
s
d 11/2
g
n=6 i 126
13/2
p
3/2
5/2
f 7/2
n=5
9/2
h
11/2 82
s
n=4 d 3/2
5/2
7/2
g
50
9/2
n=3 p 1/2
3/2
5/2
f 28
7/2
s 20
n=2 1/2
3/2
d
5/2
n=1 8
p 1/2
3/2
s 1/2 2
n=0
Oscillator levels Perturbed Spin-orbit Shells
oscillator coupling
levels
j . s j ( j + 1) + s ( s + 1) − l (l + 1)
where as = = (9.36)
j. j 2 j ( j + 1)
j . l j ( j + 1) − l (l + 1) + s ( s + 1)
al = = (9.37)
j. j 2 j ( j + 1)
1
Now, for a given j, the allowed values of l are j ∓
, and the corresponding
2
magnetic moments in units of e /2mp called nuclear magneton, are
µ = gs + (j – 1/2) gl, l = j – 1/2,
j j ( j + 3/2)
µ= − gs + gl , l = j + 1/2 (9.38)
j +1 j +1
The plots of these moments as functions of j give what are known as Schmidt
lines. The experimental values of the magnetic moments ate not in good
332 Elements of Modern Physics
agreement with predictions of Eq. (9.38) but do lie between the two values.
This suggest the need for a more detailed analysis including a mixing of states,
e.g. the states may contain components in which the pairs of nucleons do not
pair off to give zero angular momentum states.
Quadrupole moments: The predictions of the shell model for electric
quadrupole moments are not in good agreement with the experimental values.
If the quadrupole moment of an odd Z, odd A nucleus is due to the last proton,
it should be approximately of the order
Q ≈ R2 (9.39)
where R is the radius of the nucleus. While this is the case for small nuclei,
some of the nuclei with large A, have Q as large as 10R2. Similar, large quadrupole
moments are observed for even Z, odd A nuclei as well. Many of these effects
are due to collective motions in nuclei, which are considered in the collective
model.
The shell model can be generalized by taking the average potential to be an
asymmetric harmonic oscillator potential. For example, in the Nilsson model
the force constant in the z-direction is taken to be different from those in the
x-and y-directions. This model retains the rotational symmetry in the z-direction
while being able to describe the observed large quadrupole moments of nuclei.
Collective Model
For nuclei with a closed shell plus one or a few nucleons, the elementary shell
model is quite successful in describing the nuclear properties. However, when
there are several nucleons outside the closed shell, the nucleus is significantly
deformed. The motion of the deformed nucleus gives rise to collective rotational
and vibrational levels of the nucleus.
In the deformed nucleus which is assumed to be ellipsoidal in shape, the
rotation can be of two types:
(i) it may be irrotational as in the case of tidal waves with no part of the
nucleus actually going around the nucleus,
(ii) the whole nucleus may rotate as a rigid body. Both these motions may
contribute to the rotational motion of a nucleus.
In even Z, even A nuclei, the angular momenta of the nucleons pair off to a
zero value, so that the total angular momentum is also the angular momentum
due to collective rotation. Accordingly, the rotational energy levels are given
by
2
EI = I ( I + 1) (9.40)
2I
where I is the moment of inertia and I is the total angular momentum quantum
number. However, since the remaining wave function (other than the rotational
The Nucleus 333
part) satisfies the required exchange symmetry, the rotational wave function
must be even under r → – r which effects an interchange of particles. Because
of the relation Y0m (π − θ, π + φ) = (− 1)l Yl m (θ, φ), this implies that only I = 0,
2, 4, ... are allowed. The observed energy levels for 238Pu shown in Fig. 9.4(a),
are in very good agreement with levels predicted by Eq. (9.40) with I even and
a moment of inertia
I ≈ 1.4 × 10–54 kg. m2 (9.41)
E E
in Me V 1 1 in Me V
+
0.514 8 +
9
3.44
2
+
0.3036 6
+
7
1.61
+
2
0.1460 4
+
0.0441 2 +
+ 5
0
2
238 25
Pu Al
(a) (b)
Fig. 9.4 Energy levels for collective rotation for (a) 258Pu, even
Z, even A nucleus, (b) 25Al, odd A nucleus.
This is quite large compared to the value of mpR2 ≈ 10–55 kg. m2 expected
for the motion of a single nucleon, thus justifying the interpretation in terms of
collective motion.
For odd A nuclei, the angular momentum is due both to the angular
momentum of the odd nucleon and to collective rotational motion. Since the
nucleon moves in a hemispherical potential, only the component of its angular
momentum along the axis of symmetry is a constant of motion. This component,
designated by Ω, adds vectorially to the collective angular momentum R which
is perpendicular to the axis of symmetry, to give the total angular momentum I.
1
When Ω ≠ , the rotational levels are given by
2
2
EI = [ I ( I + 1) − I 0 ( I 0 + 1)] (9.42)
2I
334 Elements of Modern Physics
2/3
n
Em(n) = C (9.44)
A
2/3
9 h2
where C= 2
32π 2mr02
≈ 52 MeV (9.45)
for r0 ≈ 1.2 fm. Similarly the total kinetic energy is given by
p2
Er =
∫ 2m
dn
3
= n Em (9.46)
5
For a nucleus with Z protons and (A – Z) neutrons, the total kinetic energy
is given by
3C
E= 2/3
[ Z 5/3 + ( A − Z )5/3 ] (9.47)
5A
1
If Z = A, the kinetic energy of the last nucleon is obtained from Eq. (9.44)
2
to be E ≈ 33 MeV. Since the binding energy of the last nucleon is about 8 MeV,
the average depth of the potential is of the order of 41 MeV. This is in agreement
with the earlier estimation is Eq. (9.31) based on the uncertainty principle. It is
also seen that for a given A,
∂E A
= 0 for Z = (9.48)
∂Z 2
i.e. the most stable nucleus has equal number of protons and neutrons. Expanding
1
E about Z = A,
2
2
3C 2(21/3 ) C 1
E= A + Z − A + ... (9.49)
5(2)2/3 3A 2
The second term, which gives the increase in the energy because of the
imbalance of the protons and neutrons, is
2
1
Z − A
2
δE ≈ 43.7 MeV (9.50)
A
336 Elements of Modern Physics
In this analysis, Coulomb interaction which increases the potential for the
protons has been ignored. Because of the Coulomb interaction, in stable, heavy
nuclei, one finds more neutrons than protons. An important point brought out
by the model is that the Pauli principle would push up the energy of a nucleus
1
with Z very different from A, and implies that the least-energy state is the
2
one with nearly equal number of protons and neutrons.
because of the imbalance of protons and neutrons. Finally, it is noted that the
1
Pauli principle allows pairs of protons and neutrons with spin to occupy the
2
same energy state whereas and odd proton of neutron is forced to go into a
higher energy state. This effect is included by a pairing term
all in MeV. The expression in Eq. (9.57) is known as the Weizsacker mass
formula and the values in Eq. (9.58) give the best fit to the binding energy plot
in Fig. (9.1). This formula is of considerable use in the analysis of the stability
of nuclei.
modes, then its lifetime is the inverse of the sum of decay probabilities λi,
−1
τ=
∑
i
λi
(9.61)
338 Elements of Modern Physics
This quantity τ is the average lifetime of the nucleus (see Sec. 6.4). For
example 238U has a lifetime of about 6.5 × 109 years, comparable with the age of
the universe.
The allowed decays in general must satisfy certain conservation laws such
as charge conservation, energy-momentum conservation, etc. γ-decays involve
changes only in the energy levels, the components of the nucleus remaining the
same. Here, we concentrate on β-decay, α-decay, and fission, which alter the
composition of the nucleus.
Conservation of energy allows only those decays which satisfy the following
rules:
1. A nucleus is unstable against the emission of an electron [Eq. (9.8)], i.e.
β–-decay, if
M(Z, A) > M(Z + 1, A) + me (9.62)
It is unstable against the absorption of an electron [Eq. (9.10)] if
M(Z, A) + me > M(Z – 1, A) (9.63)
It is suitable against the emission of a positron [Eq. (9.9)], β+-decay, if
M(Z, A) > M(Z – 1, A) + me (9.64)
2. A nucleus is unstable against breakup into two fragments if
M(Z, A) > M(Z′, A′) + M(Z – Z′, A – A′) (9.65)
In particular, it may decay by emitting a portion if
M(Z, A) > M(Z – 1, A – 1) + mp (9.66)
or by emitting an α particle if
M(Z, A) > M(Z – 2, A – 4) + mα (9.67)
It may be noted that because of the large binding energy per nucleon, the
decays via the emission of a heavy particle are important mainly in heavy nuclei.
For determining the stability pattern of lighter nuclei, it is necessary to consider
only electron emission or absorption.
Beta Decay
Consider first an odd A nucleus. The Z value for the most stable nucleus is
given by the condition
∂M ( Z , A)
=0 (9.68)
∂Z Z = ZA
1
(mp – mn) + 2a3ZAA–1/3 + 2a4 Z A − A A− 1 = 0 (9.69)
2
The Nucleus 339
6
M(Z) – M(ZA) in M V
4
e
–
e
2
e –
e
0
101
Ru
42 43 44 45 46
Z
(a)
340 Elements of Modern Physics
16
12
M(Z) – M(ZA) in M V
8 –
e
e
4 64
Cu
–
e e 64
64 Zn
0 Ni
27 28 29 30 31
Z
(b)
Fig. 9.5 (a) Stability of nuclei with A = 101, (b) stability of nuclei with A = 64.
A point of caution: the usual masses of nuclei quoted are the masses of the
corresponding neutral atoms and hence include an additional mass of Z electrons.
For the sake of clarity, we refer only to the masses of the nuclei.
Alpha Decay
It is observed (Fig. 9.1) that the binding energy per nucleon decreases as nuclear
mass increases (A > 56). Therefore, a heavy nucleus would, in some
circumstances prefer to decay into lighter nuclei. However, a decay by emission
of only a proton or neutron is not observed since each nucleon in a nucleus has
a binding energy of about 8 MeV whereas the binding energy of a free nucleon
is zero. On the other hand, a decay by emission of an α particle (4He), which
has a binding energy of about 7.1 MeV per nucleon, is quite likely and is observed
in many nuclei.
The properties of α-decay, representing in Eq. (9.5), may be illustrated by
taking a specific example. Consider the α-decay of 212Bi,
Bi → 208Ti + 4He
212
(9.72)
If Tl is in its ground state, the initial mass exceeds the final mass by 6.203
MeV. This appears in the form of kinetic energy shared by the final particles.
Since momentum conservation implies that Tl and He have equal and opposite
momenta, their kinetic energies are inversely proportional to their masses. This
means that the α particle is ejected with a kinetic energy
The Nucleus 341
208
Eα ≈ 6.203 ≈ 6.086 MeV (9.73)
212
Alternatively, Tl may be in one of its excited states [see Fig. 9.6 (a)] in
which case the kinetic energy of the α particle is
208
Eα ≈ (6.203 − ε) (9.74)
212
where ε is the excitation energy of the state. Consequently, the α particle is
observed with essentially discrete kinetic energies, 27% of the times with 6.086
MeV, 70% of the times with 6.047 MeV, and the remaining 3% of the times
with the smaller allowed energies given by Eq. (9.74). The total lifetime
corresponding to these transitions is about 87.7 minutes. It is important to note
that subsequent to the α-decay, typically in about 10–8 to 10–15 s, the excited
Tl nucleus undergoes transitions to a lower energy state by emitting a γ ray. In
the general case, the excited nucleus may also loose its energy by emitting an
electron, a proton, a neutron or another α particle. Alternatively, the excess
energy may knock out one of the electrons in the atomic orbits. This is known
as internal conversion and is usually characterized by the emission of an x-ray
photon when the vacancy created by the ejection of the electron is filled by an
electron from the higher energy levels.
212
Bi
E in MeV
0.617
0.492 1/r
0.473
0.327
0.04 0 R r
0
208
Ti (b)
(a)
Fig. 9.6 (a) Alpha decay of 212Bi and the subsequent gamma decay of 208
Tl,
(b) tunnelling of α-particle wave function leading to alpha decay.
An important question which arises is the following: Why does the nucleus
not undergo and instantaneous decay to an energetically allowed state? The
reason for this is that when the nucleus decays, it goes through some states
342 Elements of Modern Physics
which have higher energy and the decay products must overcome the potential
barrier. For example, when Tl and He are just touching each other before
separating, they have an additional Coulomb energy
e 2 (81) (2)
Ec ≈ (9.75)
4πε 0 (rTl + r He )
Using the values for radii given by Eq. (9.11) with r0 ≈ 1.2 fm,
Ec ≈ 26 MeV (9.76)
Classically, this is a forbidden domain. Quantum mechanically, the α particle
can penetrate and tunnel through a potential [see Fig. 9.6 (b)] with some
probability. It is this property which permits the decay, with a finite lifetime.
Nuclear Fission
In some cases, it may be energetically favourable for a heavy nucleus to break up
into fragments of nearly equal masses, accompanied by the release of a large
amount of energy. Since heavy nuclei have an overabundance of neutrons, the
fission process is usually followed by the emission of a few neutrons. An example
of this is the fission of an excited 236U* nucleus:
236
U* → 144Ba + 89Kr + 3n (9.77)
An insight into the fission process is obtained by considering a simple model
in which a nucleus (A, Z) decays into only two fragments, (αA, βZ) and
[(1 – α)A, (1– β)Z]. The energy released in this process is given by the decrease
in the masses, which may be estimated from the empirical formula in Eq. (9.57):
∆E = 17.8 A2/3 [1 – α2/3 – (1 – α)2/3]
+ 0.71 Z2 A–1/3 [1 – β2 α – 1/3 (1 – β)2 (1 – α)–1/3]
+ 95 Z2 A–1 [ 1 – β2α–1– ( 1 – β)2 (1 – α)–1] (9.78)
It is easy to show that
∂ ∂ 1
(∆E ) = ( ∆E ) = 0 at α = β = (9.79)
∂α ∂β 2
so that the maximum energy released is
( ∆E)max ≈ – 4.6 A2/3 + 0.26 Z2 A–1/3 (9.80)
For A = 236, Z = 92, this has a value of about 180 MeV. However the
fission products, when just in contact, have an additional Coulombic energy
given approximately by
e 2 ( Z /2) 2
Ec ≈ (9.81)
4πε0 (2 R )
where R ≈ 1.2 (A/2)1/3 fm. For A = 236, Z = 92, Ec has a value of about 259 MeV
so that the fission products have to escape by tunnelling through this potential
barrier [Ec > (∆E)max]. The tunnelling is not necessary only if
The Nucleus 343
to
Z2 > 65 A (9.83)
This condition is not satisfied even in the case of the heaviest known nucleus
2 6 2
1 0 5 Ha (hahnium) for which Z2/A ≈ 42. Thus all the observed fission processes
are expected to produced by tunnelling through the potential barrier.
A fission process is initiated by bombarding a nucleus with neutrons, giving
an excited initial state such as 236U*, and forms the basis of nuclear fission
reactors used for tapping large nuclear energies.
Gamma Decay
A nucleus may be found in an excited state if it is one of the products in an
alpha or a beta decay. The excitation energy here is provided by the higher mass
of the decaying particle. Alternatively, the excitation may take place as a result
of a collision in which the projectile excites the nucleus by transferring to it
some of its kinetic energy. In either case, the excited nucleus usually undergoes
transition to a lower state by emitting a photon,
Z* → Z + γ (9.84)
An example of γ-decay is the de-excitation of 208
Tl discussed earlier. The
energy of the photon emitted is
hv ≈ E* – E (9.85)
i.e. the difference in the energies of the number states.
The lifetimes for γ-decay are usually of the order of 10–14 s. However,
selection rules may inhibit some of these decays leading to mush longer lifetimes
of the excited states are sufficiently long so as to be directly measurable, the
nuclei are known as isomers. An example of extreme isomerism is 91Nb* which
has a lifetime of about 87 days.
Radioactive Series
In nature one observes several decays of radioactive elements some of which
were created in the early stages of the evolution of the universe, others which
are continuously formed by the bombardment of cosmic rays (see Sec. 10.6).
Of special interest are the ones which form what are known as the radioactive
series.
All nuclei with A > 209 are unstable and normally undergo either α-decay,
β-decay, or γ-decay. Since the changes in the number of nucleons A, are due
only to α-decay, the mass numbers A of the series of nuclei produced in a series
are related by
344 Elements of Modern Physics
A = a + 4n (9.86)
Thus, four series exist corresponding to a = 1, 2, 3. These are summarized
below along with the half-life in years (τ1/2 ≈ 0.69 τav) of the longest-lived
member:
Thorium 4n 232
Th (1.39 × 1010) 208
Pb
Neptunium 4n + 1 237
Np(2.25 × 106) 209
Bi
Uranium 4n + 2 238
U(4.51 × 109) 206
Pb
Actinium 4n + 3 235
U(7.07 × 186) 207
Pb
Of these, the thorium, uranium and actinium series are observed in nature.
The neptunium series is not observed in nature since its longest lived member,
237
Np has a half-life of about 2.25 × 106 years and whatever amount was created
at the early stages of the universe would have decayed by now. However, 237Np
can be artificially produced, e.g. from 236U by the capture of neutron followed
by β– decay. All these elements undergo a series of α and β– decays till they are
reduced t.o the final stable nucleus. The details of the 238U series are shown in
Fig. (9.7) where the steps with decrease in Z of two correspond to α-decays and
steps with unit increase in Z correspond to β-decays.
95
238
U
90
Z
85
206
Pb
80
120 125 130 135 140 145 150
A–Z
N 235 (t )
≈ exp [– t (1/τ235 – 1/τ238)] (9.87)
N 238 (t )
Using the observed value of about 1/140 for the relative abundance,
τ235 τ238
t≈ ln 140 (9.88)
τ238 – τ235
Since τ235 ≈ 1.02 × 109 years and τ238 ≈ 6.51 × 109 years, the age of the
elements is
t ≈ 5.98 × 109 years (9.89)
This is in reasonable agreement with the age determined from the abundances
of other nuclei. It is also of the same order as the age of the universe deduced
from the rate of expansion of the universe (see Sec. 11.5), and hence is an
important element in the understanding of the universe.
Cross-Section
The probability for a nuclear reaction to take place is expressed in terms of the
effective cross-section σ.
Consider a beam of projectiles incident on a collection of targets in the
form of a thin plate (sufficiently thin so that the nuclei do not overlap). An
effective cross-sectional area σ [Fig. 9.8 (a)] is associated with each target such
that every projectile incident within that area produces a reaction. If there are n
targets per unit volume and t is the thickness of the plate, the probability of a
single projectile producing a reaction is
P = σnt (9.94)
(it is the fraction of the effective target area). Therefore, the fraction of projectiles
∆N/N producing reactions, is
∆N
= σnt (9.95)
N
∆N
which leads to σ= (9.96)
Nnt
This relation allows us to determine the effective cross-section for a nuclear
reaction. It is usually expressed in terms of barns, 1 barn = 10–28 m2.
Nuclear reaction cross-sections vary over large ranges. Of particular interest
are the cross-sections for neutron projectiles. In this case, the neutrons, being
neutral, can enter a nucleus with ease even at low energies. In fact, since the
The Nucleus 347
1
time spent by the neutron inside the nucleus is proportional to ,v being the
v
1
neutron velocity, it is expected that the effective cross-section increases as at
v
low energies,
1
σ~
v
~ E–1/2 for small E (9.97)
This is observed experimentally. However, it is also observed that the
reaction cross-section has a sharp maximum for neutrons of some specific
energies E. This corresponds to what is known as resonance reaction [Fig. 9.8
(b)]. For example, the neutron capture cross-section, with Rh as a target, increases
by a factor of about 100 at an energy of about 1.4 eV. The resonance reaction
takes place when the energy of the incoming neutron equals the energy required
to transfer the combined system, of the target plus the neutron, to an excited
energy level. Similar resonant absorptions of photons are observed when the
photon energy is equal to the excitation energy of a nuclear level of the target.
10000
1000
r
s
100
s in barns
10
1
0.1 1 10 100
E in eV
(a) (b)
Compound Nucleus
A very useful concept in the understanding of nuclear reactions is that of
compound-nucleus formation proposed by Bohr (1936). It was suggested that
348 Elements of Modern Physics
for incident energies less than about 50 MeV, a nuclear reaction proceeds in
two stages. In the first stage, the projectile a is captured by the nucleus, forming
a compound nucleus c* which is usually an excited state of a nucleus. Such a
compound nucleus had a lifetime of about 10–18 s, considerably longer than the
time taken by the projectile to cross the nucleus (tcross ~ R /v ≈ 10–21 s). During
this time, the compound nucleus will have ‘forgotten’ the way it was formed. In
the second stage, this compound nucleus will decay independently of the mode
of its formation. The two stage process may be indicated by
X + a → C* → Y + b (9.98)
The corresponding cross-section is factorizable, i.e.
σ [X (a,b) Y] = F (X, a) G (Y, b) (9.99)
where F and G depend on the energy, spin, etc. of X, a and Y, b respectively. An
important result which follows immediately is that the ratio of the reaction
rates at a given energy, is independent of the initial state, i.e.
σ [ X ( a, b) Y ] G (Y , b)
= (9.100)
σ [ X (a, ′b) Y ′] G (Y ′, b′)
An example of the reactions proceeding via a compound nucleus is given
by the following processes:
27 Al + γ
23
p + 26 Mg Na + α
25
27 26
d + Mg → Al * → Al + n
26 (9.101)
α + 23 Na Mg + p
25 Mg + n + p
In these reaction, the relative cross-sections for the five final states are
independent of the initial state, but depend on the energy of the system.
Direct Processes
Reactions produced by fast projectiles proceed without the formation of a
compound nucleus. They are known as direct processes. As a typical example,
consider a deuteron incident on a nucleus, say 50V. When it approaches the
target, one of the nucleons may be captured by the nucleus while the other may
continue essentially undisturbed along its path (it should be remembered that
the deuteron is a loosely bound object). The net reaction is represented by
50
V + d → 51V + p (9.102)
50
V + d → 51Cr + n (9.103)
The Nucleus 349
Such reactions are called stripping reactions and are observed for other
projectiles, such as α particles, as well. Conversely, a fast-moving neutron (or
proton) may collect one or more nucleons and move off as a deuteron or an
α particle, e.g.
Al + n → 24Na + α
27
(9.104)
Such processes are known as pick-up reactions.
Applications
Nuclear reactions provide a powerful tool for the investigation of nuclear
structure and properties. From the conservation of energy, the masses of the
nuclei can be deduced, while conservation of angular momentum and the angular
distribution of particles help us to determine their angular momenta. Furthermore,
the cross-sections give important information on the interaction between nuclei.
Nuclear reactions allow us to produce trans-uranic elements as well as
elementary particles. For example, plutonium (Z = 94) is produced by
bombarding uranium with 40 MeV α particles
U + a → 241 Pu + n
238
(9.105)
Similarly, other elements such as berkelium (Z = 95), californium (Z = 98),
einsteinium (Z = 99), fermium (Z = 100), nobelium (Z = 102), etc. have been
262
produced from nuclear reactions, the heaviest of them being hahnium 105 Ha.
Many elementary particles also have been produced and investigated in nuclear
reactions. Two examples are:
p + p → P + n + π+ (9.106)
p + p → p + Σ K°+
(9.107)
π+ and K° being the positively charged pi-meson and the neutral K-meson,
respectively, and Σ+ is the positively charged sigma baryon (see Sec. 10.1).
From the point of view of practical applications, products of nuclear reactions
are of considerable utility in industry, medicine, agriculture, etc.
Their uses may be grouped in the following broad categories:
1. Radioactive isotopes, which have the same chemical properties as a given
element but decay with the emission of photons, are of general use as
tracers. These isotopes are easily detected by their characteristic radiation
and half-life. For example, the phosphorus isotope 32P (decays via beta
decay with half-life of 14.2 days) may be used to determine the proper
application of fertilizers to plants. Radioactive tracers are also used for
analysing blood circulation, flow rates of fluids, etc.
2. Neutron activation analysis is used for detecting small amounts of
impurities. The impurities are activated by exposing them to neutron
beams. The radioactive isotope formed as a result is estimated by its
350 Elements of Modern Physics
nucleus therefore is in an excited state and can decay by fission, generally into
parts with mass numbers of the order of 95 and 140. The neutron numbers in
these fragments are close to the magic numbers 50 and 82 which lead to stable
nuclear structures.
Since the heavy nuclei have an excess of neutrons, the fission process is
usually accompanied by the emission of neutrons. There is a further reduction
of the neutrons in the nuclei, due to either β-decay or emission of delayed
neutrons. A typical process would be
235
U + n → 236U* → 137I + 97Y + 2n (9.110)
followed by
97
Y → 97Zr + e + v
↓ 17h
97
Nb + e + v
↓ 74 min (9.111)
97
Mo + e + v
I → 137Xe + e + v
137
↓ 22 s
136
Xe + n (9.112)
137
I emits a neutron after its β-decay into 137Xe, and since β decays proceed
slowly, there is a large item delay between the fission and the emission of this
neutron. Such neutrons are called delayed neutrons.
The most important feature of neutron-induced fission is that the fission
itself provides additional neutrons which can produce additional reactions. Thus,
the process can build into a self-sustaining, possibly a growing, chain reaction
which is the basis of fission reactors. In the following, the main elements of a
fission reactor are discussed briefly.
Reactor Fuel
The essential process in a reactor is the fission of nuclei, accompanied by
neutrons and a release of energy. From a practical point of view, it would be
required that sufficient quantities of these nuclei, which form the reactor fuel,
occur naturally or can be produced. Five such nuclei are 235U, 239Pu, 233U which
contain odd number of neutrons, and 238U and 232Th which contain even number
of neutrons. The first three of these can undergo fission with the capture of a
thermal neutron, whereas that last two undergo fission mainly through the capture
of fast neutrons (the captured neutron in these two cases would be an odd neutron
and hence would release less energy). It is, however, important to note that the
capture of neutrons by 238U and 232Th leads to fissionable fuel:
352 Elements of Modern Physics
238
U + n → 239U + γ
↓ 23 min
239
Np + e + v (9.113)
↓ 2.3 days
239
Pu + e + v ,
232
Th + n → 233Th + γ
↓ 24 min
233
Pa + e + v (9.114)
↓ 27 days
233
U+e+ v
In most cases, the fuel is in the form of rods or plates which are placed in a
regular array within a moderator which serves the purpose of slowing down the
neutrons to thermal energies, i.e. energies of the order of 0.1 eV. The fission
reaction is triggered either by secondary cosmic ray neutrons i.e. neutrons
produced by the cosmic rays, or neutrons from a small neutron source (usually
containing a source of α particles which react with beryllium to produce
neutrons). The neutrons emitted, if suitably controlled, can then produce chain
reactions.
As a specific example, the source may be 235U which occurs in nature
(0.7% 235U along with 99.3%238U). It may be used in the natural form or after
concentration. One of the many known reactions produced was indicated in
Eq. (9.110). The fission of 235U produces, on the average 2.5 neutrons per nucleus
of which about 0.7% are delayed neutrons which play an important role in the
control of reactor rates. The energy released in each fission is about 200 MeV
which is distributed among the main fission fragments (about 165 MeV),
neutrons (about 5 MeV), electrons and photons (about 20 MeV), and neutrinos
(about 10 MeV).
Neutron Economy
In order that the fission process be self-sustaining, the neutrons produced in the
fission reactions should not all be lost.
In neutrons may be lost by being captured by 238U. The resulting 238U does
not lead to fission, but decays by emitting a photon. The capture cross-section
for 238U decreases to small values, about 3 barns, for thermal neutrons (note that
the cross-section goes through a large resonant value, 2.3 × 104 barns at 7 eV).
The capture cross-section for 235U, on the other hand, increases as 1/E1/2 for
small energies [see Eq. (9.97)], and has a value of about 580 barns for thermal
neutrons. Thus, the fission-effectiveness of neutrons is increased by the
thermalization of neutrons, achieved by the moderators surrounding the fuel.
Successive scattering of neutrons by the moderator transfers the neutron energy
to the moderator and hence slows down the neutrons.
The Nucleus 353
Some neutrons may be lost from the surface of the active zone. In this
connection, it is noted that the fission-effectiveness of the neutrons is proportional
to the volume (i.e. l3) whereas the surface losses are proportional to the surface
area (i.e. l2). Hence, the relative surface losses can be decreased by increasing
the active volume, and the size at which the chain reaction is just self-sustaining
is known as the critical volume, and the corresponding mass of the active material
is known as the critical mass.
In addition, there are other sources of neutron losses, such as absorption by
the impurity in the moderator. These losses must be carefully controlled, for
example, by sing moderators which are nearly free of impurities. The essential
requirement of continued chain reactions is that the number of fission neutrons
remaining after taking into account all the losses, must be greater than the initial
neutrons which induced the fission.
Moderators
The role of a moderator is to slow down the neutrons without absorbing them.
Elementary considerations show that maximum energy is transferred to the
target if the target mass is equal to the projectile mass.
The ideal moderator would have been hydrogen. Unfortunately, hydrogen
can capture a neutron according to the reaction p (n, γ) d. More suitable
moderators are the deuteron d (nucleus of deuterium), graphite (C) or beryllium
(Be). About 25 collisions are adequate to thermalize 2 MeV neutrons in heavy
water (D2O), and about 100 collisions in C or Be.
Control Rods
The number of fission neutrons available for the controlled chain reaction, after
taking into account the various losses, must be slightly greater than the neutrons
which caused the initial fission. To produce sustained, stable chain reactions,
the excess neutrons must be removed and controlled. This is usually done by
inserting what are known as control rods into the core of the reactor. These rods
are made of an element with a large cross-section for neutron capture.
The element often used in control rods is cadmium which has a very large
capture cross-section for thermal neutrons, 113Cd (n, γ) 114Cd. The insertion of
these rods decreases the reactivity of the reactor whereas withdrawal increases
the reactivity. It is important to observe that the response of the chain reaction
to fluctuations in neutrons, is slow, because of the delyed neutrons produced in
fission (see Example 6, Sec. 9.9). This permits the use of the control mechanism
with a time delay of about 1 min.
Coolant
The heat generated in the active region of the reactor is carried away by a heat-
carrying agent, usually water or an alkali metal such as sodium (the agent should
354 Elements of Modern Physics
have a large thermal capacity). This agent or the coolant gives this energy to
water (see Fig. 9.9) transforming it into steam which operates the steam turbines.
Breeder Reactors
So far mainly moderator-operated reactor bases on thermal neutrons has been
discussed. It is possible to run a reactor on fast neutrons by using a fuel with
enriched 235U or 239Pu. The fast neutrons released in the fission are used to
transform 238U into 239Pu (or 232Th into 233U) which can undergo fission by
capturing thermal neutrons. The core of such a reactor will contain two materials,
say, fissionable fuel 239Pu and the potential fuel 238U. If the conditions are such
as to produce more fuel than the amount burnt, the reactor is known as breeder
reactor (it breeds fissionable fuel).
Steam to
steam turbine
Core
Water
in
Coolant
Pump
Controlled Fusion
To have controlled fusion reactions, it is necessary to maintain nuclei at a
temperature of about 107 – 108 K in a confined region, so that nuclear reactions
can take place. At such high temperature, the atoms are ionized into positively
charged ions and electrons, forming what is known as a plasma state. The two
main problems in achieving controlled fusion are the containment of the plasma
within a suitable volume, and the heating of the plasma to the required high
temperatures.
356 Elements of Modern Physics
For the confinement of the plasma, one cannot use the walls of any vessel.
Any contact with the wall will not only quickly cool the plasma but also cause
the wall to evaporate. What is usually done is to confine the plasma in a suitable
magnetic field. The nuclei spiral along the magnetic field lines. By a suitable
arrangement of the field, the nuclei are reflected back and for the between bottle
necks provided by the converging lines (the lines tend to converge in regions
where the magnetic field is stronger). Such an arrangement is called a mirror
machine. Alternatively, the plasma may be confined in toroidal region formed
by a solenoid bent in the form of a torus. In this case, the nuclei spiral along the
closed field lines inside the torus. However, there are as yet serious difficulties
in controlling the instabilities of confinement over appreciable time periods.
There are two important methods of heating a plasma. In one method, fast
neutral atoms are injected into the magnetically-confined system and are ionized
by collisions with the plasma. The energetic ions are now trapped by the magnetic
field for long enough to transfer their energy to the plasma by collisions. For
example, a plasma of H+ may be heated by a beam of energetic H, or a plasma
of D+ (nucleus of deuterium) and T+ (nucleus of tritium) by a beam of energetic
D (Deuterium). The beam energies are generally of the order of a few tens of
keV to several hundreds of keV. The energetic beams are usually produced by
accelerating low energy ions in an electrostatic field and then passing the ions
through a target gas where the ions capture electrons and are neutralized. The
other method of heating a plasma is by radio-frequency electromagnetic waves.
When the waves are incident of a plasma, under suitable conditions, their energy
is converted into ordered particle energy which is then thermalized by collisions.
An alternative approach to controlled fusion is through what is known as
inertial confinement. Here, the fusion fuel, e.g. mixture of deuterium and tritium,
in the form of a pellet, is imploded from all sides by energy sources such as
laser beams, high energy electron or ion beams. The intense compression
pressures and the high temperatures produced in the pellet may produce
conditions conductive to fusion (it is the particle interial which provides the
basis for confinement over the required period and hence the term inertial
confinement). The difficulties in this approach are the low efficiencies of laser
or other sources, and the need to produce stable symmetrical implosion.
For controlled fusion to be a meaningful source of energy, the output energy
must be more than the input energy. There are several technical difficulties
which remain in achieving the break-even point, such as instabilities in
confinement, inefficient heating, etc. As such, controlled fusion has not yet
been realized. When realized, it will provided a virtually inexhaustible source
of energy. Deuterium, which is suitable for a fusion reaction (ordinary hydrogen
has a very small cross-section for fusion, and hence is not suitable), is readily
available, 0.03% by mass of hydrogen in water being in the form of deuterium.
Furthermore, the fusion reactions have important advantages over other sources
The Nucleus 357
of energy, in that they have hardly any radioactivity problem, produce negligible
pollution, and their sources are widely distributed.
Uncontrolled Fusion
Uncontrolled fusion can be achieved by using an atom bomb whose explosion
produces temperatures of the order of 107 K. For example, such an atom bomb
can ignite a fuel of deuterium and tritium, leading to the fusion reaction in
Eq. (9.116). This is the source of energy in what is known as the hydrogen or
thermonuclear bomb.
Fusion reactions are the source of energy in the sun and the stars, inside
which temperatures are of the order of 107-108 K. The energy there is produced
in two ways. In the proton-proton cycle which is dominant at lower temperatures
(T ~ 107 K), the fusion of hydrogen takes place in the following steps:
H + 1H → 2H + e + v
1
H + 1H → 3He + γ
2
(9.118)
3
He + He → He + 2 H
3 4 1
↓
13
C+ e +v
H + 13C → 14N + γ
1
(9.119)
H + 14N → 15O + γ
1
↓
N+ e +v
15
The net result is the fusion of four hydrogen nuclei fusion into one helium
atom with 12C serving only as a catalyst.
It may be noted that the Coulomb potential barrier (Eq. 9.117) is larger for
nuclei with higher Z values so that it is more difficult for the fusion of heavier
nuclei to take place. However, when the temperatures of stellar interiors rise,
fusion of heavier nuclei begins to take place. In particular, there is helium burning,
4
He + 4He → 8Be
4
He + 8Be → 12C + γ (9.120)
producing carbon. At higher densities and temperatures, fusion of heavier
elements also takes place, ultimately leading to elements in the iron mass region
358 Elements of Modern Physics
(A = 56) where the binding energy per nucleon has a maximum. Here, exothermic
fusion reactions cease. Elements heavier than 56Fe may be produced by capture
of neutrons produced in some reactions, and subsequent β-decays. Thus,
elements upto and just beyond uranium are produced. Still heavier elements
have short lifetimes and if created would quickly decay either by α-emission or
fission.
9.9 EXAMPLES
Some examples to illustrate the properties and interactions of the nucleus are
considered here.
Example 1
In the early stages of the development of nuclear physics before the neutron
was discovered, one of the models of the nucleus considered was that it was
made up of A protons and (A – Z) electrons. There are several arguments against
this model.
An electron confined to a volume of nuclear dimensions would be highly
relativistic, and its energy would be estimated by the uncertainty principle to be
c
T ≈ pc ≈ (9.121)
∆x
≈ 100 MeV for ∆x ≈ 2 fm
The confinement of such energetic electrons would require the existence of
very strong forces for which there is no evidence (such potentials would also
create many electron-positron pairs which are not observed). A conclusive
evidence against the proton-electron model of the nucleus is that even A, odd
Z nuclei have integral spin. In the proton-electron picture, such a nucleus would
have A protons and A-Z electrons, i.e. the nucleus has an odd number of fermions,
and hence would be expected to have a half-integral spin. This is contrary to the
experimental observations, e.g. 14N has I = 1. Finally, the proton-electron picture
e
would imply the existence of nuclear magnetic moments of the order of
2me
e
whereas the observed moments are much smaller, of the order of .
2m p
Example 2
One can estimate the strength of the deuteron potential, by assuming the potential
to be a square well of depth V0 and radius a.
The Nucleus 359
1
where m is the reduced mass m ≈ m p , V = 0 for r > a and V = – V0 for r ≤ a.
2
It is easy to show that
1 A1e − kr for r > a
ψ = A sin αr for r < a (9.123)
2 2
where
1/2 1/2
2m | E | 2m(V0 + E )
k= 2 ,α=
2
Continuity of the wave function and the derivative of the wave function, at
r = a, give the condition
α cot αa = – k (9.124)
Since | E | is known to be small compared to V0, one has an approximate
relation
αa ≈ π/2 (9.125)
which implies V0 ≈ 23 MeV for a ≈ 2 fm. A better approximation would be
π k
αa ≈ + (9.126)
2 α
which yields a value of V0 » 33 MeV for | E | ~ 2 MeV. In any case, it is seen that
the interaction is quite strong but the deuteron is a shallow bound state, i.e.
(| E |/V0) << 1.
Example 3
As an application of the shell model, the spin and magnetic moments of 17O and
127
I are considered.
The odd nucleon in 17O is the ninth neutron which is in the d5/2 state.
Therefore, 17O has j = 5/2 and its magnetic moment [see Eq. (9.38)] is expected
e
to be about − (1.91). Experimentally j is found to be 5/2 and the magnetic
2m p
e
moment to be − (1.89).
2m p
360 Elements of Modern Physics
For 127I, the odd nucleon is the 53rd proton. Shell model would predict that
it is in the g7/2 state. However, one finds j = 5/2 for the nucleus. Assuming that
e
is in the d5/2 state (see Fig. 9.3), shell model predicts µ ≈ (4.79) whereas
2m p
e
the experimental value is (2.81). The two example, illustrate the usefulness
2m p
and limitations of the shell model.
Example 4
A mass spectrometer is used for determining the masses of nuclei. It is based on
the principle that a moving particle subjected to mutually perpendicular electric
and magnetic fields, which are also perpendicular to the velocity of the particle,
is undeviated if
qE + qv × B = 0 (9.127)
or v = | E |/ | B |. If such a particle is now subjected to a magnetic field, it moves
in a circle of radius
p
r= (9.128)
qB
where p is its momentum. Thus, from the knowledge of v and p, the mass of the
particle can be determined (provided q is known).
Example 5
It is after the case that only a small amount of the target is exposed to a beam of
particles. The reaction produces an unstable isotope which decays. It is of interest
to know the number of unstable nuclei remaining after an exposure to the beam
for time t.
If the target contains N nuclei, the number of reactions per second is
n = NσF (9.129)
where F is the flux of the beam, i.e. particles/m2/s. The net increase dP in the
isotope population over a period dt is
dP = Nσ F dt – λ P dt (9.130)
where λ is the probability for decay. The solution to this equation is
P(t) = Nσ F(1 – e– λt)/λ (9.131)
For example, consider 1 mg of Na exposed to a neutron beam of
23
flux 10 14/cm2 s. The cross section for the reaction 23Na(n, γ) 24Na is about 0.56
barns. Since 1/λ ≈ 21.7 h, and N ≈ 2.6 × 1019, the number of isotope nuclei is
The Nucleus 361
Example 6
The determination of the age of a sample by 14C dating is illustrated here.
Let M grams of a sample of organic carbon decay at the rate of r(t) per hour.
Then the number n(t) of 14C atoms is given by
1
r(t) = n(t ) (9.133)
τ
where the lifetime τ is about 7.242 × 107 hs (half-life is 0.6931 times τ). The
fraction of 14C in atmospheric carbon is 1.3 × 10–12, which implies that at the
beginning there are
M
t = τ ln 902 (9.136)
r (t )
For example, if a sample of 1 g yields 300 decays/h, its age is t ≈ 9100
years.
Example 7
The delayed neutrons through small in number, play an important role in reactor
control.
Let there by N(t) neutrons at time t, and let τ0 ≈ 10– 2 s be the period of the
cycle between two fissions in the chain reaction. In the fission, 99.3% of N(t)
are multiplied by a factor of about 2.5 and the remaining 0.7%, though multiplied
by the same factor, are emitted later, say after time τ1 ≈ 9 s. Under the equilibrium
condition 1.5 N(t) would be lost so that once again N(t) is got back after time τ0.
Suppose now that the equilibrium condition is disturbed and that (1.5 – δ)
N(t) are lost (δ > 0). Then the number of neutrons at time t + τ0 is
N(t + τ0) = 2.5 (0.993) N(t) + 2.5(0.007) N(t – τ1)
– (1.5 – δ) N(t) (9.137)
362 Elements of Modern Physics
δ(t − t0 )
N(t) = N (t0 ) exp (9.139)
τ0 + 0.0175 τ1
Since δ is usually of the order of 1% or less and τ1 >> τ0, the delayed neutrons
(characterized by τ1) dominate the time variation and the time-scales of change
are of the order of 0.0175 τ1/δ ≈ 10 s – 1 min.
Example 8
An interesting mechanism of producing fusion is by screening the Coulomb
repulsion between the nuclei while they are being brought together. This can be
done by using a muonic hydrogen atom (bound state of a proton and a muon).
Since the muon is about 200 times heavier than the electron, its Bohr radius is
correspondingly smaller, about 0.25 × 10–12 m. Thus, the muon effectively screens
the proton charge, allowing it to approach another nucleus with greater ease.
For example, it may fuse with a deuteron to give 3He,
(µ–p) + 2H → µ– + 3He (9.140)
with a release of about 5.5 MeV. Since µ has a short lifetime (τ ≈ 2.2 × 10–6 s)
–
and is not easily produced, this does not appear to be an economical way of
producing fusion. However, it is being considered as a triggering mechanism to
start controlled fusion.
PROBLEMS
1. For a charge q distributed uniformly in a sphere of radius r, show that the
3 q2
electrostatic energy is . For a proton this has a value of about
5 4πε0 r
0.86 MeV if r ≈ 1 fm. It is believed that the small proton-neutron mass
difference is of electromagnetic origin, which depends crucially on the
magnetic properties to give a heavier neutron.
2Z + 1 2Z + 1
2. Nuclei Z X , Z + 1Y are examples of mirror nuclei (which are obtained
by n ↔ p). Charge independence of the nuclear forces implies that the
mass difference between these nuclei is electromagnetic in origin. Using
the result of problem 1, obtain an expression for the mass difference of
mirror nuclei. Using r = 1.2 A1/3 fm, determine the mass difference between
11
B and 11C, 13C and 13N, 35Cl and 35Ar. Compare with the experimental
values of 2.8, 3.0 and 6.7 MeV respectively.
The Nucleus 363
3. Show that for an ellipsoid with semi-major axis a along the axis of rotation
and semi-minor axis b perpendicular to it, the quadrupole moment is
2 2
5 Z (a − b 2 ) . Estimate (a – b)/R for 176Lu from the information that its
quadrupole moment is about 8 × 10–28 m2.
4. Show that the density of nuclear matter is about 2.3 × 1017 kg/m3. This is
the type of density expected in a neutron star which may be regarded as
a giant nucleus of mass comparable to that of the sun.
5. Assuming that the deuteron is a bound state in the Yukawa potential
given in Eq. (9.25), and the energy of about 30 MeV is approximately
the value of the potential at r ≈ 1 fm, show that | gε0/e2 | ≈ 60. This gives
an idea of the strength of nuclear forces compared to the electromagnetic
forces.
6. The nucleus 121Sb has spin 5/2. What is its expected magnetic moment?
e
Compare the result with the observed value of 3.36 .
2m p
7. Deduce the spin and magnetic moment of 3He, 15N, 39K and 209Bi from
the simple shell model and compare with the experimental values of
j = 1/2, 1/2, 3/2, 9/2 and µ = – 2.13, – 0.28, 0.39, 4.1 in units of nuclear
magnetons, respectively.
8. Determine the moment of inertia of 234Th given that its lowest rotational
energy levels are at 0, 0.048, 0.16 MeV. Compare it with the moment of
inertia of the whole nucleus regarded as a rigid sphere. What can you
deduce? What is the next expected rotational level?
9. The rotational ground state of 237Np has I = 5/2. The observed excited
levels have energies 0.033, 0.060, 0.076, 0.103 and 0.159 MeV. Which
of these may be expected to belong to the rotational band?
10. Obtain the masses of 106Ru, 106Rh, 106Pd, 106Ag, and 106Cd, from the
semiempirical formula and discuss the stability of these nuclei against
β± decays and electron capture.
11. Obtain the masses of 65Ni, 65Cu and 65Zn from the semi-empirical formula
and discuss the stability against β±-decay and electron capture.
12. Estimate the Coulomb barrier for α emission by 238U. What is the energy
of the α particle and of 234Th in the process 238U → 234Th + 4He?
(mU – mTh – mHe ≈ 4.3 MeV).
13. The Q value for the α-decay of 213Po into 209Pb is 8.52 MeV. What is the
energy of the α particle in the transition between these states? If some
α particle come out with 7.60 MeV, what is the energy of the
corresponding excited state of Pb?
364 Elements of Modern Physics
14. The element 32P decays into 32S (ms ≈ 31.972072) mu including the mass
of electrons) by β– -emission. If the maximum kinetic energy of the
electron is 1.7 MeV, what is the mass of 32P?
15. The element 7Be decays by electron capture. If the masses of 7Be and 7
Li are 7.016930 and 7.016004 mu respectively, what is the energy and
momentum of the recoil nucleus?
16. In the thorium series, the initial nucleus is 238U and the final nucleus is
206
Pb. How many α particles are emitted by each uranium nucleus? How
many electrons are emitted by each nucleus? If the lifetime is 6.5 × 109
years, how much helium is released from 1 g of 238U in 1 year? A mineral
sample contains 206Pb and 238U in the ratio of 1 to 4. Assuming that the
Pb is from the decay of U, estimate the age of the sample.
17. For the reaction 6Li + n → 3H + 4He with thermal neutrons, determine
the kinetic energy of 4He. It is given that the Q value for the reaction is
4.78 MeV.
18. A beam of neutrons is incident on a piece of gold. Show that the intensity
of the beam as a function of the depth t of penetration is given by
I(t) = I(0) exp (– σnt) where σ is the capture cross-section and n is the
number of target nuclei per volume. If the emerging intersity is 74% of
the original intensity for t = 0.05 cm, what is the capture cross-section of
gold?
19. If an average energy of 200 MeV is released in the fission of each 235U
nucleus, how mush 235U is used in one day in a reactor operating at a
power of 50 MW?
20. There is evidence to believe the sum has been in the present stable
condition for the last 5 × 109 years. Assuming that the stable condition
implies that the change in the composition is less than 10%, argue that
nuclear fusion is the only feasible source of energy and that the sun is
likely to remain in the present condition for another 5 × 109 years. The
sun radiates an energy of about 4 × 1026 J/s, and its mass is about
2 × 1030 kg.
10
Elementary Particles
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 365
S. H. Patil, Elements of Modern Physics,
https://doi.org/10.1007/978-3-030-70143-7_10
366 Elements of Modern Physics
same mass as the leptons but with other properties such as the charge, being
opposite to those of the leptons (antiparticles for fermions are the holes in the
negative-energy sea of Dirac discussed in Sec. 4.8).
Similar to the lepton doublets, there are doublets of quarks which have not
so far been observed, but which have proved extremely useful in describing the
properties of their composites, such as the proton, the neutron, etc. They are
u , cs , t (10.3)
d b
They have half-integral spin and have the properties shown in Table 10.1.
Since they have not been observed directly, their masses are inferred from the
masses of their composites, and are known as first, second and third generation
quark doublets, in the order of increasing masses. For reasons mentioned later,
each quark is supposed to come in three varieties, known as red, white and blue
quarks, equal mixtures of which are said to be colourless. As in the case of
leptons, one has antiquarks which have the same mass as the quarks, but all the
other properties such as charge, baryon number, etc. are opposite to those of the
quarks.
Table 10.1 Properties of quarks, charge in units of e, and masses in MeV
Leptons and quarks interact with each other via the exchange of some bosons.
These bosons are gluons (not yet observed directly) which give rise to strong
forces between quarks binding them into observed strongly interacting particles
such as the proton, the pi-meson, etc., the photons which give rise to
electromagnetic forces, the W-bosons which give rise to weak-interaction forces,
and the gravitations which given rise to gravitational interaction forces. These
bosons are necessary for the description of the interaction between different
particles.
368 Elements of Modern Physics
π+ 139.6 1 1(1) 0 ud
πθ 135.1 0 1(0) 0 u, u d d
π –
139.6 –1 1(–1) 0 du
K0 498 0 1/2(1/2) –1 sd
K– 494 –1 1/2(–1/2) –1 su
η 549 0 0 0 u u , dd , ss
Isospin Symmetry
The most striking feature to be noted in the properties of the hadrons is that
they come in multiplets with components of very nearly the same mass, e.g.
938.2 MeV for the proton and 939.5 MeV for the neutron. Such a property has
been noted for bound states in central potentials, where the states with different m
values but the same l value, have the same energy. It is therefore suggested that
we postulate and abstract space in which there is an abstract spin called isospin
I, and the different components of a multiplet are states with the same I but
different Iz. For example, P and N have I = 1/2 and Iz = 1/2, – 1/2 respectively.
The equality of the masses of the different components, would then follow
from the invariance of the interaction under rotations in the abstract isospin
space.
It may be observed in Table 10.2, that the different components of an isospin
multiplet differ only in their u, d components. Therefore, an equality of the
masses of the u and d quarks would imply an equality of the masses of the
components of each multiplet. Thus, if the interactions do not distinguish between
u and d quarks, it is suggested that the interactions are invariant under the
transformations
| u′〉 = x1 | u 〉 + x2 | d 〉
| d′ 〉 = y1 | u 〉 + y2 | d 〉 (10.5)
with the condition that | u′ 〉 and | d′ 〉 are orthonormal (the notation | u 〉, etc. is
used to designate the states), which implies
| x1 |2 + | x2 |2 = | y1 |2 + | y2 |2 = 1
x1y1* + x2y2* = 0 (10.6)
370 Elements of Modern Physics
One also imposes a phase condition for the determine of the matrix formed
by the coefficients xi and yi
x1y2 – x2y1 = 1 (10.7)
The linear transformations in Eq. (10.5), with the conditions in Eqs. (10.6)
and (10.7) define the group SU(2) (group of special unitary transformations in
2-dimensions) which is closely related to the usual 3-dimensional rotations.
Invariance under these transformations gives rise to SU(2) or isospin symmetry.
It allows the characterization of states by isospin I and its z-component. Iz (similar
to l and m in the case of ordinary rotations). Thus, (u,d) have I = 1/2 and
Iz = ± 1/2, (s) has I = 0 and Iz = 0, (P, N) have I = 1/2 and Iz = ± 1/2, (Σ+, Σ0, Σ–)
have I = 1 and Iz = 1, 0, –1, etc. Furthermore, these quantum numbers are
conserved in processes which involve only strong interaction.
For specific applications, the (P, N) system is considered which can have
I = 1 or 0. Designating the isospin states by | I, Iz〉,
| 1,1〉 = | PP〉 (10.8)
1
| 1, 0〉 = 1/2 (| PN 〉 + | NP 〉) (10.9)
2
| 1, –1〉 = | NN〉 (10.10)
1
| 0, 0〉 = 1/2 (| PN〉 – | NP〉) (10.11)
2
These relations follow from the usual quantum-mechanical rules for
combining two angular momenta (also see Problem 1). Isospin symmetry then
implies that the probability amplitudes T, which are essentially the probability
amplitudes for the processes, satisfy the relations
〈PP | T | PP〉 = 〈 NN| T |NN〉
1
= 〈PN + NP | T | PN + NP〉 (10.12)
2
Another useful application is obtained by noting that the deuteron D appears
in only one charge state and hence is assigned I = 0. Since the π– meson multiplet
has I = 1, the Dπ state is an I = 1 state. Conservation of isospin then gives the
result
1
〈 Dπ0 | T | PN〉 = 1/2〈Dπ+ | T | PP〉 (10.13)
2
Experimentally, this relation was verified at an energy of 340 MeV, to with
in a few per cent by Hildebrand (1953), which supports the general ideas of
isospin invariance in strong interaction.
1 N P
Y=S+B
o
– S +
0 S S
L
–1 X
–
X
0
–2
–2 –1 0 1 2
Iz
up of c c (c has charm 1 while c has charm –1). Since each charmed quark
retains its charm these Ψ-mesons cannot decay into particles which do not contain
c or c as constituents which explains their long lifetimes (other particles which
contain charm are too heavy to provide decay channels).
only hadrons and 10–8 s for decays of excited atoms emitting electromagnetic
radiation. The interaction which causes these decays is called the weak
interaction. It was noticed in these decays that the electron does not carry away
all the energy (∆E ≈ mnc2 – mn′c2) but has a continuous energy distribution with
a cut-off in the energy equal to (mn – mn′)c2. It was also found experimentally
that no photons were emitted in the process.
In order to save the law of conservation of energy, Pauli made a bold
suggestion (1930) that an electrically neutral particle with spin 1/2, accompanies
the emission of the electron. This particle is the neutrino. It has a very small
mass, mv < 60 eV, possibly zero (theories prefer a zero mass for the neutrino),
and as in the case of other particles, there is an antineutrinio as well. Indeed,
the emission of an electron in Eq. (10.21) is accompanied by the emission of an
antineutron (a neutrino would accompany the emission of a positron). The
basic β-decay process of radioactive decay is
N → P + e + ve (10.22)
The neutrinos do not have direct strong or electromagnetic interaction. They
interact very weakly with matter (a neutrino of 1 MeV energy has a path length
of 1018 m in lead) and hence are very difficult to detect. However, nuclear reactors
provide intense beams of neutrinos (about 1017 m–2s–1), and were detected by
Reines and Cowan (1956) in the reaction
ve + P → N + e (10.23)
which is essentially the inverse β-decay process ( e is the positron). There are
other examples of reactions due to weak interaction in which neutrinos
accompany other leptons, e.g.
π+ → m + vm (10.24)
The beam of neutrinos (of energy about 500 MeV) from the decay of π+
produced in accelerators, was allowed to interact with neutrons (Lederman and
Schwartz, 1962) and produced reactions
vµ + N → P + µ (10.25)
but not vµ, + N → P + e. Thus, the neutrinos produced in reactions of the type
given in Eq. (10.24), accompanying muons, are different from those produced
in the β-decay. There are, therefore, two types of neutrinos, ve and vµ which are
associated with the electrons and the muons respectively. With the recent
discovery of τ leptons, there should also be vτ associated with τ leptons,
six different neutrinos along with the antineutrinos, there would be all together
six different neutrinos and antineutrinos.
Elementary Particles 375
Strangeness
The weak interaction plays an important role in the behaviour of what are known
as strange particles.
The K-mesons and Λ, Σ, Ξ baryons were discovered in the cosmic rays
(which are rays of generally high energy particles originating from the outer
space) in the early fifties. After the construction of high energy accelerators,
they could be produced and studied in a controlled manner. They are produced
in reactions of the type
π– + P → K0 + Λ (10.26)
The rate of their production is typical of strongly interacting particles (e.g
comparable to the production of π0 N). However, the decay of Λ,
Λ → π– + P or π0 + N (10.27)
is very slow. The lifetime of strange particles is generally of the order of 10–8 to
10–10 s (except for Σ0 which decays into Λ + γ in less than 10–14 s) whereas the
typical lifetimes of decays of strongly interacting particles are of the order of
10–22 s. The unusual behaviour of these particles, as strongly interacting particles
in production and as weakly interacting particles in decays, brought them the
name of strange particles.
It was observed that the strange particles are produced in pairs [K0 and Λ in
Eq. (10.26)], called associated production, whereas the decay processes involve
individual strange particles. This is reminiscent of a neutral system (e.g. radiation)
producing a pair of oppositely charged particles (e.g. e and e+) but a charged
particle being forbidden to decay into a neutral system by charge conservation.
Using this analogy, Gell-Mann and Nishijima introduced a new quantum number
S called the strangeness which is conserved in strong interaction. Thus K0 is
assigned strangeness 1 while Λ is assigned strangeness –1, and π– and p are
assigned strangeness zero. Thus, the total strangeness is conserved in the reaction
given in Eq. (10.26) (being zero both before and after the reaction). However, it
is not conserved in the decay process given in Eq. (10.27) and hence the decay
would be forbidden by strong interaction. Strangeness is conserved in
electomagnetic interactions as well, so that the decay in Eq. (10.27) proceeds
via the weak interaction which does not conserve strangeness. This would explain
the long lifetime of Λ. Indeed the strength of the interaction for the decay in
Eq. (10.27) is of the same order as the strength of the interaction which leads to
the β-decay of the neutron in Eq. (10.22), ones the dependence of the decay on
the masses is separated out. Thus, it is the weak interaction which governs the
strangeness-changing processes, e.g. decay of Λ.
The strangeness of a particle is given by the relation
1
Q= (S + B) + Iz (10.28)
2
376 Elements of Modern Physics
where Q is the charge, B is the baryon number, and Iz is the z-component of the
isotopic spin. The combination S + B is called the hypercharge Y and is often
more convenient to use than strangeness. This relation implies that the
conservation of strangeness is equivalent to the conservation of charge and Iz. It
needs a slight modification once particles with nonzero charm are included:
1
Q = (S + B + C) + Iz (10.29)
2
where C is the charm of the particle. The relation will require further modification
if top and bottom quarks are included.
Parity Violation
Weak interaction violates another important conservation low, namely the
conservation of parity.
It had been observed that the laws of nature generally do not appear to
distinguish between a coordinate frame and an inverted coordinate frame, i.e.
the equations of motion are the same whether coordinates (x, y, z) or
(–x, – y, –z) are used. This is termed as invariance under space inversion or
parity transformation. Let us define an operator P called the parity operator,
which takes the wave function ψ(x, y, z) in a coordinate frame, into a wave
function ψ′(x, y, z) observed for the same state but in a coordinate frame with
inverted axes:
ψ′(x, y, z) = Pψ(x, y, z) (10.29a)
However, ψ′(x, y, z) is essentially the same as ψ(–x, –y, –z) except for a
possible phase factor A, so that
ψ′(x, y, z) = piψ(–x, –y, –z) (10.30)
Now a second operation by P leads us back to the original wave function so
that
ψ(x, y, z) = Pψ′(x, y, z)
= pi2 ψ(x, y, z) (10.31)
which implies that
pi = ± 1 (10.32)
The states with pi = 1 are said to be even intrinsic parity states and the states
with pi = –1 are said to be odd intensity parity states. Furthermore, let us assume
that
ψ(–x, –y, –z) = pe ψ(x, y, z) (10.33)
where pe is called the spatial parity of the state. This relation together to pi, pe
Eqs. (10.30), (10.29) implies
Pψ(x, y, z) = pi pe ψ(x, y, z) (10.34)
so that there exist eigenstates of parity with eigenvalues equal to pi pe which are
the product of intrinsic parity and spatial parity eigenvalues.
Elementary Particles 377
Now, if nature does not distinguish between the coordinate frame and the
inverted coordinate frame, i.e. space inversion symmetry exists, then ψ′(x, y, z)
also must satisfy the Schrödinger equation
∂
i Py ( x, y, z ) = HPψ (x, y, z) (10.35)
∂t
Since P2 = 1,
PHP = H (10.36)
or HP = PH (10.37)
Thus, P commutes with the Hamiltonian. Therefore, parity is conserved
and states which are simultaneous eigenstates of H and P can be obtained
(See Eq. (3.42) and the dissuasion which follows it). This is valid provided
nature exhibits space-inversion symmetry (which has been shown to be
equivalent to having HP = PH).
In the fifties, two mesons called the τ-meson and the θ-meson were
discovered in cosmic rays. They have the same mass, around 498 MeV, the
same lifetime, around 1.2 × 10–8 s, and the same production rates in nuclear
reactions of the type π+ N → Λ + (τ+ or θ+). However, they had different decay
modes: the τ-meson decayed into only two π-mesons (τ+ → 2π+ + π–) while the
θ-meson decayed into only two π-mesons (θ+→ π+ + π0). Now, the intrinsic
parity for a π-meson is –1 (as deduced from the strong interaction of the
π-mesons) so that pi = 1 for the two π−meson states. It was also shown by Dalitz
(1953) from the energy distribution of the pions, that pe = 1 for both two
π-meson final states. Thus, the decay products of the τ-meson decay are in a
negative parity state pi pe = –1, while the deacy products of the θ-meson decay
are in a positive parity state pi pe =1. If parity is conserved, an unusual situation
occurs, viz. that there are two particles with almost the same mass but opposite
party. The other possibility is that τ and θ are one and the same particle but the
weak interaction which is responsible for the decay (lifetimes indicate that the
interaction is weak), violates parity invariance, i.e. parity is not conserved in
weak interaction. Lee and Yang (1956) suggested this possibility after a critical
examination of processes involving weak interaction, and proposed an
experiment to test parity noncompensation in weak interaction.
Before describing the experiment, it is noted that space inversion is
equivalent to a reflection and a rotation, e.g. reflection in the xz plane
(change y → –y), and a rotation about the y-axis by 180° (changes x → – x and
z → –z). Since rotational invariance is a universal symmetry, it gives the result
that in addition to ψ′ (x, y, z) = Pψ (x, y, z) satisfying the Schrödinger equation,
the space-reflected wave function also satisfies the Schrödinger equation. e.g.
Rψ (x, y, z) where R denotes the operator which changes y → –y. This is known
as right-left symmetry. Consider. The nuclei of Co60 whose spins are aligned
along the z-direction with the aid of a magnetic field. It was then found that the
378 Elements of Modern Physics
– –
e e
(a) (b)
The V-A theory brings in the possibility that the weak interaction is invariant
under the combined operation of spatial reflection and charge conjugation which
changes a particle into an antiparticle. As observed before, mirror reflection
takes a left-handed neutrino into a right-handed neutrino which under charge
conjugation goes into a right-handed antineutrino which is included in the theory.
However, even this invariance under CP, i.e. the combined operations of parity
and charge conjugation, is violated to a small extent. This was observed by
Christenson, Cronin, Fitch and Turaly (1964) in the decay of the long-lived
Component, Cronin, Fitch and Turlay (1964) in the decays into three π-meson
states with CP = –1, also decays to a small extent into two π-meson states with
CP = 1, indicating a small violation of CP invariance.
Cosmic Rays
Before the development of powerful accelerators, the cosmis rays were the
only source of particles with sufficient energy to produce mesons and strange
baryons. Many particles such as the positrons, the µ-mesons, the π-mesons,
and several strange particles were first observed in the cosmic rays.
Primary cosmic rays are a flux of energetic charged particles, mainly protons
(about 89% protons, 9% helium nuclei, 1% remaining heavier elements and
about 1% electrons) that are incident on the earth. The energy of these particles
varies from about 103 MeV to 10–14 MeV(average energy is about 104 MeV).
When these energetic particles encounter the earth’s atmosphere, they undergo
inelastic collisions, producing what are called secondary cosmic rays which
consist of mesons, protons, neutrons, strange particles etc. These secondary
cosmic rays will themselves undergo additional elastic collisions, producing
nucleonic cascades. Ultimately they reach the ground with a composition of
about 80% µ-mesons, the remaining being protons, neutrons and some strange
particles.
The origin of the rays is thought to be supernova explosions, with additional
contributions from the sun (low energy cosmic rays), the centra of the galaxy,
etc. Some high energy particles may be from outside our galaxy.
While the cosmic rays have proved to be important for the discovery of
many particles, they have the disadvantage that neither their energy, nor their
intensity, can be monitored to our convenience.
(such as the protons) can be accelerated. This accelerator can be used for
accelerating protons to energies of about 15 MeV and is very useful in low-
energy nuclear physics. The limitations of linear accelerators are in general due
to the length, instability and loss of voltage.
The Cyclotron
The cyclotron is based on the principle that charged particles (nonrelativistic)
in a constant magnetic field B perform circular motion whose frequency is
independent of the magnitude of the velocity. The frequency is obtained by the
force-acceleration relation
ev B = mv2/r (10.41)
eB
or ω= (10.42)
m
In a cyclotron, protons in spiral orbits (Fig. 10.4) between the poles of two
magnets and are accelerated by pulses across the hollow D-shaped electrodes
which enclose the particle chamber. The radio frequency pulse across the
electrodes has the frequency given by Eq. (10.42) and gives an extra energy of
eV for every traversal across the gap (V being the voltage across the electrodes).
The cyclotron can be used for obtaining protons of energy about 20 MeV (this
would require about 1000 pulses of V = 20 000 V). The limitations of the
cyclotron are due to the fact that the relativistic effects reduce the frequency in
Eq. (10.42) as the particle speeds up so that it is no longer independent of the
velocity of the particle.
Leads to the
alternating voltage
Dee electrodes
Synchrotron
To overcome the voltage pulses getting out of phase with the rotational frequency
in Eq. (10.42), the pulse frequency can be gradually changed so as to keep it in
Elementary Particles 383
step with the circulating particles. Machines based on this idea are called synchro-
cyclotrons. A further modification was to change the magnetic field as well as
the pulse frequency so as to keep the protons in circular orbits of approximately
the same radius (magnetic field must increase as the velocity of the protons
increases) and to keep the pulse frequency in step with the particles. Such an
accelerator is called the synchrotron, and is capable of providing proton beams
of an energy of a few GeV (GeV = 109 eV).
One of the problems of synchrotrons is the focussing of the particles. If
there is no focussing, particles with velocities slightly different from the average
velocity will spread out and only a few particles with the final energy will be
obtained. In velocity focussing the particles are kept together by adjusting the
timing of the pulses. In spatial focussing, the particles are kept together (though
they perform small oscillations) by controlling spatial variation of the magnetic
field. An important advance in the accelerators was introduced by what is known
as strong focussing. This was achieved by using magnet sections with alternating
magnetic field gradients—that is the magnetic field increases radially in one
section and decreases in the next sections. This allows one to obtain focussing
in both radial and axial directions and to reduce the radial and axial oscillations.
Synchrotrons using strong focussing are called alternating-gradient synchrotrons
(AGS) (see Fig. 10.5) and have been used to obtain protons of energies of about
30 GeV.
High-energy beams of electrons have been obtained by linear accelerators
(where the radiation loss in the energy due to radial acceleration, synchrotron
radiation, is avoided), and beams of photons have been obtained from the
synchrotron radiation of electrons, e– – e+ annihilation, etc.
Colliding Beams
For the production of heavy particles and the observation of interactions at
high energies, it is advantageous to work with colliding beams of energetic
particles. To see this, consider a collision between two particles of mass m
each. The effective energy for a process may be standardized in terms of the
total energy in the centre of mass (cm) system (Ptot = 0 in the cm system). If one
of the particles is at rest and the other is moving with an energy E (E includes
kinetic energy and rest energy) and momentum p, the total cm energy Et is
given by
1 2 1 2 2
E
2 t = 2 ( E + mc ) - p. p
c c
= 2m2c2 + 2Em (10.43)
384 Elements of Modern Physics
Injections of
protons from a To the target
linear accelerator
Proton beam
Magnets with
alternating gradients
Radio-frequency
acceleration regions
energy of 26 GeV and then to an energy of about 270 GeV in a super proton
synchrotron. The same synchrotrons are used for speeding-up protons as well,
and providing a beam of 240 GeV protons. Since the protons and the antiprotons
are oppositely charged, they move in opposite directions in the synchrotrons.
It was in the collision of 270 collision of 2w70 GeV proton antiproton
beams at CERN that the W and Z bosons were produced recently; and identified
by their characteristic decays. These bosons, which are essential for the
propagation of unified weak-electromagnetic interaction, were found to have a
mass of mW ≈ 81 GeV and mZ ≈ 95 GeV. This has been an important step in the
confirmation of the theory of unified weak-electromagnetic interaction.
Finally, it may be noted that colliding-beam experiments have also been
performed with electron and positron beams, which have provided important
information about the properties of weak-electromagnetic interaction and of
elementary particles.
alcohol vapour, from the top (kept at about 10°C) to the bottom (kept at
about – 70° C by using solid carbon dioxide). This gives a layer of supersaturated
vapour (a few centimeters thick) near the bottom. The diffusion chamber has
the advantage that it can work continuously.
Counter
Charged
particle
Pulse
generator
Counter
Emulsion Chamber
Charged particles interact with photographic emulsions in the same way as
photons, and hence the emulsions can be used for recording the tracks of charged
particles. Photographic emulsions such as silver bromide (which has a density
of about 4 g/cc) is an efficient stopper of high-energy ionizing particles. Several
hundreds of layers of these emulsion sheets, each about 1/2 mm thick, may be
exposed to cosmic rays or to high energy particles from accelerators. After the
development of the sheets, the path of the particles can be traced from layer to
layer.
Geiger Counter
This is a very compact and sturdy instrument used for detecting energetic
particles. It consists of a metal tube with a wire along its axis. It is filled with a
suitable gas mixture at low pressure. The tube is insulated from the wire and a
high potential difference is maintained between the tube and the wire. When an
energetic particle enters the tube, it produces some ions in the gas. This causes
a small discharge current to flow between the wire and the tube. Recording of
these signals allows us to count the number of incident particles. Since the ions
and the electrons take a time of about 10–3 s to recombine, the counter can
record only a few hundred counts per second. Its main advantage is that it is
very inexpensive and easy to operate.
10.7 EXAMPLES
A few examples are discussed in this section.
Example 1
It is interesting to note that several particles in the SU (3) multiplet scheme,
such as the η-meson, were discovered only after the scheme predicted the
existence of these particles. Particularly noteworthy is the Ω– which is a member
of the baryon decuplet. Gell-Mann predicted it to have spin of 3/2, strangeness
–3, and a mass of about 1676 MeV. Since there is no baryonic system with
strangeness –3, and which is lighter than 1676 MeV, the Ω– is stable against
decay through strong interaction (which conserves strangeness). The Ω– can
decay via weak interaction (which does not conserve strangeness) but would
have a lifetime of about 10–10 s characteristic of weak decays. A particle with
precisely these characteristics, with a mass of 1675 MeV was observed in 1964,
through its weak decay
Ω– → Ξ– + π0 (10.46)
which does not conserve strangeness.
388 Elements of Modern Physics
The Ω– is made of three strange quarks (sss). The three quarks are in the
ground state with spatial angular momentum equal to zero and in the totally
symmetric spin 3/2 state. This would violate Fermi statistics. In order to get
out of this difficulty, an additional property called colour was assigned to the
quarks. The three quarks in the baryons are supposed to be of different colours
so that the exchange statistics is not applicable.
Example 2
The β-decay in Eq. (10.21) is to be regarded as a spontaneous emission of e and
ve and not an escape of an electron bound in the nucleus. This is indicated by
the fact that a bound electron would have a momentum p ~ /r0 (follows from
the uncertainty principle) and a kinetic energy
KE ~ (m2c4 + c2 2 /r0 2)1/2 –mc2 (10.47)
which for r0 ~ 10–15 m comes out to be approximately 200 MeV. Such a highly-
energetic electron would quickly escape in about 10–23 s, and the slow rate of
the β-decay cannot be explained.
Example 3
The parity of π– is determined from the reaction
π– + d → N + N (10.48)
The low-energy π is captured is a Bohr orbit (with a radius about 2 ×
–
Example 4
The behaviour of neutral K-mesons provides a very interesting application of
the superposition principle. Neutral K-mesons have the decay modes
K0 → π+ + π–, π0 + π0
K 0 → π+ + π–, π0 + π0 (10.49)
Elementary Particles 389
strangeness so that the states with well-defined energy (and also lifetime) are
not expected to be states with well-defined strangeness.
Considering the possibility of CP being conserved in weak interaction, one
may define
CP | K0 〉 = | K 0 〉
CP | K 0 〉 = | K 0 〉 (10.51)
where P is the parity operator and C is the charge-conjugation operator (which
changes a particle into an antiparticle with opposite charge, strangeness, etc.).
If the Hamiltonian commutes with CP, the eigenstates of energy can also be
eigenstates of CP. Such eigenstates are the superpositions
1
| K10〉 = (| K 0〉 + | K 0〉 ), CP = 1 (10.52)
2
1
| K 20 〉 = (| K 0〉 − | K 0〉 ), CP = –1
(10.53)
2
and it is these states which are expected to have well-defined masses and
lifetimes.
In the 2π-meson decays given in Eq. (10.49) 2π-mesons have l = 0 and
hence are in CP = 1 state. Hence while the | K10 〉 can decay into two π-mesons,
the decay of | K 20 〉 into two π-mesons is forbidden. Thus | K10 〉 has a short
lifetime of about 0.86 × 10–10 s whereas | K 20 〉 has a much longer lifetime of
about 5.4 × 10–8 s (the masses of | K10 〉 and | K 20 〉 , also are slightly different).
This leads to some interesting observations for the |K0 〉 created in a process
such as Eq. (10.26) . The |K0〉 may be regarded as
1
|K0 〉 = (| K10 〉 + | K 20 〉 ) (10.54)
2
of which | K10 〉 decays quickly into two π-mesons [giving the decays in
Eq. (10.49)] so that after a few lifetimes of | K10 〉 , only the | K 20 〉 is left. This
long-lived component decays slowly into other modes such as the ones in
Eq. (10.50). It may be observed that though only | K0 〉 existed at first, the
component | K 20 〉 which remains at the end contains equal mixtures of |K0 〉
and | K 0 〉 .
390 Elements of Modern Physics
PROBLEMS
1. Using the operators in Eq. (4.127), except the factor of , for the isospin
operators for I == 1/2, and the representation in Eq. (4.126) for the P and
N states respectively, obtain the isospin values of the states for the proton-
neutron system in Eqs. (10.8) to (10.11)
2. If a positron of energy E annihilates an electron at rest, giving out two
photons, e+ + e → γ + γ, obtain the angular distribution of energy.
3. Determine the maximum energy of π– when the K+ at rest decays into
π+ π+ π–.
4. What is the lifetime of Σ0 so much smaller than that of Σ+ or Σ–?
5. What is the range of weak-interaction forces originating from the exchange
of W-bosons of mass about 80 GeV?
6. Obtain from dimensional arguments, the lifetime of β-decay, given that it
2
Ê mw2 ˆ
is proportional to Á ˜ and that the remaining factor is a function of
Ë a ¯
, c and me.
7. For a particle moving with high velocity, the force-acceleration relation
in Eq. (10.41) changes to evB = mv2/r (1 – v2/c2)1/2. What os the field
needed to keep a proton of 5 GeV energy, in a circular orbit of radius
20 m?
11
General Relativity and Cosmology
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 391
S. H. Patil, Elements of Modern Physics,
https://doi.org/10.1007/978-3-030-70143-7_11
392 Elements of Modern Physics
Our discussion so far has been about modern ideas in the domain of high velocity
(special theory of relativity), and small distances (quantum theory and its
applications). There have been important developments in our understanding
of large-distance, large-body phenomena as well. These refer primarily to the
general theory of relativity as applied to large stars, galaxies and cosmology,
i.e. the science of the universe as a whole.
In cosmology, the events taking place in distant objects such as galaxies,
which not only may be moving with high velocity with respect to us but may
also be accelerating, have to be interpreted. It is to be expected that observation
and interpretation are simpler in the galaxy where the events are taking place.
Therefore, a relativistic theory, which can relate observations in frames which
are in arbitrary motion with respect to each other is needed. Einstein’s theory of
general relativity provides the frame-work for relating observations of space-
time events in arbitrary frames. Indeed, the theory accomplishes more than
that. It incorporates gravitational effects as well. This is based on the observation
that the motion of a particle in a gravitational field, with a given initial position
and velocity, is independent of its mass. This implies, as will be discussed later,
that the effect of gravity can be locally simulated by an accelerating frame but
without gravity, so that gravitational effects can be described by the theory of
general relativity. Furthermore, in the discussion of cosmological dynamics, it
is mainly the long-range gravitational forces which are important. Therefore
general relativity is also an appropriate theory for the analysis of the development
of the universe.
Here, an elementary and brief consideration of the main ideas of the theory
of general relativity is presented and the ideas are applied to discuss some
predictions of the dynamical properties of the universe. It is quite appropriate
to end with the description of general relativity, a theory which is grand in its
concepts and structure and awe-inspiring in its predictions. An exposition of
modern ideas in physics cannot be said to be complete without a discussion of
general relativity, and any thing to follow would only be an anticlimax.
Mach’s Principle
The inertial frames may be determined by considering the inertial forces on the
surface of the earth. For example, the rate of rotation of the earth with respect to
the inertial frames may be estimated by the measurement of the centrifugal and
coriolis forces on the earth. The rate of rotation thus obtained is found to be
approximately the same as the rate of the earth’s rotation with respect to distant
matter, e.g. the distant galaxies. This leads to an important result that the average
motion of distant galaxies with respect to the inertial frame is zero.
According to Mach (1872), the above result is not an accident. It suggests
that the inertial frame is not an absolute frame but is related to the distribution
of matter in the universe. In fact, Mach asserted that the concept of inertia (and
the inertial frame) can be given meaning only in terms of background stars and
galaxies. This is known as Mach’s principle. It implies that if there were no
matter in the universe except for a given body, there would be no inertia or
inertial forces and it is meaningless to ask whether it is accelerating with respect
to an inertial frame.
A theory of the universe which incorporates Mach’s principle cannot include
an inertial frame without reference to the distribution of matter in the universe.
Principle of Equivalence
An important input of Einstein’s theory of general relativity is the observation
that the effect of gravity can be simulated locally by a noninertial frame.
Consider the motion of an object of inertial mass mI, in a region where the
gravitational acceleration g is approximately constant. If there is a
nongravitational force F acting on it, its motion is given by
mIa = mgg + F (11.1)
where the gravitational mass mg, in principle, may be different from the inertial
mass mI. Alternatively, consider a frame of reference without a gravitational
force, but moving with acceleration – g. If the acceleration of mass mI in this
frame is a, the corresponding acceleration in the inertial frame is a – g so that
the equation of motion is
mI(a – g) = F (11.2)
or mIa = mIg + F (11.3)
Experimentally is observed to a high accuracy (about 1 part in 1011) that
mI = mg, so that Eqs. (11.1) and (11.3) describe the same motion. This result
was generalized by Einstein into what is known as the principle of equivalence:
394 Elements of Modern Physics
The physical laws are locally the same in an inertial frame with gravitational
acceleration g and a noninertial frame with acceleration – g but no gravity.
It is important to note that the equivalence is local since the gravitational
field tends to zero at large distances whereas the inertial forces in general do
not vanish at infinity.
The principle of equivalence gives special importance to freely-falling
frames. In these frames, the local effect of gravity is cancelled by the inertial
forces so that they form local inertial frames and the considerations of special
relativity suffice to describe the physical observations in them. They allow us
to deduce two interesting results without going into additional details of the
general theory.
1. Bending of light in a gravitational field: Consider a beam of light in a
gravitational field. Observed from a freely-falling frame which is locally
inertial, the beam travels in a straight line. However, the frame with the
gravitational field moves with acceleration – g with respect to the freely-
falling frame so that in this frame the beam appears to bend in the direction
of g. This is a remarkable result since just the finiteness of the velocity
of light implies that light interacts with gravitational fields. The bending
of light in a gravitational field is observed in the deflection of light from
the stars, moving past the sun, seen during total solar eclipses. However,
since g is not constant along the path of the beam, quantitative calculations
are rather complicated. It may be noted that the predictions of general
theory agree with the observations within experimental accuracy.
2. Gravitational shift of spectral lines: Consider a photon emitted at t = 0,
and moving in a direction opposite to the gravitational acceleration g.
Observed from a freely-falling frame which is at rest with respect to the
source at t = 0, the frequency of the photon is v0. If the photon meets the
frame at t, it will have travelled a distance h = ct during this time and the
frame at that instant will be moving with a velocity of gt. Hence the
frequency v of the photon observed at a height of h, from a frame at rest
with respect to the source (and moving with velocity – gt with respect to
the freely-falling frame), is given by
v0 − v gt
≈
v0 c
gh
= (11.4)
c2
This result, which can also be deduced from energy conservation (see
Example 2), has been verified by Pound and Rebka (1960) using Mossbauer
effect. They found that a photon falling through a height of 22.6 m shows a
General Relativity and Cosmology 395
shift of ∆λ/λ0 ≈ – gh/c2 ≈ – (2.57 ± 0.26) × 10–15 compared with the predicted
value of – 2.46 × 10–15.
The theory of general relativity is based on the principle of equivalence and
includes the gravitational effects in terms of the geometry of the space. The
theory makes no distinction between gravitational and inertial effects, both being
related to the energy and momentum distribution of matter and hence
incorporates Mach’s principle to some extent (there are some difficulties is the
interpretation of boundary conditions in the case of an infinite universe).
1 v
t′ = t − 2 x
(1 − v /c 2 )1/2
2
c
Trajectories may be characterized by an invariant variable τ, the proper
time, which gives the invariant interval between two events as
1
(∆τ)2 = (∆t)2 −2
( ∆r ) 2 (11.6)
c
Now, a gravitational field with acceleration a in the x-direction can be
introduced by going over to a frame with acceleration – a. The required
transformations are given approximately by
1 2
r′ = r + at , t′ = t (11.7)
2
in terms of which the infinitesimal proper time interval is
a 2 t ′2 2 1 2 2t ′
(∆τ)2 = 1 − 2 (∆t ′) − 2 (∆r′) + 2 a . ∆r′ ∆t′
c c c
(11.8)
396 Elements of Modern Physics
0 0 0 1 0
0
g 00 = 1, g11 = g 22 = g33 = − , gµv = 0 for µ ≠ v (11.10)
c2
corresponding to flat space. If the coordinates in the second frame are given by
the functional relation xµ = xµ (x′),
∂xµ
∆xµ = ∆x′v (11.11)
∂x′v
and
(∆τ)2 = gµv ∆x′µ ∆x′v (11.12)
α β
0 ∂x ∂x
where gµv = g αβ (11.13)
∂x′µ ∂x′v
It should, however, be noted that the metric tensor here is given in terms of
only four independent function xµ (x′), though an arbitrary but symmetric gµv(x′)
consists of 10 independent functions. The restricted gµv given in Eq. (11.13) can
describe the local gravitational field. Einstein the postulated that the general
gravitational field is described by an arbitrary metric gµv (x) with 10 independent
functions. Having defined the means of describing the gravitational and inertial
effects, one must now provide (t) the framework for the determination of the
metric gµv (x) and (ii) the dynamical equations for a given metric.
1 8πG
Rµv − gµv R = − 4 Tµv (11.14)
2 c
where Tµv is the energy-momentum tensor which acts as the source and G = 6.67
× 10– 11 N . m2/kg2 is the usual gravitational constant. The quantities Rµv and
R are related to the Riemann-Christoffel curvature tensor Rµαvβ which in turn is
General Relativity and Cosmology 397
related to the metric gmv. The discussion of these relations is beyond the scope
of this book (interested reader may refer to Ref. 11). We only note that these
field equation determines the metric for a given energy-momentum distribution.
In the limiting case of weak gravitational field φ, one has T00 = ρc2, g00 =
2φ 1 1 2
1+ , R00 = − R = − 2 ∇ φ with which Eq. (11.14) to first order in φ,
c 2 2 c
reduces to the Poisson equation for Newton’s gravitational potential, ∇2f = 4πGρ.
Geodesics
The path followed by a particle in the presence of gravitational forces is
determined by the geometry of the equivalent metric space. To obtain the relation
between the path and the metric, it is noted that in 3-dimensional Euclidean
space, a particle moves in a straight line, i.e. it chooses a path which corresponds
to the shortest distance between any two points. However, the ordinary length
is not invariant even in special relativity, and is not a suitable quantity for
determining the trajectory in the general case.
The proper time τ, defined in Eqs. (11.6) and (11.12), is invariant under
general transformations, and the allowed paths may be regarded as corresponding
to extrema of τ. It turns out that τ is actually a maximum for the allowed trajectory
in the case of special relativity [because of the negative sign of spatial terms in
Eq. (11.10)]. For example, the proper time corresponding to points (0, 0) and
(t, 0), with metric gµαv (Eq. 11.10), is
τ1 = t (11.15)
whereas that corresponding to two segments (0, 0) → (t1, x1) and (t1, x1) → (t, 0)
is
1/2 1/2
1 1
τ2 = t12 − 2 x12 + (t − t1 ) 2 − 2 x12 (11.16)
c c
which for x1 ≠ 0 is less than t (note that the intermediate point has to be taken so
that each proper-time interval is real). Thus, the straight line between (0, 0) and
(t, 0) corresponds to the maximum proper time. The general situation may be
covered by the requirement that the total proper time
B
τAB = ∫
A
dτ (11.17)
B 1/2
dx µ dx v
= ∫
A
gµv
ds ds
ds
398 Elements of Modern Physics
is an extremum, where Eq. (11.12) has been used, and s is an arbitrary parameter.
The extremum paths are called geodesics.
The integral condition can be converted into a set of differential equations
in the following way. To be specific, let s be the proper time of the geodesic,
with values τA and τB at the end points. Now consider a set of curves xµ (τ, ε)
which connect points A and B, such that
xµ (τ, ε) = xµ (τ, 0) + εhµ (τ) (11.18)
where x (τ, 0) is the geodesic needed, h (τA) = h (τB) = 0, and ε is a small
µ µ µ
τB
dxµ µ
≡ ∫
τA
f
dτ
, x dτ
To first order in ε, this expression reduces to
τB µ
∂f dh + ∂f hµ d τ
τ(ε) = τ(0) + ε ∫ dxµ d τ ∂xµ (11.20)
τA
d τ
which on integration by parts (and using hµ (τA) = hµ (τB) = 0) gives
τB
d ∂f ∂f µ
τ(ε) = τ(0) − ε ∫ −
d τ dx µ ∂xµ
h dτ (11.21)
τA ∂
dτ
Since τ(ε) is an extremum at ε = 0, the second term should vanish for all
hµ(τ) which implies that
d ∂f ∂f
− =0 (11.22)
dτ dx µ
∂xµ
∂
dτ
Equation (11.22) gives four equations, for µ = 0, 1, 2, 3. One of these can
be replaced by using the relation
dxµ dx v
gµv =1 (11.23)
dτ dτ
General Relativity and Cosmology 399
∂f
which follows from Eq. (11.12). For the special case of = 0, Eq. (11.22)
∂xµ
simplifies to
∂f
= constant. (11.24)
dxµ
∂
dτ
These differential equations, i.e. Eq. (11.22) or Eq. (11.24), with Eq. (11.23),
determine the geodesics and hence the dynamics of a particle.
As an illustration, the metric tensor in Eq. (11.8) is considered for the case
of acceleration a in the x-direction,
a 2t 2 1 at
g00 = 1 −2
, g11 = − 2 , g 01 = g10 = 2 (11.25)
c c c
which is equivalent to a space with gravitational field characterized by
acceleration a. The corresponding function f in 2-dimensions is
1/2
a 2t 2 dt
2
1 dx
2
2at dt dx
f = 1 − 2 − 2 + 2
c dτ c dτ c d τ d τ
(11.26)
Using Eq. (11.23) and Eq. (11.24) for µ = 1,
dx dt
− + at =A (11.27)
dτ dτ
2
dt A2
= 1 +
dτ c2
dx
which lead to = at + constant, and therefore reproduce the usual equations
dt
for motion in a constant gravitational field.
Curvature of Space
It is clear from the above example that the geodesics in an arbitrary metric
space are, in general, curved lines. The space is then said to be curved (in contrast
to the flat spaces of inertial frames). A measure of the curvature of the space is
given by what is known as the curvature tensor. Only the simple case of
2-dimensions is considered here. The curvature of a surface in two dimensions,
as given by Rindler, is the following: Draw the geodesics starting from a point
P, and consider the circle formed by the locus of points which are at a distance
400 Elements of Modern Physics
P a
2 2 1/2
Dr/(1 – r /R )
r
R
Dr
= R sin–1 (r0/R)
while the length of the circumference of the circle is
l = 2πr0
= 2πR sin (a/R) (11.31)
Hence, the curvature is
3 2 πa − 2 πR sin (a/R )
K= lim
π a→0 a3
1
= 2 (11.32)
R
For a flat surface (R → ∞), the curvature is zero. In some cases, such as at
a saddle point, it can be negative.
An interesting point which emerges from our example is that a is multi-
valued (corresponding to going around the sphere an arbitrary number of times),
General Relativity and Cosmology 401
2 1
(∆τ)2 = e(r ) ( ∆t ) − [ f (r ) (∆r ) 2 + r 2 (∆θ) 2 + r 2 sin 2 θ (∆φ) 2 ]
c2
(11.34)
1 2GM
with e(r) = =1− 2 (11.35)
f (r ) c r
This is known as the Schwarzschild metric. It may be observed that for
r → ∞, the Schwarzschild metric reduces to the metric of the flat space as it
should.
The interpretation of the different variables in Eq. (11.33) should be carefully
noted. The variable t is the coordinate which, in the absence of any gravitational
potential, would represent the time variable. The proper time interval ∆τ, on the
other hand, corresponds to the rate at which local clocks are running. Similarly,
the interpretation of r is that the distance measurements for ∆t = 0 give a value
[f(r) (∆r)2 + r2 (∆θ)2 + r2 sin2 θ (∆φ)2]1/2.
The Schwarzschild metric can be used to explain several important
observations. A few of the applications are discussed here.
Rate of Clocks
Consider an atomic clock in the presence of a gravitational field due to mass M.
The time interval it shows is
1/2
2GM
∆τ0 = 1 − 2 ∆t (11.36)
c r
402 Elements of Modern Physics
v0 − v GM
≈ 2 (11.40)
v0 c r
This is essentially the gravitational red shift deduced earlier, in Eq. (11.4),
from the equivalence principle.
f = e (r ) − 2 − 2 (11.41)
dτ c dτ c d τ
They lead to
dφ
r2 =A
dτ
dt
e( r ) =B (11.42)
dτ
2 2 2
dt f (r ) dr r2 dφ
e( r ) − 2 − 2 =1
dτ c dτ c dτ
Using the first two equations and Eq. (11.35), the last equation simplifies
to:
General Relativity and Cosmology 403
1 dr 2 2GM A
2
GM 1 2 2
2 + 1 − 2 2 − = c ( B − 1) (11.43)
r d φ c r 2r r 2
where the constant on the right-hand side may be identified with the energy.
Compared to the corresponding Newton’s equation, this equation has the extra
GMA2
term − 2 3 . With this additional effective interaction, the planetary orbits
c r
are no longer closed ellipses but may be simulated by slowly rotating ellipses.
This gives rise to a shift of the perihelion of planets (perihelion is the point on
GM
the orbit nearest to the sun). Compared with the leading potential − , the
r
additional term is small,
GMA2 /c 2 r 3
≈ r2/c2 (11.44)
GM /r
which for Mercury is about 10–7-10–8. The quantitative effect of this term can be
calculated from perturbation theory. It gives rise to a rotation of the perihelion
of Mercury by about 43″ per century, in good agreement with the observed
rotation.
Bending of Light
The equations for the trajectory of light in the Schwarzschild metric can be
deduced from Eqs. (11.42). It should however be noted that since ∆τ = 0 for the
propagation of light, the constants A and B are infinite, though A/B is finite.
Dividing the last two equations by the first equation in Eqs. (11.42),
e(r ) dt B
=
r dφ
2
A
2 2
dt f (r ) dr r2
e( r ) − 2 − 2 =0 (11.45)
dφ c dφ c
which in terms of x, = 1/r lead to
2
dx 2 2GM 3
+x − 2 x =D (11.46)
dφ c
404 Elements of Modern Physics
GM
where D = (cB/A)2. For = 0, x = D1/2 cos φ. Treating the gravitational term
c2
as a small perturbation, we find to first order in GM/c2
GMD
x = D1/2 cos φ + 2
(2 − cos 2 φ) (11.47)
c
The bending of light is then deduced by obtaining the angles f for r → ∞ or
x → 0, which to first order in GM/c2 satisfy the relation
2GMD1/2
cos φ ≈ − (11.48)
c2
π 2GMD1/2
or φ± = ± + (11.49)
2 c2
Noting that D1/2 ≈ 1/rmin, the deflection of light comes out to be
∆φ ≡ (φ+ – φ– – π), (11.50)
4GM
≈
c 2 rmin
For light just grazing the sun (M ≈ 2 × 1030 kg, rmin ≈ 7 × 108 m), this has a
value of ∆φ ≈ 1.75″. The bending of starlight grazing the sun during an eclipse
(so as to minimize glare), is found to be about 1.89″ which agrees well with
Einstein’s prediction. The corresponding prediction of Newton’s theory (particles
with velocity c accelerated by gravity) is half of Einstein’s prediction, i.e. about
0.875″.
Black Holes
Going back to the Schwarzschild metric in Eqs. (11.34), (11.35), it is seen that
the metric is singular at r = rs,
2GM
rs = (11.51)
c2
where rs is known as the Schwarzschild radius. In most cases, the Schwarzschild
radius is quite small, it is about 3 km for the sun. However, the metric is applicable
only outside the mass distribution. Inside the distribution, the metric is modified
to a nonsingular form. Therefore, in cases where rs is much smaller than the
radius of the mass distribution, the singularity is not relevant. On the other
hand, in the case of very massive stars (m > 3msun), it is expected that the stars
may ultimately collapse to size smaller than their Schwarzschild radius. These
General Relativity and Cosmology 405
stars whose radius is smaller than their Schwarzschild radius, are known as
black holes and the metric singularity gives rise to some unusual behaviour for
them.
For radial motion (dφ/dτ = 0), one obtains from Eqs. (11.42),
2
1 dr 2GM
B2 − 2 = 1 − c2r (11.52)
c dτ
If the particle starts at large r, with zero velocity, B2 = 1 and the equation of
motion reduces to
1/2
dr 2GM
= − (11.53)
dτ r
The solution to this equation is
2/3
1/3 8
r(τ) = (2GM ) d − τ (11.54)
2
where d is a constant. It shows that r, as a function of the proper time, does not
show any singular behaviour at r = rs, and therefore the fall through r = rs is
smooth.
The behaviour as seen by an observer outside the field, on the other hand, is
different. Since B = 1 for the case under consideration, it follows from
Eqs. (11.42) and (11.53), that the time interval dt, as seen by the observer, is
dτ
dt =
1 − rs /r
r1/2 dr
= (11.55)
− (2GM )1/2 (1 − rs /r )
Clearly the time interval ∆t → ∞ as r → rs. Physically, this means that with
respect to an outside observer, a black hole is frozen at r = rs.
It is easy to show that no light can escape from a black hole. For a light
signal one has ∆τ = 0. Therefore, from the last equation in Eqs. (11.42)
dr
dt = (11.56)
c |1 − rs /r |
which implies that the time taken by the signal to escape from the black hole is
infinite:
r2
dr
t12 = ∫
r1 < rs
c |1 − rs /r |
(11.57)
406 Elements of Modern Physics
lim t12 → ∞
r2 → rs
This explains the name ‘black hole’ given to the object. It should be
mentioned that this result does not take quantum effects into account. It has
been shown by Hawking that quantum effects do allow some radiation to come
out from the black holes so that, strictly speaking, a black hole is not a black
hole.
2 1 (∆r ) 2
(∆τ)2 = (∆t ) − + r 2 (∆θ) 2 + r 2 sin 2 θ (∆φ) 2
c2 1 − K r
2
(11.58)
where K is the curvature of space. The spatial part of the metric is similar to the
metric of a spherical surface given in Eq. (11.29). The curvature of the space
may be written as
k
K= (11.59)
R 2 (t )
where R(t) is called the comic scale factor, with k = 1 for positive curvature,
k = –1 for negative curvature and k = 0 for flat space. The metric may be written
in a more convenient form in terms of the dimensionless variable σ,
General Relativity and Cosmology 407
σ = r/R(t) (11.60)
in terms of which the separation between fundamental observers does not change
with time. In terms of the co-moving coordinate σ, the metric is
2 R 2 (t ) ( ∆σ) 2
(∆τ)2 = (∆t ) − + σ 2 (∆θ) 2 + σ 2 sin 2 θ (∆φ) 2
c2 1 − k σ
2
(11.61)
which is known as the Robertson-Walker metric.
Distances
Distances in the Robertson-Walker metric, with ∆θ = ∆φ = 0, are given by
a
dσ
D(t) = R (t ) ∫
0
(1 − k σ 2 )1/2
(11.62)
R (t ) sin −1 σ for k = 1
= R (t ) σ for k = 0 (11.63)
R (t ) sinh − 1 σ for k = − 1
As might have been expected the distances between fundamental observers
are proportional to the scale factor. In the k = 1 case (positive curvature), the
distance D(t) is ambiguous to the extent of 2πnR(t) corresponding to going
around the closed universe n number of times. The surface area of the sphere is
given by 4πr2 or
S = 4πR2(t) sin2 [D(t)/R(t)] (11.64)
which is bounded. This is analogous to the length of the circumference of a
circle in the two dimensional case [see Eqs. (11.31) and (11.30)].
Velocities
The relative velocities of fundamental observers, are obtained from Eq. (11.62),
d D(t )
v= (11.65)
dt
R(t )
= D(t )
R(t )
This allows us to identify the constant of proportionality in Hubble’s law. It
is observed that distant galaxies appear to be moving away with speeds
proportional to their distances. This is described by Hubble’s law v = Hr where
H is called Hubble’s constant. Comparison of this relation with Eq. (11.65)
leads to
408 Elements of Modern Physics
R (t )
H(t) = (11.66)
R (t )
Thus, Hubble’s constant, in general, is a function of time. At present it has
a value of about 1.8 × 10–18 s–1.
Red Shifts
The Robertson-Walker metric provides the proper framework for the description
of cosmological red shifts. Since the proper time for the propagation of radiation
is zero, ∆τ = 0, the time interval for the propagation is given by
R(t ) ∆σ
∆t = (11.67)
c (1 − k σ2 )1/2
Consider now two crests emitted from a galaxy at times te and te + ∆te,
which are received by another galaxy at times t0 and t0 + ∆t0 respectively,
t0 σe
d (t ) 1 dσ
∫
te
R (t )
=
c ∫ (1 − k σ 2 )1/2
(11.68)
σ0
σe
t0 + ∆t0 dt 1 dσ
∫te + ∆te R(t )
=
cσ ∫ (1 − k σ 2 )1/2
(11.69)
0
1 2 2
= (t0 − te ) H 0 + q0 + 1 H 0 (t0 − te ) + ... (11.73)
2
The parameter q0 known as the deceleration paramenter, is important in
the determination of the nature of the universe. For example, a positive q0 implies
a slowing down of the expansion of the universe. It is possible to estimate the
value of (t0 – te) from the study of the apparent brightness (essentially the radiation
received) of galaxies which together with a knowledge of the red shifts would
give an estimate of q0. Through the experimental uncertainties are too large at
present to yield a reliable value for q0 there are some indications that it is positive
(see Ref. 10).
4π 3
Gm r ρ(t )
3
ma = − (11.74)
r2
which in terms of R(t) [see Eq. (11.60)] reads as
8π R 3 (t0 )
R 2 = G ρ(t0 ) − kc 2 (11.76)
3 R (t )
Here, the constant of integration – kc2, is a measure of the total energy of
the particle, and is related to the curvature index k(k = ± 1, 0) by the solutions to
the field equations. The relation suggests that the universe is closed for k = 1,
i.e. R (t) becomes zero for sufficiently large R(t) and changes its sign, but open
410 Elements of Modern Physics
for k = – 1 or 0, i.e. R (t) ≠ 0. This can be compared with what happens to a body
thrown up from the surface of the earth. If the initial velocity vin is less than the
escape velocity vesc (Etot < 0), the body will reach a maximum height and return
back to the earth, corresponding to the closed universe. If vin > vesc (Etot > 0), the
body will go on forever corresponding to the open universe. When vin = vesc
(Etot = 0), the body just manage to escape from the earth.
In all the three cases of Eq. (11.76), R is positive now, and was large at
earlier times. Hence the universe described by Eq. (11.76) had a big-bang origin.
The solution for R(t) can be obtained by integrating Eq. (11.76) in the usual
way:
k=–1
k=0
R(t)
Rm
k=1
0 t1/2 2t1/2
t®
Fig. 11.2 The scale factor R(t) of the universe as a function of time t.
πRm
2t1/2 =
c
≈ 1.2 × 1011 years (11.82)
The present age of the universe is estimated from Eq. (11.77) to be
t0 ≈ 1010 years (11.83)
Since the universe is closed in this case, there are no problems of boundary
conditions at infinity, and Mach’s principle is incorporated in the theory.
(iii) k = – 1, ever-expanding universe: In this case, R(t) increases as t2/3 for
small t but increases as t for large t. In this model, following the same
steps as for k = 1, the present age of the universe comes out to be somewhat
larger, [ρ(t0) ≈ 3 × 10–28 kg/m3]
t0 ≈ 1.8 × 1010 years (11.84)
Though the evidence from various sources, e.g. from the estimates of
deceleration parameter, is not definitive, it does favour a closed universe. This
means that not only did the universe start with a big bang, it will also collapse to
the original dense state. What it will do beyond that point is not very clear—it
may end at that point or start again giving an oscillating universe.
matter must have been in an ionized state. Since matter in ionized state has
much greater interaction with radiation than does matter in atomic state, the radiation
had an equilibrium black-body spectrum characterized by temperature T. In this
condition, the spectrum of photons, i.e. number of photons in the range v and v
+ dv, is deduced from the Planck expression [Eq. (2.12)],
8πv 2 V dv
dN = (11.85)
c 3 [exp (hv/kT ) − 1]
As the universe expands, the temperature falls. It can be shown that the
photon spectrum maintains its black-body characteristics, but with a lower
temperature. As a result of the expansion, V changes to V′,
R′3
V′ = V (11.86)
R3
Furthermore, the frequency gets red-shifted (see Eq. (11.72)), the red-shift
is present even for reflection by a body moving away) and is given by
R
v′ = v (11.87)
R′
Hence the spectrum is now given by
8πv′2 V ′ dv′
= (11.88)
c 3 [exp (hv′ / kT ′) − 1]
R
where T′ = T . Furthermore, the energy density is given by
R′
4
ε′rad = σ T ′4
c
4σR 4T 4
= (11.89)
cR′4
where σ is the Stefan-Boltzmann constant, σ = 5.67 × 10–8 in mks units. Such a
radiation was observed by Penzias and Wilson (1965), with a characteristic
temperature T′ = 2.7 K. This provides strong support to the big bang theory.
The equivalent mass density is
4
ρ′rad = 3
σT ′4
c
General Relativity and Cosmology 413
= b′/l03/2 (11.96)
2
where l0 = a/r0 , and a, b, b′ are constant. This implies that
3
ln N = − ln l0 + d (11.97)
2
Observationally, the number of faint sources appears to be larger,
ln N ~ – 2.0 ln l0 + d′ (11.98)
This is an interesting result. In the big-bang theory, it only means that since
the radiation from far away sources was emitted earlier, there must have been
(i) more numerous radio sources and/or (ii) brighter radio sources, at early times.
allow for a unified approach. In a sense, the distinctions between the different
domains are becoming blurred and ultimately there may be only the core of
basic laws of physics in terms of which all observations can be explained.
11.7 EXAMPLES
In this section, some examples related to the ideas discussed earlier are given.
Example 1
Olbers argued that in an infinite, static universe, every part of the sky must have
a brightness comparable to that of the sun.
In an infinite, static universe, every part of the sky will be covered by a star.
Let the visible part of one such star subtend a solid angle ∆Ω at the earth. Now,
the intensity of radiation received at the earth, due to this star at distance d, is
proportional to the exposed are d2∆ω of the star, and decreases as 1/d2. Therefore
it is proportional to d2∆Ω (1/d2) ~ ∆Ω which is independent of the distance of
the star. Therefore, an equivalent area (d2sun ∆Ω) of the star might as well be at
a distance equal to that of the sun. If we make the reasonable assumption that
the stars in general have approximately the same inherent brightness as that of
the sun, we would then expect every part of the sky to have the brightness of the
sun, day or night. Any attempt at an explanation in terms of absorption does not
succeed since at equilibrium the absorbing material must emit as much energy
as it absorbs.
Example 2
The red shift of radiation in the presence of a gravitational field, to the leading
order in the field, can be shown to follow from energy conservation.
Consider an atom of mass m2 which goes to a state of lower mass m1, with
the emission of a photon of frequency v0. Then energy conservation implies
m2c2 = m1c2 + hv0 (11.99)
If the atom is placed in a gravitational potential φ, its initial energy is
m2a2 + m2φ while the final energy is m1c2 + m1φ. Then the frequency v of the
photon, which comes out of the potential, is given by
hv = (m2c2 + m2φ) – (m1c2 + m1φ)
= (m2c2 – m1c2) (1 + φ/c2) (11.100)
Using Eq. (11.99)
v = v0 (1 + φc2) (11.101)
which agrees with the general expression in Eq. (11.39) to the leading order.
For emission from the sun, φ/c2 ≈ – 2 × 10–6.
416 Elements of Modern Physics
Example 3
When two clocks accelerate with respect to each other, they show different
proper times. This provides a solution to the twin paradox.
The metric near the surface of the earth, in the local inertial frame (at rest
with respect to distant galaxies) is the Schwarzschild metric in Eq. (11.34).
Consider two clocks, 1 and 2 which go around with angular velocities ω1 and
ω2. If they are together at the beginning and again at the end, the proper times
shown by them are
1/2
t 2GM r 2 2
τi = ∫ 0
1 − 2 − 2 ωi dt
c r c
1/2
2GM r 2
= 1 − 2 − 2 ωi2 t (11.102)
c r c
where t is the time coordinate. Therefore
τ1 − τ2 r 2 (ω22 − ω12 )
≈ (11.103)
τ1 2c 2
This relation was verified by keeping clock 1 at rest on Earth, ω1 = 2π rad/
v
day, and taking clock 2 around the earth with velocity v, ω2 = ω1 ± . For
r
v ≈ 800 km/h,
τ1 − τ2 v
≈ 1.42 × 10− 12 for ω2 = ω1 + ,
τ1 r
v
≈ − 0.87 × 10− 12 for ω2 = ω1 − (11.104)
r
It is important to note that a clock with greater acceleration shows smaller
time, which explains the longer lifetime observed for particles going around in
accelerators.
Example 4
An interesting idea in cosmology is what is called the object horizon. This is
the value σoh of the farthest object which is visible to us. The signal reaching us,
from this object, must have been emitted at the beginning of the universe, i.e. at
t = 0. Since ∆τ = 0 for the propagation of light, one has [Eq. (11.61)]
t0 dt 1 σoh dσ
∫0 R(t )
=
c ∫
0 (1 − k σ2 )1/2
(11.105)
General Relativity and Cosmology 417
t0 dt
σoh = sin c ∫
0 R (t )
(11.106)
PROBLEMS
1. A photon is moving horizontally on the surface of the earth. What is the
height through which falls in travelling 100 m?
2. Starting from the flat-space metric, obtain the metric for the frame which
rotates with angular velocity ω along the z-direction. Write down the
equations for geodesics in this frame. Exhibit the coriolis and centrifugal
forces in the nonrelativistic approximation.
3. For a particle going around an accelerator, show that the lifetime is given
by τ = τ0 (1 – ω2r2/c2)–1/2, where ω is the angular velocity, and r is the
radius of the orbit. What is the expression for τ if ω is changing but
r remains a constant?
4. Curvature of a surface may be defined in terms of area also. Show that
curvature of a spherical surface is given by
12 πa 2 − A
K= lim
π a → 0 a4
where A is the area of the surface and a is the distance of any point on the
circumference of the circle from the centre of the surface.
5. What is the Schwarzschild radius of the earth?
6. Show that for circular motion in the Schwarzschild metric
2
dφ 3GM GM
1 − = 3
dτ rc 2 r
418 Elements of Modern Physics
dφ
r = 3GM /c 2 and = c /(31/2 r ) .
dt
8. Consider the Robertson-Walker metric with k = 0. If a signal is emitted
at t and received at t0,
(a) show that
D(t )
=σ
R(t )
1/3
12c 1/3 1/3
= (t0 − t )
Rm
(b) show that
2
v(t) = D(t )
3t
(c) show that
3 v
D(t) = t0 .
2 (1 + v/2c)3
9. In the steady state theory of the universe, the decrease in the density of
matter due to expansion of the universe is compensated by continuous
creation of matter. Using continuity equation show that the rate of creation
is given by
d ρc
= 3ρ0H
dt
Given that ρ0 ≈ 3 × 10–28 kg/m3, estimate the rate of creation in terms of
protons/m3/s. Argue that the steady state theory does not imply a sky with
a uniform brightness equal to that of the sun.
References
General Books on Modern Physics
1. Leighton R.B., Principles of Modern Physics, McGraw-Hill, New York,
1959.
2. Richtmyer F.K., E.H. Kennard and J.N. Copper, Introduction to Modern
Physics, McGraw-Hill, New York, 1969.
3. French A.P., Principles of Modern Physics, John Wiley, London, 1958.
4. Weidner R.T. and R.L. Sells, Elementary Modern Physics, Allyn and
Bacon, Boston, 1980.
5. Sproull R.L. and W.A. Phillips, Modern Physics, John Wiley, New York,
1980.
6. Savelyev I.V., Physics, a General Course, vol. III, Mir, Moscow, 1981.
7. Beiser A., Perspectives of Modern Physics, McGraw-Hill, New York,
1973.
Quantum Mechanics
14. Pauling L. and E.B. Wilson, Introduction to Quantum Mechanics,
McGraw-Hill, New York, 1935.
15. Fermi E., Notes an Quantum Mechanics, University of Chicago Press,
Chicago, 1961.
© The Editor(s) (if applicable) and The Author(s), under exclusive license 419
to Springer Nature Switzerland AG 2021
S. H. Patil, Elements of Modern Physics,
https://doi.org/10.1007/978-3-030-70143-7
420 Elements of Modern Physics
Nuclear Physics
36. Elton L.R.B., Introductory Nuclear Theory, Interscience, New York, 1959.
37. Segre E., Nuclei and Particles, Benjamin, New York, 1965.
38. Enge H.A., Introduction to Nuclear Physics, Addison-Wesley, Reading,
1966.
39. Bethe H.A. and P. Morrison, Elementary Nuclear Theory, John Wiley,
New York, 1956.
40. Preston M.A., Physics of the Nucleus, Addison-Wesley, Reading, 1962.
41. Murray R.L., Nuclear Energy, Pergamon, New York, 1975.
Elementary Particles
42. Longo M.J., Fundamentals of Elementary Particles, McGraw-Hill, New
York, 1973.
43. Yang C.N., Elementary Particles, Princeton University Press, Princeton,
1962.
44. Livigston M.S., Particle Physics, McGraw-Hill, New York, 1968.
Answers to Problems
Chapter 1
1. Time period is (1 + v/c)/(1 – β2)1/2
5. 14%; 10 km 7. 0.9974c
9. (M2 – m2) c2/2M; 224.6 MeV, 3.4 eV
11. 0.875c, 0.999994c 12. 287 km/s; 6556.7 Å
Chapter 2
1. 5800 K; for significant number of hydrogen atoms to be in the excited
states, kT ~ 10 eV
2. 7.13 × 103 J/s; 0.019 J/s 4. 19; 4 × 1017 m–2 s–1
5. [2mc2 (hv – ε) + h2v2 – 2hv (2mc2 (hv – ε))1/2 cos φ] 2 Mc2
Chapter 3
1/2
1 h sin (ka/)
1.
π 2a k
2. T = 4r/(1 + r2), r = (E + V0)1/2/E1/2, R = 1 – T
3. v = hn/4ml2 = v/2l
4. A = (πa3)–1/2, a = 4πε0h2/me2, E = – 2/2ma2; P = 13 e–4
(8/3)1/2 (2a )5/4
5. a = (km)1/2/2 . A =
π1/4
© The Editor(s) (if applicable) and The Author(s), under exclusive license 423
to Springer Nature Switzerland AG 2021
S. H. Patil, Elements of Modern Physics,
https://doi.org/10.1007/978-3-030-70143-7
424 Elements of Modern Physics
Chapter 4
1. 2E1, – E1 2. 13e–4, 0
3. E = E1/4 4. a1, 4e–2/a1
5. 10 Z E1
–10 2
5 2 −1
v2 = 2.462 × 1015 1 + α s
16
9. 7 lines
10. S0, S1, P0, P1, P1, P2; P1 → S1, P1 → S0,
P0 → S1, P2 → S1, P1 → S1, P1 → S0
Chapter 5
3. 1, 9
6. 2S1/2 , 1S0, 2D3/2, 3F2, 4F3/2, 7S3, 6S5/2, 5D4, 4F9/2, 3F4, 2S1/2, 1S0
7. CLS 2 = 0.6 eV 8. 2.1 eV
9. 25.6 kV, 3.8 kV; 0.57 Å; 0.48 Å, 3.3 Å
10. 40; 11.1 keV 11. 4.13 × 10–15
12. 6.8 keV 13. 3.8 eV
14. 470 N/m, 1.27 Å 15. 0.059
Chapter 6
eB
1. ∆E = ± , 0; ± 0.00160 Å, 0; ± 0.208 Å, 0
2m
eB 4 4
2. ∆v = M J − M J ′ ; max ∆λ is 0.5 Å
4πm 5 3
Answers to Problems 425
eB 4 2
3. ∆v = M J − M J ′
4πm 5 3
4. g = 3/2; 4.2 × 1010 s–1 6. g = 0.40
7. 0.025
8. 2 exp (– E2/kT) = exp (– E1/kT) + exp (– E3/kT)
9. 4.55 × 10–4 % 10. ∆λ = 3.98 Å, 6.64 Å
Chapter 7
1. Allowed occupation numbers are A = (1, 0, 0, 2),
B = (0, 1, 1, 1), C = (0, 0, 3, 0)
(a) 3, 12, 8 (b) 1, 2, 4 (c) 0, 1, 0 are the numbers of arrangements for A,
B,C
1
2. kT; 1.35 × 10–6 3. 7.5 × 10–7 %
2
4. 11.7 %, 1.6 % 5. 2.8 R
2 θ/T
T x 2 dx
6. E = 4 N 0 kT
θ ∫0
(e x − 1)
7. 350 K
3 θ/T
kT x 2 dx
8. R mol–1 K–1 9. 4πV (2vt−3 + vl−3 )
h
∫0
(e x − 1)
11. 5.5 eV, 3.3 eV; 1 – 1.8 × 10 , 1–5.1 × 10
–5 –5
Chapter 8
1. 6.7 eV, 0.89 eV 3. C = 8.7 keV, a = 0.31 Å
4. 6.0 × 10 mol
23 –1
5. sin θ = 0.50, 0.71, 0.87 ; no; no
6. R (3 –1); R (2/3 –1); R (3 /2 –1)
1/2 1/2 1/2 1/2
Neg B − y eg B
21. E = (e − e y )/ (e− y + e y ) y =
4m 4mkT
22. 3T
Chapter 9
2. ∆E ≈ 0.72 A2/3 MeV 3. 0.32
6. 4.79 e/2m p for l = 2
7. s1/2, p1/2, s1/2, h9/2; – 1.91, – 0.26, 2.79, 2.62 in units of e/2m p
8. 3.9 × 10–54 kg. m2, Itot = 8.7 × 10–54 kg. m2,
E = 0.336 Mev
9. 0.033 MeV and 0.076 MeV
10. Ru
β
→ Rh
β
→ Pd, Ag
β+
→ Pd, Cd
β+
→ Ag;
Ag and Cd decay also by election capture.
11. Ni
β
→ Cu, Zn
→ Cu by electron capture
12. 27.9 MeV, 4.23 MeV, 0.07 MeV
13. 8.36 Mev, 0.076 Mev
14. 31. 9739 mu
15. 5.5 × 10–5 MeV, 4.6 × 10–22 kg m/s
16. 8; 6; 1.54 × 10–10 gm; 1.45 × 109 years
17. 2.05 MeV 18. 1.93 × 10–26 m2
19. 53 g
20. Fusion would require 4.6% change and fission 30% change in 5 × 109
years
Chapter 10
( E + mc 2 )1/2
2. hv = mc 2
( E + mc 2 )1/ 2 − ( E − mc 2 )1/2 cos θ
3. (mK2 –3mπ2)/2mK
4. Σ0→ Λ0 + γ decay is due to electromagnetic interaction, whereas the
decay of Σ± is due to weak interaction
5. About 2 × 10–18 m 6. 4 × 103 s
7. 0.82 Wb/m2
Answers to Problems 427
Chapter 11
1. 5.5 × 10–13 m
ω2 2 ω ω
2. g00= 1 − 2
( x + y 2 ), g0 x = 2 y , g 0 y = − 2 x
c c c
τ 1/2
r2 2
∫
3. τ0 = 1 − 2 ω
0
c
dt 5. 9 × 10–3 ms
9. 2 × 10 m–3 s–1
–18
Index
© The Editor(s) (if applicable) and The Author(s), under exclusive license 429
to Springer Nature Switzerland AG 2021
S. H. Patil, Elements of Modern Physics,
https://doi.org/10.1007/978-3-030-70143-7
430 Elements of Modern Physics
F L
Fabrication of Semiconductor Devices, Laser Cooling, 196
290 Lasers and Masers, 188
Ferrimagnetism, 301 Length Contraction, 11
Ferroelectric Crystals, 306 Lifetimes and Linewidths, 186
Ferromagnetism, 297 Light Emitting Diodes, 289
Fine Structure of One-Electron Atomic Lorentz Four-Vectors, 14
Spectra, 110 Lorentz Transformations, 6
Fission Reactors, 350
Frames of Reference, 392 M
Free Particle, 74
Mach’s Principle, 393
Free-Electron Paramagnetism, 294
Magnetic Moment, 323
Free-Electron Theory of Metals, 232
Magnetic Properties, 241, 292
G Magnetic Resonance Experiments, 198
Medical MRI, 201
Galilean Transformations, 2 Metallic Bonds, 258
Gamma Decay, 343 Metric Tensor of the Space, 395
Geiger Counter, 387 Models of the Nucleus, 328
General Relativity, 391 Moderators, 353
Geodesics, 397 Molecular Bonding, 159
Molecular Spectra, 162
H Moseley Diagram, 156
Holography, 194 Moseley’s Law, 155
Hydrogen, 118 Muonic Helium, 119
Hydrogen Bonds, 259 Muonium, 119
Hydrogen Spectrum, 46
N
I Nearly Free Electron Approximation,
Inertial Frames of Reference, 2 269
Interaction with External Fields, 173 Neutron Economy, 352
Interaction with Radiation, 181 Nonlinear Optics, 193
Ionic Bonds, 159, 256 Nuclear Constituents, 318
Ionic Polarizability, 305 Nuclear Fission, 342
Ionization Potential, 138 Nuclear Forces, 325
Isospin Symmetry, 369 Nuclear Model of the Atom, 49
Nuclear Radius, 322
J Nuclear Reactions, 345
Nuclear Stability, 337
Josephson Junctions, 242 Nucleon-Nucleon Interaction, 327
K O
KCl Crystal Structure, 257 One-Electron Atom, 101
Kinematics of the Universe, 406 Orientational Polarizability, 305
Index 431
W Y
Wave Nature of Particles, 43 Yukawa Forces, 325
Wave Packet, 76
Weak Interaction, 373 Z
Weizsacker’s Mass Formula, 336 Zeeman Effect, 175
X Zero-Mass Particles and Doppler Shift,
21
X-ray Absorption Spectrum, 156
X-ray Spectra, 151